It is very often the case that the ultra-formal notation and style that mathematicians adopt can seem pedantic, or even pretentious, but the abstract problems they try to precisely define are often equivalent to an everyday puzzle. 3SAT is a good example. On the one hand, I could babble away about it being an NP-complete language with a polynomial time reduction from CNF-SAT, and I will, but the problem underlying 3SAT can also be explained in terms of model trains.

Suppose you're an amateur rail enthusiast. You are designing a new station for your little trains to be late at, but unlike the real rail networks, you want some guarantee that your station's signalling system will definitely prevent any collisions between your lovingly crafted model engines. Each of the signals can take one of two colours, green or red. The requirements for safety will be very precisely defined, e.g. "*platform 9 can only be green if platform 8 and 10 are red or if platform 6 is green and platform 5 is red*". To find out just what sort of timetabling your safety constraits allow, you try and list all the different signal combinations that are allowed. This is an equivalent problem to 3SAT.

Although the above problem is *equivalent* to 3SAT, it's not exactly the same. Whereas the signalling problem is basically finding solutions to a large boolean expression, 3SAT is only concerned with boolean expressions in a certain, strict form.

### What is 3SAT?

3SAT is the set of satisfiable boolean expressions that are expressed in 3CNF. 3CNF is just a particular way to write down boolean expressions - all expressions can be converted to it, and it is a canonical form. For an expression to be in 3CNF form, it must satisfy two constraints:

**the expression is in conjunctive normal form**: this is a less strict convention that requires an expression to be made up of `AND`

ed together clauses, which themselves are made up of `OR`

red together literals.
**each clause contains at most 3 literals**: there can be any number of clauses `AND`

ed together, but there can only be one, two or three literals in each clause.

So,

`( x | y | ¬z) & ( ¬d | e )`

is in 3SAT
`( w | x | y | z )`

isn't - it has four literals in a clause
`( x ) & ( ¬x )`

isn't - it is unsatisfiable; it cannot evaluate to `true`

`( x & y | z )`

isn't - it has got an `AND`

inside a clause.

Every boolean expression can be converted into 3CNF, so every satisfiable boolean expression can be mapped onto an expression in 3SAT.

To show that a language is NP-complete, all that is required is to show that an existing NP-complete problem can be simulated efficiently by the target language. For example, in 3-colorability, I use the fact that 3SAT is an NP-complete language to prove that 3-colourability is itself NP-complete. To show that 3SAT is NP-complete I will simulate CNF-SAT, which can be shown to be NP-complete from first principles.

CNF-SAT is a less strict version of 3SAT. All its elements are satisfiable boolean expressions, like 3SAT, but they only have to satisfy the first constraint listed above, not the second. Therefore, unlike in 3SAT, `( w | x | y | z )`

is in CNF-SAT, as there is no limit on the number of literals in a clause.

To show that CNF-SAT can be simulated by 3SAT, all we have to do is find a way to efficiently (in polynomial time), convert an expression in CNF-SAT into an equivalent expression in 3SAT.

To do so, we must consider the "problem clauses"; ones with more than 3 literals in, such as this:

`( l`_{1} | l_{2} | ... | l_{k-1} | l_{k}

)

To convert this clause, a recursive algorithm can be applied that strips literals off the right hand end of the expression, always preserving the meaning of the overall expression. The above clause would be converted into these equivalent clauses:

`( l`_{1} | n_{1} ) & ( ¬n_{1} | l_{2} | n_{2} ) & ( ¬n_{2} | l_{3} | n_{3} ) & ... & ( ¬n_{k-2} | l_{k-1} | n_{k-1} ) & ( ¬n_{k-1} | l_{k} )

In this expression, which is equivalent to the original, the `n`

variables have been inserted as links between the `l`

literals. The assignment to these variables doesn't affect the overall result of the expression - try it yourself with a small example!!

This algorithm is very efficient - it only needs to do perform a one-time pass down all the literals in all the clauses. Therefore, we have found a way of simulating a CNF-SAT language using 3SAT. As CNF-SAT is NP-complete so is 3SAT.