Set theory treats all mathematical objects as the same kind of thing, a set, from whose elementary properties all of mathematics can gradually be built. The essence of a set is that it contains other things (called its elements or members). There is no need to specify the sense of containment in any detail: it is a purely abstract notion.

Georg Cantor began with the loose idea of a set or collection or group of things (not group in the technical sense), and worked out how to treat these collections as individual objects in themselves. He bit the bullet and studied properties of infinite collections, and found there were definite rules, and in fact different sizes of infinity. At the time this was held by many to be unacceptable, and even absurd. He was hotly attacked by intuitionist mathematicians such as Leopold Kronecker.

Gottlob Frege considered classes to exemplify some property, and in particular thought about the numbers as abstract entities: what every three-member collection had in common was the property of having three members, so this property defined the class of all such things. This he identified as the meaning of the number.

Bertrand Russell discovered the paradox that bears his name and it was conceded that you could not simply consider any class you pleased and expect it to behave the way Cantor had worked out. Some classes were inherently contradictory. Russell and Alfred North Whitehead developed a solution to this in their Principia Mathematica, requiring objects to exist in a hierarchy of types, so that a lower one could not dominate a higher one.

Ernst Zermelo also proposed a hierarchical resolution in about 1908, but his was an axiomatic construction, that is it began with the most elementary properties, and only admitted to existence precisely those entities that could be shown to obey the properties. (Whether there could be entities other than those that can be proved to exist is a permanently open question: this is the axiom V = L, and it has been shown to be independent of Zermelo's construction of set theory.)

He began with the notions of membership and equality. Then the only thing that can be said about a thing a and a set b is that ab. Two sets are the same set if they have the same members. A set with several members a, b, and c is denoted {a, b, c}. From here Zermelo admitted axioms, principles so obvious that they must be true. If a and b are sets then {a, b} is a set. The set uniting all their elements is a set. If some of the elements have a property in common, then there exists a set composed of exactly those elements that have the property: a subset. The set of all subsets of a set is a set.

There are about ten axioms in all (depending on the exact formulation), and they are called the Zermelo-Fraenkel axioms, or ZF, also honouring Abraham Fraenkel, who refined them around 1921. This write-up is a very general survey, and mathematicians reading it will be painfully aware of how vague I'm being. More detail will no doubt be found under specific heads such as Axiom of Separation when I or someone else gets around to them. ZF is absolutely central to all of modern mathematics. Alternatives have been proposed, but they seem mere shadows of the original conception. In the standard theory, everything is a set. It is possible to work with ZF plus atoms (called ZFA), where an atom is something that isn't a set but can be a member of a set -- but it is unnecessary.

There exists an empty set (or null set). It has no members, is denoted { } or Ø, and there is only one empty set: any two sets having no members have the same members, so are the same set. There exist one-member sets. If a is a set then {a, a} is a set (stated above), and that has the same members as {a}, so {a} is a set.

Build the numbers as follows. 0 = { }. 1 = {0}. 2 = {0, 1}, i.e. {0, {0}}. Build up from there, 3 = {0, 1, 2} and so on. All the familiar properties of the numbers 0, 1, 2, ... can be reformulated equivalently as set-theoretic properties of this particularly hierarchy of sets. Then other structures such as relations and functions can be considered as subsets of certain large combinations of these.

To define an infinite set requires a new axiom in addition to the ones I've introduced, and that allows a set omega equal to {0, 1, 2, 3, ...}. This introduces a completely new arithmetic of infinities, and leads to the creation of two hierarchies, the cardinals and the ordinals. These go on for ever and ever and become seriously more infinite in profound ways as they go on, but some properties would produce an infinity so large that they cannot be produced with the existing axioms, and a new one has to be enacted just to create them. Such sets are called inaccessible.


Disclaimer for mathematicians. This is written largely off the top of my head, so I'd appreciate correction by /msg if you spot a substantive factual error: but I'm trying to keep technical notation and detail out of it. It's already a long enough introduction. Feel free to beat me to nodes for any of the details, though I'll get to them eventually.