Constructive set theory

Axiomatic constructive set theory is an approach to mathematical constructivism following the program of axiomatic set theory. The same first-order language with " $=$ " and " $\in$ " of classical set theory is usually used, so this is not to be confused with a constructive types approach. On the other hand, some constructive theories are indeed motivated by their interpretability in type theories.

In addition to rejecting the principle of excluded middle ( ${\mathrm {PEM} }$ ), constructive set theories often require some logical quantifiers in their axioms to be set bounded. The latter is motivated by results tied to impredicativity.

Introduction

Constructive outlook

Preliminary on the use of intuitionistic logic

The logic of the set theories discussed here is constructive in that it rejects the principle of excluded middle ${\mathrm {PEM} }$ , i.e. that the disjunction $\phi \lor \neg \phi$ automatically holds for all propositions $\phi$ . This is also often called the law of excluded middle ( ${\mathrm {LEM} }$ ) in contexts where it is assumed. Constructively, as a rule, to prove the excluded middle for a proposition $P$ , i.e. to prove the particular disjunction $P\lor \neg P$ , either $P$ or $\neg P$ needs to be explicitly proven. When either such proof is established, one says the proposition is decidable, and this then logically implies the disjunction holds. Similarly and more commonly, a predicate $Q(x)$ for $x$ in a domain $X$ is said to be decidable when the more intricate statement $\forall (x\in X).{\big (}Q(x)\lor \neg Q(x){\big )}$ is provable. Non-constructive axioms may enable proofs that formally claim decidability of such $P$ (and/or $Q$ ) in the sense that they prove excluded middle for $P$ (resp. the statement using the quantifier above) without demonstrating the truth of either side of the disjunction(s). This is often the case in classical logic. In contrast, axiomatic theories deemed constructive tend to not permit many classical proofs of statements involving properties that are provenly computationally undecidable.

The law of noncontradiction is a special case of the propositional form of modus ponens. Using the former with any negated statement $\neg P$ , one valid De Morgan's law thus implies $\neg \neg (P\lor \neg P)$ already in the more conservative minimal logic. In words, intuitionistic logic still posits: It is impossible to rule out a proposition and rule out its negation both at once, and thus the rejection of any instantiated excluded middle statement for an individual proposition is inconsistent. Here the double-negation captures that the disjunction statement now provenly can never be ruled out or rejected, even in cases where the disjunction may not be provable (for example, by demonstrating one of the disjuncts, thus deciding $P$ ) from the assumed axioms.

More generally, constructive mathematical theories tend to prove classically equivalent reformulations of classical theorems. For example, in constructive analysis, one cannot prove the intermediate value theorem in its textbook formulation, but one can prove theorems with algorithmic content that, as soon as double negation elimination and its consequences are assumed legal, are at once classically equivalent to the classical statement. The difference is that the constructive proofs are harder to find.

The intuitionistic logic underlying the set theories discussed here, unlike minimal logic, still permits double negation elimination for individual propositions $P$ for which excluded middle holds. In turn the theorem formulations regarding finite objects tends to not differ from their classical counterparts. Given a model of all natural numbers, the equivalent for predicates, namely Markov's principle, does not automatically hold, but may be considered as an additional principle.

In an inhabited domain and using explosion, the disjunction $P\lor \exists (x\in X).\neg Q(x)$ implies the existence claim $\exists (x\in X).(Q(x)\to P)$ , which in turn implies ${\big (}\forall (x\in X).Q(x){\big )}\to P$ . Classically, these implications are always reversible. If one of the former is classically valid, it can be worth trying to establish it in the latter form. For the special case where $P$ is rejected, one deals with a counter-example existence claim $\exists (x\in X).\neg Q(x)$ , which is generally constructively stronger than a rejection claim $\neg \forall (x\in X).Q(x)$ : Exemplifying a $t$ such that $Q(t)$ is contradictory of course means it is not the case that $Q$ holds for all possible $x$ . But one may also demonstrate that $Q$ holding for all $x$ would logically lead to a contradiction without the aid of a specific counter-example, and even while not being able to construct one. In the latter case, constructively, here one does not stipulate an existence claim.

Imposed restrictions on a set theory

Compared to the classical counterpart, one is generally less likely to prove the existence of relations that cannot be realized. A restriction to the constructive reading of existence apriori leads to stricter requirements regarding which characterizations of a set $f\subset X\times Y$ involving unbounded collections constitute a (mathematical, and so always meaning total) function. This is often because the predicate in a case-wise would-be definition may not be decidable. Adopting the standard definition of set equality via extensionality, the full Axiom of Choice is such a non-constructive principle that implies ${\mathrm {PEM} }$ for the formulas permitted in one's adopted Separation schema, by Diaconescu's theorem. Similar results hold for the Axiom of Regularity existence claim, as shown below. The latter has a classically equivalent inductive substitute. So a genuinely intuitionistic development of set theory requires the rewording of some standard axioms to classically equivalent ones. Apart from demands for computability and reservations regrading of impredicativity,^[1] technical question regarding which non-logical axioms effectively extend the underlying logic of a theory is also a research subject in its own right.

Metalogic

With computably undecidable propositions already arising in Robinson arithmetic, even just Predicative separation lets one define elusive subsets easily. In stark contrast to the classical framework, constructive set theories may be closed under the rule that any property that is decidable for all sets is already equivalent to one of the two trivial ones, $\top$ or $\bot$ . Also the real line may be taken to be indecomposable in this sense. Undecidability of disjunctions also affects the claims about total orders such as that of all ordinal numbers, expressed by the provability and rejection of the clauses in the order defining disjunction $(\alpha \in \beta )\lor (\alpha =\beta )\lor (\beta \in \alpha )$ . This determines whether the relation is trichotomous. A weakened theory of ordinals in turn affects the proof theoretic strength defined in ordinal analysis.

In exchange, constructive set theories can exhibit attractive disjunction and existence properties, as is familiar from the study of constructive arithmetic theories. These are features of a fixed theory which metalogically relate judgements of propositions provable in the theory. Particularly well-studied are those such features that can be expressed in Heyting arithmetic, with quantifiers over numbers and which can often be realized by numbers, as formalized in proof theory. In particular, those are the numerical existence property and the closely related disjunctive property, as well as being closed under Church's rule, witnessing any given function to be computable.^[2]

A set theory does not only express theorems about numbers, and so one may consider a more general so-called strong existence property that is harder to come by, as will be discussed. A theory has this property if the following can be established: For any property $\phi$ , if the theory proves that a set exist that has that property, i.e. if the theory claims the existence statement, then there is also a property $\psi$ that uniquely describes such a set instance. More formally, for any predicate $\phi$ there is a predicate $\psi$ so that

{\mathsf {T}}\vdash \exists x.\phi (x)\implies {\mathsf {T}}\vdash \exists !x.\phi (x)\land \psi (x)

The role analogous to that of realized numbers in arithmetic is played here by defined sets proven to exist by (or according to) the theory. Questions concerning the axiomatic set theory's strength and its relation to term construction are subtle. While many theories discussed tend have all the various numerical properties, the existence property can easily be spoiled, as will be discussed. Weaker forms of existence properties have been formulated.

Some theories with a classical reading of existence can in fact also be constrained so as to exhibit the strong existence property. In Zermelo–Fraenkel set theory with sets all taken to be ordinal-definable, a theory denoted ${\mathsf {ZF}}+({\mathrm {V} }={\mathrm {HOD} })$ , no sets without such definability exist. The property is also enforced via the constructible universe postulate in ${\mathsf {ZF}}+({\mathrm {V} }={\mathrm {L} })$ . For contrast, consider the theory ${\mathsf {ZFC}}$ given by ${\mathsf {ZF}}$ plus the full axiom of choice existence postulate: Recall that this collection of axioms proves the well-ordering theorem, implying well-orderings exists for any set. In particular, this means that relations $W\subset {\mathbb {R} }\times {\mathbb {R} }$ formally exist that establish the well-ordering of ${\mathbb {R} }$ (i.e. the theory claims the existence of a least element for all subsets of ${\mathbb {R} }$ with respect to those relations). This is despite the fact that definability of such an ordering is known to be independent of ${\mathsf {ZFC}}$ . The latter implies that for no particular formula $\psi$ in the language of the theory does the theory prove that the corresponding set is a well-ordering relation of the reals. So ${\mathsf {ZFC}}$ formally proves the existence of a subset $W\subset {\mathbb {R} }\times {\mathbb {R} }$ with the property of being a well-ordering relation, but at the same time no particular set $W$ for which the property could be validated can possibly be defined.

Anti-classical principles

As mentioned above, a constructive theory ${\mathsf {T}}$ may exhibit the numerical existence property, ${\mathsf {T}}\vdash \exists e.\psi (e)\implies {\mathsf {T}}\vdash \psi ({\underline {\mathrm {e} }})$ , for some number ${\mathrm {e} }$ and where ${\underline {\mathrm {e} }}$ denotes the corresponding numeral in the formal theory. Here one must carefully distinguish between provable implications between two propositions, ${\mathsf {T}}\vdash P\to Q$ , and a theory's properties of the form ${\mathsf {T}}\vdash P\implies {\mathsf {T}}\vdash Q$ . When adopting a metalogically established schema of the latter type as an inference rule of one's proof calculus and nothing new can be proven, one says the theory ${\mathsf {T}}$ is closed under that rule.

One may instead consider adjoining the rule corresponding to the meta-theoretical property as an implication (in the sense of " $\to$ ") to ${\mathsf {T}}$ , as an axiom schema or in quantified form. A situation commonly studied is that of a fixed ${\mathsf {T}}$ exhibiting the meta-theoretical property of the following type: For an instance from some collection of formulas of a particular form, here captured via $\phi$ and $\psi$ , one established the existence of a number ${\mathrm {e} }$ so that ${\mathsf {T}}\vdash \phi \implies {\mathsf {T}}\vdash \psi ({\underline {\mathrm {e} }})$ . Here one may then postulate $\phi \to \exists (e\in {\mathbb {N} }).\psi (e)$ , where the bound $e$ is a number variable in language of the theory. For example, Church's rule is an admissible rule in first-order Heyting arithmetic ${\mathsf {HA}}$ and, furthermore, the corresponding Church's thesis principle ${\mathrm {CT} }_{0}$ may consistently be adopted as an axiom. The new theory with the principle added is anti-classical, in that it may not be consistent anymore to also adopt ${\mathrm {PEM} }$ . Similarly, adjoining the excluded middle principle ${\mathrm {PEM} }$ to some theory ${\mathsf {T}}$ , the theory thus obtained may prove new, strictly classical statements, and this may spoil some of the meta-theoretical properties that were previously established for ${\mathsf {T}}$ . In such a fashion, ${\mathrm {CT} }_{0}$ may not be adopted in ${\mathsf {HA}}+{\mathrm {PEM} }$ , also known as Peano arithmetic ${\mathsf {PA}}$ .

The focus in this subsection shall be on set theories with quantification over a fully formal notion of an infinite sequences space, i.e. function space, as it will be introduced further below. A translation of Church's rule into the language of the theory itself may here read

\forall (f\in {\mathbb {N} }^{\mathbb {N} }).\exists (e\in {\mathbb {N} }).{\Big (}\forall (n\in {\mathbb {N} }).\exists (w\in {\mathbb {N} }).T(e,n,w)\land U(w,f(n)){\Big )}

Kleene's T predicate together with the result extraction expresses that any input number $n$ being mapped to the number $f(n)$ is, through $w$ , witnessed to be a computable mapping. Here ${\mathbb {N} }$ now denotes a set theory model of the standard natural numbers and $e$ is an index with respect to a fixed program enumeration. Stronger variants have been used, which extend this principle to functions $f\in {\mathbb {N} }^{X}$ defined on domains $X\subset {\mathbb {N} }$ of low complexity. The principle rejects decidability for the predicate $Q(e)$ defined as $\exists (w\in {\mathbb {N} }).T(e,e,w)$ , expressing that $e$ is the index of a computable function halting on its own index. Weaker, double negated forms of the principle may be considered too, which do not require the existence of a recursive implementation for every $f$ , but which still make principles inconsistent that claim the existence of functions which provenly have no recursive realization. Some forms of a Church's thesis as principle are even consistent with the classical, weak so called second-order arithmetic theory ${\mathsf {RCA}}_{0}$ , a subsystem of the two-sorted first-order theory ${\mathsf {Z}}_{2}$ .

The collection of computable functions is classically subcountable, which classically is the same as being countable. But classical set theories will generally claim that ${\mathbb {N} }^{\mathbb {N} }$ holds also other functions than the computable ones. For example there is a proof in ${\mathsf {ZF}}$ that total functions (in the set theory sense) do exist that cannot be captured by a Turing machine. Taking the computable world seriously as ontology, a prime example of an anti-classical conception related the Markovian school is the permitted subcountability of various uncountable collections. When adopting the subcountability of the collection of all unending sequences of natural numbers ( ${\mathbb {N} }^{\mathbb {N} }$ ) as an axiom in a constructive theory, the "smallness" (in classical terms) of this collection, in some set theoretical realizations, is then already captured by the theory itself. A constructive theory may also adopt neither classical nor anti-classical axioms and so stay agnostic towards either possibility.

Constructive principles already prove $\forall (x\in X).\neg \neg {\big (}Q(x)\lor \neg Q(x){\big )}$ for any $Q$ . And so for any given element $x$ of $X$ , the corresponding excluded middle statement for the proposition cannot be negated. Indeed, for any given $x$ , by noncontradiction it is impossible to rule out $Q(x)$ and rule out its negation both at once, and the relevant De Morgan's rule applies as above. But a theory may in some instances also permit the rejection claim $\neg \forall (x\in X).{\big (}Q(x)\lor \neg Q(x){\big )}$ . Adopting this does not necessitate providing a particular $t\in X$ witnessing the failure of excluded middle for the particular proposition $Q(t)$ , i.e. witnessing the inconsistent $\neg {\big (}Q(t)\lor \neg Q(t){\big )}$ . Predicates $Q(x)$ on an infinite domain $X$ correspond to decision problems. Motivated by provenly computably undecidable problems, one may reject the possibility of decidability of a predicate without also making any existence claim in $X$ . As another example, such a situation is enforced in Brouwerian intuitionistic analysis, in a case where the quantifier ranges over infinitely many unending binary sequences and $Q(x)$ states that a sequence $x$ is everywhere zero. Concerning this property, of being conclusively identified as the sequence which is forever constant, adopting Brouwer's continuity principle strictly rules out that this could be proven decidable for all the sequences.

So in a constructive context with a so-called non-classical logic as used here, one may consistently adopt axioms which are both in contradiction to quantified forms of excluded middle, but also non-constructive in the computable sense or as gauged by meta-logical existence properties discussed previously. In that way, a constructive set theory can also provide the framework to study non-classical theories, say rings modeling smooth infinitesimal analysis.

History and overview

Historically, the subject of constructive set theory (often also " ${\mathsf {CST}}$ ") begun with John Myhill's work on the theories also called ${\mathsf {IZF}}$ and ${\mathsf {CST}}$ .^[3]^[4]^[5] In 1973, he had proposed the former as a first-order set theory based on intuitionistic logic, taking the most common foundation ${\mathsf {ZFC}}$ and throwing out the Axiom of choice as well as the principle of the excluded middle, initially leaving everything else as is. However, different forms of some of the ${\mathsf {ZFC}}$ axioms which are equivalent in the classical setting are inequivalent in the constructive setting, and some forms imply ${\mathrm {PEM} }$ , as will be demonstrated. In those cases, the intuitionistically weaker formulations were consequently adopted. The far more conservative system ${\mathsf {CST}}$ is also a first-order theory, but of several sorts and bounded quantification, aiming to provide a formal foundation for Errett Bishop's program of constructive mathematics.

The main discussion presents a sequence of theories in the same language as ${\mathsf {ZF}}$ , leading up to Peter Aczel's well studied ${\mathsf {CZF}}$ ,^[6] and beyond. Many modern results trace back to Rathjen and his students. ${\mathsf {CZF}}$ is also characterized by the two features present also in Myhill's theory: On the one hand, it is using the Predicative Separation instead of the full, unbounded Separation schema. Boundedness can be handled as a syntactic property or, alternatively, the theories can be conservatively extended with a higher boundedness predicate and its axioms. Secondly, the impredicative Powerset axiom is discarded, generally in favor of related but weaker axioms. The strong form is very casually used in classical general topology. Adding ${\mathrm {PEM} }$ to a theory even weaker than ${\mathsf {CZF}}$ recovers ${\mathsf {ZF}}$ , as detailed below.^[7] The system, which has come to be known as Intuitionistic Zermelo–Fraenkel set theory ( ${\mathsf {IZF}}$ ), is a strong set theory without ${\mathrm {PEM} }$ . It is similar to ${\mathsf {CZF}}$ , but less conservative or predicative. The theory denoted ${\mathsf {IKP}}$ is the constructive version of ${\mathsf {KP}}$ , the classical Kripke–Platek set theory without a form of Powerset and where even the Axiom of Collection is bounded.

Models

Many theories studied in constructive set theory are mere restrictions of Zermelo–Fraenkel set theory ( ${\mathsf {ZF}}$ ) with respect to their axiom as well as their underlying logic. Such theories can then also be interpreted in any model of ${\mathsf {ZF}}$ .

Peano arithmetic ${\mathsf {PA}}$ is bi-interpretable with the theory given by ${\mathsf {ZF}}$ minus Infinity and without infinite sets, plus the existence of all transitive closures. (The latter is also implied after promoting Regularity to the Set Induction schema, which is discussed below.) Likewise, constructive arithmetic can also be taken as an apology for most axioms adopted in ${\mathsf {CZF}}$ : Heyting arithmetic ${\mathsf {HA}}$ is bi-interpretable with a weak constructive set theory,^[8]^[9] as also described in the article on ${\mathsf {HA}}$ . One may arithmetically characterize a membership relation " $\in$ " and with it prove - instead of the existence of a set of natural numbers $\omega$ - that all sets in its theory are in bijection with a (finite) von Neumann natural, a principle denoted ${\mathrm {V} }={\mathrm {Fin} }$ . This context further validates Extensionality, Pairing, Union, Binary Intersection (which is related to the Axiom schema of predicative separation) and the Set Induction schema. Taken as axioms, the aforementioned principles constitute a set theory that is already identical with the theory given by ${\mathsf {CZF}}$ minus the existence of $\omega$ but plus ${\mathrm {V} }={\mathrm {Fin} }$ as axiom. All those axioms are discussed in detail below. Relatedly, ${\mathsf {CZF}}$ also proves that the hereditarily finite sets fulfill all the previous axioms. This is a result which persists when passing on to ${\mathsf {PA}}$ and ${\mathsf {ZF}}$ minus Infinity.

As far as constructive realizations go there is a relevant realizability theory. Relatedly, Aczel's theory constructive Zermelo-Fraenkel ${\mathsf {CZF}}$ has been interpreted in a Martin-Löf type theories, as sketched in the section on ${\mathsf {CZF}}$ . In this way, theorems provable in this and weaker set theories are candidates for a computer realization.

Presheaf models for constructive set theories have also been introduced. These are analogous to presheaf models for intuitionistic set theory developed by Dana Scott in the 1980s.^[10]^[11] Realizability models of ${\mathsf {CZF}}$ within the effective topos have been identified, which, say, at once validate full Separation, relativized dependent choice ${\mathrm {RDC} }$ , independence of premise ${\mathrm {IP} }$ for sets, but also the subcountability of all sets, Markov's principle ${\mathrm {MP} }$ and Church's thesis ${\mathrm {CT} _{0}}$ in the formulation for all predicates.^[12]

Notation

In an axiomatic set theory, sets are the entities exhibiting properties. But there is then a more intricate relation between the set concept and logic. For example, the property of being a natural number smaller than 100 may be reformulated as being a member of the set of numbers with that property. The set theory axioms govern set existence and thus govern which predicates can be materialized as entity in itself, in this sense. Specification is also directly governed by the axioms, as discussed below. For a practical consideration, consider for example the property of being a sequence of coin flip outcomes that overall show more heads than tails. This property may be used to separate out a corresponding subset of any set of finite sequences of coin flips. Relatedly, the measure theoretic formalization of a probabilistic event is explicitly based around sets and provides many more examples.

This section introduces the object language and auxiliary notions used to formalize this materialization.

Language

The propositional connective symbols used to form syntactic formulas are standard. The axioms of set theory give a means to prove equality " $=$ " of sets and that symbol may, by abuse of notation, be used for classes. A set in which the equality predicate is decidable is also called discrete. Negation " $\neg$ " of equality is sometimes called the denial of equality, and is commonly written " $\neq$ ". However, in a context with apartness relations, for example when dealing with sequences, the latter symbol is also sometimes used for something different.

The common treatment, as also adopted here, formally only extends the underlying logic by one primitive binary predicate of set theory, " $\in$ ". As with equality, negation of elementhood " $\in$ " is often written " $\notin$ ".

Variables

Below the Greek $\phi$ denotes a proposition or predicate variable in axiom schemas and $P$ or $Q$ is used for particular such predicates. The word "predicate" is sometimes used interchangeably with "formulas" as well, even in the unary case.

Quantifiers only ever range over sets and those are denoted by lower case letters. As is common, one may use argument brackets to express predicates, for the sake of highlighting particular free variables in their syntactic expression, as in " $Q(z)$ ". Unique existence $\exists !x.Q(x)$ here means $\exists x.\forall y.{\big (}y=x\leftrightarrow Q(y){\big )}$ .

Classes

As is also common, one makes use set builder notation for classes, which, in most contexts, are not part of the object language but used for concise discussion. In particular, one may introduce notation declarations of the corresponding class via " $A=\{z\mid Q(z)\}$ ", for the purpose of expressing any $Q(a)$ as $a\in A$ . Logically equivalent predicates can be used to introduce the same class. One also writes $\{z\in B\mid Q(z)\}$ as shorthand for $\{z\mid z\in B\land Q(z)\}$ . For example, one may consider $\{z\in B\mid z\notin C\}$ and this is also denoted $B\setminus C$ .

One abbreviates $\forall z.{\big (}z\in A\to Q(z){\big )}$ by $\forall (z\in A).Q(z)$ and $\exists z.{\big (}z\in A\land Q(z){\big )}$ by $\exists (z\in A).Q(z)$ . The syntactic notion of bounded quantification in this sense can play a role in the formulation of axiom schemas, as seen in the discussion of axioms below. Express the subclass claim $\forall (z\in A).z\in B$ , i.e. $\forall z.(z\in A\to z\in B)$ , by $A\subset B$ . For a predicate $Q$ , trivially $\forall z.{\big (}(z\in B\land Q(z))\to z\in B{\big )}$ . And so follows that $\{z\in B\mid Q(z)\}\subset B$ . The notion of subset-bounded quantifiers, as in $\forall (z\subset A).z\in B$ , has been used in set theoretical investigation as well, but will not be further highlighted here.

If there provenly exists a set inside a class, meaning $\exists z.(z\in A)$ , then one calls it inhabited. One may also use quantification in $A$ to express this as $\exists (z\in A).(z=z)$ . The class $A$ is then provenly not the empty set, introduced below. While classically equivalent, constructively non-empty is a weaker notion with two negations and ought to be called not uninhabited. Unfortunately, the word for the more useful notion of 'inhabited' is rarely used in classical mathematics.

Two ways to express that classes are disjoint does capture many of the intuitionistically valid negation rules: ${\big (}\forall (x\in A).x\notin B{\big )}\leftrightarrow \neg \exists (x\in A).x\in B$ . Using the above notation, this is a purely logical equivalence and in this article the proposition will furthermore be expressible as $A\cap B=\{\}$ .

A subclass $A\subset B$ is called detachable from $B$ if the relativized membership predicate is decidable, i.e. if $\forall (x\in B).x\in A\lor x\notin A$ holds. It is also called decidable if the superclass is clear from the context - often this is the set of natural numbers.

Extensional equivalence

Denote by $A\simeq B$ the statement expressing that two classes have exactly the same elements, i.e. $\forall z.(z\in A\leftrightarrow z\in B)$ , or equivalently $(A\subset B)\land (B\subset A)$ . This is not to be conflated with the concept of equinumerosity also used below.

With $A$ standing for $\{z\mid Q(z)\}$ , the convenient notational relation between $x\in A$ and $Q(x)$ , axioms of the form $\exists a.\forall z.{\big (}z\in a\leftrightarrow Q(z){\big )}$ postulate that the class of all sets for which $Q$ holds actually forms a set. Less formally, this may be expressed as $\exists a.a\simeq A$ . Likewise, the proposition $\forall a.(a\simeq A)\to P(a)$ conveys " $P(A)$ when $A$ is among the theory's sets." For the case where $P$ is the trivially false predicate, the proposition is equivalent to the negation of the former existence claim, expressing the non-existence of $A$ as a set.

Further extensions of class comprehension notation as above are in common used in set theory, giving meaning to statements such as " $\{f(z)\mid Q(z)\}\simeq \{\langle x,y,z\rangle \mid T(x,y,z)\}$ ", and so on.

Syntactically more general, a set $w$ may also be characterized using another 2-ary predicate $R$ trough $\forall x.x\in w\leftrightarrow R(x,w)$ , where the right hand side may depend on the actual variable $w$ , and possibly even on membership in $w$ itself.

Subtheories of ZF

Here a series of familiar axioms is presented, or the relevant slight reformulations thereof. It is emphasized how the absence of ${\mathrm {PEM} }$ in the logic affects what is provable and it is highlighted which non-classical axioms are, in turn, consistent.

Equality

Using the notation introduced above, the following axiom gives a means to prove equality " $=$ " of two sets, so that through substitution, any predicate about $x$ translates to one of $y$ . By the logical properties of equality, the converse direction of the postulated implication holds automatically.

Extensionality

$\forall x.\forall y.\ \ x\simeq y\to x=y$

In a constructive interpretation, the elements of a subclass $A=\{z\in B\mid Q(z)\lor \neg Q(z)\}$ of $B$ may come equipped with more information than those of $B$ , in the sense that being able to judge $b\in A$ is being able to judge $Q(b)\lor \neg Q(b)$ . And (unless the whole disjunction follows from axioms) in the Brouwer–Heyting–Kolmogorov interpretation, this means to have proven $Q(b)$ or having rejected it. As $\{z\in B\mid Q(z)\}$ may not be detachable from $B$ , i.e. as $Q$ may be not decidable for all elements in $B$ , the two classes $A$ and $B$ must a priori be distinguished.

Consider a predicate $Q$ that provenly holds for all elements of a set $y$ , so that $y\simeq \{z\in y\mid Q(z)\}$ , and assume that the class on the right hand side is established to be a set. Note that, even if this set on the right informally also ties to proof-relevant information about the validity of $Q$ for all the elements, the Extensionality axiom postulates that, in our set theory, the set on the right hand side is judged equal to the one on the left hand side.

This above analysis also shows that a statement of the form $\forall (x\in w).Q(x)$ , which in informal class notation may be expressed as $w\subset \{x\mid Q(x)\}$ , is then equivalently expressed as $\{x\in w\mid Q(x)\}=w$ . This means that establishing such $\forall$ -theorems (e.g. the ones provable from full mathematical induction) enables substituting the subclass of $w$ on the left hand side of the equality for just $w$ , in any formula.

Note that adopting " $=$ " as a symbol in a predicate logic theory makes equality of two terms a quantifier-free expression.

Alternative approaches

While often adopted, this axiom has been criticized in constructive thought, as it effectively collapses differently defined properties, or at least the sets viewed as the extension of these properties, a Fregian notion.

Modern type theories may instead aim at defining the demanded equivalence " $\simeq$ " in terms of functions, see e.g. type equivalence. The related concept of function extensionality is often not adopted in type theory.

Other frameworks for constructive mathematics might instead demand a particular rule for equality or apartness come for the elements $z\in x$ of each and every set $x$ discussed. But also in an approach to sets emphasizing apartness may the above definition in terms of subsets be used to characterize a notion of equality " $\simeq$ " of those subsets. Relatedly, a loose notion of complementation of two subsets $u\subset x$ and $v\subset x$ is given when any two members $s\in u$ and $t\in v$ are provably apart from each other. The collection of complementing pairs $\langle u,v\rangle$ is algebraically well behaved.

Merging sets

Define class notation for the pairing of a few given elements via disjunctions. E.g. $z\in \{a,b\}$ is the quantifier-free statement $(z=a)\lor (z=b)$ , and likewise $z\in \{a,b,c\}$ says $(z=a)\lor (z=b)\lor (z=c)$ , and so on.

Two other basic existence postulates given some other sets are as follows. Firstly,

Pairing

$\forall x.\forall y.\ \ \exists p.\{x,y\}\subset p$

Given the definitions above, $\{x,y\}\subset p$ expands to $\forall z.(z=x\lor z=y)\to z\in p$ , so this is making use of equality and a disjunction. The axiom says that for any two sets $x$ and $y$ , there is at least one set $p$ , which hold at least those two sets.

With bounded Separation below, also the class $\{x,y\}$ exists as a set. Denote by $\langle x,y\rangle$ the standard ordered pair model $\{\{x\},\{x,y\}\}$ , so that e.g. $q=\langle x,y\rangle$ denotes another bounded formula in the formal language of the theory.

And then, using existential quantification and a conjunction,

Union

$\forall x.\ \ \exists u.\forall z.{\Big (}{\big (}\exists (y\in x).z\in y{\big )}\to z\in u{\Big )}$

saying that for any set $x$ , there is at least one set $u$ , which holds all the members $z$ , of $x$ 's members $y$ . The minimal such set is the union.

The two axioms are commonly formulated stronger, in terms of " $\leftrightarrow$ " instead of just " $\to$ ", although this is technically redundant in the context of ${\mathsf {BCST}}$ : As the Separation axiom below is formulated with " $\leftrightarrow$ ", for statements $\exists t.\forall z.\phi (z)\to z\in t$ the equivalence can be derived, given the theory allows for separation using $\phi$ . In cases where $\phi$ is an existential statement, like here in the union axiom, there is also another formulation using a universal quantifier.

Also using bounded Separation, the two axioms just stated together imply the existence of a binary union of two classes $a$ and $b$ , when they have been established to be sets, denoted by $\bigcup \{a,b\}$ or $a\cup b$ . For a fixed set $z$ , to validate membership $z\in a\cup b$ in the union of two given sets $y=a$ and $y=b$ , one needs to validate the $z\in y$ part of the axiom, which can be done by validating the disjunction of the predicates defining the sets $a$ and $b$ , for $z$ . In terms of the associated sets, it is done by validating the disjunction $z\in a\lor z\in b$ .

The union and other set forming notations are also used for classes. For instance, the proposition $z\in A\land z\notin C$ is written $z\in A\setminus C$ . Let now $B\subset A$ . Given $z\in A$ , the decidability of membership in $B$ , i.e. the potentially independent statement $z\in B\lor z\notin B$ , can also be expressed as $z\in B\cup (A\setminus B)$ . But, as for any excluded middle statement, the double-negation of the latter holds: That union isn't not inhabited by $z$ . This goes to show that partitioning is also a more involved notion, constructively.

Set existence

The property that is false for any set corresponds to the empty class, which is denoted by $\{\}$ or zero, $0$ . That the empty class is a set readily follows from other existence axioms, such as the Axiom of Infinity below. But if, e.g., one is explicitly interested in excluding infinite sets in one's study, one may at this point adopt the

Axiom of empty set:

$\exists x.\forall y.\,\neg (y\in x)$

Introduction of the symbol $\{\}$ (as abbreviating notation for expressions in involving characterizing properties) is justified as uniqueness for this set can be proven. As $y\in \{\}$ is false for any $y$ , the axiom then reads $\exists x.x\simeq \{\}$ .

Write $1$ for $S0$ , which equals $\{\{\}\}$ , i.e. $\{0\}$ . Likewise, write $2$ for $S1$ , which equals $\{\{\},\{\{\}\}\}$ , i.e. $\{0,1\}$ . A simple and provenly false proposition then is, for example, $\{\}\in \{\}$ , corresponding to $0<0$ in the standard arithmetic model. Again, here symbols such as $\{\}$ are treated as convenient notation and any proposition really translates to an expression using only " $\in$ " and logical symbols, including quantifiers. Accompanied by a metamathematical analysis that the capabilities of the new theories are equivalent in an effective manner, formal extensions by symbols such as $0$ may also be considered.

More generally, for a set $x$ , define the successor set $Sx$ as $x\cup \{x\}$ . The interplay of the successor operation with the membership relation has a recursive clause, in the sense that $(y\in Sx)\leftrightarrow (y\in x\lor y=x)$ . By reflexivity of equality, $x\in Sx$ , and in particular $Sx$ is always inhabited.

BCST

The following makes use of axiom schemas, i.e. axioms for some collection of predicates. Some of the stated axiom schemas shall allow for any collection of set parameters as well (meaning any particular named variables $v_{0},v_{1},\dots ,v_{n}$ ). That is, instantiations of the schema are permitted in which the predicate (some particular $\phi$ ) also depends on a number of further set variables and the statement of the axiom is understood with corresponding extra outer universal closures (as in $\forall v_{0}.\forall v_{1}.\cdots \forall v_{n}.$ ).

Separation

Basic constructive set theory ${\mathsf {BCST}}$ consists of several axioms also part of standard set theory, except the so called "full" Separation axiom is weakened. Beyond the four axioms above, it postulates Predicative Separation as well as the Replacement schema.

Axiom schema of predicative separation: For any bounded predicate $\phi$ , with parameters and with set variable $y$ not free in it,

$\forall y.\,\exists s.\forall x.{\big (}x\in s\,\leftrightarrow \,(x\in y\land \phi (x)){\big )}$

This axiom amounts to postulating the existence of a set $s$ obtained by the intersection of any set $y$ and any predicatively described class $\{x\mid \phi (x)\}$ . For any $z$ proven to be a set, when the predicate is taken as $\phi (x):=x\in z$ , one obtains the binary intersection of sets and writes $s=y\cap z$ . Intersection corresponds to conjunction in an analog way to how union corresponds to disjunction.

When the predicate is taken as the negation $\phi (x):=x\notin z$ , one obtains the difference principle, granting existence of any set $y\setminus z$ . Note that sets like $y\setminus y$ or $\{x\in y\mid \neg (x=x)\}$ are always empty. So, as noted, from Separation and the existence of at least one set (e.g. Infinity below) will follow the existence of the empty set $\{\}$ (also denoted $0$ ). Within this conservative context of ${\mathsf {BCST}}$ , the Predicative Separation schema is actually equivalent to Empty Set plus the existence of the binary intersection for any two sets. The latter variant of axiomatization does not make use of a formula schema.

Predicative Separation is a schema that takes into account syntactic aspects of set defining predicates, up to provable equivalence. The permitted formulas are denoted by $\Delta _{0}$ , the lowest level in the set theoretical Lévy hierarchy.^[13] General predicates in set theory are never syntactically restricted in such a way and so, in praxis, generic subclasses of sets are still part of the mathematical language. As the scope of subclasses that are provably sets is sensitive to what sets already exist, this scope is expanded when further set existence postulates are added.

A class with at most one element is called a subsingleton. For a proposition $P$ , a recurring trope in the constructive analysis of set theory is to view the predicate $x=0\land P$ as the subsingleton $B:=\{x\in 1\mid P\}$ , which is subclass of the second ordinal $1:=S0=\{0\}$ . If it is provable that $P$ holds, or $\neg P$ , or $\neg \neg P$ , then $B$ is inhabited, or empty (uninhabited), or non-empty (not uninhabited), respectively. Clearly, $P$ is equivalent to both the proposition $0\in B$ , and also $B=1$ . Likewise, $\neg P$ is equivalent to $B=0$ and, equivalently, also $\neg (0\in B)$ . So, here, $B$ being detachable from $1$ exactly means $P\lor \neg P$ . In the model of the naturals, if $B$ is a number, $0\in B$ also expresses that $0$ is smaller than $B$ . The union that is part of the successor operation definition above may be used to express the excluded middle statement as $0\in SB$ . In words, $P$ is decidable if and only if the successor of $B$ is larger than the smallest ordinal $0$ . The proposition $P$ is decided either way through establishing how $0$ is smaller: By $0$ already being smaller than $B$ , or by $0$ being $SB$ 's direct predecessor. Yet another way to express excluded middle for $P$ is as the existence of a least number member of the inhabited class $b:=B\cup \{1\}$ .

If one's separation axiom allows for separation with $P$ , then $B$ is a subset, which may be called the truth value associated with $P$ . Two truth values can be proven equal, as sets, by proving an equivalence. In terms of this terminology, the collection of proof values can a priori be understood to be rich. Unsurprisingly, decidable propositions have one of a binary set of truth values. The excluded middle disjunction for that $P$ is then also implied by the global statement $\forall b.(0\in b)\lor (0\notin b)$ .

No universal set

When using the informal class terminology, any set is also considered a class. At the same time, there do arise so called proper classes that can have no extension as a set. When in a theory there is a proof of $\neg \exists x.A\subset x$ , then $A$ must be proper. (When taking up the perspective of ${\mathsf {ZF}}$ on sets, a theory which has full Separation, proper classes are generally thought of as those that are "too big" to be a set. More technically, they are subclasses of the cumulative hierarchy that extend beyond any ordinal bound.)

By a remark in the section on merging sets, a set cannot consistently ruled out to be a member of a class of the form $A\cup \{x\mid x\notin A\}$ . A constructive proof that it is in that class contains information. Now if $A$ is a set, then the class $\{x\mid x\notin A\}$ is provably proper. The following demonstrates this in the special case when $A$ is empty, i.e. when the right side is the universal class. Being negative results, it reads as in the classical theory.

The following holds for any relation $E$ . It gives a purely logical condition such that two terms $s$ and $y$ cannot be $E$ -related to one another.

{\big (}\forall x.xEs\leftrightarrow (xEy\land \neg xEx){\big )}\to \neg (yEs\lor sEs\lor sEy)

Most important here is the rejection of the final disjunct, $\neg sEy$ . The expression $\neg (x\in x)$ does not involve unbounded quantification and is thus allowed in Separation. Russel's construction in turn shows that $\{x\in y\mid x\notin x\}\notin y$ . So for any set $y$ , Predicative Separation alone implies that there exists a set which is not a member of $y$ . In particular, no universal set can exist in this theory.

In a theory further adopting the axiom of regularity, like ${\mathsf {ZF}}$ , provenly $x\in x$ is false for any set $x$ . There, this then means that the subset $\{x\in y\mid x\notin x\}$ is equal to $y$ itself, and that the class $\{x\mid x\in x\}$ is the empty set.

For any $E$ and $y$ , the special case $s=y$ in the formula above gives

\neg {\big (}\forall x.xEy\leftrightarrow \neg xEx{\big )}

This already implies that no set $y$ equals the subclass $\{x\mid x\notin x\}$ of the universal class, i.e. that subclass is a proper one as well. But even in ${\mathsf {ZF}}$ without Regularity it is consistent for there to be a proper class of singletons which each contain exactly themselves.

As an aside, in a theory with stratification like Intuitionistic New Foundations, the syntactic expression $x\in x$ may be disallowed in Separation. In turn, the above proof of negation of the existence of a universal set cannot be performed, in that theory.

Predicativity

The axiom schema of Predicative Separation is also called $\Delta _{0}$ -Separation or Bounded Separation, as in Separation for set-bounded quantifiers only. (Warning note: The Lévy hierarchy nomenclature is in analogy to $\Delta _{0}^{0}$ in the arithmetical hierarchy, albeit comparison can be subtle: The arithmetic classification is sometimes expressed not syntactically but in terms of subclasses of the naturals. Also, the bottom level of the arithmetical hierarchy has several common definitions, some not allowing the use of some total functions. A similar distinction is not relevant on the level $\Sigma _{1}^{0}$ or higher. Finally note that a $\Delta _{0}$ classification of a formula may be expressed up to equivalence in the theory.)

The schema is also the way in which Mac Lane weakens a system close to Zermelo set theory ${\mathsf {Z}}$ , for mathematical foundations related to topos theory. It is also used in the study of absoluteness, and there part of the formulation of Kripke-Platek set theory.

The restriction in the axiom is also gatekeeping impredicative definitions: Existence should at best not be claimed for objects that are not explicitly describable, or whose definition involves themselves or reference to a proper class, such as when a property to be checked involves a universal quantifier. So in a constructive theory without Axiom of power set, when $R$ denotes some 2-ary predicate, one should not generally expect a subclass $s$ of $y$ to be a set, in case that it is defined, for example, as in

\{x\in y\mid \forall t.{\big (}(t\subset y)\to R(x,t){\big )}\}

,

or via a similar definitions involving any quantification over the sets $t\subset y$ . Note that if this subclass $s$ of $y$ is provenly a set, then this subset itself is also in the unbounded scope of set variable $t$ . In other words, as the subclass property $s\subset y$ is fulfilled, this exact set $s$ , defined using the expression $R(x,s)$ , would play a role in its own characterization.

While predicative Separation leads to fewer given class definitions being sets, it may be emphasized that many class definitions that are classically equivalent are not so when restricting oneself to the weaker logic. Due to the potential undecidability of general predicates, the notion of subset and subclass is automatically more elaborate in constructive set theories than in classical ones. So in this way one has obtained a broader theory. This remains true if full Separation is adopted, such as in the theory ${\mathsf {IZF}}$ , which however spoils the existence property as well as the standard type theoretical interpretations, and in this way spoils a bottom-up view of constructive sets. As an aside, as subtyping is not a necessary feature of constructive type theory, constructive set theory can be said to quite differ from that framework.

Replacement

Next consider the

Axiom schema of Replacement: For any predicate $\phi$ with set variable $r$ not free in it,

$\forall d.\ \ \forall (x\in d).\exists !y.\phi (x,y)\to \exists r.\forall y.{\big (}y\in r\leftrightarrow \exists (x\in d).\phi (x,y){\big )}$

It is granting existence, as sets, of the range of function-like predicates, obtained via their domains. In the above formulation, the predicate is not restricted akin to the Separation schema, but this axiom already involves an existential quantifier in the antecedent. Of course, weaker schemas could be considered as well.

Via Replacement, the existence of any pair $\{x,y\}$ also follows from that of any other particular pair, such as $\{0,1\}=2=SS0$ . But as the binary union used in $S$ already made use of the Pairing axiom, this approach then necessitates postulating the existence of $2$ over that of $0$ . In a theory with the impredicative Powerset axiom, the existence of $2\subset {\mathcal {P}}{\mathcal {P}}0$ can also be demonstrated using Separation.

With the Replacement schema, the theory outlined thus far proves that the equivalence classes or indexed sums are sets. In particular, the Cartesian product, holding all pairs of elements of two sets, is a set. In turn, for any fixed number (in the metatheory), the corresponding product expression, say $x\times x\times x\times x$ , can be constructed as a set. The axiomatic requirements for sets recursively defined in the language are discussed further below. A set $x$ is discrete, i.e. equality of elements inside a set $x$ is decidable, if the corresponding relation as a subset of $x\times x$ is decidable.

Replacement is relevant for function comprehension and can be seen as a form of comprehension more generally. Only when assuming ${\mathrm {PEM} }$ does Replacement already imply full Separation. In ${\mathsf {ZF}}$ , Replacement is mostly important to prove the existence of sets of high rank, namely via instances of the axiom schema where $\phi (x,y)$ relates relatively small set $x$ to bigger ones, $y$ .

Constructive set theories commonly have Axiom schema of Replacement, sometimes restricted to bounded formulas. However, when other axioms are dropped, this schema is actually often strengthened - not beyond ${\mathsf {ZF}}$ , but instead merely to gain back some provability strength. Such stronger axioms exist that do not spoil the strong existence properties of a theory, as discussed further below.

If $i_{X}$ is provenly a function on $X$ and it is equipped with a codomain $Y$ (all discussed in detail below), then the image of $i_{X}$ is a subset of $Y$ . In other approaches to the set concept, the notion of subsets is defined in terms of "operations", in this fashion.

Hereditarily finite sets

Pendants of the elements of the class of hereditarily finite sets $H_{\aleph _{0}}$ can be implemented in any common programming language. The axioms discussed above abstract from common operations on the set data type: Pairing and Union are related to nesting and flattening, or taken together concatenation. Replacement is related to comprehension and Separation is then related to the often simpler filtering. Replacement together with Set Induction (introduced below) suffices to axiomize $H_{\aleph _{0}}$ constructively and that theory is also studied without Infinity.

A sort of blend between pairing and union, an axiom more readily related to the successor is the Axiom of adjunction.^[14]^[15] Such principles are relevant for the standard modeling of individual Neumann ordinals. Axiom formulations also exist that pair Union and Replacement in one. While postulating Replacement is not a necessity in the design of a weak constructive set theory that is bi-interpretable with Heyting arithmetic ${\mathsf {HA}}$ , some form of induction is. For comparison, consider the very weak classical theory called General set theory that interprets the class of natural numbers and their arithmetic via just Extensionality, Adjunction and full Separation.

The discussion now proceeds with axioms granting existence of objects which, in different but related form, are also found in dependent type theories, namely products and the collection of natural numbers as a completed set. Infinite sets are particularly handy to reason about operations applied to sequences defined on unbounded index domains, say the formal differentiation of a generating function or the addition of two Cauchy sequences.

ECST

For some fixed predicate $I$ and a set $a$ , the statement $I(a)\land {\big (}\forall y.I(y)\to a\subset y{\big )}$ expresses that $a$ is the smallest (in the sense of " $\subset$ ") among all sets $y$ for which $I(y)$ holds true, and that it is always a subset of such $y$ . The aim of the axiom of infinity is to eventually obtain unique smallest inductive set.

In the context of common set theory axioms, one statement of infinitude is to state that a class is inhabited and also includes a chain of membership (or alternatively a chain of supersets). That is,

{\big (}\exists z.z\in A{\big )}\land \forall (x\in A).\exists (s\in A).x\in s

.

More concretely, denote by $\mathrm {Ind} _{A}$ the inductive property,

(0\in A)\land \forall (x\in A).Sx\in A

.

In terms of a predicate $Q$ underlying the class so that $\forall x.(x\in A)\leftrightarrow Q(x)$ , the latter translates to $Q(0)\land \forall x.{\big (}Q(x)\to Q(Sx){\big )}$ .

Write $\bigcap B$ for the general intersection $\{x\mid \forall (y\in B).x\in y\}$ . (A variant of this definition may be considered which requires $\cap B\subset \cup B$ , but we only use this notion for the following auxiliary definition.)

One commonly defines a class $\omega =\bigcap \{y\mid \mathrm {Ind} _{y}\}$ , the intersection of all inductive sets. (Variants of this treatment may work in terms of a formula that depends on a set parameter $w$ so that $\omega \subset w$ .) The class $\omega$ exactly holds all $x$ fulfilling the unbounded property $\forall y.\mathrm {Ind} _{y}\to x\in y$

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]