# Measure and Measurable sets in $\mathbb R^d$
- Although in its abstract from measure theory does not need euclidean geometry but the intuition gained from the concrete geometry is very useful, therefore this is usually how the study of measure theory and Lebesgue integration is introduced. The measure of a set is connected to its volume (or length in case of single dimension).
- Once one understands the nature of real line, perhaps by looking at peculiar sets, the Cantor set, for example, one realizes that one cannot use the notion of *length* to assign *measure* to arbitrary sets because these sets can be too complicated. Moreover, the notion of cardinality is also useless: it is easy to see that the measure of two subsets of $\mathbb R$ with the same number of points need not be the same! For instance, $A=[0,1]$ and $B=[0,2]$ are in one-to-one correspondence through the mapping $x\mapsto 2x$ but their lengths are different. So the set can be disassembled into *uncountably* infinite number of parts and re-assembeled to form a set of twice its length.
- It is not just the infinite nature of these sets that cause the issue, lookup *Banach-Tarski paradox* to see that this can happen even when one decomposes a set into finite number of parts. The main cause of the breakdown of intuition is due to the pathological nature of these sets, which are mostly created using the [[Axiom of choice]]. Such pathological sets almost never come up in practice and hence the measure theory ignores such sets.
- Therefore, along with the development of *measure*, as a by product, we will discover that we simply cannot assign a measure to some sets. But to our great relief, we will see that we can measure a very large and general class of sets including almost everything you may encounter in any application. We call the collection *measureable*.
## Necessary properties of a measure
- The goal is to assign a sensible notion of volume to all kinds of sets in any space, but we’ll start with $\mathbb R^d$. For finite spaces, it is very easy to use the cardinality as the notion of volume but for sets with infinite elements the issue is a bit more nuanced. Therefore, we first identify the key properties that any reasonable notion of volume must satisfy. Some of the properties will turn out to redundant, but we’ll remove them later. To start with, if we call $\mu: 2^{\mathbb R^d}\to \mathbb R_+$ a *measure*, we will require the following from it:
1. For intervals ( cubes for $d>1$), it should agree should agree with the notion of length (physical volume for $d>1$).
2. Positivity $\mu \geq 0$ and $\mu(\emptyset)=0$.
3. It must be *additive*: $\mu(A\cup B)=\mu(A)+\mu(B)$ for all $A,B$ such that $A\cap B=\emptyset$. We will see that it is not sufficient for this to hold for finite unions, but we will need it to hold for countable unions.
4. *Monotonicity*: If $C\subseteq D$ then $\mu(C)\leq \mu(D)$. Monotonicity follows immediately from *additivity* and *positivity*: let $A=D\setminus C$ and $B=C$.
Now lets try to build up the notion of measure using the familiar and simple sets like intervals (cubes).
## Measuring the Elementary Sets
A *rectangle* is a subset of $\mathbb R^d$ that is formed using the cartesian product of $d$ closed and bounded intervals each in $\mathbb R$. A rectangle is a cube if all the intervals have the same length.[^1]
> [!question]- Define *Elementary sets* of $\mathbb R^d$ in the context of measurability.
> A subset of $\mathbb R^d$ is elementary if it can be expressed as a finite union of cubes, where a cube a product of closed intervals.
If a set can be expressed using a finite union of cubes, it can also be expressed as a finite union of disjoint cubes, therefore one can measure it by summing up the volumes of the finite disjoint cubes.
> [!question] Are the following sets elementary? Any open set in $\mathbb R$? Any open ball in $\mathbb R^d$.
> NO. In general, any open set in $\mathbb R$ is countable union of open intervals. An open ball is $\mathbb R^2$ cannot even be expressed as a countable union of open cubes. [^2]
Given a general set $A$, which may not be elementary, we may still be able to find elementary sets $A^-$ and $A^+$ that cover it from below and above, i.e., $A^-\subset A \subset A^+$ , and use a number between the volumes of these elementary set covers as the measure for the set. But there could be many such coverings, so we will instead use $\sup{\mu(A^-)}$ and $\inf \mu(A^+)$, which are called *Jordan inner* and *Jordan outer* measures of $A$, respectively. For a lot of sets, an open ball in $\mathbb R^2$, for example, the outer and inner Jordan measure agree—such sets are called *Jordan measurable* sets. Intuitively, may regularly shaped sets seem to be Jordan measurable. But a lot many sets, particularly sets with a lot of holes (i.e. all the volume is close to the boundaries), unbounded sets with many pieces, sets with recursive structure, etc., are not *Jordan measurable*.
> [!question]- Give representative examples of sets that are not Jordan Measurable.
See [[Jordan Measurability]] for further discussion.
**TODO: Start by solving the exercises in the section for Jordan measurability. Understand what useful sets are not Jordan measurable, eg: set of rationals in an interval, some compact sets, countable unions or intersections of Jordan measurable sets**
## Measuring using countable coverings
Intuitively, we can see that many sets are not Jordan measurable because we only allow finite unions. What if we were to define a notion of the limiting set. Suppose $(A_n)$ is a sequence of sets, then $\lim_{n\to \infty} A_n \coloneqq \{a \,|\, \exists n : a \in A_n \}$. But what is the measure of $\lim_n A_n$? You may have guessed it already! We will use the $\inf$ and $\sup$ but this time we will allows
Let $(A_n)$ be a sequence
Even these sets can be handled by allowing countable unions of cubes in the covering.
This is what is defined to be the *outer* or *exterior* measure of a general set. It is positive, rotation and translation invariant, and super-additive. It is even additive for most sets—not for all sets however. We see that the sets on which the exterior measure fails to be additive are highly pathological sets—unlikely to be encountered in any practical use-case. One can use Vitali’s construction, which depends on the [[Axiom of choice]], to construct such non-measurable sets. Since such sets are not likely to appear in practice, one can build a fairly useful theory without considering these sets. The sets for which the exterior measure is additive are called *(Lebesgue) measurable sets*, or in general *measurable sets* of $\mathbb R^d$.
**TODO: Give examples from $\mathbb R^d$ to show the following results:**
$\mu$ is a set function and let:
- p1 be *positivity*
- p2 be *finite additivity*
- p3 be *countable additivity*
- p4 be *continuity*
- c1 be *algebra*
- c2 be *sigma algebra*
1. Is the collection of elementary sets and algebra?
2. Is the set of measurable sets a sigma algebra?
3. Give an example to show that p1, p2, c1 neither implies p3, c1 nor p4, c1
4. Give an example to show that p1, p3, c2 implies p4, c2 as well as p4, c1 (thm 1.2.7 in Ash)
5. Give an example to show that p1, p2, p3, c1 implies p2, c1 as well as p3, c2 (thm 1.2.8 in Ash)
#### Abstract axiomatic measure theory
At this point, a question arises: How do we know that we can approximate any set arbitrarily closely using almost disjoint cubes? First we will show that we can approximate any open set (open set of standard topology) as well as closed set using countable almost disjoint cubes. The collection of open and closed sets in standard topology is fairly big; however, there are still lot of sets that are neither. For general sets we define something called the *outer measure* or *exterior measure*
We will then show that the set of
# Essential measure theory for probability
When working with probability, it is sufficient to start with an axiomatic definition of $\sigma$-algebra and calling its elements as *measurable sets*. By doing so, one frees oneself from details of the creation of the collection of all measurable sets from primitive sets of the space. Probability theory begins with an assumption that one is starting with a measurable space $(E,\mathcal E)$, where $\mathcal E$ is a $\sigma$-algebra.
> [!question]- Define $\sigma$-algebra?
> (Def:: sigma algebra) Collection of subsets closed under complements and countable unions. In other words, given a set $E$ a collection $\mathcal E$ of subsets of $E$ is called a $\sigma$-algebra if
> 1. $A\in \mathcal E \implies E\setminus A\in \mathcal E$ (closed under complement)
> 2. $\forall i\in N \subseteq \mathbb N,~A_i\in \mathcal E \implies \lim_n \bigcup_{i=1}^n A_i \triangleq \bigcup_{i\in N} A_i \in \mathcal E$
^0e49f8
<!--ID: 1704398237078-->
> [!question]- Show that $\sigma$-algebra is closed under countable intersections.
> $\bigcap_{i} A_i = \left((\bigcap_i A_i)^c\right)^c$
> $=\left(\bigcup_i A_i^c\right)^c$ (by deMorgan’s law)
> Because $A_i$ is in the $\sigma$-algebra, so is $A_i^c$ and so is their countable union and finally its complement.
> <!--ID: 1704398123755-->
> [!question]- Does closure under finite intersection imply closure under countable intersections?
> No! Consider the collection of all open sets of $\mathbb R$. We know from the definition of standard topology that this collection is closed under finite intersections but not countable intersections. So, in general, the requirement of closure under countable intersections is stronger.
> <!--ID: 1704398237082-->
> [!question]- Show that the intersection of an arbitrary family of sigma algebras is again a sigma algebra.
> Just apply the definition
> Let $\sigma_i$ denote a collection of sigma algebras on a space $E$. We just need to show that $\sigma=\bigcap_i \sigma_i$ is a sigma algebra.
> $A\in \sigma\implies A\in\sigma_i$ for all $i$, $\implies A^c\in \sigma_i$ $\implies A^c\in \sigma$. Same arguments apply for countable union.
> [!question]- Define sigma algebra generated by a collection of sets.
> Given a collection of subsets $\mathcal C$ , let $\Gamma$ denote the set of all $\sigma$-algebras that contain $\mathcal C$. Then from previous question we know that $\sigma \mathcal C=\bigcap_{\lambda\in \Gamma} \lambda$ is a $\sigma$-algebra. Moreover, it contains $\mathcal C$. $\sigma \mathcal C$ is called the sig al generated by $\mathcal C$. ^293e06
> <!--ID: 1711333389424-->
> [!question]- Show that $\sigma \mathcal C$ is the smallest containing $\mathcal C$.
> Let’s say $\mathcal G$ is a sig al containing $\mathcal C$ and $\mathcal G\subseteq \sigma \mathcal C$. But by definition of $\sigma \mathcal C$ (it is intersection of sig al, one of which is $\mathcal G$) $\sigma \mathcal C \subseteq \mathcal G$. Hence they are one and the same.
At this point, a probability text book like *Probability and Stochastics by Erhan Cinlar* goes into p-system and d-system but no analysis book ever mentions this. I wondered why and found [this](https://mathoverflow.net/questions/32288/why-pi-systems-and-dynkin-lambda-systems-on-the-relative-merits-of-approaches-i) page. It seems like the p-system and d-system are names given to collections of sets that obey certain properties and the $\pi-\lambda$ makes many proofs easier. The analysis might try to prove such things bottom up.
Why do we care about the *smallest* sigma algebra containing a particular collection of subsets? This may be because the observer can only *observe* the output on the collection of sets. We go up to sigma algebra because that is necessary for the machinery of probability to be useful but we don’t want to go any more granular. Consider an experiment of two coin tosses, but an observer only being able to see the result of the second one.
> [!question]- What is Borel sigma algebra?
> If $E$ is topological (with topology $\tau$) then the sigma algebra generated by the collection of all open subsets of $E$ is called the (def:: Borel sigma algebra) on $E$ and is denoted by $\mathcal B_{\tau}(E)$ or $\mathcal B_E$, if topology is an obvious one.
> [!question] Limit of an increasing sequence of sets
> By $A_n \uparrow A$, we denote an increasing sequence of sets converging to $A$, meaning for $i\in\mathbb N$, sets $A_i$ are such that $A_i \subseteq A_{i+1}$ and $A\triangleq\bigcup_i A_i$. See [these notes on convergence of sequences of sets for more details](onenote:https://d.docs.live.net/725c2b64870f4322/Documents/Stats%20605%20Probability%20theory/MIT%206.436J.one#Prerequisites§ion-id=%7B25E4F1A0-5501-D648-BADE-4C9E80A8C37B%7D&page-id=%7B39648B71-A9C8-B941-A79C-78F37ACBDA0E%7D&end) ([Web view](https://onedrive.live.com/view.aspx?resid=725C2B64870F4322%211365&id=documents&wd=target%28MIT%206.436J.one%7C25E4F1A0-5501-D648-BADE-4C9E80A8C37B%2FPrerequisites%7C39648B71-A9C8-B941-A79C-78F37ACBDA0E%2F%29)) ^4a7be7
> [!question]- What is p-system or $\pi$-system?
> A collection of subsets is called a (def:: pi-system) if it is *non-empty* and is closed under finite intersections.
Example of a p-system:
1. Let $E=\{0,1\}^K$ and let $\pi_i: E \to \{0,1\}$ denote the set of canonical projection maps, i.e., for $e=(e_1, \dots, e_k) \in E$, $\pi_i(e) = e_i$. Then the collection $C_f = \{e\in E\,|\, ~\forall i ~\text{with}~ \pi_i(f)=1,~~~\pi_i(e)=1 \}$ for $f\in E$ forms a p-system. In essence, for a particular $f\in E$, the set $C_f\subset E$ contains all elements of $e$ that agree with $f$ on indices that take value 1. It is also easy to see that the sets of the collection are overlapping and the largest ones the those that correspond to $f$ that has only one index that is 1.
2. A simpler example is $\{\emptyset, \{1\}, \{1,2\}, \{1,3\}\}$ for $E=\{1, 2, 3\}$.
> [!definition]- Def: d-system or $\lambda$-system
> A collection of subsets $\mathcal D$ of $E$ is called a (def:: d-system) if
> 1. $E\in\mathcal D$
> 2. $A\supseteq B, ~A,B\in \mathcal D\implies A\setminus B\in \mathcal D$ (closed under set subtractions of C sets)
> 3. $(A_i)\in\mathcal D, ~~ A_i \uparrow A\implies A\in\mathcal D$. (closed under increasing unions)
> [!question]- pi-system and d-system $\Leftrightarrow$ sigma al:
> **Statement**: A collection of subsets is a $\sigma$-algebra iff it is both a pi-system and d-system.
> **Proof**: The necessity is obvious. Following is the proof for sufficiency (assume $\mathcal G$ is the collection):
> 1. Closure under complements is implied by it being a d-system (1.+2.)
> 2. Closure under *finite* unions is implied by p-system (closed under finite intersections) and closure under complements: $A\cup B=(A^c\cap B^c)^c$.
> 3. Let $(B_i)\subseteq \mathcal G$ . Let $A_i=\bigcup_{j=0}^i B_i,~~i\in \mathbb N$. Then $(A_i)$ is an [[#^4a7be7\|increasing sequence of sets]] with each $A_i \in \mathcal G$ due to 2. Since $\mathcal G$ is a d-system, $B=\bigcup_i B_i =\bigcup_i A_i =A\in \mathcal G$, implying closure under countable unions. ^817190
**Lemma (projection of a d-system on a set is a d-system)**: Let $\mathcal D$ be a d-system on $E$. Fix any $D\in \mathcal D$ and let $\hat{\mathcal D} =\{A\in \mathcal D| A \cap D \in \mathcal D\}$. Then $\hat{\mathcal D}$ is a d-system.
1. $E\cap D=D\in \mathcal D\implies E\in \hat{\mathcal D}$
2. Let $F\supseteq G$ in $\hat{\mathcal D}$. Let $F' = F\cap D$ and $G'=G\cap D$. Since $F'\supseteq G'$, and by def of $\hat{\mathcal D}$ $F',G'\in\mathcal D$, $F'\setminus G' \in \mathcal D$. Now, $\mathcal D\ni F'\setminus G'=(F\cap D) \setminus G=(F\setminus G)\cap D$. Since $F,G \in{\mathcal D}$, we have $F\setminus G \in \mathcal D$, which implies $F\setminus G \in \hat{\mathcal D}$.
3. Let $(A_i) \subseteq \hat{\mathcal D}$ and $A_{i+1}\supseteq A_i$. Since $D\cap A_i\in\mathcal D$, $D\cap\left(\bigcup_i A_i\right) = \bigcup_i D\cap A_i\in\mathcal D$, which implies $A=\bigcup_i A_i\in\hat{\mathcal D}$
> [!question] Monotone class theorem
> **statement**: If a d-system contains a pi-system, then it contains also the sigma algebra generated by that pi-system.
> **Proof:** Let $\mathcal D$ be the smallest d-system that contains the pi-system $\mathcal C$. By definition of $\sigma \mathcal C$, it is sufficient to show that $\mathcal D$ is a sigma al, which, by [[#^817190\|pi-d system theorem]], implies that it is sufficient to show that $\mathcal D$ is a pi-system! **Complete this proof!** ^0d2f95
Note that we can also prove monotone class theorem without using pi-d systems. The [[Plan for studying probability \| MIT course notes lecture 1]] does that.
**TODO: Apply monotone class theorem (both versions) on a collection of sets in $\mathbb R^d$ to obtain the sigma algebra, ie. all measurable sets in $\mathbb R^d$.**
\### Showing that a collection is sigma algebra
Following are the ways:
1. Use the definition
2. Use the [[#^817190\|pi-d system theorem]].
3. Use [[#^0d2f95\|d-system containing pi-system]]
### Borel sigma algebra on real line
With the standard topology on $\mathbb R$, the Borel sigma algebra is the one that is generated by all open sets of $\mathbb R$. We also know that any open set of $\mathbb R$ is can be expressed as a countable union of open intervals (see [[Real Analysis by Stein and Shakarchi\|SS]] page 6 for proof).
**Lemma (generated sigma algebra is a subset of sigma algebra containing the collection)**: Let $\mathcal C$ be a collection of subsets of $E$ and $\mathcal G$ a sigma algebra on $E$ such that $C\subseteq \mathcal G$. Then $\sigma C\subseteq \mathcal G$.
Proof follows right from the definition of [[#^293e06\|generated sigma algebra]], which is the smallest sigma algebra that contains $\mathcal C$.
**Prop (collection of intervals generate the borel sigma algebra on real line)**: Let
1. $\mathcal C_1 = \{(a,b)|a,b\in \mathbb R, a<b\}$
2. $\mathcal C_2 = \{[a,b]|a,b\in\mathbb R, a\leq b\}$
3. $\mathcal C_3=\{(-\infty, a)|a\in\mathbb R\}$
4. $\mathcal C_4 =\{(a,\infty)|a\in\mathbb R\}$
Then $\mathcal B(\mathbb R)=\sigma \mathcal C_i,~~i\in\{1,2,3,4\}$.
Let $\mathcal C$ be the collection of all open sets of $\mathbb R$, which also means that by definition $\mathcal B(\mathbb R)=\sigma \mathcal C$. Then we know that any element of $\mathcal C$ can be expressed as a countable union of open intervals. Hence, $\mathcal C\subseteq \sigma \mathcal C_1$, which implies $\sigma C\subseteq \sigma\mathcal C_1$. But since $\mathcal C_1\subseteq \mathcal C$, $\sigma\mathcal C_1\subseteq \mathcal \sigma \mathcal C$. Hence $\mathcal B(\mathbb R)=\sigma \mathcal C=\sigma \mathcal C_1$.
Now $(-\infty,a)=\bigcup_{n\in\mathbb N}(a-n, a)$. Hence $\mathcal C_3\subseteq \sigma\mathcal C_1$. Moreover, $(a,b) =\bigcup_n[(-\infty,a+1/n)^c\cap(-\infty,b)]$. Hence $\mathcal C_1\subseteq \sigma\mathcal C_3$. This implies $\sigma C_1=\sigma C_3$.
Similar arguments can be used to prove the rest.
### Product spaces
# References
1. [[Real Analysis by Stein and Shakarchi\|Real Analysis Measure Theory, Integration and Hilbert Spaces]]
2. *See [[\[Measure Integration and Real Analysis\|MIRA]], [[Real Analysis by Stein and Shakarchi\|SS]], [[Introduction to measure theory by Terrence Tao]] for a detailed discussion.*
[^1]: [[Real Analysis by Stein and Shakarchi \| Real Analysis by Stein and Shakarchi (pg 3)]]
[^2]: [[Real Analysis by Stein and Shakarchi\|SS (pg 7)]]