# Notes for my lecture on multiple recurrence theorem for weakly mixing systems – Part 2

Now we can start to prove the multiple recurrence theorem in the weak mixing case. Again the material is from Furstenberg’s book ‘Recurrence in Ergodic Theory and Combinatorial Number Theory’ which, very unfortunately, is hard to find a copy of.

Definition: A sequence $(x_i) \subseteq X$ converges in density to $x$ if there exists $Z \subseteq \mathbb{N}$ of density $0$ s.t. for all neighborhood $U$ of $x$, $\exists N \in \mathbb{N}, \ \{ x_n \ | \ n \geq N$ and $n \notin Z \} \subseteq U$.

We denote $(x_n) \rightarrow_D \ x$.

Theorem: For $(X, \mathcal{B}, \mu, T)$ weakly mixing,

then $\forall f_0, f_1, \cdots, f_k \in L^\infty(X)$, we have

$\int f_0(x) f_1(T^n(x)) f_2(T^{2n}(x)) \cdots f_k(T^{kn}(x)) \ d \mu$

$\rightarrow_D \int f_0 \ d \mu \int f_1 \ d \mu \cdots \int f_k \ d \mu$ as $n \rightarrow \infty$

In particular, this implies for any $A \in \mathcal{B}$ with $\mu(A)>0$, by taking $f_0 = f_1 = \cdots = f_k = \chi_A$ we have
$\mu(A \cap T^{-n}(A) \cap \cdots \cap T^{-kn}(A))$ $\rightarrow_D \mu(A)^k > 0$.
Hence we may pick $N \in \mathbb{N}$ for which
$\mu(A \cap T^{-N}(A) \cap \cdots \cap T^{-kN}(A)) > 0$.

Establishes the multiple recurrence theorem.

To prove the theorem we need the following:

Lemma 1: Let $(f_n)$ be a bounded sequence in Hilbert space $\mathcal{H}$, if $\langle f_{n+m}, f_n \rangle \rightarrow_D a_m$ as $n \rightarrow \infty$, $a_m \rightarrow_D 0$ as $m \rightarrow \infty$. Then $(f_n)$ converges weakly in density to $\overline{0}$

In order to prove the lemma 1, we need:

Lemma 2: Given $\{ R_q \ | \ q \in Q \}$ a family of density $1$ subsets of $\mathbb{N}$, indexed by density $1$ set $Q \subseteq \mathbb{N}$. Then for all $S \subseteq \mathbb{N}$ of positive upper density, for all $k \geq 1$. There exists $\{ n_1, n_2, \cdots, n_k \} \subseteq S$, $n_1 < n_2 \cdots < n_k$ such that whenever $ii \in Q$ and $n_i \in R_{n_j-n_i}$.

Proof of lemma 2: We’ll show this by induction. For $k=1$, since there is no such $j>1$, the statement is vacant.

We’ll proceed by induction: Suppose for $k>1$, there exists $S_k \subseteq S$ of positive upper density and integers $m_1< \cdots < m_k$ such that $(S_k+m_1) \cup (S_k+m_2) \cup \cdots \cup (S_k+m_k) \subseteq S$ and for all $j>i$, $m_j - m_i \in Q$ and $S_k+m_i \subseteq R_{m_j-m_i}$.

For $k+1$, we shall find $S_{k+1} \subseteq S_k$ with positive upper density and $m_{k+1}>m_k$ where $S_{k+1}+m_{k+1} \subseteq S$ and for all $1 \leq i \leq k$, $m_{k+1} - m_i \in Q$ and $S_{k+1}+m_i \subseteq R_{m_{k+1}-m_i}$.

Let $S_k^* = \{n \ | \ S_k+n \cap S_k$ has positive upper density $\}$.

Claim:$S_k^*$ has positive upper density.

Since $\overline{D}(S_k) = \epsilon >0$, let $N = \lceil 1/ \epsilon \rceil$.

Hence there is at most $N-1$ translates of $S_k$ that pairwise intersects in sets of density $0$.

Let $M < N$ be the largest number of such sets, let $p_1, \cdots, p_M$ be a set of numbers where $(S_k+p_i) \cap (S_k+p_j)$ has density $0$.
i.e. $S_k+(p_j-p_i) \cap S_k$ has density $0$.

Therefore for any $p>p_M$, $(S_k+p-p_i) \cap S_k$ has positive upper density for some $i$. Hence $p-p_i \in S_k^*$. $S_k^*$ is syntactic with bounded gap $2 \cdot p_M$ hence has positive upper density.

Pick $\displaystyle m_{k+1} \in S_k^* \cap \bigcap_{i=1}^k(Q+m_i)$.

(Hence $m_{k+1}-m_i \in Q$ for each $1 \leq i \leq k$)

Let $\displaystyle S_{k+1} = (S_k - m_{k+1}) \cap S_k \cap \bigcap_{i=1}^k (R_{m_k+1 - m_i}-m_i)$.

$(S_k - m_{k+1}) \cap S_k$ has positive upper density, $\bigcap_{i=1}^k (R_{m_k+1 - m_i}-m_i)$ has density $1$, $S_{k+1}$ has positive upper density.

$S_{k+1}, \ m_{k+1}$ satisfied the desired property by construction. Hence we have finished the induction.

Proof of lemma 1:

Suppose not. We have some $\epsilon > 0, \ f \neq \overline{0}$,

$S = \{ n \ | \ \langle f_n, f \rangle > \epsilon \}$ has positive upper density.

Let $\delta = \frac{\epsilon^2}{2||f||^2}$, let $Q = \{m \ | \ a_m < \delta/2 \}$ has density $1$.

$\forall q \in Q$, let $R_q = \{ n \ | \ \langle f_{n+q}, f_n \rangle < \delta \}$ has density $1$.

Apply lemma 2 to $Q, \{R_q \}, S$, we get:

For all $k \geq 1$. There exists $\{ n_1, n_2, \cdots, n_k \} \subseteq S$, $n_1 < n_2 \cdots < n_k$ such that whenever $i, $n_j - n_i \in Q$ and $n_i \in R_{n_j-n_i}$.

i) $n_i \in S \ \Leftrightarrow \ \langle f_{n_i}, f \rangle > \epsilon$

ii) $n_i \in R_{n_j-n_i} \ \Leftrightarrow \ \langle f_{n_i}, f_{n_j} \rangle < \delta$

Set $g_i = f_{n_i} - \epsilon \cdot \frac{f}{||f||}$. Hence

$\forall \ 1 \leq i < j \leq k$ $\langle g_i, g_j \rangle = \langle f_{n_i} - \epsilon \frac{f}{||f||}, f_{n_j} - \epsilon \frac{f}{||f||}\rangle$ $< \delta - 2\cdot \frac{\epsilon^2}{||f||^2} + \frac{\epsilon^2}{||f||^2} = \delta - \frac{\epsilon^2}{||f||^2}= -\delta$.

On the other hand, since $(f_n)$ is bounded in $\mathcal{H}, \ (g_n)$ is also bounded (independent of $k$). Suppose $||g_n||< M$ for all $k$,
then we have
$\displaystyle 0 \leq || \sum_{i=1}^k g_i ||^2 = \sum_{i=1}^k ||g_i ||^2 + 2 \cdot \sum_{i < j} \langle g_i, g_j \rangle$ $\leq kM - k(k-1) \delta$

For large $k, \ kM - k^2 \delta<0$, contradiction.
Hence $S$ must have density $0$.

Proof of the theorem:
By corollary 2 of the theorem in part 1, since $T$ is weak mixing, $T^m$ is weak mixing for all $m \neq 0$.
We proceed by induction on $l$. For $l=1$, the statement is implied by our lemma 2 in part 1.

Suppose the theorem holds for $l \in \mathbb{N}$, let $f_0, f_1, \cdots, f_{l+1} \in L^\infty(X)$,

Let $C = \int f_{l+1} \ d \mu, \ f'_{l+1}(x) = f_{l+1}(x) - C$.

By induction hypothesis, $\int f_0(x) f_1(T^n(x)) f_2(T^{2n}(x)) \cdots f_l(T^{ln}(x)) \cdot C \ d \mu$

$\rightarrow_D \int f_0 \ d \mu \int f_1 \ d \mu \cdots \int f_l \ d \mu \cdot C$ as $n \rightarrow \infty$

Hence it suffice to show $\int f_0(x) f_1(T^n(x)) f_2(T^{2n}(x)) \cdots f_l(T^{ln}(x)) \cdot$
$f'_{l+1}(T^{(l+1)n}(x)) \ d \mu \rightarrow_D 0$

Let $\int f_{l+1} \ d\mu = 0$

For all $n \in \mathbb{N}$, set $g_n (x)= f_1 \circ T^n(x) \cdot f_2 \circ T^{2n}(x) \cdots f_{l+1} \circ T^{(l+1)n}(x)$

For each $m \in \mathbb{N}, \ \forall \ 0 \leq i \leq l=1$, let $F^{(m)}_i (x)= f_i(x) \cdot f_i(T^{im}(x))$

$\langle g_{n+m}, g_n \rangle = \int (f_1(T^{n+m} (x) \cdots f_{l+1}(T^{(l+1)(n+m)} (x)))$ $\cdot (f_1(T^n(x)) \cdots f_{l+1}(T^{(l+1)n} (x))) \ d\mu$
$= \int F^{(m)}_1(T^n(x)) \cdots F^{(m)}_{l+1}(T^{(l+1)n}(x)) \ d \mu$

Since $T^{l+1}$ is measure preserving, we substitute $y = T^{(l+1)n}(x)$,

$= \int F^{(m)}_{l+1}(y) \cdot F^{(m)}_1(T^{-ln}(y)) \cdots F^{(m)}_l(T^{-n}(y)) \ d \mu$

Apply induction hypothesis, to the weak mixing transformation $T^{-n}$ and re-enumerate $F^{(m)}_i$

$\langle g_n, g_{n+m} \rangle \rightarrow_D ( \int F^{(m)}_1 \ d\mu) \cdots (\int F^{(m)}_{l+1} \ d\mu)$ as $n \rightarrow \infty$.

$\int F^{(m)}_{l+1} \ d\mu = \int f_{l+1} \cdot f_{l+1} \circ T^{(l+1)m} \ d\mu$

By lemma 2 in part 1, we have $\int F^{(m)}_{l+1} \ d\mu \rightarrow_D 0$ as $m \rightarrow \infty$.

We are now able to apply lemma 2 to $g_n$, which gives $(g_n) \rightarrow_D \overline{0}$ under the weak topology.

i.e. for any $f_0$, we have $\int f_0(x) g_n(x) \ d \mu \rightarrow_D 0$.

Establishes the induction step.

Remark: This works for any group of commutative weakly mixing transformations. i.e. if $G$ is a group of measure preserving transformations, all non-identity elements of $G$ are weakly mixing. $T_1, \cdots, T_k$ are distinct elements in $G$, then $\int f_0(x) f_1(T_1^n(x)) f_2(T_2^n(x)) \cdots f_k(T_k^n(x)) \ d \mu$ $\rightarrow_D \int f_0 \ d \mu \int f_1 \ d \mu \cdots \int f_k \ d \mu$ as $n \rightarrow \infty$.

# Probability of leading N digits of 2^n

Okay, so there was this puzzle which pops out from the ergodic seminar a while ago:

What’s the probability for the leading digit of $2^N$ being $k \in \{1,2, \cdots, 9 \}$ as $N \rightarrow \infty$?

It’s a cute classical question in ergodic theory, the answer is $\log_{10}(k+1) - \log_{10}(k)$.

Proof: (all log are taken in base $10$)
Given a natural number $N$, let $\log(N) = k+\alpha$ where $k \in \mathbb{Z}, \ \alpha \in [0, 1)$, since $N = 10^{\log(N)} = 10^{k+\alpha} = 10^k \cdot 10^\alpha$, $1 \leq 10^\alpha < 10$, we see that the first digit of $N$ is the integer part of $10^\alpha$.

The first digit of $2^n$ is the integer part of $10^{ \log(2^n) \mod{1} } = 10^{ n \cdot \log(2) \mod{1}}$.

For $k \in \{1, 2, \cdots, 9 \}$, leading digit of $2^n$ is $k$ iff $k \leq 10^{ n \cdot \log(2) \mod{1}} < k+1$ iff $\log(k) \leq n \cdot \log(2) \mod{1} < \log(k+1)$.

Let $\alpha = \log(2)$ irrational, let $\varphi: S^1 \rightarrow S^1$ be rotation by $\alpha$ ($S^1$ is considered as $\mathbb{R} / \mathbb{Z}$, $\varphi(x) = x+\alpha$). All orbits of $\varphi$ are uniformly distributed i.e. for any $A \subseteq S^1, \forall x \in S^1$, $\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N \chi_A(\varphi^n(x)) = m(A)$

In particular we have $\displaystyle \lim_{N \to \infty} \frac{1}{N} \sum_{n=1}^N \chi_{[\log(k), \log(k+1))}(\varphi^n(0))$ $= m([\log(k), \log(k+1)) = \log(k+1) - \log(k)$

Therefore the limiting probability of first digit of $2^n$ being $k$ is $\log(k+1) - \log(k)$.

To generalize, Pengfei asks Given some two digit number K, what’s the probability of the first two digits being K?

The natural thing to do is take base $100$, however one soon figured out there is a problem since we don’t really want to count “$0x$” as the first two digits when the number of digits is odd.

I found the following trick being handy:

When the number of digits is odd, we may consider the orbit of $\log_{100}(10) = 1/2$ under the rotation $\log_{100}(2)$. This will give us the first digit in base $100$ of $2^n \cdot 10$ which takes even number of digits precisely when $2^n$ has odd number of digits, the first two digit is the same as the original. Since this orbit is also uniformly distributed, we get the probability of $2^n$ having odd number of digits and the first two digit is $K \ (10 \leq K < 100)$ is $\log_{100}(K+1) - \log_{100}(K)$.

Applying the usual procedure to the orbit of $0$ in base $100$ gives us the probability of $2^n$ having even number of digits and the first two digit is $K$ is $\log_{100}(K+1) - \log_{100}(K)$.

Hence the actual probability of $2^n$ starting with $K$ is just the sum of the two that’s $2 \cdot (\log_{100}(K+1) - \log_{100}(K))$.

The same works for finding the distribution of the first $n$ digits. i.e. taking the number of digit mod n, we would be summing the probability $\log_{10^n}(K+1) - \log_{10^n}(K) \ n$-times for each $10^{n-1} \leq K < 10^n$, the limiting probability is $n \cdot (\log_{10^n}(K+1) - \log_{10^n}(K))$.

Remark: One can first calculate the probability of $2^n$ having odd number of digit. This would be the orbits of $0$ under rotation $\log_{100}(2)$ inside the interval $[0, \log_{100}(10)$ which is $[0, \frac{1}{2})$. The limiting probability is $1/2$ (make sense since this says about half of the time the number of digits is odd)

In general, the number of digits being $k \mod{n}$ is $1/n$ for each $k$.

For some reason, professor Kra was interested in figuring out the distribution of the ‘middle’ digit…which I’m not exactly sure how one would define it.

# Notes for my lecture on multiple recurrence theorem for weakly mixing systems – Part 1

So…It’s finally my term to lecture on the ergodic theory seminar! (background, our goal is to go through the ergodic proof of Szemerédi’s theorem as in Furstenberg’s book). My part is the beginning of the discussion on weak mixing and prove the multiple recurrence theorem in the weak mixing case, the weak mixing assumption shall later be removed (hence the theorem is in fact true for any ergodic system) and hence prove Szemerédi’s theorem via the correspondence principal discussed in the last lecture.

Given two measure preserving systems $(X_1, \mathcal{B}_1, \mu_1, T_1)$ and $(X_2, \mathcal{B}_2, \mu_2, T_2)$, we denote the product system by $(X_1 \times X_2, \mathcal{B}_1 \times \mathcal{B}_2, \mu_1 \times \mu_2, T_1 \times T_2)$ where $\mathcal{B}_1 \times \mathcal{B}_2$ is the smallest $\sigma$-algebra on $X_1 \times X_2$ including all products of measurable sets.

Definition: A m.p.s. $(X, \mathcal{B}, \mu, T)$ is weakly mixing if $(X \times X, \mathcal{B} \times \mathcal{B}, \mu \times \mu, T \times T)$ is ergodic.

Note that weak mixing $\Rightarrow$ ergodic

as for non-ergodic systems we may take any intermediate measured invariant set $\times$ the whole space to produce an intermediate measured invariant set of the product system.

For any $A, B \in \mathcal{B}$, let $N(A, B) = \{ n \in \mathbb{N} \ | \ \mu(A \cap T^{-n}(B)) > 0 \}$.

Ergodic $\Leftrightarrow$ for all $A,B$ with positive measure, $N(A,B) \neq \phi$

Weakly mixing $\Rightarrow$ for all $A,B,C,D$ with positive measure, $N(A \times C, B \times D) \neq \phi$.

Since $n \in N(A \times C, B \times D)$

$\Leftrightarrow \ \mu \times \mu(A \times C \cap T^{-n}(B \times D)) > 0$

$\Leftrightarrow \ \mu \times \mu(A \cap T^{-n}(B) \times C \cap T^{-n}(D)) > 0$

$\Leftrightarrow \ \mu(A \cap T^{-n}(B)) > 0$ and $\mu(C \cap T^{-n}(D)) > 0$

$\Leftrightarrow \ n \in N(A,B)$ and $n \in N(C,D)$

Hence $T$ is weakly mixing $\Rightarrow$ for all $A,B,C,D$ with positive measure, $N(A,B) \cap N(C,D) \neq \phi$. We’ll see later that this is in fact $\Leftrightarrow$ but let’s say $\Rightarrow$ for now.

As a toy model for the later results, let’s look at the proof of following weak version of ergodic theorem:

Theorem 1: Let $(X, \mathcal{B}, \mu, T)$ be ergodic m.p.s., $f, g \in L^2(X)$ then

$\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int f \cdot g \circ T^n \ d\mu = \int f \ d\mu \cdot \int g \ d\mu$.

Proof: Let $\mathcal{U}: f \mapsto f \circ T, \ \mathcal{U}$ is unitary on $L^2(X)$.

Hence $\{ \frac{1}{N+1} \sum_{n=0}^N g \circ T^n \ | \ N \in \mathbb{N} \} \subseteq \overline{B( \overline{0}, ||g||)}$

Any weak limit point of the above set is $T$-invariant, hence ergodicity implies they must all be constant functions.

Suppose $\lim_{i \rightarrow \infty} \frac{1}{N_i+1} \sum_{n=0}^N g \circ T^n \equiv c$

then we have $c = \int c \ d\mu = \lim_{i \rightarrow \infty} \frac{1}{N_i+1} \sum_{n=0}^N \int g \circ T^n \ d\mu = \int g \ d \mu$

Therefore the set has only one limit point under the weak topology.

Since the closed unit ball in Hilbert space is weakly compact, hence $\frac{1}{N+1} \sum_{n=0}^N g \circ T^n$ converges weakly to the constant valued function $\int g \ d \mu$.

Therefore $\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int f \cdot g \circ T^n \ d\mu$

$= \int f \cdot ( \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int g \circ T^n) \ d\mu$

$= \int f \cdot (\int g \ d\mu) d\mu = \int f \ d\mu \cdot \int g \ d\mu$.

Next, we apply the above theorem on the product system and prove the following:

Theorem 2: For $(X, \mathcal{B}, \mu, T)$ weakly mixing,

$\lim_{N \rightarrow \infty} \frac{1}{1+N} \sum_{n=0}^N (\int f \cdot (g \circ T^n) \ d \mu - \int f \ d \mu \int g \ d \mu)^2$

$= 0$

Proof: For $f_1, f_2: X \rightarrow \mathbb{R}$, let $f_1 \otimes f_2: X \times X \rightarrow \mathbb{R}$ where $f_1 \otimes f_2 (x_1, x_2) = f_1(x_1) f_2(x_2)$

By theorem 1, we have

$\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N(\int f \cdot g \circ T^n \ d \mu)^2$

$= \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int (f \otimes f) \cdot ((g \otimes g) \circ (T \times T)^n) \ d \mu \times \mu$

$= (\int f \otimes f \ d\mu \times \mu) \cdot (\int g \otimes g \ d\mu \times \mu)$

$= (\int f \ d\mu)^2 (\int g \ d\mu)^2 \ \ \ \ ( \star )$

Set $a_n = \int f \cdot (g \circ T^n) \ d \mu, \ a = \int f \ d \mu \int g \ d \mu$ hence by theorem 1, we have

$\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N a_n = a$;

By $( \star )$, we have

$\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N a_n^2 = a^2$

Hence $\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N (a_n-a)^2$

$= \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N (a_n^2 - 2a \cdot a_n + a^2)$

$= \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N a_n^2 - 2a \cdot \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N a_n + a^2$

$= a^2 - 2 a \cdot a + a^2 = 0$

This establishes theorem 2.

We now prove that the following definition of weak mixing is equivalent to the original definition.

Theorem 3: $(X, \mathcal{B}, \mu, T)$ weakly mixing iff for all $(Y, \mathcal{D}, \nu, S)$ ergodic, $(X \times Y, \mathcal{B} \times \mathcal{D}, \mu \times \nu, T \times S)$ is ergodic.

proof:$\Leftarrow$” is obvious as if $(X, \mathcal{B}, \mu, T)$ has the property that its product with any ergodic system is ergodic, then $(X, \mathcal{B}, \mu, T)$ is ergodic since we can take its product with the one point system.
This implies that the product of the system with itself $(X \times X, \mathcal{B} \times \mathcal{B}, \mu \times \mu, T \times T)$ is ergodic, which is the definition of being weakly mixing.

$\Rightarrow$” Suppose $(X, \mathcal{B}, \mu, T)$ weakly mixing.

$T \times S$ is ergodic iff all invariant functions are constant a.e.

For any $g_1, g_2 \in L^2(X), \ h_1, h_2 \in L^2(Y)$, let $C = \int g_1 \ d \mu$, let $g_1' = g_1-C$; hence $\int g_1' \ d \mu = 0$.

$\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int g_1 \cdot (g_2 \circ T^n) \ d \mu \cdot \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

$= \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int C \cdot (g_2 \circ T^n) \ d \mu \cdot \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

$+ \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int g_1' \cdot (g_2 \circ T^n) \ d \mu \cdot \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

Since $\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int C \cdot (g_2 \circ T^n) \ d \mu \cdot \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

$= C \cdot \int g_2 \ d \mu \cdot \lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

By theorem 1, since $S$ is ergodic on $Y$,

$=\int g_1 \ d \mu \cdot \int g_2 \ d \mu \cdot \int h_1 \ d \nu \cdot \int h_2 \ d \nu$

On the other hand, let $a_n = \int g_1' \cdot (g_2 \circ T^n) \ d \mu, \ b_n = \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

By theorem 2, since $T$ is weak mixing $\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N (a_n - 0 \cdot \int g_2 \ d \mu)^2 = 0$ hence $\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N a_n^2 = 0 \ \ \ (\ast)$

$(\sum_{n=0}^N a_n \cdot b_n)^2 \leq ( \sum_{n=0}^N a_n^2) \cdot ( \sum_{n=0}^N b_n^2)$ by direct computation i.e. subtract the left from the right and obtain a perfect square.

Therefore $\lim_{N \rightarrow \infty} (\frac{1}{N+1} \sum_{n=0}^N a_n \cdot b_n)^2$

$\leq (\frac{1}{N+1} \sum_{n=0}^N a_n^2) \cdot (\frac{1}{N+1} \sum_{n=0}^N b_n^2)$

Approaches to $0$ as $N \rightarrow \infty$ by $(\ast)$.

Therefore, $\lim \frac{1}{N+1} \sum_{n=0}^N \int g_1' \cdot (g_2 \circ T^n) \ d \mu \cdot \int h_1 \cdot (h_2 \circ S^n) \ d \nu = 0$

Combining the two parts we get

$\lim_{N \rightarrow \infty} \frac{1}{N+1} \sum_{n=0}^N \int g_1 \cdot (g_2 \circ T^n) \ d \mu \cdot \int h_1 \cdot (h_2 \circ S^n) \ d \nu$

$= \int g_1 \ d \mu \cdot \int g_2 \ d \mu \cdot \int h_1 \ d \nu \cdot \int h_2 \ d \nu$.

Since the linear combination of functions of the form $f(x, y) = g(x)h(y)$ is dense in $L^2(X \times Y)$ (in particular the set includes all characteristic functions of product sets and hence all simple functions with basic sets being product sets)

We have shown that for a dense subset of $f \in L^2(X \times Y)$ the sequence of functions $\frac{1}{N+1}\sum_{n=0}^N f(T^n(x), S^n(y))$ converge weakly to the constant function. (Since it suffice to check convergence a dense set of functional in the dual space)

Hence for any $f \in L^2(X \times Y)$, the average weakly converges to the constant function $\int f \ d \mu \times \nu$.

For any $T \times S$-invariant function, the average is constant, hence this implies all invariant functions are constant a.e. Hence we obtain ergodicity of the product system.

Establishes the theorem.

# Length spectrum

I sat through a talk given by Jared Wunsch about a week ago in which he mentioned a certain real valued function (defined for a fixed compact Riemannian manifold) being smooth on all of $\mathbb{R}^+$ except for those points where there is a closed geodesic of that length on the manifold. So at the end I asked the question ‘How many lengths can there be?’ as I am curious about whether ‘being smooth outside those lengths’ is a strong statement for all manifolds (or for generic manifolds).

Later on I found this question is quite cool so I went on and thought a bit more about it.

Turns out this ‘set of lengths of closed geodesics’ is called the length spectrum of the manifold.

Without much difficulty, I constructed surfaces with length spectrum containing a sequence of accumulating points or a (measure zero) Cantor set. (by taking the surface of revolutions of graphs of real valued functions with certain properties)

Note that the length spectrum itself need not be closed as one can easily construct examples where there is a sequence of closed geodesics accumulating to a parametrized curve that goes along a closed geodesic twice. However, since there can’t be a sequence of lengths approaching to $0$ (because, for example, the injectivity radius is bounded below from $0$ by compactness) we may throw in all integer multiples of the lengths of closed geodesics, in each finite interval this is merely taking a union of finitely many copies of the geodesics (hence essentially does not change the size of the set). This resulting set of ‘generalized lengths of closed geodesics’ is closed.

I wish to show that the set of generalized lengths of closed geodesics is both measure $0$ and nowhere dense (hence meager in the Baire category sense) by applying Sard’s theorem to a appropriately defined setting.

I’ll try to do this sometime soon, to be continued~