# Stable isoperimetric inequality

Eric Carlen from Rutgers gave a colloquium last week in which he bought up some curious questions and facts regarding the ‘stability’ of standard geometric inequalities such as the isoperimetric and Brunn-Minkowski inequality. To prevent myself from forgetting it, I’m dropping a short note on this matter here. Interestingly I was unable to locate any reference to this nor did I take any notes, hence this post is completely based on my recollection of a lunch five days ago.

–Many thanks to Marco Barchies, serval very high-quality references are located now. It turns out that starting with Fusco-Maggi-Pratelli ’06  which contains a full proof of the sharp bound, there has been a collective progress on shorter/different proofs and variations of the theorem made. See comments below!

As we all know, for sets in $\mathbb{R}^n$, the isoperimetric inequality is sharp only when the set is a round ball. Now what if it’s ‘almost sharp’? Do we always have to have a set that’s ‘close’ to a round sphere? What’s the appropriate sense of ‘closeness’ to use?

One might first attempt to use the Hausdorff distance:

$D(S, B_1(\bar{0})) = \inf_{t \in \mathbb{R}^n}\{D_H(S+t, B_1(\bar{0}))$.

However, we can easily see that, in dimension $3$ or higher, a ball of radius slightly small than $1$ with a long and thin finger sticking out would have volume $1$, surface volume $\varepsilon$ larger than that of the unit ball, but huge Hausdorff distance:

In the plane, however it’s a classical theorem that any region $S$ of area $\pi$ and perimeter $m_1(\partial S) \leq 2\pi +\varepsilon$ as $D(S, B_1(\bar{0})) \leq f(\varepsilon)$ where $f(\varepsilon) \rightarrow 0$ as $\varepsilon \rightarrow 0$ (well, that $f$ is because I forgot the exact bound, but should be linear in $\varepsilon$).

So what distance should we consider in higher dimensions? Turns out the nature thing is the $L^1$ norm:

$D(S, B_1(\bar{0})) = \inf_{t\in \mathbb{R}^n} \mbox{vol}((S+t)\Delta B_1(\bar{0}))$

where $\Delta$ is the symmetric difference.

First we can see that this clearly solves our problem with the thin finger:

To simplify notation, let’s normalize our set $S \subseteq \mathbb{R}^n$ to have volume 1. Let $B_n$ denote the ball with n-dimensional volume 1 in $\mathbb{R}^n$ (note: not the unit ball). $p_n = \mbox{vol}_{n-1}(\partial B_n)$ be the ($n-1$ dimensional) measure of the boundary of $B_n$.

Now we have a relation $D(S, B_n)^2 \leq C_n (\mbox{vol}_{n-1}(\partial S) - p_n)$

As said in the talk (and I can’t find any source to verify), there was a result in the 90’s that $D(S, B_n)^4 \leq C_n (\mbox{vol}_{n-1}(\partial S) - p_n)$ and the square one is fairly recent. The sharp constant $C_n$ is still unknown (not that I care much about the actual constant).

At the first glance I find the square quite curious (I thought it should depend on the dimension maybe like $n/(n-1)$ or something, since we are comparing some n-dimensional volume with (n-1) dimensional volume), let’s see why we should expect square here:

Take the simplest case, if we have a n-dimensional unit cube $C_n$, how does the left and right hand side change when we perturbe it to a rectangle with small eccentricity?

As we can see, $D(R_\varepsilon, C_n)$ is roughly $p_n \varepsilon$. The new boundary consists of two faces with measure $1+\varepsilon$, two faces of measure $1-\varepsilon$ and $2 \times (n-2)$ faces with volume $1-\epsilon^2$. Hence the linear term cancels out and we are left with a change in the order of $\varepsilon^2$! (well, actually to keep the volume 1, we need to have $1-\varepsilon/(1+\varepsilon)$ instead of $1-\varepsilon$, but it would still give $\varepsilon^2$)

It’s not hard to see that ellipses with small eccentricity behaves like rectangles.

Hence the square here is actually sharp. One can check that no matter how many of the $n$ side-length you perturbe, as long as the volume stay the same (up to $O(\varepsilon^2)$) the linear term of the change in boundary measure always cancels out.

There is an analoge of this stability theorem for the Brunn-Minkowski inequality, i.e. Given two sets of volume $V_1, V_2$, if the sum set has volume only a little bit larger than that of two round balls with those volumes, are the sets $L^1$ close to round balls? I believe it’s said this is only known under some restrictions on the sets (such as convex), which is strange to me since non-convex sets would only make the inequality worse (meaning the sum set has larger volume), don’t they?

I just can’t think of what could possibly go wrong for non-convex sets…(Really hope to find some reference on that!)

Anyways, speaking of sum sets, the following question recently caught my imagination (pointed out to me by Percy Wong, thank him~ and I shall quote him ‘this might sound like probability, but it’s really geometry!’):

Given a set $T \subseteq l^2$ (or $\mathbb{R}^n$), we define two quantities:

$G(T)=\mathbb{E}(\sup_{p \in T} \Sigma p_i g_i)$ and

$B(T) = \mathbb{E}(\sup_{p \in T} \Sigma p_i \epsilon_i)$

where $\mathbb{E}$ is the expected value, $\{g_i\}$ are independent random variables with a standard normal distribution (mean 0, variance 1) and $\{\epsilon_i\}$ are independent Bernoulli random variables.

Question: Given any $T$, can we always find $T'$ such that

$T \subseteq T' + B_{l^1}(\bar{0}, c B(T))$ and

$G(T') \leq c B(T)$

To find out more about the question, see Chapter 4 of this book. By the way, I should mention that there is a $5000 prize for this :-P # Ultrametrics and the nonlinear Dvoretzky problem Hi guys~ The school year here at Princeton is finally (gradually) starting. So I’m back to this blog :-P In this past week before anything has started, Assaf Naor came here and gave a couple rather interesting talks, which I decided to make a note about. For the complete content of the talk please refer to this paper by Naor and Mendel. As usual, I make no attempt on covering what was written on the paper nor what was presented in the talk, just some random pieces and bits which I found interesting. A long time ago, Grothendieck conjectured what is now known as Dvoretzky’s theorem: Theorem: For any $D>1$, for any $k \in \mathbb{N}$, there exists $N$ depending only on $k, D$ such that any normed vector space of dimension $N$ or larger has a k-dimensional linear subspace that embeds (via a linear transformation) with distorsion at most $D$ into the Hilbert space. i.e. this says in the case where $D$ is close to $1$ (let’s write $D = 1+\varepsilon$) for any given k, *any* norm on a vector space of sufficiently high dimension will have a $k$ dimensional subspace that looks almost Eculidean (meaning unit ball is round up to a multiple $1+\varepsilon$). Well, that’s pretty cool, but linear. As we all know, general metric spaces can be much worse than normed vector spaces. So here comes a nonlinear version of the above, posted by Terrence Tao: Nonlinear Dvoretzky problem: Given $\kappa>0, D>1$, does there exist a $\alpha$ so that every metric space of Hausdorff dimension $\geq \alpha$ has a subset $S$ of Hausdorff dimension $\kappa$ that embeds into the Hilbert space with distorsion $D$? Indeed, we should expect this to be a lot more general and much harder than the linear version, since we know literally nothing about the space except for having large Hausdorff dimension and as we know metric spaces can be pretty bizarre. That why we should find the this new result of theirs amazing: Theorem 1 (Mendel-Naor): There is a constant $C$ such that for any $\lambda \in (0,1)$, every compact metric space with dimension $\alpha$ has a closed subset $S$ of dimension at least $(1-\lambda) \alpha$ that admits an embedding into Hilbert space with distorsion $C/ \lambda$. Note that, in the original linear problem, $N = N(k, D)$ of course blows up to infinity whenever $k \rightarrow \infty$ or $D \rightarrow 1$. (whenever we need a huge dimensional space with fixed ‘flatness’ or a fixed dimension but ‘super-flat’ subspace). That is, when we are looking for subspaces inside a random space with fixed (large) dimension $N$, the larger dimension we need, the less flat (more distorsion) we can expect it to be. i.e. $k \rightarrow N$ necessarily forces $D \rightarrow \infty$ and $D \rightarrow 1$ forces $k \rightarrow 0$. We can see that this nonlinear theorem is great when we need large dimensional subspaces (when $\lambda$ is close to $0$), but not so great when we want small distorsion (it does not apply when distorsion is smaller than $C$, and blows up when it gets close to $C$). In the original statement of the problem this gives not only that a large $\alpha$ exists, but $\alpha(\kappa, D) \leq \frac{D}{D-C} \kappa$! In fact, when we are allowing relatively large distorsion (compare to constant $C$) this theorem is sharp: Theorem 2 (Mendel-Naor): There exists constant $C'$ and a family of metric spaces $\{ X_\alpha \ | \ \alpha \in \mathbb{R}^+ \}$ such that for any small $\varepsilon$, no subset of $X_\alpha$ with dimension $\geq (1-\varepsilon)\alpha$ embeds into Hilbert space with distorsion $\leq C'/\varepsilon$. This construction makes use of our all-time favorite: expander graphs! (see pervious post for definitions) So what if $D$ is small? Turns out, unlike in the linear case when $D<2$, there is no $\alpha(\kappa, D)$, in the paper they also produced spaces $X_\alpha$ for each $\alpha$ where the only subset that embeds with distorsion < $2$ are of Hausdorff dimension 0! In his words, the proof of theorem 1 (i.e. the paper) takes five hours to explain, hence I will not do that. But among many neat things appeared in the proof, instead of embedding into the Hilbert space directly they embedded it into an ultrametric space with distorsion $D$, and then make use of the fact that any such space sits in the Hilbert space isometrically. Definition: An ultrametric on space $X$ is a metric with the additional property that for any three points $x, y, z \in X$, we have the so-called strong triangle inequality, i.e. $d(x, y) \leq \max \{ d(x, z), d(y,z) \}$ If one pause and think a bit, this is a quite weird condition: for example, all triangles in an ultrametric space are isosceles with the two same length edge both longer than the third edge; every point is the center of the ball, etc. Exercise: Come up with an example of an ultrametric that’s not the discrete metric, on an infinite space. Not so easy, hum? My guess: the first one you came up with is the space of all words in 2-element alphabet $\Sigma_2$, with the dictionary metric (two points are distance $2^{-n}$ apart where $n$ is the first digit they differ), right? In fact in some sense all ultrametrics look like that. (i.e. they are represented by an infinite tree with weights assigned to vertices, in the above case a 2-regular tree with equal weights on same level) Topologically our $\Sigma_2$ is a Cantor set and sits isometrically in $l_2$. It’s not hard to see in fact all such space embeds isometrically in $l_2$. I just want to say that this operation of first construct an ultrametric space according to our given metric space, embed, and then embed the ultrametric metric space into the Hilbert space somehow reminds me of when we want to construct a measure satisfying a certain property, we first construct it on a Cantor set, then divide up the space and use the values on parts of the Cantor set for parts of the space…On this vein the whole problem gets translated to producing a tree that gives the ultrametric and work with the tree (by no means simple, though). Finally, let’s see a cute statement which can be proved by applying this result: Urbanski’s conjecture: Any compact metric space of Hausdorff dimension > $n$ can be mapped surjectively [thanks to Tushar Das for pointing out the typo] onto the unit ball $B_1^n$ by a Lipschitz map. (note strictly larger is important for our theorem to apply…since we need to subtract an $\varepsilon$) and hence another cute statement that’s still not known: Question: Does any compact metric space with positive Hausdorff $n$-dimensional measure maps Lipschitzly onto $B_1^n$? # Stabilization of Heegaard splittings In the last lecture of a course on Heegaard splittings, professor Gabai sketched an example due to Hass-Thompson-Thurston of two genus $g$ Heegaard splittings of a $3$-manifold that requires at least $g$ stabilization to make them equivalent. The argument is, in my opinion, very metric-geometric. The connection is so striking (to me) so that I feel necessary to give a brief sketch of it here. (Side note: This has been a wonderful class! Although I constantly ask stupid questions and appear to be confused from time to time. But in fact it has been very interesting! I should have talked more about it on this blog…Oh well~) The following note is mostly based on professor Gabai’s lecture, I looked up some details in the original paper ( Hass-Thompson-Thurston ’09 ). Recall: (well, I understand that I have not talked about Heegaard splittings and stabilizations here before, hence I’ll *try to* give a one minute definition) A Heegaard splitting of a 3-manifold $M$ is a decomposition of the manifold as a union of two handlebodies intersecting at the boundary surface. The genus of the Heegaard splitting is the genus of the boundary surface. All smooth closed 3-manifolds has Heegaard splitting due the mere existence of a triangulation ( by thicken the 1-skeleton of the triangulation one gets a handlebody perhaps of huge genus, it’s easy exercise to see its complement is also a handlebody). However it is of interest to find what’s the minimal genus of a Heegaard splitting of a given manifold. Two Heegaard splittings are said to be equivlent if there is an isotopy of the manifold sending one splitting to the other (with boundary gluing map commuting, of course). A stabilization of a Heegaard splitting $(H_1, H_2, S)$ is a surgery on $S$ that adds genus (i.e. cut out two discs in $S$ and glue in a handle). Stabilization will increase the genus of the splitting by $1$) Let $M$ be any closed hyperbolic $3$-manifold that fibres over the circle. (i.e. $M$ is $F_g \times [0,1]$ with the two ends identified by some diffeomorphism $f: F_g \rightarrow F_g$, $g\geq 2$): Let $M'_k$ be the $k$ fold cover of $M$ along $S^1$ (i.e. glue together$k$copies of $F_g \times I$ all via the map $f$: Let $M_k$ be the manifold obtained by cut open $M'_k$ along$F_g$and glue in two handlebodies $H_1, H_2$ at the ends: Since $M$ is hyperbolic, $M'_k$ is hyperbolic. In fact, for any $\varepsilon > 0$ we can choose a large enough $k$ so that $M_k$ can be equipped with a metric having curvature bounded between $1-\varepsilon$ and $1+\varepsilon$ everywhere. ( I’m obviously no in this, however, intuitively it’s believable because once the hyperbolic part $M'_k$ is super large, one should be able to make the metric in $M'_k$ slightly less hyperbolic to make room for fitting in an almost hyperbolic metric at the ends $H_1, H_2$). For details on this please refer to the original paper. :-P Now there comes our Heegaard splittings of $M_k$! Let $k = 2n$, let $H_L$ be the union of handlebody $H_1$ together with the first $n$ copies of $M$, $H_R$ be $H_2$ with the last $n$ copies of $M$. $H_L, H_R$ are genus $g$ handlebodies shearing a common surface $S$ in the ‘middle’ of $M_k$: Claim: The Heegaard splitting $\mathcal{H}_1 = H_L \cup H_R$ and $\mathcal{H}_2 = H_L \cup H_R$ cannot be made equivalent by less than $g$ stabilizations. In other words, first of all one can not isotope this splitting upside down. Furthermore, adding handles make it easier to turn the new higher genus splitting upside down, but in this particular case we cannot get away with adding anything less than $g$ many handles. Okay, not comes the punchline: How would one possible prove such thing? Well, as one might have imagined, why did we want to make this manifold close to hyperbolic? Yes, minimal surfaces! Let’s see…Suppose we have a common stabilization of genus $2g-1$. That would mean that we can sweep through the manifold by a surface of genus (at most) $2g-1$, with $1$-skeletons at time $0, 1$. Now comes what professor Gabai calls the ‘harmonic magic’: there is a theorem similar to that of Pitts-Rubinstein Ingredient #1: (roughly thm 6.1 from the paper) For manifolds with curvature close to $-1$ everywhere, for any given genus $g$ Heegaard splitting $\mathcal{H}$, one can isotope the sweep-out so that each surface in the sweep-out having area $< 5 \pi (g-1)$. I do not know exactly how is this proved. The idea is perhaps try to shrink each surface to a ‘minimal surface’, perhaps creating some singularities harmless in the process. The ides of the whole arguement is that if we can isotope the Heegaard splittings, we can isotope the whole sweep-out while making the time-$t$ sweep-out harmonic for each $t$. In particular, at each time there is (at least) a surface in the sweep-out family that divides the volume of $M'_n$ in half. Furthermore, the time 1 half-volume-surface is roughly same as the time 0 surface with two sides switched. We shall see that the surfaces does not have enough genus or volume to do that. (As we can see, there is a family of genus $2g$ surface, all having volume less than some constant independent of $n$ that does this; Also if we have ni restriction on area, then even a genus $g$ surface can be turned.) Ingredient #2: For any constant $K$, there is $n$ large enough so no surfaces of genus $ and area $ inside the middle fibred manifold with boundary $M'_n$ can divide the volume of $M'_n$ in half. The prove of this is partially based on our all-time favorite: the isoperimetric inequality: Each Riemannian metric $\lambda$ on a closed surface has a linear isoperimetric inequality for 1-chains bounding 2-chains, i.e. any homologically trivial 1-chain $c$ bounds a $2$ chain $z$ where $\mbox{Vol}_2(z) \leq K_\lambda \mbox{Vol}_1(c)$. Fitting things together: Suppose there is (as described above) a family of genus $2g-1$ surfaces, each dividing the volume of $M_{2n}$ in half and flips the two sides of the surface as time goes from $0$ to $1$. By ingredient #1, since the family is by construction surfaces from (different) sweep-outs by ‘minimal surfaces’, we have $\mbox{Vol}_2(S_t) < 5 \pi (2g-2)$ for all $t$. Now if we take the two component separated by $S_t$ and intersect them with the left-most $n$ copies of $M$ in $M'_{2n}$ (call it $M'_L$), at some $t$, $S_t$ must also divide the volume of $M'_L$ in half. Since $S_t$ divides both $M'_2n$ and $M'_L$ in half, it must do so also in $M'_R$. But $S_t$ is of genus $2g-1$! So one of $S_t \cap M'_L$ and $S_t \cap M'_R$ has genus $< g$! (say it's $M'_L$) Apply ingredient #2, take $K = 5 \pi (2g-2)$, there is $n$ large enough so that $S_t \cap M'_L$, which has area less than $K$ and genus less than $g$, cannot possibly divide $M'_L$ in half. Contradiction. # Graph, girth and expanders In the book “Elementary number theory, group theory and Ramanujan graphs“, Sarnak et. al. gave an elementary construction of expander graphs. We decided to go through the construction in the small seminar and I am recently assigned to give a talk about the girth estimate of such graphs. Given graph (finite and undirected) $G$, we will denote the set of vertices by $V(G)$ and the set of edges $E(G) \subseteq V(G)^2$. The graph is assumed to be equipped with the standard metric where each edge has length $1$. The Cheeger constant (or isoperimetric constant of a graph, see this pervious post) is defined to be: $\displaystyle h(G) = \inf_{S\subseteq V_G} \frac{|\partial S|}{\min\{|S|, |S^c|\}}$ Here the notation $\partial S$ denote the set of edges connecting an element in $S$ to an element outside of $S$. Note that this is indeed like our usual isoperimetric inequalities since it’s the smallest possible ratio between size of the boundary and size of the set it encloses. In other words, this calculates the most efficient way of using small boundary to enclose areas as large as possible. It’s of interest to find graphs with large Cheeger constant (since small Cheeger constant is easy to make: take two large graphs and connect them with a single edge). It’s also intuitive that as the number of edges going out from each vertice become large, the Cheeger constant will become large. Hence it make sense to restrict ourselves to graphs where there are exactly $k$ edges shearing each vertex, those are called $k$-regular graphs. If you play around a little bit, you will find that it’s not easy to build large k-regular graphs with Cheeger constant larger than a fixed number, say, $1/10$. Definition: A sequence of k-regular graphs $(G_i)$ where $|V_{G_i}| \rightarrow \infty$ is said to be an expander family if there exists constant $c>0$ where $h(G_i) \geq c$ for all $i$. By random methods due to Erdos, we can prove that expander families exist. However an explicit construction is much harder. Definition: The girth of $G$ is the smallest non-trivial cycle contained in $G$. (Doesn’t this smell like systole? :-P) In the case of trees, since it does not contain any non-trivial cycle, define the girth to be infinity. The book constructs for us, given pair $p, q$ of primes where $p$ is large (but fixed) and $q \geq p^8$, a graph $(p+1)$-regular graph$X^{p,q}$ with $\displaystyle h(X^{p,q}) \geq \frac{1}{2}(p+1 - p^{5/6 + \varepsilon} - p^{1/6-\varepsilon})$ where $0 < \varepsilon < 1/6$. Note that the bound is strictly positive and independent of $q$. Giving us for each $p$, $(X^{p,q})$ as q runs through primes larger than $p^8$ is a $(p+1)$-regular expander family. In fact, this constructs for us an infinite family of expander families: a $(k+1)$-regular one for each prime $k$ and the uniform bound on Cheeger constant gets larger as $k$ becomes larger. One of the crucial step in proving this is to bound the girth of the graph $X^{p,q}$, i.e. they showed that $g(X^{p,q}) \geq 2 \log_p(q)$ and if the quadratic reciprocity $(\frac{p}{q}) = -1$ then $g(X^{p,q}) \geq 4 \log_p(q) - \log_p(4)$. This is what I am going to do in this post. Let $\mathbb H ( \mathbb Z)$ be the set of quaternions with $\mathbb Z$ coefficient, i.e. $\mathbb H ( \mathbb Z) = \{ a+bi+cj+dk \ | \ a,b,c,d \in \mathbb Z \}$ Fix odd prime $p$, let $\Lambda' = \{ \alpha \in \mathbb H(\mathbb Z) \ | \ \alpha \equiv 1 (mod 2) \}$ $\cup \ \{\alpha \in \mathbb H(\mathbb Z) \ | \ N(\alpha) = p^n \ \mbox{and} \ \alpha \equiv i+j+k (mod 2) \}$ where the norm $N$ on $\mathbb H(\mathbb Z)$ is the usual $N(a+bi+cj+dk) = a^2 + b^2 +c^2+d^2$. $\Lambda'$ consists of points with only odd first coordinate or points lying on spheres of radius $\sqrt{p^n}$ and having only even first coordinate. One can easily check $\Lambda'$ is closed under multiplication. Define equivalence relation $\sim$ on $\Lambda'$ by $\alpha \sim \beta$ if there exists $m, n \in \mathbb{N}$ s.t. $p^m \alpha = \pm p^n \beta$. Let $\Lambda = \Lambda' / \sim$, let $Q: \Lambda' \rightarrow \Lambda$ be the quotient map. Since we know $\alpha_1 \sim \alpha_2, \beta_1 \sim \beta_2 \Rightarrow \alpha_1\beta_1 \sim \alpha_2\beta_2$, $\Lambda$ carries an induced multiplication with unit. In elementary number theory, we know that the equation $a^2+b^2+c^2+d^2 = p$ has exactly $8(p+1)$ integer solutions. Hence the sphere of radius $p$ in $\mathbb H(\mathbb Z)$ contain $8(p+1)$ points. In each $4$-tuple $(a,b,c,d)$ exactly one is of a different parity from the rest, depending on whether $p\equiv1$ or $3 (mod 4)$. Restricting to solutions where the first coordinate is non-negative, having different parity from the rest (in case the first coordinate is $0$, we pick one of the two solutions $\alpha, -\alpha$ to be canonical), this way we obtain exactly $p+1$ solutions. Let $S'_p = \{ \alpha_1, \bar{\alpha_1}, \cdots, \alpha_k, \bar{\alpha_k}, \beta_1, \cdots, \beta_l \}$ be this set of $p+1$ points on the sphere. Note that the $\beta$s represent the solutions where the first coordinate is exactly $0$. Check that $S'_p$ generates $\Lambda'$. We have $\alpha_i \bar{\alpha_i} = p$ and $-\beta_j^2 = p$. By definition $S'_p \subseteq \Lambda'$ and $Q$ is injective on $S'_p$. Let $S_p = Q(S'_p)$. Consider the Cayley graph $\mathcal G (\Lambda, S_p)$, this is a $(p+1)$-regular graph. Since $S_p$ generares $\Lambda$, $\mathcal G (\Lambda, S_p)$ is connected. Claim: $\mathcal G (\Lambda, S_p)$ is a tree. Suppose not, let $(v_0, v_1, \cdots, v_k=v_0)$ a non-trivial cycle. $k \geq 2$. Since $\mathcal G$ is a Cayley graph, we may assume $v_0 = e$. Hence $v_1 = \gamma_1, \ v_2 = \gamma_1\gamma_2, \cdots, v_k = \gamma_1 \cdots \gamma_k$, for some $\gamma_1, \cdots, \gamma_k \in S_p$. Since $v_{i-1} \neq v_{i+1}$ for all $1\leq i \leq k-1$, the word $\gamma_1, \cdots, \gamma_k$ cannot contain either $\alpha_i\bar{\alpha_i}$ or $\beta_i^2$, hence cannot be further reduced. $\gamma_1, \cdots, \gamma_k = e$ in $\Lambda$ means for some $m, n$ we have $\pm p^n \gamma_1, \cdots, \gamma_k = p^m$. Since every word in $\Lambda'$ with norm $N(\alpha) = p^k$ must have a unique factorization $\alpha = \pm p^r w_m$ where $w_m$ is a reduces word of length $m$ in $S'_p$ and $2r+m = k$. Contradiction. Establishes the claim. Now we reduce the group $\mathbb H (\mathbb Z)$ mod $q$: $\pi_q: \mathbb H (\mathbb Z) \rightarrow \mathbb H (\mathbb{F}_q)$ One can check that $\pi_q(\Lambda') = \mathbb H (\mathbb{F}_q)^\times$. Let $Z_q = \{ \alpha \in \mathbb H (\mathbb{F}_q)^\times \ | \ \alpha = \bar{\alpha} \}$, $Z_q < \mathbb H (\mathbb{F}_q)^\times$ is a central subgroup. For $\alpha, beta \in \Lambda'$, $\alpha \sim \beta \Rightarrow \pi_q(\alpha)^{-1}\pi_q(\beta) \in Z_q$. Which means we have well defined homomorphism $\Pi_q: \Lambda \rightarrow \mathbb H (\mathbb{F}_q)^\times / Z_p$. Let $T_{p,q} = \Pi_q(S_p)$, if $q > 2\sqrt{q}$ we have $\Pi_q$ is injective on $S_p$ and hence$latex $| T_{p,q} | = p+1$.

Now we are ready to define our expanding family:

$X^{p,q} = \mathcal{G}( \Pi_q(\Lambda), T_{p,q})$.

Since $S_p$ generates $\Lambda$, $T_{p,q}$ generates $\Pi_q(\Lambda)$. Hence $X^{p,q}$ is $(p+1)$-regular and connected.

Theorem 1: $g(X^{p,q}) \geq 2 \log_p(q)$

Let $(e, v_1, \cdots, v_k=e)$ be a cycle in $X^{p,q}$, there is $t_1, \cdots, t_k \in T_{p,q}$ such that $v_i = t_1 t_2 \cdots t_k$ for $1 \leq i \leq k$.

Let $\gamma_i = \Pi_q^{-1}(t_i), \ \gamma_i \in S_p$, $\alpha = a_0 + a_1 i+a_2 j +a_3 k = \gamma_1 \cdots \gamma_k \in \Lambda$. Note that from the above arguement we know $\alpha$ is a reduced word, hence $\alpha \neq e_{\Lambda}$. In particular, this implies $a_1, a_2, a_3$ cannot all be $0$.

Also, since $\alpha$ is reduced, $\displaystyle N(\alpha) = \Pi_{i=1}^k N(\gamma_i) = p^k$.

By Lemma, since $\Pi_q(\alpha) = 1$, $\alpha \in \mbox{ker}(\Pi_q)$ hence $q$ divide $a_1, a_2, a_3$, we conclude

$N(\alpha) = a_0^2 + a_1^2 +a_2^2 +a_3^2 \geq q^2$

We deduce $p^k \geq q^2$ hence $k \geq \log_p(q)$ for all cycle. i.e. $g(X^{p,q}) \geq \log_p(q)$.

Theorem 2: If $q$ does not divide $p$ and $p$ is not a square mod $q$ (i.e. $(\frac{p}{q}) = -1$), then $g(X^{p,q}) \geq 4 \log_p(q) - \log_p(4)$.

For any cycle of length $k$ as above, we have $N(\alpha) = p^k \equiv a_0^2 (\mbox{mod} \ q)$, i.e. $(\frac{p^k}{q}) = 1$.

Since $(\frac{p^k}{q}) = ((\frac{p}{q})^k$, we have $(-1)^k = 1, \ k$ is even. Let $k = 2l$.

Note that $p^k \equiv a_0^2 (\mbox{mod} \ q^2)$, we also have

$p^{2l} \equiv a_0^2 (\mbox{mod} \ q^2)$

Hence $p^l \equiv a_0 (\mbox{mod} \ q^2)$.

Since $a_0^2 \leq p^{2l}$, $|a_0| \leq p^l$.

If $2l < 4 \log_p(q) - \log_p(4)$ we will have $p^l < q^2/2$. Then $|p^l\pm a_0| \leq 2|p^l| < q^2$.

But we know that $p^l \equiv a_0 (\mbox{mod} \ q^2)$, one of $p^l\pm a_0$ must be divisible by $q^2$, hence $0$.

Conclude $p^l = \pm a_0$, $N(\alpha) = p^k = a_0^2$, hence $a_1=a_2=a_3=0$. Contradiction.

# Isoperimetric inequality on groups

Back in high school, we have all learned and loved the isoperimetric inequality: “If one wants to enclose a region of area $\pi$ on the plane, one needs a rope of length at least $2 \pi$.” or equivalently, given $U \subseteq \mathbb{R}^2$ bounded open, we always have

$\ell(\partial U) \geq 2 \sqrt{\pi} \sqrt{\mbox{Area}(U)}$

Of course this generalizes to $\mathbb{R}^n$:

Theorem: Given any open set $U \subseteq \mathbb{R}^n$, we have

$\mbox{vol}_{n-1}\partial (U) \geq n\cdot \omega_n^{1/n} \mbox{vol}_n(U)^{\frac{n-1}{n}}$

Here $\omega_n$ is the volume of the unit n-ball. Note that the inequality is sharp for balls in $\mathbb{R}^n$.

One nature question to ask should be: for which other kind of spaces do we have such an inequality. i.e. when can we lower bound the volume of the boundary by in terms of the volume of the open set? If such inequality exists, how does the lower bound depend on the volume of the set?

I recently learned to produce such bounds on groups:
Let $G$ be a discrete group, fix a set of generators and let $C_G$ be its Cayley graph, equipped with the word metric $d$.

Let $N(R) = |B_R(e)|$ be the cardinality of the ball of radius $R$ around the identity.

For any set $U \subseteq G$, we define $\partial U = \{g \in G \ | \ d(g, U) = 1 \}$ i.e. points that’s not in $U$ but there is an edge in the Cayley graph connecting it to some point in $U$.

Theorem:For any group with the property that $N(R) \geq c_n R^n$, then for any set $U \subseteq G$ with $|U| \leq \frac{1}{2}|G|$,

$|\partial U| \geq c_n |U|^{\frac{n-1}{n}}$.

i.e. If the volume of balls grow (w.r.t. radius) as fast as what we have in $\mathbb{R}^n$, then we can lower bound the size of boundary any open set in terms of its volume just like what we have in $\mathbb{R}^n$.

Proof: We make use of the fact that right multiplications by elements of the group are isometries on Cayley graph.

Let $R = (\frac{2}{c_n}|U|)^\frac{1}{n}$, so we have $|B_R(e)| \geq 2|U|$.

For each element $g \in B_R(e)$, we look at how many elements of $U$ went outside of $U$ i.e. $| Ug \backslash U|$. (Here the idea being the size of the boundary can be lower bounded in terms of the length of the translation vector and the volume shifted outside the set. Hence we are interested in finding an element that’s not too far away from the identity but shifts a large volume of $U$ outside of $U$.)

The average value of $|Ug \backslash U|$ as $g$ varies in the ball $B_R(e)$ is:

$\displaystyle \frac{1}{|B_R(e)|} \sum_{g\in B_R(e)} |Ug \backslash U|$

The sum part is counting the elements of $U$ that’s translated outside $U$ by $g$ then sum over all $g \in B_R(e)$, this is same as first fixing any $u \in U$, count how many $g$ sends $u$ outside $U$, and sum over all $u \in U$ ( In fact this is just Fubini, where we exchange the order of two summations ). “how mant $g \in B_R(e)$ sends $u$ outside of $U$” is merely $|uB_R(e) \backslash U|$.

Hence the average is

$\displaystyle \frac{1}{|B_R(e)|} \sum_{u\in U}|uB_R(e) \backslash U|$.

But we know $|B_R(e)| \geq 2 \cdot |U|$, hence $|uB_R(e) \backslash U|$ is at least $\frac{1}{2}|B_R(e)|$.

Hence

$\displaystyle \frac{1}{|B_R(e)|} \sum_{u\in U}|uB_R(e) \backslash U|$

$\geq \frac{1}{|B_R(e)|} \cdot |U| \cdot \frac{1}{2} |B_R(e)| = \frac{1}{2} |U|$

Now we can pick some $g \in B_R(e)$ where $|Ug \backslash U| \geq \frac{1}{2}|U|$ (at least as large as average).

Since $g$ has norm at most $R$, we can find a word $g_1 g_2 \cdots g_k = g, \ k \leq R$.

For any $ug \in (Ug \backslash U)$, since $u \in U$ and $ug \notin U$, there must be some $1\leq i\leq k$ s.t. $u(g_1\cdots g_{i-1}) \in U$ and $u(g_1\cdots g_i) \notin U$.

Hence $u g_1 \cdots g_i \in \partial U$, $ug \in \partial U \cdot g_{i+1} \cdots g_k$.

We deduce that $\displaystyle Ug \backslash U \subseteq \partial U \cup \partial U g_k \cup \cdots \cup \partial U g_2 \cdots g_k$ i.e. a union of $k$ copies of $\partial U$.

Hence $R |\partial U| \geq k |\partial U| \geq |Ug \backslash U| \geq \frac{1}{2}|U|$

$|\partial U| \geq \frac{|U|}{2R}$

Since $|B_R(e)| \geq c_n R^n$, we have $R \leq c_n |B_R(e)|^{\frac{1}{n}}$. $R = (\frac{2}{c_n}|U|)^\frac{1}{n} = c_n |U|^\frac{1}{n}$, hence we have

$|\partial U| \geq c_n |U|^\frac{n-1}{n}$

Establishes the claim.

Remark: The same argument carries through as long as we have a lower bound on the volume of balls, not necessarily polynomial. For example, on expander graphs, the volume of balls grow exponentially: $B_R(e) \geq c \cdot e^R$, then trancing through the argument we get bound

$|\partial U| \geq c \cdot \frac{|U|}{\log{|U|}}$

Which is a very strong isoperimetric inequality. However in fact the sharp bound for expanders is $|\partial U| \geq c \cdot |U|$. But to get that one needs to use more information of the expander than merely the volume of balls.

On the same vein, we can also prove a version on Lie groups:

Theorem:Let $G$ be a Lie group, $g$ be a left invariant metric on $G$. If $\mbox{vol}_n(B_R) \geq c_n R^n$ then for any open set $U$ with no more than half of the volume of $G$,

$\mbox{vol}_{n-1}(\partial U) \geq c_n \mbox{vol}_n(U)^\frac{n-1}{n}$.

Note that to satisfy the condition $\mbox{vol}_n(B_R) \geq c_n R^n$, our Lie group must be at least n-dimensional since if not the condition would fail for small balls. $n$ might be strictly larger than the dimension of the manifold depending on how ‘neigatively curved’ the manifold is in large scale.

Sketch of proof: As above, take a ball twice the size of the set $U$ around the identity, say it’s $B_R(e)$. Now we consider all left translates of the set $U$ by element in $B_R(e)$. In average an element shifts at least half of $U$ outside of $U$. Pick element $g$ where $\mbox{vol}(gU \backslash U)$ is above average.

Let $\gamma: [0, ||g||]$ be a unit speed geodesic connecting $e$ to $g$. Consider the union of left translates of $\partial U$ by elements in $\gamma([0, ||g||])$, this must contain all of $gU \backslash U$ since for any $gu \notin U$ the segment $\gamma([0,||g||]) \cdot u$ must cross the boundary of $U$, i.e. there is $c \in [0,||g||]$ where $\gamma(c) u \in \partial U$, hence

$g\cdot u = \gamma(||g||) \cdot u = \gamma(||g||-c)\gamma(c) u \in \gamma(||g||-c) \cdot \partial U$

But since the geodesic has derivative $1$, the n-dimensional volume of the union of those translates is at most $\mbox{vol}_{n-1}(\partial U) \cdot ||g||$.

We have $\mbox{vol}_{n-1}(\partial U) \cdot R \geq \mbox{vol}_n(gU \backslash U) \geq \mbox{vol}_n(U)/2$

Now since we have growth condition

$2\mbox{vol}_n(U) = \mbox{vol}_n(B_R) \geq c_n R^n$

i.e. $R \leq c_n \mbox{vol}_n(U)^\frac{1}{n}$.

Conbine the two inequalities we obtain

$\mbox{vol}_{n-1}(\partial U) \geq c_n \mbox{vol}_n(U)^\frac{n-1}{n}$.

Establishes the theorem.

Remark: In general, this argument produces a lower bound on the size of the boundary in terms of the volume of the set as long as:
1. There is a way to ‘continuously translate’ the set by elements in the space.
2. We have a lower bound on the growth of balls in terms of radius.

The key observation is that the translated set minus the original set is always contained in a ‘flattened cylinder’ of the boundary in direction of the translate, which then has volume controlled by the boundary size and the length of the translating element. Because of this, the constant is almost never strict as the difference (translate subtract original) can never be the whole cylinder (in case of a ball, this difference is a bit more than half of the cylinder).