A survey on ergodicity of Anosov diffeomorphisms

This is in part a preparation for my 25-minutes talk in a workshop here at Princeton next week. (Never given a short talk before…I’m super nervous about this >.<) In this little survey post I wish to list some background and historical results which might appear in the talk.

Let me post the (tentative) abstract first:

——————————————————

Title: Volume preserving extensions and ergodicity of Anosov diffeomorphisms

Abstract: Given a C^1 self-diffeomorphism of a compact subset in \mathbb{R}^n, from Whitney’s extension theorem we know exactly when does it C^1 extend to \mathbb{R}^n. How about volume preserving extensions?

It is a classical result that any volume preserving Anosov di ffeomorphism of regularity C^{1+\varepsilon} is ergodic. The question is open for C^1. In 1975 Rufus Bowen constructed an (non-volume-preserving) Anosov map on the 2-torus with an invariant positive measured Cantor set. Various attempts have been made to make the construction volume preserving.

By studying the above extension problem we conclude, in particular the Bowen-type mapping on positive measured Cantor sets can never be volume preservingly extended to the torus. This is joint work with Charles Pugh and Amie Wilkinson.

——————————————————

A diffeomorphism f: M \rightarrow M is said to be Anosov if there is a splitting of the tangent space TM = E^u \oplus E^s that’s invariant under Df, vectors in E^u are uniformly expanding and vectors in E^s are uniformly contracting.

In his thesis, Anosov gave an argument that proves:

Theorem: (Anosov ’67) Any volume preserving Anosov diffeomorphism on compact manifolds with regularity C^2 or higher on is ergodic.

This result is later generalized to Anosov diffeo with regularity C^{1+\varepsilon}. i.e. C^1 with an \varepsilon-holder condition on the derivative.

It is a curious open question whether this is true for maps that’s strictly C^1.

The methods for proving ergodicity for maps with higher regularity, which relies on the stable and unstable foliation being absolutely continuous, certainly does not carry through to the C^1 case:

In 1975, Rufus Bowen gave the first example of an Anosov map that’s only C^1, with non-absolutely continuous stable and unstable foliations. In fact his example is a modification of the classical Smale’s horseshoe on the two-torus, non-volume-preserving but has an invariant Cantor set of positive Lebesgue measure.

A simple observation is that the Bowen map is in fact volume preserving on the Cantor set. Ever since then, it’s been of interest to extend Bowen’s example to the complement of the Cantor set in order to obtain an volume preserving Anosov diffeo that’s not ergodic.

In 1980, Robinson and Young extended the Bowen example to a C^1 Anosov diffeomorphism that preserves a measure that’s absolutely continuous with respect to the Lebesgue measure.

In a recent paper, Artur Avila showed:

Theorem: (Avila ’10) C^\infty volume preserving diffeomorphisms are C^1 dense in C^1 volume preserving diffeomorphisms.

Together with other fact about Anosov diffeomorphisms, this implies the generic C^1 volume preserving diffeomorphism is ergodic. Making the question of whether such example exists even more curious.

In light of this problem, we study the much more elementary question:

Question: Given a compact set K \subseteq \mathbb{R}^2 and a self-map f: K \rightarrow K, when can the map f be extended to an area-preserving C^1 diffeomorphism F: \mathbb{R}^2 \rightarrow \mathbb{R}^2?

Of course, a necessary condition for such extension to exist is that f extends to a C^1 diffeomorphism F (perhaps not volume preserving) and that DF has determent 1 on K. Whitney’s extension theorem gives a necessary and sufficient criteria for this.

Hence the unknown part of our question is just:

Question: Given K \subseteq \mathbb{R}^2, F \in \mbox{Diff}^1(\mathbb{R}^2) s.t. \det(DF_p) = 1 for all p \in K. When is there a G \in \mbox{Diff}^1_\omega(\mathbb{R}^2) with G|_K = F|_K?

There are trivial restrictions on K i.e. if K separates \mathbb{R}^2 and F switches complementary components with different volume, then F|_K can never have volume preserving extension.

A positive result along the line would be the following slight modification of Moser’s theorem:

Theorem: Any C^{r+1} diffeomorphism on S^1 can be extended to a C^r area-preserving diffeomorphism on the unit disc D.

For more details see this pervious post.

Applying methods of generating functions and Whitney’s extension theorem, as in this paper, in fact we can get rid of the loss of one derivative. i.e.

Theorem: (Bonatti, Crovisier, Wilkinson ’08) Any C^1 diffeo on the circle can be extended to a volume-preserving C^1 diffeo on the disc.

With the above theorem, shall we expect the condition of switching complementary components of same volume to be also sufficient?

No. As seen in the pervious post, restricting to the case that F only permute complementary components with the same volume is not enough. In the example, K does not separate the plane, f: K \rightarrow K can be C^1 extended, the extension preserves volume on K, and yet it’s impossible to find an extension preserving the volume on the complement of K.

The problem here is that there are ‘almost enclosed regions’ with different volume that are being switched. One might hope this is true at least for Cantor sets (such as in the Bowen case), however this is still not the case.

Theorem: For any positively measured product Cantor set C = C_1 \times C_2, the Horseshoe map h: C \rightarrow C does not extend to a Holder continuous map preserving area on the torus.

Hence in particular we get that no volume preserving extension of the Bowen map can be possible. (not even Holder continuous)

Recurrence and genericity – a translation from French

To commemorate passing the French exam earlier this week (without knowing any French) and also to test this program ‘latex to wordpress‘, I decided to post my French-translation assignment here.

Last year, I went to Paris and heard a French talk by Crovisier. Strangely enough, although I can’t understand a single word he says, just by looking at the slides and pictures, I liked the talk. That’s why when being asked the question ‘so are there any French papers you wanted to look at?’, I immediately came up with this one which the talk was based on.

Here is a translation of selected parts (selected according to my interest) in section 1.2 taken from the paper `Récurrence et Généricité‘ ( Inventiones Mathematicae 158 (2004), 33-104 ) by C. Bonatti and S. Crovisier. In which they proved a connecting lemma for pseudo-orbits.

Interestingly, just in this short section they referred to two results I have discussed in earlier posts of this blog: Conley’s fundamental theorem of dynamical systems and the closing lemma. In any case, I think it’s a cool piece of work to look at! Enjoy~ (Unfortunately, if one wants to see the rest of the paper, one has to read French >.<)

Precise statements of results

1. Statement of the connecting lemma for pseudo-orbits

In all the following work we consider compact manifold {M} equipped with an arbitrary Riemannian metric and sometimes also with a volume form {\omega} (unrelated to the metric). We write {\mbox{Diff}^1(M)} for the set of diffeomorphisms of class {C^1} on {M} with the {C^1} topology and {\mbox{Diff}^1_\omega(M) \subset \mbox{Diff}^1(M)} the subset preserving volume form {\omega}.

Recall that, in any complete metric space, a set is said to be residual if it contains a countable intersection of open and dense sets. A property is said to be generic if it is satisfied on a residual set. By slight abuse of language, we use the term generic diffeomorphisms: the phase ‘generic diffeomorphisms satisfy property P‘ means that property P is generic.

Let f \in \mbox{Diff}^1(M) be a diffeomorphism of M. For all \varepsilon>0, an \varepsilon-pseudo-orbit of f is a sequence (finite or infinite) of points(x_i) such that for all i, d(x_{i+1},f(x_i)) < \varepsilon. We define the following binary relations for pairs of points (x,y) on M:

– For all \varepsilon > 0, we write x \dashv_\varepsilon y if there exists an \varepsilon-pseudo-orbit (x_0, x_1, \cdots, x_k) where x_0 = x and x_k = y for some k \geq 1.

– We write {x \dashv y} if {x \dashv_\varepsilon y} for all {\varepsilon>0}. We sometimes write {x \dashv_f y} to specify the dynamical system in consideration.

– We write {x \prec y} (or {x \prec_f y}) if for all neighborhoods {U, V} of {x} and {y}, respectively, there exists {n \geq 1} such that {f^n(U)} intersects {V}.

Here are a few elementary properties of these relations.

1. The relations {\dashv} and {\dashv_\varepsilon} are, by construction, transitive. The chain recurrent set {\mathcal{R}(f)} is the set of points {x} in {M} such that {x \dashv x}.

2. The relation {x \prec y} is not a-priori transitive. The non-wandering set {\Omega(f)} is the set of points {x} in {M} such that {x \prec x}.

Marie-Claude Arnaud has shown in [Ar] that the relation {\prec} is transitive for generic diffeomorphisms. By using similar methods we show:

Theorem 1: There exists a residual set {\mathcal{G}} in {\mbox{Diff}^1(M)} (or in {\mbox{Diff}^1_\omega(M)}) such that for all diffeomorphisms {f} in {\mathcal{G}} and all pair of points {(x, y)} in {M} we have:

\displaystyle x \dashv_f y \Longleftrightarrow x \prec_f y.

This theorem is a consequence of the following general perturbation result:

Theorem 2: Let {f} be a diffeomorphism on compact manifold {M}, satisfying one of the following two hypotheses:

1. all periodic orbits of {f} are hyperbolic,

2. {M} is a compact surface and all periodic orbits are either hyperbolic or elliptic with irrational rotation number (its derivative has complex eigenvalues, all of modulus {1}, but are not powers of roots of unity).

Let {\mathcal{U}} be a {C^1}-neighborhood of {f} in {\mbox{Diff}^1(M)} (or in {\mbox{Diff}^1_\omega(M)}, if {f} preserves volume form {\omega}). Then for all pairs of points {(x,y)} in {M} such that {x \dashv y}, there exists a diffeomorphism {g} in {\mathcal{U}} and an integer {n>0} such that {g^n(x) = y}.

Remark: In Theorem 2 above, if the diffeomorphism {f} if of class {C^r} with {r \in (\mathbb{N} \backslash \{0\})\cup \{ \infty \}}, then the {C^1}-perturbation {g} can also be chosen in class {C^r}. Indeed the diffeomorphism {g} is obtained thanks to a finite number of {C^1}-perturbations given by the connecting lemma (Theorem 2.1), each of these perturbations is itself of class {C^r}.

Here are a few consequences of these results:

Corollary: There exists a residual set {\mathcal{G}} in {\mbox{Diff}^1(M)} such that for all diffeomorphism {f} in {\mathcal{G}}, the chain recurrent set {\mathcal{R}(f)} coincides with the non-wandering set {\Omega(f)}.

Corollary: Suppose {M} is connected, then there exists a residual set {\mathcal{G}} in {\mbox{Diff}^1(M)} such that if {f \in \mathcal{G}} satisfies {\Omega(f) = M} then it is transitive. Furthermore, {M} is the unique homoclinic class for {f}.

For volume preserving diffeomorphism {f}, the set {\Omega(f)} always coincide with the whole manifold {M}. We therefore find the analogue of this corollary in the conservative case (see section 1.2.4).

2. Dynamical decomposition of generic diffeomorphisms into elementary pieces

Consider the symmetrized relation {\vdash\dashv } of {\dashv} defined by {x \vdash\dashv y} if {x \dashv y} and {y\dashv x}. This relation then induces an equivalence relation on {\mathcal{R}(f)}, where the equivalence classes are called chain recurrence classes.

We say a compact {f}-invariant set {\Lambda} is weakly transitive if for all {x, y \in \Lambda}, we have {x \prec y}. A set {\Lambda} is maximally weakly transitive if it is maximal under the partial order {\subseteq} among the collection of weakly transitive sets.

Since the closure of increasing union of weakly transitive sets is weakly transitive, Zorn’s lemma implies any weakly transitive set is contained in a maximally weakly transitive set. In the case where the relation {\prec_f} is transitive (which is a generic property), the maximally weakly transitive sets are the equivalence classes of the symmetrized relation induced by {\prec} on the set {\Omega(f)}. Hence we obtain, for generic diffeomorphisms:

Corollary: There exists residual set {\mathcal{G}} in {\mbox{Diff}^1(M)} such that for all {f \in \mathcal{G}} the chain recurrence classesare exactly the maximally weakly transitive sets of {f}.

The result of Conley (see posts on fundamental theorem of dynamical systems) on the decomposition of {\mathcal{R}(f)} into chain recurrence classes will therefore apply (for generic diffeomorphisms) to the decomposition of {\Omega(f)} into maximally weakly transitive sets.

{\cdots}

3. Chain recurrence classes and periodic orbits

Recall that after the establishment of closing lemma by C. Pugh (see the closing lemma post), it is known that periodic points are dense in {\Omega(f)} for generic diffeomorphisms, we would like to use these periodic orbits to better understand the dynamics of chain recurrence classes.

Recall the homoclinic class {H(p, f)} of a hyperbolic periodic point {p} is the closure of all transversal crossing points of its stable and unstable manifolds. This set is by construction transitive, as we have seen in section 1.2, the results of [CMP] imply that, for generic diffeomorphisms any homoclinic class is maximally weakly transitive. By applying corollary 1.4, we see that:

Remark: For generic diffeomorphisms homoclinic classes are also chain recurrence classes.

However, for generic diffeomorphisms, there are chain recurrence classes which are not homoclinic classes, therefore contains no periodic orbit, we call such chain recurrence class with no periodic points aperiodic class.

Corollary: There exists residual set {\mathcal{G}} in {\mbox{Diff}^1(M)} such that for all {f \in \mathcal{G}}, any connected component with empty interior of {\Omega(f) = \mathcal{R}(f)} is periodic and its orbit is a homoclinic class.

The closing lemma of Pugh and Remark 5 show:

Remark: For generic {f}, any isolated chain recurrence class in {R(f)} is a homoclinic class. In particular this applies to classes that are topological attractors or repellers.
{\cdots}

For non-isolated classes, a recent work (see [Cr]) specifies how a chain recurrence class is approximated by periodic orbits:

Theorem: There exists residual set {\mathcal{G}} in {\mbox{Diff}^1(M)} such that for all {f \in \mathcal{G}}, all maximally weakly transitive sets of {f} are Hausdorff limits of sequences of periodic orbits.

More general chain recurrence classes satisfy the upper semi-continuity property: if {(x_i) \subseteq \mathcal{R}(f)} is a sequence of points converging to a point {x} then for large enough {n}, the class of {x_n} is contained in an arbitrary small neighborhood of the class of {x}.

C^1 vs. C^1 volume preserving

One of the things I’ve always been interested in is, for a given compact set say in \mathbb{R}^n, what maps defined on the set into \mathbb{R}^n can be extended to a volume preserving map (of certain regularity) on a larger set (for example, some open set containing the original set).

The analogues extension question without requiring the extended map to be volume preserving is answered by the famous Whitney’s extension theorem. It gives a beautiful necessary and sufficient condition on when the map has C^r extension – See this pervious post for more details.

A simple case of this type of question was discussed in my earlier Moser’s theorem post:

Question: Given a diffeomorphism on the circle, when can we extend it to a volume preserving diffeomorphism on the disc?

In the post, we showed that any C^r diffeomorphism on the circle can be extended to a C^{r-1} volume preserving diffeomorphism on the disc. Some time later Amie Wilkinson pointed out to me that, by using generating function methods, in fact one can avoid losing derivative and extend it to a C^r volume preserving.

Anyways, so we know the answer for the circle, what about for sets that looks very different from the circle? Is it true that whenever we can C^r extend the map, we can also so it volume-preserving? (Of course we need to rule out trivial case such as the map is already not volume-preserving on the original set or the map sends, say a larger circle to a smaller circle.)

Question: Is it true that for any compact set K \subseteq \mathbb{R}^n with connected complement, for any function f: K \rightarrow \mathbb{R}^n satisfying the Whitney condition with all candidate derivatives having determent 1, one can always extend f to a volume preserving F: \mathbb{R}^n \rightarrow \mathbb{R}^n.

Note: requiring the set to have connected complement is to avoid the ‘larger circle to small circle’ case and if some candidate derivative does not have determent 1 then the extended map cannot possibly be volume preserving near the point.

After thinking about this for a little bit, we (me, Charles and Amie) came up with the following simple example where the map can only be C^1 extended but not C^1 volume preserving.

Example: Let K \subset \mathbb{R}^2 be the countable union of segments:

K = \{0, 1, 1/2, 1/3, \cdots \} \times [0,1]

As shown below:

Define f: K \rightarrow K be the map that sends the vertical segment above 1/n to the vertical segment above 1/(n+1), preserves the y-coordinate and fixes the segment \{0\} \times [0,1]:

Claim: f can be extended to a C^1 map F: \mathbb{R}^2 \rightarrow \mathbb{R}^2.

Proof: Define g: \mathbb{R} \rightarrow \mathbb{R} s.t.

1) g is the identity on \mathbb{R}^{\leq 0}

2) g(x) = x-1/2 for x>1

3) g: 1/n \mapsto 1/(n+1)

4) g is increasing and differentiable on each [1/n, 1/(n-1)] with derivative no less than (1-1/n)(n^2-n)/(n^2+n) and the one sided derivative at the endpoints being 1.

It’s easy to check such g exists and is continuous:

Since \lim_{n \rightarrow \infty}  (1-1/n)(n^2-n)/(n^2+n) = 1, we deduce g is continuously differentiable with derivative 1 at 0.

Let F = g \otimes \mbox{id}, F: \mathbb{R}^2 \rightarrow \mathbb{R}^2 is a C^1 extension of f.

Establishes the claim.

Hence the pair (K, f) satisfies the Whitney condition for extending to C^1 map. Furthermore, since the F as above has derivative being the identity matrix at all points of K, the determent of candidate derivatives are uniformly 1. In other words, this example satisfies all conditions in the question.

Claim: f cannot be extended to a C^1 volume preserving diffeomorphism of the plane.

Proof: The idea here is to look at rectangles with sides on the set K, if F preserves area, they have to go to regions enclosing the same area as the original rectangles, then apply the isoperimetric inequality to deduce that image of some edges of the rectangle would need to be very long, hence at some point on the edge the derivative of F would need to be large.

Suppose such extension F exists, consider rectangle R_n = [1/n, 1/(n-1)] \times [0,1]. We have

m_2(R_n) = 1/(n^2-n)

m_2(R_n) - m_2(R_{n+1})

=1/(n^2-n)-1/(n^2+n)=2/(n^3-n)

Hence in order for F(R_n) to have the same area as R_n, the image of the two segments

s_{n,0} = [1/n, 1/(n-1)] \times \{ 0\} and

s_{n,1}= [1/n, 1/(n-1)] \times \{ 1\}

would need to enclose an area of 2/(n^3-n) \sim n^{-3} outside of the rectangle R_{n+1}.

By isoparametric inequality, the sum of the length of the two curves must be at least \sim n^{-3/2}, while the length of the original segments is 2/(n^2-n) \sim n^{-2}.

Hence somewhere on the segments F needs to have derivative having norm at least

\ell(F(s_{n,0} \cup s_{n,1}) / \ell(s_{n,0} \cup s_{n,1})

\sim n^{-3/2}/n^{-2} = n^{1/2}

We deduce that there exists a sequence of points (p_n) converging to either (0,0) or (0,1) where

|| F'(p_n) || \sim n^{1/2} \rightarrow \infty.

Hence F cannot be C^1 at the limit point of (p_n).

Remark: In fact we have showed the stronger statement that no volume preserving Lipschitz extension could exist and gave an upper bound 1/2 on the best possible Holder exponent.

From this we know the answer to the above question is negative, i.e. not all C^1 extendable map can me extended in a volume preserving fashion. It would be very interesting to give criteria on what map on which sets can be extended. By applying same methods we are also able to produce an example where the set K is a Cantor set on the plane.

On C^1 closing lemma

Let f: M \rightarrow M be a diffeomorphism. A point p is non-wandering if for all neighborhood U of p, there is increasing sequence (n_k) \subseteq \mathbb{N} where U \cap f^{n_k}(U) \neq \phi. We write p \in \mathcal{NW}(f).

Closing lemma: For any diffeomorphism f: M \rightarrow M, for any p \in \mathcal{NW}(f). For all \varepsilon>0 there exists diffeomorphism g s.t. ||f-g||_{C^1} < \varepsilon and g^N(p) = p for some N \in \mathbb{N}.

Suppose p \in \mathcal{NW}(f), \overline{\mathcal{O}(p)} is compact, then for any \varepsilon>0, there exists x_0 \in B(p, \varepsilon), k \in \mathbb{N} s.t. f^k(x) \in B(p, \varepsilon).

First we apply a selection process to pick an appropriate almost-orbit for the closing. Set x_i = f^i(x_0), \ 0 \leq i \leq k.

If there exists 0 < j < k where

\min \{ d(x_0, x_j), d(x_j, x_k) \} < \sqrt{\frac{2}{3}}d(x_0, x_k)

then we replace the origional finite sequence by (x_0, x_1, \cdots, x_j) or (x_j, \cdots, x_k). Iterate the above process. since the sequence is at least one term shorter after each shortening, the process stops in finite time. We obtain final sequence (p_0, \cdots, p_n) s.t. for all 0 < i < n,

\min \{ d(p_0, p_i), d(p_i, p_n) \} \geq \sqrt{\frac{2}{3}}d(p_0, p_n).

Since the process is applied at most k times, x_0, x_k \in B(p, \varepsilon), after the first shortening, d(p, x_{i_1}) \leq \max \{d(p, x_0), d(p, x_k) \} + \sqrt{\frac{2}{3}}d(x_0, x_k) \leq \varepsilon +  2 \sqrt{\frac{2}{3}} \varepsilon.

i.e. both initial and final term of the sequence is at most (\frac{1}{2}+ \sqrt{\frac{2}{3}}) 2 \varepsilon. Along the same line, we have, at the i-th shortening, the distance between the initial and final sequence and p is at most (\frac{1}{2} + \sqrt{\frac{2}{3}} + (\sqrt{\frac{2}{3}})^2 + \cdots (\sqrt{\frac{2}{3}})^i) 2 \varepsilon. Hence for the final sequence p_0, p_n \in B(p, 1+2 \sqrt{\frac{2}{3}}/(1-\sqrt{\frac{2}{3}}) \varepsilon) \subseteq B(p, 10 \varepsilon).

There is a rectangle R \subseteq M where p_0, p_n \in \sqrt{\frac{3}{4}}R
(i.e. shrunk R by a factor of \sqrt{\frac{3}{4}} w.r.t. the center) and for all 0 < i < n, \ p_i \notin R.

Next, we perturb f in R i.e. find h: M \rightarrow M with ||h||_{C^1} < \delta and h|_{M \backslash R} = id. Hence ||h \circ f - f ||_{C^1} < \delta.

Suppose R = I_1 \times I_2; L_1, L_2 are the lengths of I_1, I_2, L_1 < L_2.
By main value theorem, for all x \in M, \ d(x, h(x)) < \delta L_1.
On the other hand, since p_0 \in \sqrt{\frac{3}{4}}R, it's at least \frac{1}{2}(1-\sqrt{\frac{3}{4}})L_1 away from the boundary of R. i.e. there exists bump function h satisfying the above condition and d(p_0, h(p_0)) > \frac{\delta}{8}(1-\sqrt{\frac{3}{4}})L_1.

Hence in order to move a point by a distance L_1, we need about 1/ \delta such bump functions, to move a distance L_2, we need about \frac{L_2}{\delta L_1} bumps.

For simplicity, we now suppose M is a surface. By starting with an \varepsilon (and hence R) very small, we have for all 0 \leq i \leq N+M, \ f^i(R) is contained in a small neighbourhood of p_i. Hence on f^i(B), f^i is C^1 close to the linear map p_i + Df^i(p_0)(x-p_0). Hence mod some details we may reduce to the case where f is linear in a neighborhood of \mathcal{O}(p_0).

By choosing appropiate coordinate system in R, we can have f preserving the horizontal and vertical foliations and the horizontal vectors eventually grow more rapidly than the vertical vectors.

It turns out to be possible to choose R to be long and thin such that for all i \leq 40 / \delta, f^i(R) has height greater than width. (note that M = \lfloor 40/ \delta \rfloor bumps will be able to move the point by a distance equal to the width of the original rectangle R. Since horizontal vectors eventually grow more rapidly than the vertical vectors, there exists N s.t. for all N \leq i \leq N+M, f^i(R) has width greater than its height.
For small enough \epsilon, the boxes f^i(R) are disjoint for 0 \leq i \leq N+40/ \delta. Construct h to be identity outside of

\displaystyle \bigsqcup_{i=0}^M f^i(R) \sqcup \bigsqcup_{i=N}^{N + M} f^i(R)

For the first M boxes, we let h preserve the horizontal foliation and move along the width so that g = h \circ f has the property that g^M(p_n) lies on the same vertical fiber as f^M(p_0).

On the boxes f^{N+i}(R), \ 0 \leq i \leq M, we let h pushes along the vertical direction so that

g^{N+M}(p_n) = f^{N+M}(p_0)

Since iterates of the rectangle are disjoint, for N+M \leq i \leq n, \ h(p_i) = p_i, g(p_i) = f(p_i).

Hence g^n(p_n) = g^{n-(N+M)} \circ g^{N+M}(p_n) = g^{n-(N+M)} f^{N+M}(p_0) = g^{n-(N+M)} (p_{N+M}) = p_n.

Therefore we have obtained a periodic point p_n of g.

Since p_n \in B(p, 10 \varepsilon), we may further perturb g to move p_n to p. This takes care of the linear case on surfaces.