Two-sided bounds for degenerate processes with densities supported in subsets of R^N

We obtain two-sided bounds for the density of stochastic processes satisfying a weak H\"ormander condition. In particular we consider the cases when the support of the density is not the whole space and when the density has various asymptotic regimes depending on the starting/final points considered (which are as well related to the number of brackets needed to span the space in H\"ormander's theorem). The proofs of our lower bounds are based on Harnack inequalities for positive solutions of PDEs whereas the upper bounds derive from the probabilistic representation of the density given by the Malliavin calculus.


Introduction
We present a methodology to derive two-sided bounds for the density of some R N -valued degenerate processes of the form where the (Y i ) i∈[[0,n]] are smooth vector fields defined on R N , ((W i t ) t≥0 ) i∈[ [1,n]] stand for nstandard monodimensional independent Brownian motions defined on a filtered probability space (Ω, F , (F t ) t≥0 , P) satisfying the usual conditions. Also • dW t denotes the Stratonovitch integral. The above stochastic differential equation is associated to the Kolmogorov operator We assume that the Hörmander condition holds: [H] Rank(Lie{Y 1 , · · · , Y n , Z}(x)) = N + 1, ∀ x ∈ R N .
We will particularly focus on processes satisfying a weak Hörmander condition, that is Rank(Lie{Y 1 , · · · , Y n , −∂ t }(x)) < N + 1, ∀ x ∈ R N . This means that the first order vector field Y 0 (or equivalently the drift term of the SDE) is needed to span all the directions.
As leading examples we have in mind processes of the form where X 1,n s = (X 1 s , · · · , X n s ) (and correspondingly for every x ∈ R n+1 , x 1,n := (x 1 , · · · , x n )), k is any even positive integer and |.| denotes the Euclidean norm of R n . Note that we only consider even exponents in (1.3) in order to keep Y 0 smooth. Our approach also applies to for any given positive integer k.
It is easily seen that the above class of processes satisfies the weak Hörmander condition. Also for equation (1.3), the density p(t, x, .) of X t is supported on R n × (x n+1 , +∞) for any t > 0. Analogously, for equation (1.4), the support of p(t, x, .) is R n+1 when k is odd and R n × (x n+1 , +∞) when k is even.
Let us now briefly recall some known results concerning these two examples. First of all, for k = 1, equation (1.4) defines a Gaussian process. The explicit expression of the density goes back to Kolmogorov [25] and writes for all t > 0, x, ξ ∈ R n+1 : We already observe the two time scales associated respectively to the Brownian motion (of order t 1/2 ) and to its integral (of order t 3/2 ) which give the global diagonal decay of order t n/2+3/2 . The additional term x 1 +ξ 1 2 t in the above estimate is due to the transport of the initial condition by the unbounded drift. We also refer to the works of Cinti and Polidoro [17] and Delarue and Menozzi [19] for similar estimates in the more general framework of variable coefficients, including non linear drift terms with linear growth.
For equation (1.3) and k = 2, n = 1, a representation of the density of X t has been obtained from the seminal works of Kac on the Laplace transform of the integral of the square of the Brownian motion [23]. We can refer to the monograph of Borodin and Salminen [10] for an explicit expression in terms of special functions. We can also mention the work of Tolmatz [39] concerning the distribution function of the square of the Brownian bridge already characterized in the early work of Smirnov [36]. Anyhow, all these explicit representations are very much linked to Liouville type problems and this approach can hardly be extended to higher dimensions for the underlying Brownian motion. Also, it seems difficult from the expressions of [10] to derive explicit quantitative bounds on the density.
Some related examples have been addressed by Ben Arous and Léandre [5] who obtained asymptotic expansions for the density on the diagonal for the process X 1 t = x 1 + W 1 t , X 2 t = x 2 + t 0 (X 1 s ) m dW 2 s + t 0 (X 1 s ) k ds. Various asymptotic regimes are deduced depending on m and k. Anyhow, the strong Hörmander condition is really required in their approach, i.e. the stochastic integral is needed in X 2 .
From the applicative point of view, equations with quadratic growth naturally appear in some turbulence models, see e.g. the chapter concerning the dyadic model in Flandoli [20]. This model is derived from the formulation of the Euler equations on the torus in Fourier series after a simplification consisting in considering a nearest neighbour interaction in the wave space. This operation leads to consider an infinite system of differential equations whose coefficients have quadratic growth. In order to obtain some uniqueness properties, a Brownian noise is usually added on each component. In the current work, we investigate from a quantitative viewpoint what can be said for a drastic reduction of this simplified model, that is when considering 2 equations only, when the noise only acts on one component and is transmitted through the system thanks to the (weak) Hörmander condition.
Our approach to derive two-sided estimates for the above examples is the following. The lower bounds are obtained using local Harnack estimates for positive solutions of L u = 0 with L defined in (1.2). Once the Harnack inequality is established, the lower bound for p(t, x, ξ) is derived applying it recursively along a suitable path joining x to ξ in time t. The set of points of the path to which the Harnack inequality is applied is commonly called a Harnack chain. For k = 1 in (1.4) the path can be chosen as the solution to the deterministic controllability problem associated to (1.4), that is taking the points of the Harnack chain along the path γ where and ω : L 2 ([0, t]) → R n achieves the minimum of t 0 |ω(s)| 2 ds, see e.g. Boscain and Polidoro [11], Carciola et al. [13] and Delarue and Menozzi [19].
In the more general case k > 1 it is known that uniqueness fails for the associated control problem, i.e. when γ ′ n+1 (s) = n i=1 (γ i (s)) k in the above equation (see e.g. Trélat [40]). Therefore, there is not a single natural choice for the path γ. Actually, we will consider suitable paths in order to derive homogeneous two-sided bounds. After the statement of our main results, we will see in Remark 2.3 that the paths we consider allow to obtain a cost similar to the one found in [40] for the abnormal extremals of the value function associated to the control problem.
Anyhow, the crucial point in this approach is to obtain a Harnack inequality invariant w.r.t. scale and translation. Introducing for all (m, ](x)) (i 1 ,··· ,im)∈[[0,n]] m }, the above invariance properties imply that dim(Span{V m (x)}) does not depend on x for any m. This property fails for k > 1 since we need exactly k brackets to span the space at x = (0 1,n , x n+1 ) and exactly one bracket elsewhere. Hence, we need to consider a lifting procedure of L in (1.2) introduced by Rotschild and Stein [35] (see also Bonfiglioli and Lanconelli [6]). Our strategy then consists in obtaining an invariant Harnack inequality for the lifted operator L . We then conclude applying the previous Harnack inequality to L -harmonic functions (which are also L -harmonic). A first attempt to achieve the whole procedure to derive a lower bound for (1.4) and odd k can be found in Cinti and Polidoro [16].
Concerning the upper bounds, we rely on the representation of the density of p obtained by the Malliavin calculus. We refer to Nualart [33] for a comprehensive treatment of this subject. The main issues then consist in controlling the tails of the random variables at hand and the L p norm of the Malliavin covariance matrix for p ≥ 1. The tails can be controlled thanks to some fine properties of the Brownian motion or bridge and its local time. The behavior of the Malliavin covariance matrix has to be carefully analyzed introducing a dichotomy between the case for which the final and starting points of the Brownian motion in (1.3)-(1.4) are close to zero w.r.t. the characteristic time-scale, i.e. |x 1,n |∨ |ξ 1,n | ≤ Kt 1/2 for a given K > 0, which means that the non-degenerate component is in diagonal regime, and the complementary set. In the first case, we will see that the characteristic time scales of the system (1.3), (1.4) and the probabilistic approach to the proof of Hörmander theorem, see e.g. Norris [31] will lead to the expected bound on the Malliavin covariance matrix whereas in the second case a more subtle analysis is required in order to derive a diagonal behavior of the density similar to the Gaussian case (1.5). Intuitively, when the magnitude of either the starting or the final point of the Brownian motion is above the characteristic time-scale, then only one bracket is needed to span the space and the Gaussian regime prevails in small time.
Note that our procedure can be split in two steps. In the first one, purely PDEs methods provide us with lower bounds of the density p. In this part useful information about its asymptotic behavior in various regimes are obtained by elementary arguments. Once the lower bounds have been established, we rely on some ad hoc tools of the Malliavin calculus to prove the analogous upper bounds. However, aiming at improving the readability of our work, we reverse our exposition: we first prove the upper bounds, as well as the diagonal ones, by using probabilistic methods, then we prove the lower bounds by PDEs arguments.
The article is organized as follows. We state our main results in Section 2. We then recall some basic facts of Malliavin calculus in Section 3 and obtain the upper bounds as well as a diagonal lower bound in Gaussian regime in Section 4.In Section 5, we recall some aspects of abstract potential theory needed to derive the invariant Harnack inequality. We also give a geometric characterization of the set where the inequality holds. Section 6 is devoted to the proof of the lower bounds.

Main Results
Before giving the precise statement of our bounds for the the density p of X in (1.3) or (1.4), we give some remarks. In the sequel p(t, x, .) stands for the density of any stochastic process X at time t starting from x. It is well known that, if the vector fields Y 1 , . . . , Y n (note that the drift term Y 0 does not appear) satisfy the Hörmander condition, then the following two sided bound holds: Here and in the sequel, for measurable functions g : R + * × R n → R, h : R + * × R 2n , the above notation p(t, x, ξ) ≍ 1 g(t,x) exp(−h(t, x, ξ)) means that there exists a constant C ≥ 1 s.t.
Moreover, in (2.1), d Y denotes the Carnot-Carathéodory distance associated to Y 1 , . . . , Y n , and B Y (x, r) is the relevant metric ball, with center at x and radius r. On the other hand (1.5) shows that, when the drift term Y 0 is needed to check the Hörmander condition, the density p of the process X doesn't satisfy (2.1). In this article we prove that, when considering processes (1.3) and (1.4) with k > 1, different asymptotic behavior as |x| → +∞ appear.
To be more specific, we first remark that a behavior similar to (1.5) can also be observed for equations (1.3) and (1.4).Conditioning w.r.t. to the non degenerate component we get ) is the usual Gaussian density, For the sake of simplicity, we next focus on the case n = 1 and k = 2 so that (1.3) and (1.4) coincide. Moreover we assume x 1 = ξ 1 . This leads to estimate the density of: Thus, when |x 1 | is sufficiently big w.r.t. the characteristic time scale t 1/2 , the Gaussian random variable dominates in terms of fluctuation order w.r.t. the other random contribution whose variance behaves as O(t 4 ) 1 .
If we additionally assume that |ξ 2 − x 2 − tx 2 1 | ≤ C|x 1 |t 3/2 , for some constant C := C(n = 1, k = 2) to be specified later on, that is the deviation from the deterministic system deriving 1 The previous identity in law is derived from Itô's formula and the differential dynamics of the Brownian bridge. Namely, from (1.3), obtained dropping the Brownian contribution, has the same order as the standard deviation of G, we actually find: When |ξ 2 −x 2 −tx 2 1 | > C|x 1 |t 3/2 , that is when the deviation from the deterministic system exceeds a certain constant times the standard deviation, the term t 0 (W 0,t s ) 2 ds in (2.3) is not negligeable any more and we obtain the following heavy-tailed estimate: The diagonal contribution of the degenerate component corresponds to the intrinsic scale of order t 2 of the term t 0 (W 0,t s ) 2 ds. In particular, if x 1,n = 0 1,n this is the only random variable involved. The off-diagonal bound can be explained by the fact that t 0 (W 0,t s ) 2 ds belongs to the Wiener chaos of order 2. The tails of the distribution function for such random variables can be characterized, see e.g. Janson [22], and are homogeneous to the non Gaussian term in the above estimate.
Observe also that the density p is supported on the half space {ξ ∈ R 2 : ξ 2 > x 2 }. We obtain as well an asymptotic behavior for the density close to the boundary. Precisely, for 0 < ξ 2 − x 2 sufficiently small w.r.t. to the characteristic time-scale t 2 of t 0 (W 0,t s ) 2 ds, that is when the deviations of the degenerate component have the same magnitude as those of the highest order random contribution, then .
We summarize the above remarks with the assertion that processes of the form (1.3) or (1.4) do not have a single regime for k > 1.The precise statements of the previous density bounds are formulated for general n and k in the following Theorem 2.1.

7)
and respectively. Then, from the PDEs point of view, Theorem 2.1 provides us with estimates analogous to those due to Nash, Aronson and Serrin for uniformly parabolic operators.
We next give some comments about our main result. As already pointed out, processes of the form (1.3) or (1.4) do not have a single regime anymore for k > 1. Let us anyhow specify that when 1, n]], C ≥ 1, then expanding Y t as in (2.3), we find that all the terms have the same order and thus a global estimate of type (2.4) (resp. of type (2.5)) holds for the upper bound (resp. lower bound) in both cases (1.3) and (1.4). Observe also that in this case (2.4) and (2.5) give the same global diagonal decay of order t (k+n)/2+1 .

Remark 2.3
As already mentioned in the introduction, for k = 2, n = 1, we observe from (2.6) that the off-diagonal bound is homogeneous to the asymptotic expansion of the value function associated to the control problem at its abnormal extremals, see Example 4.2 in [40]. The optimal cost is asymptotically equivalent to 1 4 ξ 4 1 ξ 2 when x = (0, 0) as ξ is close to (0, 0).
We then get from (2.6) that there exist c := c(n, k), C := C(n, k, T ) . This estimate can be compared to the exponential decay on the diagonal proved by Ben Arous and Léandre in [5, Theorem 1.1].

Introduction
Introduced at the end of the 70s by Malliavin, [30], [29], the stochastic calculus of variations, now known as Malliavin calculus, turned out to be a very fruitful tool. It allows to give probabilistic proofs of the celebrated Hörmander theorem, see e.g. Stroock [37] or Norris [31]. It also provides a quite natural way to derive density estimates for degenerate diffusion processes. The most striking achievement in this direction is the series of papers by Kusuoka and Stroock,[26], [27], [28]. Anyhow, in those works the authors always considered "strong" Hörmander conditions, that is the underlying space is assumed to be spanned by brackets involving only the vector fields of the diffusive part. We also point out that because of the non uniqueness associated to the deterministic control problem, the strategy of [19] relying on a stochastic control representation of the density breaks down. For the systems handled in [19], we refer to Bally and Kohatsu-Higa for a Malliavin calculus approach [2]. The Malliavin calculus remains the most robust probabilistic approach to density estimate in the degenerate setting.
We now briefly state some facts and notations concerning the Malliavin calculus that are needed to prove our results. We refer to the monograph of Nualart [32], from which we borrow the notations, or Chapter 5 in Ikeda and Watanabe [21], for further details.

Operators of the Malliavin Calculus
Let us consider an n-dimensional Brownian motion W on the filtered probability space (Ω, F , (F t ) t≥0 , P) and a given T > 0. Define for h ∈ L 2 (R + , R n ), W (h) = T 0 h(s), dW s . We denote by S the space of simple functionals of the Brownian motion W , that is the subspace of L 2 (Ω, F , P) consisting of real valued random variables F having the form for some m ∈ N, h i ∈ L 2 (R + , R n ), and where f : R m → R stands for a smooth function with polynomial growth.

Malliavin Derivative.
For F ∈ S, we define the Malliavin derivative (D t F ) t∈[0,T ] as the R n -dimensional (non adapted) process For any q ≥ 1, the operator D : S → L q (Ω, L 2 (0, T )) is closable. We denote its domain by D 1,q which is actually the completion of S w.r.t. the norm Writing D j t F for the j th component of D t F , we define the k th order derivative as the random vector on [0, T ] k × Ω with coordinates: We then denote by D N,q the completion of S w.r.t. the norm In the sequel we agree to denote for all q ≥ 1, F q := E[|F | q ] 1/q .

Skorohod Integral.
We denote by P the space of simple processes, that is the subspace of for some m ∈ N, where the (F i ) i∈[ [1,m]] are smooth real valued functions with polynomial growth, ∀i ∈ [ [1, m]], h i ∈ L 2 ([0, T ], R n ) so that in particular F i (W (h 1 ), · · · , W (h m )) ∈ S.
Observe also that with previous definition of the Malliavin derivative for F ∈ S we have (D s F ) s∈[0,T ] ∈ P. For u ∈ P we define the Skorohod integral so that in particular δ(u) ∈ S. The Skorohod integral is also closable. Its domain writes

Ornstein Uhlenbeck operator.
To state the main tool used in our proofs, i.e. the integration by parts formula in its whole generality, we need to introduce a last operator. Namely, the Ornstein-Uhlenbeck operator L which for F ∈ S writes: This operator is also closable and D ∞ is included in its domain Dom(L).
Integration by parts.
Proposition 3.1 (Integration by parts: first version) Let F ∈ D 1,2 , u ∈ Dom(δ), then the following indentity holds: that is the Skorohod integral δ is the adjoint of the Malliavin derivative D. As a consequence, for F, G ∈ Dom(L) we have i.e. L is self-adjoint.
These relations can be easily checked for F, G ∈ S, u ∈ P, and extended to the indicated domains thanks to the closability.
We now state a theorem that provides a decomposition of real-valued square-integrable random variables in terms of series of multiple integrals. where for all m ∈ N, f m is a symmetric function in L 2 ([0, T ] m , (R n ) ⊗m ) and We refer to Theorem 1.1.2 in Nualart [32] for a proof. The computation of Malliavin derivatives is quite simple for multiple integrals. Indeed, As a consequence, for a random variable F having a decomposition as in (3.1), we have that it belongs to D 1,2 if and only if Therefore, when a random variable is smooth in the Malliavin sense, i.e. D ∞ , the Stroock formula, see [38], provides a representation for the functions (f m ) m∈N in the chaotic expansion in terms of Malliavin derivatives.
For square integrable process, a result analogous to Lemma 3.1 also holds.
There exists a sequence of deterministic functions (g m ) m∈N * s.t.

Representation of densities through Malliavin calculus
For F = (F 1 , · · · , F N ) ∈ (D ∞ ) N , we define the Malliavin covariance matrix γ F by Let us now introduce the non-degeneracy condition [ND] We say that the random vector F = (F 1 , · · · , F N ) satisfies the non degeneracy condition if γ F is a.s. invertible and det(γ F ) −1 ∈ ∩ q≥1 L q (Ω). In the sequel, we denote the inverse of the Malliavin matrix by Γ F := γ −1 F . This non degeneracy condition guarantees the existence of a smooth density, i.e. C ∞ , for the random variable F , see e.g. Corollary 2.1.2 in [32] or Theorem 9.3 in [21].
The following Proposition will be crucial in the derivation of an explicit representation of the density.
Also, for all q > 1, and all multi-index α, there exists (C, q 0 , q 1 , q 2 , r 1 , r 2 ) only depending on For the first part of the proposition we refer to Section V-9 of [21]. Concerning equation (3.3), it can be directly derived from the Meyer inequalities on LF q and the explicit definition of H, see also Proposition 2.4 in Bally and Talay [3]. A crucial consequence of the integration by parts formula is the following representation for the density.

Strategy and usual Brownian controls
We here concentrate on the particular case of the process (1.3) (indeed the estimates concerning (1.4) can be derived in a similar way). Since condition [H] is satisfied, assumption [ND] is fullfilled. It then follows from Theorem 2.3.2 in [32] that the process (X s ) s≥0 admits a smooth density p(t, x, .) at time t > 0. Our goal is to derive quantitative estimates on this density, emphasizing as well that we have different regimes in function of the starting/final points.
To do that, we condition w.r.t. to the non-degenerate Brownian component for which we explicitly know the density. For all (t, x, ξ) ∈ R + * × (R n+1 ) 2 we have: We then focus on the conditional density which agrees with the one of a smooth functional, in the Malliavin sense, of the Brownian bridge. Precisely: The estimation of p Yt is the core of the probabilistic part of the current work.
We recall, see e.g. [34], two ways to realize the standard n-dimensional Brownian bridge from a standard Brownian motion of R n . Namely, if (W t ) t≥0 denotes a standard n-dimensional Brownian motion then To recover the framework of Section 3.2, in order to deal with functionals of the Brownian increments, it is easier to consider the realization of the Brownian bridge given by (4.3).
For the sake of completeness, we recall some well known results concerning the Brownian motion and Brownian bridge.
Moreover, there exists c := c(n) ≥ 1, s.t. for all ζ ≥ 0, and 0 ≤ τ ≤ t, Proof. The first inequality is a simple consequence of the Brownian scaling. The second one can be derived from convexity inequalities and Lévy's identity that we now recall (see e.g. Chapter 6 in [34]). Let (B t ) t≥0 be a standard scalar Brownian motion. Then: The third inequality follows from the first two and the representation (4.2). Eventually, the deviation estimates follow from (4.4) as well. These deviations estimates can also be seen as special cases of Bernstein's inequality, see e.g. [34] p. 153.

Some preliminary estimates on the Malliavin derivative and covariance matrix
We now give the expressions of the Malliavin derivative and covariance matrix of the scalar random variable Y t defined in (4.1) and some associated controls.

Lemma 4.2 (Malliavin Derivative and some associated bounds) Let us set
Considering the realization (4.3) of the Brownian bridge, the Malliavin derivative of Y t (seen as a column vector) and the "covariance" matrix (that is in our case a scalar) write for all s ∈ [0, t]: Introduce now There exists C := C(k, n) ≥ 1 s.t. for all τ ∈ [0, t]: Also, for all q ≥ 1, there exists C(k, n, q) s.t.
when |x 1,n |∨|ξ 1,n | ≥ Kt 1/2 . For K := K(k, n, q) large enough, then the term M t (corresponding to the Malliavin covariance matrix of a Gaussian contribution) dominates the remainder. This intuitively explains the Gaussian regime appearing in ii) of Theorem 2.1.
Proof. Assertion (4.6) directly follows from the chain rule (see e.g. Proposition 1.2.3 in [32]) and the identity D s W 0,t . Concerning (4.10), we only prove the claim for τ = 0 for notational simplicity. Usual computations involving convexity inequalities yield that there exists C : On the other hand to prove that a lower bound at the same ordre also holds for M t one has to be a little more careful. W.l.o.g. we can assume that |ξ 1,n | ≥ |x 1,n |. Indeed, because of the symmetry of the Brownian Bridge and its reversibility in time (see Remark 4.1), if |ξ 1,n | < |x 1,n | we can perform the computations w.r.t. to the Brownian bridge (W Observe now that for s ≥ n 1/2 (4.14) Now, for s ≥ t (4.15) Equation (4.10) thus follows from (4.14), (4.15) and (4.13).
On the other hand, from (4.10) and the previous convexity inequality for R we get: . Equation (4.12) then follows from Proposition 4.1. Concerning the Ornstein-Uhlenbeck operator we will rely on the chaos expansion techniques introduced in Section 3.3.

"Gaussian" regime
In this section we assume that |x 1,n | ∨ |ξ 1,n | ≥ Kt 1/2 , for K := K(n, d) sufficiently large. That is we suppose that the starting or the final point of the non-degenerate component has greater norm than the characteristic time-scale t 1/2 . In this case, we show below that the dominating term in the Malliavin derivative is the one associated to the non-random term M D 1 in (4.7). This term corresponds to the Malliavin derivative of a Gaussian process. This justifies the terminology "Gaussian" regime.
In order to give precise asymptotics on the density of Y t , the crucial step consists in controlling the norm of Γ Yt Proof. As in Lemma 4.2, we assume, without loss of generality, that |ξ 1,n | ≥ |x 1,n |. To give the L q estimates of the Malliavin derivative we recall the definition of M t given in (4.9), and we use the following partition: Equation (4.10) in Lemma 4.2 provides us with an useful bound for M t . We next give estimates of P γ Yt ≤ Mt 4m , m ≥ 1 in the spirit of Bally [1].
Let us now turn to the lower bound for Γ Yt L p (P) . Write: Therefore, for |ξ 1,n | ≥ Kt 1/2 and K large enough, we get E[Γ q Yt ] ≥ 1 2(3Mt) q , which thanks to (4.10) completes the proof.

Controls of the weight for the integration by parts.
From Proposition 3.3 and Corollary 3.4, we derive using the chain rule for the last but one identity. We have the following L q (P), q ≥ 1, bounds for the random variable H t .
we get for all given q ≥ 1, using Lemma 4.4 for the last but one inequality. Now, from equations (4.6), (4.8), using the notations of Lemma 4.4, On the one hand equation (4.10) in Lemma 4.2 readily gives M . On the other hand, equation (4.12) of the same Lemma yields In order to get a bound for . Equation (4.6) and the chain rule yield that for all C 2 := C 2 (n, k, q, K) using (4.21) for the last inequality.
With the notations of equations (4.6), (4.8) we set for all for simplicity.

Non Gaussian regime
We now consider the case |x 1,n | ∨ |ξ 1,n | ≤ Kt 1/2 , which corresponds to a diagonal regime of the non-degenerate component w.r.t. the characteristic time scale. It turns out that the characteristic time-scale of the density p Yt (ξ n+1 − x n+1 ) is t 1+k/2 . Indeed, we have the following result.
Proof. For t > 0 write: . From Corollary 3.4 (Malliavin representation of the densities), we obtain: Now, as a consequence of the Brownian scaling we get ( Recalling that | x 1,n t 1/2 | ∨ | ξ 1,n t 1/2 | ≤ K we derive that the usual techniques used to prove the non degeneracy of the Malliavin covariance matrix under Hörmander's condition (see e.g. Norris [31] or Nualart [32]) yield that there exists C q := C q (n, k, K) ∈ R + * s.t. H

Off-diagonal bounds
From the Malliavin representation of the density given by (4.19), to derive off-diagonal bounds on the density, it remains to give estimates on P[Y t > ξ n+1 − x n+1 ].
(ii) If |x 1,n | ∨ |ξ 1,n | ≤ Kt 1/2 for the same previous K, Proof. We only prove point (i), the second point can be derived in a similar way. According with (4.5), we first decompose Y t as On the other hand, we have: (4.31) From Proposition 4.1 one gets that there exists C 3 := C 3 (k, n) ≥ 1 s.t.

Auxiliary deviation estimates
Still from the Malliavin representation of the density given by (4.19), when ξ n+1 − x n+1 is small, that is when for the degenerate component the starting and final points are close, we have to give estimates on P[Y t ≤ ξ n+1 − x n+1 ] (small and moderate deviations).
where L z 1/2 stands for the local time at level z and time 1/2 for the scalar process The last equality in (4.36) is a consequence of the scaling properties of the local time. From Tanaka's formula for semimartingales L Denoting with a slight abuse of notation ( 1] , we have the following differential dynamics for X u : where (B u ) u∈[0,1] is a standard scalar Brownian motion. Therefore, from equation (4.36) and the usual differential dynamics for the Brownian bridge: ].

Final derivation of the upper-bounds in the various regimes
In this section we put together our previous estimates in order to derive the upper bounds of Theorem 2.1 in the various regimes.
Remark 4.6 The above result means that the Gaussian regime holds if the final point ξ 1,n of the degenerate component has the same order as the "mean" transport term m t (x, ξ) := x n+1 + 2 k−1 k+1 (|x 1,n | k + |ξ 1,n | k )t (moderate deviations). A similar lower bound holds true, see Lemma 4.7.

Derivation of the heavy-tailed upper bounds
We here assume and C is as in the previous paragraph.
Hence, up to a modification of C, the control given by (4.38) holds for all off-diagonal cases.

Gaussian lower bound on the compact sets of the metric
We conclude this section with a proof of a lower bound for the density on the compact sets of the metric associated to the Gaussian regime in Theorem 2.1. A similar feature already appears in the appendix of [19].
Lemma 4.7 Assume that |x 1,n | ∨ |ξ 1,n | ≥ Kt 1/2 , K ≥ K 0 := K 0 (n, k) and that for a given Remark 4.8 The condition in the Lemma means that the deviation ξ n+1 − x n+1 has exactly the same order as the transport term t(|x 1,n | k + |ξ 1,n | k ), up to a neglectable fluctuation corresponding to the variance of the Gaussian contribution in Y t .

Potential Theory and PDEs
In this section we are interested in proving Harnack inequalities for non-negative solutions to for every positive solution u to L u = 0. We say that a set z 0 , z 1 , . . . , z k ⊂ O is a Harnack chain of lenght k if u(z j ) ≤ C j u(z j−1 ), for j = 1, . . . , k, for every positive solution u of L u = 0, so that we get In order to construct Harnack chains, and to have an explicit lower bound for the densities considered in this article, we will prove invariant Harnack inequalities w.r.t. a suitable Lie group structure. By exploiting the properties of homogeneity and translation invariance of the Lie group, we will find Harnack chains with the property that every C j in (5.3) agrees with the constant C K in (5.2). As a consequence we find u(z k ) ≤ C k K u(z 0 ), and the bound will depend only on the lenght of the Harnack chain connecting z 0 to z k .
Let us now recall some basic notations concerning homogeneous Lie groups (we refer to the monograph [7] by Bonfiglioli, Lanconelli and Uguzzoni for an exhaustive treatment). Let • be a given group law on R N +1 and suppose that the map (z, ζ) → ζ −1 • z is smooth. Then G = (R N +1 , •) is called a Lie group. Moreover, G is said homogeneous if there exists a family of dilations (δ λ ) λ>0 which defines an automorphism of the group, i.e., δ λ (z • ζ) = (δ λ z) • (δ λ ζ) , for all z, ζ ∈ R N +1 and λ > 0.
We also make the following assumption.
[L] L is Lie-invariant with respect to the Lie group G = R N +1 , •, (δ λ ) λ>0 , i.e. i) Y 1 , . . . , Y n and Z are left-invariant with respect to the composition law of G, i.e.
To illustrate Property [L] we recall the Lie group structure of the Kolmogorov operator corresponding to k = 1 in (1.4).

Example 5.1 (Kolmogorov operators)
Clearly, L can be written as in It is known that the composition law • is always a sum with respect to the t variable (see Propostion 10.2 in [24]). Moreover, the family (δ λ ) λ>0 acts on R N +1 as follows: where σ = (σ 1 , σ 2 , . . . , σ N ) ∈ N N is a multi-index. The natural number Q = N k=1 σ k + 2 is called the homogeneous dimension of G with respect to δ λ . We shall assume that Q ≥ 3. Observe that the diagonal decay of the heat kernel on the homogeneous Lie group is given by the characteristic time scale t −(Q−2)/2 . For the above example we have Q = n + 3 + 2, matching the diagonal exponent in (1.5) (Q − 2)/2 = (n + 3)/2.
Write the operator L as follows for suitable smooth coefficients a i,j 's and b j 's only depending on the vector fields Y 0 , . . . , Y n . As n < N , L is strictly degenerate, since the rank(A(x)) ≤ n at every x (here A(x) := (a i,j (x)) i,j∈[[1,n]] ). In Example 5.1 we see that rank(A) never vanishes. We say that L is not totally degenerate if for every for any non-negative solutions u to L u = 0 in O. Here C K is a positive constant depending on O, K, z 0 and on L .
Our Theorem 5.2 improves Bony's one in that it gives an explicit geometric description of the set K in (5.7). Also, it is more general than the one in [15],

Potential Theory
For the first part of the section, we assume L to be a general abstract parabolic differential operator satisfying Let V be a bounded open subset of R N +1 with Lipschitz-continuous boundary. We say that V is L -regular if, for every z 0 ∈ ∂V , there exists a neighborhood U of z 0 and a smooth function w : U → R satisfying Note that the function ψ(x, t) = 1 2 + 1 π arctan t verifies 0 ≤ ψ ≤ 1, L ψ < 0 in R N +1 .
Moreover, H V ϕ ≥ 0 whenever ϕ ≥ 0 (see Bauer [4] and Constantinescu and Cornea [18]). Hence, if V is L -regular, for every fixed z ∈ V the map ϕ → H V ϕ (z) defines a linear positive functional on C(∂V, R). Thus, the Riesz representation theorem implies that there exists a Radon measure µ V z , supported in ∂V , such that for every ϕ ∈ C(∂V, R). (5.10) We will refer to µ V z as the L -harmonic measure defined with respect to V and z. -the family S(R N +1 ) separates the points of R N +1 , i.e., for every z, ζ ∈ R N +1 , z = ζ, there exists u ∈ S(R N +1 ) such that u(z) = u(ζ).
This last separation property is proved in Lemma 5.5. We will in fact show a stronger result: actually, the family S + (R N +1 ) ∩ C(R N +1 ) separates the points of R N +1 . A harmonic space (R N +1 , H) satisfying this property is said to be a B-harmonic space.
In order to prove the separation property we use a fundamental solution Γ of L . To prove the existence of a fundamental solution we now rely on condition [H] that we assumed to be in force through the paper. We recall that a fundamental solution is a function Γ with the following properties: i) the map (z, ζ) → Γ(z, ζ) is defined, non-negative and smooth away from the set {(z, ζ) ∈ R N +1 × R N +1 : z = ζ}; ii) for any z ∈ R N +1 , Γ(·, z) and Γ(z, ·) are locally integrable; iii) for every φ ∈ C ∞ 0 (R N +1 ) and z ∈ R N +1 we have iv) L Γ(·, ζ) = −δ ζ (Dirac measure supported at ζ); v) if we define Γ * (z, ζ) := Γ(ζ, z), then Γ * is the fundamental solution for the formal adjoint L * of L , satisfying the dual statements of iii), iv); vi) Γ(x, t, ξ, τ ) = 0 if t < τ .

Remark 5.4 Assumption [H]
implies the existence of a smooth density p(t, ξ, x)dx := P ξ [X t ∈ dx], t > 0, for the process (X t ) t≥0 associated to L see e.g. Stroock [37] or Nualart [32]. Actually, is a fundamental solution for L in the above sense. Indeed p satisfies the Kolmogorov equation L p = 0, in R N +1 \{(ξ, τ )}. We refer to Bonfiglioli and Lanconelli [6] for a purely analytic proof of existence of fundamental solutions for operators satisfying [H], [L].
If condition [L] holds, then we also have: We next prove the separation property for L by adapting the argument in [14, Proposition 7.1].
In the case t 1 = t 2 , x 1 = x 2 , we consider the sequence O n (z 2 ) = ζ ∈ R N +1 : Γ(z 2 , ζ) > n Q−2 , n ∈ N. (5.13) We note that O n (z 2 ) shrinks to {z 2 } as n → ∞, by property viii) of the fundamental solution. For any ϕ n ∈ C ∞ 0 (O n (z 2 )) such that ϕ n = 1 and ϕ n ≥ 0, we define u ϕn as in (5.12). Then, u ϕn is a smooth non-negative function in R N +1 satisfying L u ϕn ≤ 0, and so u ϕn is Lsuperharmonic. It holds where C is a real positive constant independent of n. This ends the proof.
We summarize the above facts in the following A remarkable feature of a B-harmonic space is that the Wiener resolutivity theorem holds (see [4,18]). In order to state it, we introduce some additional notations. We recall that if O ⊂ R N +1 is a bounded open set, then an extended real function f : We say that H O f is the generalized solution in the sense of Perron-Wiener-Brelot to the problem u ∈ H(O), u = f on ∂O.
The Wiener resolutivity theorem yields that any f ∈ C(∂O, R) is resolutive. The map We call µ O z the L -harmonic measure relative to O and z, and when O is L -regular this definition coincides with the one in (5.10). Finally, a point ζ ∈ ∂O is called L -regular for O if lim O∋z→ζ , for every f ∈ C(∂O, R).  In order to prove Theorem 5.2 we give the following

Harnack inequalities
Let V ⊂ V ⊂ O be a L -regular neighborhood of z 1 with z / ∈ V . Arguing as above, we can find t 2 ∈]t 1 , T [ such that γ([t 1 , t 2 [) ⊂ V and z 2 = γ(t 2 ) ∈ ∂V . Consider any neighborhood W of z 2 , such that W ⊂ O \ O z 0 . Let ϕ ∈ C(∂V ) be any non-negative function, supported in W ∩ ∂V , and such that ϕ(z 2 ) > 0. Recalling that the harmonic function H V ϕ is non-negative, we aim to show that H V ϕ (z 1 ) > 0. (5.18) By contradiction, we suppose that H V ϕ vanishes at z 1 . In other terms, H V ϕ attains its minimum value at z 1 , then Bony's minimum principle implies H V ϕ ≡ 0 in γ([t 1 , t 2 [). As a consequence, since H V ϕ satisfies (5.9), lim On the other hand, by the choice of ϕ This contradicts (5.19) and proves (5.18). By using representation (5.10) of H V ϕ in terms of the L -harmonic measure, (5.18) reads as follows On the other hand, z 1 belongs to the absorbent set O z 0 , so that µ V z 1 (∂V \ O z 0 ) = 0. But this clashes with (5.20), being W ⊆ O \ O z 0 . This accomplishes the proof.
Proof of Theorem 5.2. It is a plain consequence of Proposition 5.7 and Lemma 5.8 As the following proposition shows, we are able to give a complete characterization of the set O z 0 if A z 0 is an absorbent set as well.
Proof. Since u is continuous and non-negative, is a closed subset of O. Let z ∈ A z 0 , and let V ⊂ V ⊂ O be a L -regular neighborhood of z.
Hence A z 0 is an absorbent set. The last statement plainly follows from Proposition 5.9.

Lifting and Harnack inequalities
We first consider the PDE (2.7) for k = 2. Note that, in this case, it is equivalent to (2.8), and reads as follows It is homogeneous with respect to the following dilation δ λ (x, t) = λx 1,n , λ 4 x n+1 , λ 2 t . (5.22) Even if L does not satisfy [L]-i), it has a fundamental solution Γ which shares several properties of the usual heat kernels. We remark that, since L does not satisfy the controllability condition [C], the support of Γ is strictly contained in the half space t < τ .
Clearly, γ = γ 1 + γ 2 + · · · + γ 5 is a L -admissible curve of R 2n+2 connecting (0, 0, 0) to (x, y, t). Next we prove that, for sufficiently small ε, the trajectory γ is contained in O. To this aim, as the set O is convex and the paths γ 1 , γ 2 , . . . , γ 5 are segments, we only need to show that the end-points of γ 1 , γ 2 , γ 3 , γ 4 belong to O. The inequalities −1 < |y|−sε(1−ε) −t−sε < 1 directly follow from the definition of s ε , for sufficiently small positive ε. The other inequalities are a plain consequence of the fact that 0 < s ε < −t < 1, as previously noticed. Since A z 0 is the closure of the set of the points that can be reached by a L -admissible path, we get This concludes the proof of (5.28).
To complete the proof, by Proposition 5.10 it is sufficient to find a non-negative solution v of L v = 0, such that v ≡ 0 in A z 0 , and v > 0 in O \ A z 0 . Let ϕ be any function in C(∂O), such that ϕ ≡ 0 in ∂O ∩ A z 0 and ϕ > 0 in ∂O \ A z 0 . Then the Perron-Wiener-Brelot solution v := H O ϕ of the following Cauchy-Dirichlet problem is non-negative. Next we prove that v > 0 in O \ A z 0 . By contradiction, let (x, y, t) ∈ O \ A z 0 be such that v(x, y, t) = 0. Then (x, y, t) is a minimum for v, so that from Bony's minimum principle [9, Théorème 3.2] it follows that v( x 1,n , x n+1 , y, t) = ϕ( x 1,n , x n+1 , y, t) = 0, for every x 1,n ∈ ∂(] − 1, 1[ n ). Since every point ( x 1,n , x n+1 , y, t) is regular for the Dirichlet problem, and belongs to ∂O\A z 0 , we find a contradiction with our assumption on ϕ. Suppose now that there exists (x, y, t) ∈ A z 0 such that v(x, y, t) > 0. Since every point of the set Hence there exists a (x, y, t) ∈ A z 0 such that v(x, y, t) = max Az 0 v > 0. By Bony's minimum principle we have v( x 1,n , x n+1 , y, t) = ϕ( x 1,n , x n+1 , y, t) > 0, for any x 1,n ∈ ∂(] − 1, 1[ n ), and this fact contradicts our assumption on ϕ.
Next we introduce some notations to state a Harnack inequality which is invariant with respect to the group law • defined in (5.24) and the dilation δ r introduced in (5.25). Consider the box Q r =] − r, r[ n ×] − r 4 , r 4 [×] − r 3 , r 3 [ n ×] − r 2 , 0], and note that Q r = δ r Q 1 . For every compact set K ⊆ Q 1 , for any positive r and for any z 0 ∈ R 2n+2 we denote by Corollary 5.12 For every compact set K ⊆ (x, y, t) ∈ Q 1 | 0 < x n+1 < −t, |y| 2 < −tx n+1 , r > 0 and z 0 ∈ R 2n+2 there exists a positive constant C K , depending only on L and K, such that sup for every non-negative solution v of L v = 0 on any open set containing Q r (z 0 ).
Proof. Consider the function w(z) = v z 0 • δ r z . By the invariance with respect to δ r and •, we have L w = 0 in Q 1 . Aiming to apply Theorem 5.2, we consider the open set O defined in (5.27), and we note that O ∩ t < 0 ⊂ Q 1 . Then w is defined as a continuous function on ∂O ∩ t < 0 . We extend w to a continuous function on ∂O, and we solve the boundary value problem L w = 0 in Q 1 , with w = w in ∂O. Then we apply Theorem 5.2 and Lemma 5.11, and we get sup K w ≤ C K w(0, 0, 0). By the comparison principle we have w = w in O ∩ t ≤ 0 , then the claim plainly follows from the inclusion K ⊂ O ∩ t < 0 .
We are now ready to build a Harnack chain for (5.21) by using the following set which is a compact subset of (x, y, t) ∈ Q 1 | 0 < x n+1 < −t, |y| 2 < −tx n+1 . Before doing that for k = 2 only, we extend the above procedure to equations (2.7) and (2.8) for k > 2.

Remark 5.13
For every (ξ, τ ) ∈ R n+2 , t > 0 and for any constant vector ω ∈ R n , we have We next focus on the attainable set A z 0 of the unit cylinder with respect to the point z 0 = (0, 0, 0). Here |x 1,n | and |y| denote, respectively, the Euclidean norm of the vectors x 1,n ∈ R n and y ∈ R (k−1)n . Unlike the case k = 2, as k > 2 we are not able to give a complete characterization of the sets A z 0 and O z 0 as we did in Lemma 5.11. We will consider instead the differential of the end point map related to (5.34) to find some interior points of A z 0 . With obvious meaning of the notations, we set x(T ), y(T ), t(T ) = γ(T ), we note that t(T ) = t − T , and we define We refer to the classical literature (see e.g. [12, Theorem 3.2.6]) for the differentiability properties of E. We next show that the differential DE(ω) of E, computed at some given ω ∈ L 2 ([0, T ]) is surjective. Hence E(ω) is an interior point of A z 0 , so that we can apply Theorem 5.2.
Lemma 5.14 Let w be any given vector of R n such that w j = 0 for every j ∈ [ [1, n]]. Consider the solution γ to the problem (5.34), with ω ≡ w. Then DE(ω) is surjective.
Proof. By the invariance of the vector fields Y i , i ∈ [ [1, n]], and Z with respect to the homogeneous Lie group G, is not restrictive to assume (x, y, t) = (0, 0, 0) and T = 1. To prove our claim, we compute where where o(h) vanishes as h goes to zero. This proves (5.44). Analogously,  (5.46) in the case of (1.4). Note that for all i ∈ [ [1, k]] one has We next obtain, as a corollary, a Harnack inequality which is invariant with respect to the Lie group G = R kn+2 , •, ( δ λ ) λ>0 . For every compact subset K of the unit cylinder O defined in (5.40), any positive r and any z 0 = (x 0 , y 0 , t 0 ) ∈ R kn+2 we set that is the group law • is additive w.r.t. the (n + 1)th component. In some sense this allows to move in the direction of the vector field [∂ x 1 , [∂ x 1 , · · · , [∂ x 1 , Z] · · · ]] k times = k!∂ x n+1 .
The proof of the two lemmas is based on the lifting procedure introduced in Section 5 and on the construction of a finite sequence of cylinders contained in the lifted domain of the solution. Specifically, we find points along the trajectory of the integral path introduced in (5.34). The bounds depend on the length of the Harnack chain that in turns depends on the slope of the trajectory. Asymptotic lower bounds are proved in Propositions 6.3 and 6.4. Lemma 6.1 Let L be the operator defined in (2.7) or in (2.8), let T 1 , τ, t, T 2 be such that T 1 < τ < t < T 2 and t − τ ≤ τ − T 1 . Let γ : [0, t − τ ] → R n+1 × ]T 1 , T 2 [ be a path satisfying γ ′ (s) = n j=1 ω j Y j (γ(s)) + Z(γ(s)), γ(0) = (x, t), γ(t − τ ) = (ξ, τ ), Then there exists a positive constant C, only depending on L , such that: for every non-negative solution u to L u = 0 in R n+1 × ]T 1 , T 2 [.
We omit the other details of the proof of (i) and (ii).
The proof of (iii) follows from the same argument used in the proof of (i). Note that, as k is odd, the function defined for any b ∈ R is surjective, then in this case Lemma 6.1 is sufficient to conclude the proof.