An inexact Newton method for solving complementarity problems in hydrodynamic lubrication

We present an iterative procedure based on a damped inexact Newton iteration for solving linear complementarity problems. We introduce the method in the framework of a popular problem arising in mechanical engineering: the analysis of cavitation in lubricated contacts. In this context, we show how the perturbation and the damping parameter are chosen in our method and we prove the global convergence of the entire procedure. A Fortran implementation of the method is finally analyzed. First, we validate the procedure and analyze all its components, performing also a comparison with a recently proposed technique based on the Fischer–Burmeister–Newton iteration. Then, we solve a 2D problem and provide some insights on an efficient implementation of the method exploiting routines of the Lapack and of the PETSc packages for the solution of inner linear systems.


Introduction
Linear Complementarity Problems (LCPs) began to be extensively studied in the mid 1960's (e.g. see [17]) and they immediately excited much interest for both their mathematical properties and their applications. Indeed, on the mathematical side, the LCP unifies the formulation of linear programming, quadratic programming and bimatrix game problems, leading to important discoveries in all these fields [18, p. 3]. Several applications ranging from engineering, to economics or computer science, nonetheless, exist.
For instance, an interesting problem that often arises in mechanical engineering and that can be formulated as a complementarity problem consists in the analysis of cavitation in lubricated contacts. For a comprehensive review of earlier studies on this problem, the reader may refer to [14]. Other relevant contributions include [16,19,20,35,37]. Other studies, like [33], have then considered the case of cavitation in elastohydrodynamic lubrication, while, more recently, also the effect of fluid compressibility, piezoviscosity and non-Newtonian fluid behavior has been analyzed [11].
The aim of this paper is to propose an iterative procedure based on a Damped Inexact Newton (DIN) iteration for solving LCPs. Approaches of this kind are characterized by the presence of a perturbation, which prevents the algorithm from stalling, and of a damping parameter, which ensures the global convergence of the procedure. The choice of these two parameters is, thus, of fundamental importance and it is the distinctive feature of different damped inexact Newton methods. Therefore, we show how the perturbation and the damping parameters are chosen in our procedure and provide a proof of the global convergence of the method. We also analyze how different perturbations affect the efficiency of the method itself.
Although the procedure we propose is general, it seems appropriate to present it with special regard to a specific problem, which, in our case, is the aforementioned cavitation in hydrodynamic lubrication. This idea follows [15], where a procedure based on an inexact Newton iteration with Armijo backtracking condition has been used for solving practical problems concerning oxygen diffusion and combustion. In particular, we refer to the model of cavitation in hydrodynamic lubrication presented in [29], where the authors reformulated the problem so that pressure and a variable related to density are complementary in the entire domain [1].
We also compare our method with a similar procedure based on the Fischer-Burmeister-Newton (FBN) iteration. Algorithms of this kind are efficient and have been recently applied to lubrication problems in [45]. However, we show that the DIN-based algorithm better enforces the non-negativity conditions imposed by the complementarity.
Lastly, we also analyze the efficiency of the method. In particular, we make some considerations on how efficiency can be improved and compare different inner solvers, all implemented through the Lapack [4] or the PETSc [7,8] packages.
The paper is structured as follows. In Sect. 2 we outline the problem of our concern, which is described by the Reynolds equation. For a more detailed description of this equation in bearing systems, the reader is referred to [32,Chapt. 6]. We then rewrite the problem in a complementarity formulation, present its discretized form and introduce the general layout of our method applied to the resulting nonlinear system.
In Sect. 3 we analyze the method. We prove that the conditions for it to be a damped inexact Newton method hold and we provide a proof of its convergence. In this regard, we also show how the perturbation and the damping parameter are chosen and which criteria they must satisfy. Finally, we briefly present the FBN method, which we later compare with the DIN iteration.
Sections 4, 5 and 6 are devoted to numerical experiments. First, in Sect. 4 we provide the numerical data which define the test problems and the setting of the algorithm. We also summarize the configurations of lubricated bearings which are considered afterwards. Then, in Sect. 5 we validate and analyze the procedure by a Fortran implementation of the DIN method. We consider the behavior of the method in different situations (e.g. changing the starting vectors) and when some parts of the algorithm are removed, so to better show the role of each component. We also compare the results obtained by the DIN method with those computed by the FBN iteration. Finally, Sect. 6 presents some insights on a faster implementation and provides a comparison of various linear solvers called through Lapack or PETSc. Lastly, we present the solution of a 2D example.
Conclusions are then given in Sect. 7.
2 Layout of the algorithm applied to a complementarity problem

Description of the problem
The problem of our interest is described by the Reynolds equation (1886), which governs the pressure distribution of the thin, fluid films typical of bearing systems. When the thickness of the film is constant in time, the equation is stationary and it reads 1 12μ where -the unknowns are the pressure p = p(x, y) and the density ρ = ρ(x, y); h = h(x, y) > 0 is the thickness of the film; -μ > 0 is the viscosity of the fluid; -U > 0 is the velocity of the fluid.
Cavitation consists in cavities forming in the fluid when pressure gets low. Equation (1) remains valid also in this case, but pressure and density behave differently where cavitation does and does not occur. Indeed, in the zone where cavitation does not occur (called active region), p is always greater than or equal to zero. Density ρ is instead equal to the density of the liquid, ρ 0 , implying ρ 0 − ρ = 0. On the other hand, in the cavitated region (also called inactive region), ρ is not equal to ρ 0 , since the fluid is now made of liquid, vapor and gas. Hence ρ ≤ ρ 0 , or, equivalently, ρ 0 − ρ ≥ 0. Simultaneously, we have p = 0. Thus we can write the complementarity condition Following [29], we then pose r = (ρ 0 − ρ)/ρ 0 and write (1) as subject, in cavitation, to the complementarity condition pr = 0, with p ≥ 0, r ≥ 0. We now discretize the domain by a grid of n inner points and approximate the derivatives of Eq. (3) by a finite difference scheme [44,Chap. 6] or by a finite element method [43, §14.5]. In this paper we use the box-discretization in [44, pp. 196-199] and discretize the derivative in r by backwards finite difference quotients. It is worth noticing that this discretization method works directly on the self-adjoint equation. Therefore, it is sufficient to require the continuity of h 3 ∂ p/∂ x and of h 3 ∂ p/∂ y, while the partial derivatives of p can be piece-wise continuous.
After discretization, we obtain the algebraic system where 0 ∈ R n denotes the null vector, A ∈ R n×n , B ∈ R n×n and c, p, r ∈ R n . This is the formulation given in [18, p. 30] for the generalized complementarity problem. More specifically, considering, for simplicity, a 1D problem and defining the matrix A is the tridiagonal matrix of elements a i, j It is easy to verify that A is irreducible [44, p.18] Then, B is diagonally dominant by columns (strictly at the last column) and it has negative diagonal elements and non-negative off-diagonal elements. Moreover, it is nonsingular. In the 2D case, similar considerations apply, with A block-tridiagonal matrix (with diagonal blocks that are tridiagonal and sub-and super-diagonal blocks that are diagonal) and B block-diagonal (with lower-bidiagonal blocks).

The damped inexact Newton iteration
Problem (4) is the LCP that we use to present our damped inexact Newton iteration. In this regard, first we write problem (4) as a system of 2n nonlinear algebraic equations with restriction on the sign of the components of p and of r: where F 1 ( p, r) ∈ R n is Ap + Br − c, e ∈ R n is the unity vector and P, R ∈ R n×n are diagonal matrices whose non-zero entries are the elements of p and of r respectively. Then, we write the Jacobian matrix and set the Newton iteration to solve (4): chosen an initial iterate ( p (0) , r (0) ) sufficiently close to the solution, we define where (Δp (k) , Δr (k) ) is the solution of the linear system A typical problem with this method is that the iteration on P Re = 0 can lead the algorithm to stall. This has been observed, for example, in [25] with regard to the application of the Newton method to the Karush-Kuhn-Tucker conditions of a nonlinear programming problem. We can see this easily by considering a single ith equation, n < i ≤ 2n, of system (12) To solve this problem, we perturb the complementarity equations in order to force the iterates sufficiently far from the boundary of the nonnegative orthant p ≥ 0, r ≥ 0. Callingρ a positive scalar, the perturbed system becomes Therefore, now we have to solve the perturbed Newton equation at each iteration, withρ k → 0 for k → ∞. In this way, Eq. (16) replaces (12) and, together with Eq. (11), it generates a sequence {p (k) , r (k) } which strictly satisfies the non-negativity conditions in (14). Equation (8) is instead satisfied in the limit. Finally, we complete the method by introducing a damping parameter α k , 0 < α k < 1, which enforces convergence, as described in Sect. 3. Denoting the solution of (16) by (Δp (k) , Δr (k) ), Eq. (11) is therefore replaced by

Analysis of the method
For a positive initial guess ( p (0) , r (0) ) and for k = 0, 1, . . . , Eqs. (16) and (17)  1. the vector (Δp (k) , Δr (k) ) solution of (16) must be a descent direction for the merit function Φ( p, r) = F( p, r) 2 , where × denotes the Euclidean norm; 2. α k must guarantee the reduction of the merit function at each iteration, while the components of p (k) and r (k) remain positive for all k = 0, 1, . . . . This means defining a path following condition and a backtracking condition which α k must satisfy.
To satisfy these conditions, we act on the choice of the perturbationρ k and of the damping parameter α k .

Choice ofρ k and descent condition
With a suitable choice of the perturbation, we can interpret Eq. (16) as the kth iteration of an inexact Newton method [21,24,27]. Let us set where σ k is the forcing term, 0 < σ min ≤ σ k ≤ σ max < 1; -μ k is a perturbation parameter which satisfies If we isolateρ kẽ in (16), take the norm of both sides of the equation and replaceρ k with its definition in (18), applying condition (19) we get Thus, σ k μ kẽ has the meaning of a residual. Therefore, we call (20) residual condition of the inexact Newton method with forcing term σ k . Now we must then choose a μ k for which the residual condition is satisfied. In this regard, in [23] it is suggested to choose μ k in the interval With this choice, the residual condition is satisfied and the following theorem holds.
Proof First let us determine how μ k must be for (22) to be satisfied. In this regard, let us substitute

∇Φ( p, r) = 2F ( p, r) T F( p, r)
in (22). By (16) and by F( p, r) Tẽ = e T R Pe = p T r, with a few simple algebraic passages we can write It follows that condition (22) is satisfied if We now have to prove that μ (1) k and μ (2) k satisfy (24) and that inequality (21) holds. To this aim, in the following we use the chain of inequalities (e.g., see [30, pp. 278-279]) (where × 1 indicates the l 1 -norm) and It follows proving that (22) is satisfied for μ (1) k and μ (2) k . Finally, we also have which proves that (21) holds as well.

Solution of the perturbed Newton equation
We now have to solve the perturbed Newton equation (16). This can be done exactly by a direct solver or approximately by an iterative one. This last option is useful for reducing the computational cost of the procedure when the Jacobian matrix is of large dimensions. Moreover, the tolerance of the iterative solver can be set adaptively, so to require less inner iterations when p (k) and r (k) are still far from the solution.
In order to do so, we need to introduce a new parameter δ k , which is needed to define adaptively the stopping condition of the iterative solver, but which also affects all the following analysis. Every consideration involving δ k , however, applies also to case of direct inner solver: in this case, we simply pose δ k = 0.

Choice of α k
After solving the perturbed Newton equation, we need to compute the new iterate using (17). The damping parameter α k is used to enforce the global convergence of the method and its choice is performed by verifying three conditions: feasibility, centrality and backtracking.

The feasibility condition
The feasibility condition determines the initial value of α, which will then be passed to the centrality conditions. For k = 0, 1, . . . , the feasibility condition reads The value of α computed by (27) guarantees the feasibility of (k) and Δr (k) are the approximate solutions of (16) with residual condition (19).

The centrality conditions
The centrality conditions force the iterates p (k) and r (k) to adhere to the central path, forcing them sufficiently far from the boundary of the nonnegative orthant. In this regard, following [25], let us define the functions where γ k ∈ [0.5, 1) and with p (0) > 0 and r (0) > 0. Imposing also it is possible to prove (see [28]) the existence of a value of α satisfying ϕ (k) (α) ≥ 0 and ψ (k) (α) ≥ 0. These two inequalities are the centrality conditions 1 and we denote byα k an α which satisfies them both. In practice,α k is usually computed recursively by dividing α (provided by the feasibility condition) by a factor larger than 1 until both the centrality conditions are satisfied. We now need to prove the boundedness of the sequences {p (k) } and {r (k) }, which is required also to prove thatα k is bounded away from zero. To do so, given ε ≥ 0, let us define the level set with p (0) > 0 and r (0) > 0. This last condition is required by Eq. (30). The following theorem holds.
, this would mean p T r > 0 and min 1≤i≤n P Re > 0, contradicting the hypothesis. Therefore, P Re = 0 is not possible. The sequences { p i } are then bounded away from zero, proving proposition 1.
Let us then consider the properties of A and of B in Sect. 2.1 and the positivity of p j > 0 and r j > 0, j = 1, . . . , n (which makes P and R nonsingular). We can prove that A − B P −1 R is nonsingular. Indeed, the matrix −B is a column diagonally dominant matrix with positive diagonal elements and non-positive off-diagonal elements. These properties are conserved when −B is multiplied to the right by diagonal matrices with positive diagonal entries, such as P −1 and R. It follows that A− B P −1 R is the sum between two column diagonally dominant matrices with positive diagonal entries and non-positive off-diagonal entries. Moreover, strict diagonal dominance holds for at least a column index. Hence, A − B P −1 R is nonsingular. Indeed, it is irreducibly diagonally dominant and (by the sign of its elements) it is also a nonsingular M-matrix.
Considering a matrix like the Jacobian F ( p, r) in (10), the nonsingularity of A − B P −1 R and of P implies that the inverse of the Jacobian is (e.g. see [10, p.108 Since every sub-matrix is bounded, so is also F ( p, r) −1 . Then, {F ( p (k) , r (k) ) −1 } is uniformly bounded in Ω(ε), ε > 0, proving proposition 2.
Finally, let us consider that, by the definition ofr (k) in (25), we have Then, let us consider that the boundedness of , ε > 0 and k ≥ 0. Moreover, the inequalities in (19) and in (26) pose similar conditions on σ k μ kẽ and onr (k) , respectively. Using these relationships and remembering σ k + δ k ≤ σ max + δ max < 1 in Ω(ε), ε > 0, we get Therefore, the sequence completing the proof.
Using this result, it is possible to prove thatα k is bounded away from zero as well. In this regard, see [28,Theorem 6].

The backtracking condition
The value ofα k computed by the feasibility and by the centrality conditions could still be too large to assure a reduction of the merit function Φ( p, r). Thus, we introduce the backtracking condition. In this regard, we use the Inexact Newton Backtracking algorithm (INB), which is a line search strategy described in [24] that recursively reducesα k by the relationship α k = θ tα k , with θ ∈ (0, 1) and t nonnegative integer which is gradually increased. The procedure stops as α k satisfies with β ∈ (0, 1). The proof that t is finite, and, therefore, α k remains bounded away from zero, runs as that in [28, pp. 364-366].
Moreover, it is also possible to show (see [12, p. 9]) that the damped inexact Newton method has a super-linear local convergence.

A note on the Fischer-Burmeister-Newton iteration
As mentioned earlier, complementarity problems connected with hydrodynamic lubrication have been recently solved in [45] using a solution algorithm based on the Fischer-Burmeister-Newton (F B N ) method, which has been introduced in [26].
In the FBN approach, F( p, r) in (8) is replaced bỹ and the Newton method is then applied to the nonlinear systemF( p, r) = 0. The definition of the method follows, then, (11) and (12) withF instead of F. Here we call (Δp (k) , Δr (k) ) the solution of the linear system (12) applied toF( p, r) = 0.
It is easy to show that also the F B N method suffers of the problem that, if a component of p (k) or of r (k) reaches the boundary of the feasible region, it sticks to it also in the successive iterations.
In order to ensure global convergence, we assume the non-singularity of the Jacobian ofF( p (k) , r (k) ) and we consider two approaches: 1. Introduce a damping parameter α k satisfying the Armijo rule [6] withβ ∈ (0, 1), e.g.,β = 10 −4 . In this case, the global convergence can be proved by showing that (Δp (k) , Δr (k) ) is a descent direction for the merit functionΦ( p, r) = F ( p, r) 2 and use the backtracking condition (32) with σ k = 0. In this case, the method becomes an inexact Newton method. Therefore, one has to prove that the inequalities (33) and (34) are satisfied. In this regard, defined the compact set it is easy to prove the boundedness of the sequence by taking into account the non-singularity of the Jacobian matrix. We can then make the same considerations of the DIN method and, setting ξ k = 1 −βα k (1 − δ k ) and , we can show that both the convergence inequalities (33) and (34) are satisfied withF andF instead of F and F . It is possible to show that this method converges also for δ k = 0 (which is, for a direct inner solver) by following the theorem [34, §6.3], which regards the convergence

Numerical experiments
We now introduce the numerical experiments, performed using a Fortran implementation of the method. In the following, we define the analyzed problems and the numerical parameters used in the experiments.

Setting of the algorithm
The parameters of the solver are reported in Table 1 are the upper bounds for τ 1 and τ 2 as in (30) and θ is the parameter multiplying α k until feasibility, centrality and backtracking condition are satisfied.
Other parameters depend on the type of problem we are solving. Indeed, in 1D cases with coarser discretizations, we use a direct inner solver (Gaussian elimination) so to analyze the behavior of the D I N algorithm itself, without having to deal with the additional residual given by an iterative inner solver, which would require further remarks. We therefore set σ k = 0.01 F( p (k) , r (k) ) ; σ max = 0.9; δ k = 0.
The same applies to the more efficient direct solvers used in Sect. 6. When we employ iterative solvers, on the other hand, δ k + σ k < 1 and (31) must hold. We thus set σ k = 0.01 F( p (k) , r (k) ) ; σ max = 0.9; When not otherwise specified, we setˆ = 1/n. Since σ k can become very small, we also impose a lower bound to the tolerance, which at the kth iteration is then chosen as Finally, the Newton iteration is stopped when the stopping criteria are satisfied, where 1 = 2 = 10 −8 when not otherwise specified. The maximum number of DIN iterations is set at 500, while the maximum number of inner iterations is set at 10,000.

Numerical data of the test problems
In We consider both homogeneous and non-homogeneous Dirichlet boundary conditions on p. This describes the two physical situations that can occur: in the first case, we are starting in a cavitated situation, while in the second the entire phenomenon happens inside the domain. We also modify h(x), changing the shape of the film. In particular, we reproduce the Convergent-Divergent (C-D) and the Divergent-Convergent (D-C) schematics (see [29]). We can reproduce the C-D schematics by defining h(x) as a convex function passing by h max in x = a and in x = b and by h min in x = (a + b)/2. In this regard, we define h(x) as a parabola h(x) = ax 2 + bx + c or as a sinusoid The D-C schematics is reproduced analogously with concave functions. In the following, when we speak of divergent and convergent parts of the domain, we thus refer to the reducing or increasing thickness of the film, respectively.
Depending on the boundary conditions and on the configuration of the film, we define the following three test problems, embedding the cases most commonly analyzed in the literature:

Validation of the method
In this subsection, we present the results obtained by solving Problem 1, Problem 2 and Problem 3 with the numerical data in (38) (scaled so that the result for p is in MegaPascal, as commonly done in the literature) by the DIN method. The film thickness h(x) is here described by a sinusoid as in Sect. 4.2 and the domain is discretized by n = 100 inner points. Figure 1 provides the plots of the pressure for all these cases, along with the plot of r and of h(x).
The C-D configuration with homogeneous Dirichlet conditions on p is the simplest case, whose solution is provided by several articles in the literature (e.g. see [29], where also [13,40] are cited). The trend of the pressure computed with our code is the same as in these works, as it can be seen in Fig. 1a. Moreover, we also see that the complementarity condition is always respected.
The same can be said for Problem 2 (Fig. 1b) and for Problem 3 (Fig. 1c). This latter situation is probably the most interesting one: indeed, as stated in [29], some formulations (like those in [33] and in [40]) fail to give correct results for Problem 3 since they assume that the cavitation can occur only in the divergent parts of the profile. This leads to some inaccuracies at the boundary of Problem 2 as well. On the contrary, the used cavitation model leads to physically realistic solutions: in Fig. 1c, for instance, cavitation occurs in the converging part of the film, as in [29]. All the computed profiles are, thus, in accordance with the most recent results found in the literature.
To further assess the validity of our approach, we also formulate h(x) in a different way. As mentioned in Sect. 4.2, we consider h(x) described by a parabola, as well. Considering, for example, Problem 1, in Fig. 2 we compare the results obtained for a parabolic and a sinusoidal h(x).
Qualitatively, the pressure profiles are really similar. However, as it can be intuitively expected, a less abrupt decrease in thickness leads to lower peaks of pressure. The behavior is the same also for Problem 2 and Problem 3.
Finally, in Table 2 we report the number of DIN, centrality and backtracking iterations, together with the value of the stopping criteria c 1 and c 2 (defined like in (37))  Table 3 we instead report the results obtained by solving Problem 1 with various discretizations using GMRES as inner solver with the parameters in Sect. 4 with tol min = 10 −12 . Here the implementation of the GMRES follows [31,  p.45] with Givens rotations and re-orthogonalization. We remark that we are now not concerned with the efficiency of the method, but only with its validation with direct and iterative solvers. Efficient implementations are instead studied in Sect. 6.

Role of the conditions on α k
We now analyze what happens when we remove some conditions (like feasibility, centrality or backtracking) in the choice of the step-size α k . We also consider what happens if the reduction of Argaez, Tapia and Velazquez 2 is introduced. The results of this analysis are reported in Table 4, where the different conditions on α k are marked in the following way: -F marks the presence of the feasibility condition; -C marks the presence of the centrality condition; -B marks the presence of the backtracking condition; -R marks the presence of the Argaez, Tapia and Velazquez reduction. Finally, the star * next to the value of the stopping conditions c 1 and c 2 indicates that the algorithm converged, but to a wrong solution. We notice that the reduction of Argaez, Tapia and Velazquez does indeed reduce the number of iterations: indeed, the centrality conditions are not triggered anymore in any of the considered problems, and the number of the DIN iterations is almost halved.
Although both centrality and backtracking are, in general, needed for the algorithm to converge, we notice that backtracking is never triggered in the analyzed cases. Therefore, the solution does not change when we remove it. Similarly, Table 3 shows not only that the algorithm converges when we remove the centrality conditions, but also that α k still does not trigger backtracking. Therefore, α k is never reduced after the feasibility condition. This results in a larger step, which justifies the smaller number of iterations. However, we remark that this applies to the analyzed problems and it does not mean that, in general, disregarding the centrality conditions leads to higher efficiency or to faster convergence: both centrality and backtracking are required for ensuring convergence as in Sect. 3.
If we also remove the feasibility condition, the centrality conditions become fundamental in order to compute the correct solution. Otherwise, the backtracking condition still ensures that the algorithm converges, but the computed solution is not correct. Indeed, the solution gets unstable in the parts of p and of r close to zero, where we start having also large oscillations and negative values. The reductions performed by the backtracking conditions alone are, therefore, not sufficient, since the adherence to the central path is granted no more.
On the other hand, removing the feasibility condition when preserving the centrality does not largely affect the results, but it only increases the number of required centrality iterations. This could be expected also from the analysis of the algorithm: indeed, the centrality satisfies the feasibility, which is introduced in order to pass a "better", already feasible α k to the centrality.

Role of the perturbation parameter μ k and of the starting vectors p (0) and r (0)
In Table 1 we chose μ k equal to the upper bound μ (2) k , thus obtaining a greater perturbation. Arguably, this helps avoiding stagnation when we are close to the boundary. In order to see this, let us see what happens when we change μ k and the starting vectors p (0) and r (0) . The results are reported in Table 5, where Maxit Table 5 Results with different starting vectors p (0) and r (0) and different choices of μ k indicates that the maximum number of DIN iterations has been reached before convergence. We notice that the convergence is never faster for μ k = μ (1) k . On the contrary, it gets much slower as the initial iterates p (0) and r (0) are reduced. Indeed, when the initial iterates are large, the difference is negligible, especially for Problem 2 and for Problem 3, for which the results are indistinguishable in the two cases. However, for p (0) = r (0) = 10 −15 , the algorithms do not converge if the perturbation parameter is too small: we are indeed so near to the non-negative orthant that several centrality iterations are triggered, leading the algorithm to stall. This is particularly evident for Problem 3.
Another effective choice is to set the perturbation parameter in an adaptive way. This can be done using the condition In the considered cases, this leads exactly to the same results computed choosing μ k = μ (2) k . In other cases, however, the accuracy of the solution can increase if we choose μ k = μ (1) k when possible.

Comparison with the Fischer-Burmeister-Newton method
Finally, we compare the DIN method with the FBN method. As mentioned before, in [45] the FBN iteration has been used for the solution of lubrication problems and it is remarked that this algorithm is very efficient. However, it has to be noted that it also accepts iterates p (k) and r (k) whose components are negative: indeed, there is nothing preventing this, while in DIN the non-negativity conditions are enforced by the perturbationρ k . We now show that this difference is actually visible in the numerical experiments.
Solving the one-dimensional problems by the FBN method, we get the results in Table 6, where FBN-A denotes the FBN method with Armijo condition (36) and FBN-I denotes the FBN method with inexact Newton backtracking condition (26) withF instead of F and σ k = δ k = 0. In all the cases, we set n = 100.
We see that both the FBN methods converge in less iterations than the DIN method. This is a good measure of the computational cost as well: indeed, as better described in the next section, the solution of the linear systems arising at each Newton iteration is arguably the most expensive part of the procedure. However, the faster convergence of the FBN iterations is likely linked to the specific problems we analyzed: for these particular problems, similar and even better performances were obtained in Table 4 with the DIN method when only feasibility and backtracking conditions were applied.
Moreover, the conditions p ≥ 0, r ≥ 0 are not always respected. This is shown in Table 7, which represents the value of some components of p in points where p  should be equal to zero. The table refers to Problem 1, but the same applies to the other analyzed problems as well. We see that p computed with DIN is bounded away from zero: its components never get smaller than 10 −26 , ensuring that the non-negativity conditions of the complementarity are satisfied. On the other hand, the FBN implementations oscillate around zero, allowing both positive and negative values for the components of the vector. This violation is here below machine precision and hence it is not concerning for the analyzed problems and for the chosen tolerance. Nonetheless, it is interesting to notice the effect of the absence of something which strongly enforces the non-negativity of the iterates. Moreover, if we choose 1 = 2 = 10 −4 (as done for the DIN in the next section) FBN iterations present a few values in the order of −10 −9 . The DIN iteration is, thus, more stable and the solutions it computes approach zero smoothly and without oscillations, as in Table 7.

An efficient implementation and a 2D example
In the previous section, we validated the DIN method and we analyzed the procedure in different situations. However, we were not concerned with the efficiency of the implementation. This is testified, for instance, by the large number of linear iterations required in Table 3, which, in turn, greatly affects computational times. In this subsection, we instead focus on an efficient solution of the problem. The heart of a fast implementation of the DIN method relies in the efficient solution of the inner linear system arising at each DIN iteration. Indeed, this is arguably the most onerous part of the procedure, since the computation of the Jacobian is not onerous (see (10)) and the choice of an appropriate step-length α k is performed by algebraic operations which are not much computationally expensive. We also relax the outer tolerances to 1 = 2 = 10 −4 , which is enough to compute accurate results. Indeed, with this choice, in the analyzed cases the norm of F( p, r) at convergence does not exceed 10 −6 (and it usually is in the range 10 −9 ÷ 10 −11 ), while c 2 < 2 = 10 −4 is sufficient to ensure that the solution does not change rapidly between two successive iterations.
We consider different kinds of inner solvers: -direct methods for full matrices; -direct methods for sparse matrices; -preconditioned iterative methods.
In order to guarantee an efficient implementation and to enhance reproducibility, we employ well-known software libraries. In particular, the used direct solver for full matrices is given by the LU solver of the Lapack libraries. Regarding sparse direct solvers, we consider SuperLU [22,36] and MUMPS [2,3] called through the PETSc suite 3 . Finally, we use preconditioners and iterative solvers of the PETSc package. In this last case, the inner systems are solved inexactly and the tolerance of the linear solver is chosen as in Sect. 4 with tol min = min{ 1 , 2 } = 10 −4 . We always used left preconditioning, which is also the default setting for most Krylov solvers in PETSc.
In the following, we consider Problem 1 (representing a 1D example) and Problem 4 (2D example). In tables, the columns titled time report computational times (expressed in seconds) for running the entire algorithm in a Unix environment on a laptop with a dual core 2.7 GHz Intel "Core i5" processor ("Broadwell" series).
Starting from the 1D case, in Table 8 we report the results obtained with the LU factorization. When n is small, the solver works efficiently, but computational times rapidly increase as the order of the Jacobian matrix increases. Thus, exploiting the sparsity of the Jacobian becomes crucial for problems of large dimensions.
To this end, we can use direct solvers for sparse matrices (Table 9) or iterative solvers (Table 10). We see that direct solvers for sparse matrices behave much better   than those for full matrices and are still efficient also when we have thousands of variables.
Regarding iterative solvers, in Table 10 we report the results obtained by using GMRES or BiCG-STAB with ILU preconditioner. Other common preconditioners (such as Jacobi, SOR, etc.) did not behave well, with the sole exception of the block-Jacobi preconditioner. On the other hand, ILU preconditioning reduces the number of linear iterations consistently, as we can see in Table 10, where also computational times approach those of direct methods for sparse matrices.
Passing to the 2D case, the order of the Jacobian matrix rapidly increases with n: each block of the Jacobian matrix is of order n 2 . Moreover, A is block-tridiagonal and it is more sparse (although we have more nonzero elements in each row). Therefore, we now only consider solvers for sparse matrices and iterative solvers. The results obtained using SuperLU and ILU-preconditioned GMRES are reported in Table 11, while Fig. 3 represents the solution computed by ILU-preconditioned GMRES with n = 100. We notice that the iterative solver becomes more competitive as n increases: computational times are practically equal when n = 100. For yet larger systems, it is also advisable to employ multigrid preconditioners, in order to make the number of linear iterations more independent of the order of the problem. Lastly, we conclude with some information on saddle-point (e.g. see [9]) and on augmentation-based preconditioners (e.g. see [38]), which are commonly used in the solution of systems similar to those we need to solve at each DIN iteration. Since these preconditioners are not natively implemented in PETSc, a direct comparison with the results previously reported would be inconsistent. It appears nonetheless suitable to provide some remarks on their implementation for the problems of our concern.
Starting from augmentation-based preconditioners, they are generally applied to solving systems where the (2, 2) block of the coefficient matrix is null or symmetric negative (semi)definite. Remembering the form of the Jacobian (10), we can put ourselves in this situation by multiplying both sides of our Newton's system by the matrixĨ = I 0 0 −I where I ∈ R n×n is the identity matrix of order n. In the application of the preconditioner, a positive definite weight matrix W must then be chosen. A common choice is to set it as the (2, 2) block of the coefficient matrix changed of sign, which is W = P. This, however, can give some numerical difficulties as some diagonal elements of P approach zero, especially when we want to solve the complementarity problem using a small tolerance. In this case, it is therefore advisable to use other diagonal weight matrices, such as the identity matrix. Passing to saddle-point preconditioners, if P 2 is small enough, the problem preserves the characteristics of a generalized saddle-point problem. The preconditioners analyzed, for instance, in [42] can then be of interest. Here, the (1, 1) block of the coefficient matrix is first split in A = F − E. Then, considering the Jacobian (10), the saddle-point preconditioner P with exact Schur complement is where P, R are diagonal and B is lower bidiagonal. It is worth noticing that the Schur complement can be computed exactly when A is split by Jacobi or Gauss-Seidel splittings. Indeed, if F is the diagonal of A (Jacobi splitting), P − R F −1 B is lower bidiagonal and its inverse is readily available. In case of Gauss-Seidel splitting, since A is tridiagonal, F is lower bidiagonal. Hence, P − R F −1 B is lower triangular and can be inverted by forward substitution. Alternatively, preconditioners with approximate Schur complement [42] can be used as well.

Conclusions
We introduced a DIN algorithm for solving complementarity problems and we presented it with special regard to LCPs arising in hydrodynamic lubrication. We proved the global convergence of the proposed method and we demonstrated its applicability for solving the problems of our concern. In the numerical experiments we also analyzed the components of the algorithm, highlighting, for example, the advantage of choosing a larger perturbation parameter μ k . We also compared the DIN and the FBN iteration, confirming the ability of the DIN method of strongly enforcing the nonnegativity conditions of the complementarity problem, while the FBN also accepts negative values in the solution vectors (although within machine precision). Lastly, we remarked the importance of solving efficiently the inner linear systems in order to have a fast implementation of the method. In particular, we compared the effect of different linear solvers, all efficiently implemented through the Lapack and the PETSc packages. We then concluded our analysis by solving a 2D example.