Open Access

On iterative computation of fixed points and optimization

Fixed Point Theory and Applications20152015:128

https://doi.org/10.1186/s13663-015-0372-8

Received: 29 April 2015

Accepted: 30 June 2015

Published: 25 July 2015

Abstract

In this paper, a semi-local convergence analysis of the Gauss-Newton method for convex composite optimization is presented using the concept of quasi-regularity in order to approximate fixed points in optimization. Our convergence analysis is presented first under the L-average Lipschitz and then under generalized convex majorant conditions. The results extend the applicability of the Gauss-Newton method under the same computational cost as in earlier studies such as Li and Ng (SIAM J. Optim. 18:613-642, 2007), Moldovan and Pellegrini (J. Optim. Theory Appl. 142:147-163, 2009), Moldovan and Pellegrini (J. Optim. Theory Appl. 142:165-183, 2009), Wang (Math. Comput. 68:169-186, 1999) and Wang (IMA J. Numer. Anal. 20:123-134, 2000).

Keywords

fixed pointthe Gauss-Newton methodmajorizing sequencesconvex composite optimizationsemi-local convergence

MSC

47H1047J0547J2565G9949M1541A29

1 Introduction

In this paper, we are concerned with the convex composite optimizations problem. Many problems in mathematical programming such as convex inclusion problems, minimax problems, penalization methods, goal programming problems, constrained optimization problems, and other problems can be formulated like composite optimization problems (see, for example, [16]).

Recently, in the elegant study by Li and Ng [7], the notion of quasi-regularity for \(x_{0} \in \mathbb{R}^{l}\) with respect to inclusion the problem was used. This notion generalizes the case of regularity studied in the seminal paper by Burke and Ferris [3] as well as the case when \(d \longrightarrow F'(x_{0}) d - \mathcal {C}\) is surjective. This condition was inaugurated by Robinson in [8, 9] (see, also, [1, 10, 11]).

In this paper, we present a convergence analysis of the Gauss-Newton method (GNM) (see the method (GNA) in Section 2). In [7], the convergence of the method (GNA) is based on the generalized Lipschitz conditions inaugurated by Wang [12, 13] (to be precise in Section 2). In [11], we presented a finer convergence analysis in the setting of Banach spaces than in [1216] for the method (GNM) with the advantages \((\mathcal{A})\): tighter error estimates on the distances involved and the information on the location of the solution is at least as precise. These advantages were obtained (under the same computational cost) using the same or weaker hypotheses. Here, we provide the same advantages \((\mathcal{A})\) but for the method (GNA).

The rest of the study is organized as follows: Section 2 contains the notions of generalized Lipschitz conditions and the majorizing sequences for the method (GNA). In order for us to make the paper as self-contained as possible, the notion of quasi-regularity is re-introduced (see, for example, [7]) in Section 3. Semi-local convergence analysis of the method (GNA) using L-average conditions is presented in Section 4. In Section 5, some convex majorant conditions are used for the semi-local convergence of the method (GNA).

2 Generalized Lipschitz conditions and majorizing sequences

The purpose of this paper is to study the convex composite optimization problem:
$$ \min_{x \in \mathbb{R}^{l}} \phi(x):= h \bigl(F(x) \bigr), $$
(2.1)
where \(h : \mathbb{R}^{m} \rightarrow \mathbb{R}\) is a convex operator, \(F : \mathbb{R}^{l} \rightarrow \mathbb{R}^{m}\) is a Fréchet-differentiable operator and \(m , l \in \mathbb{N}^{\star}\).

The study of the problem (2.1) is very important. On the other hand, the study of the problem (2.1) provides a unified framework for the development and analysis of algorithmic method and on the other hand it is a powerful tool for the study of first- and second-order optimality conditions in constrained optimality (see, for example, [17]).

We assume that the minimum \(h_{\min}\) of the function h is attained. The problem (2.1) is related to the following:
$$ F(x) \in \mathcal {C}, $$
(2.2)
where
$$ \mathcal {C}= \operatorname{argmin} h $$
(2.3)
is the set of all minimum points of h.

A semi-local convergence analysis for the Gauss-Newton method (GNM) was presented using the popular algorithm (see, for example, [1, 7, 17]):

Algorithm

(GNA): \((\xi, \Delta, x_{0})\)

Let \(\xi\in[1, \infty[\), \(\Delta\in\,] 0, \infty]\) and, for each \(x \in \mathbb{R}^{l}\), define \(\mathcal {D}_{\Delta}(x)\) by
$$\begin{aligned} \mathcal {D}_{\Delta}(x) =& \bigl\{ d\in \mathbb{R}^{l} : \| d \| \leq\Delta, h \bigl(F(x)+ F'(x) d \bigr) \leq h \bigl(F(x) + F' (x) d' \bigr) \\ &{} \mbox{for all } d'\in \mathbb{R}^{l} \mbox{ with } \bigl\| d' \bigr\| \leq\Delta \bigr\} . \end{aligned}$$
(2.4)

Let also \(x_{0} \in \mathbb{R}^{l}\) be given. Having \(x_{0}, x_{1}, \ldots, x_{k}\) (\(k \geq0\)), determine \(x_{k+1}\) by the following.

If \(0 \in \mathcal {D}_{\Delta}(x_{k})\), then STOP;

If \(0 \notin \mathcal {D}_{\Delta}(x_{k})\), choose \(d_{k}\) such that \(d_{k} \in \mathcal {D}_{\Delta}(x_{k})\) and
$$ \| d_{k} \|\leq\xi d \bigl(0, \mathcal {D}_{\Delta}(x_{k}) \bigr). $$
(2.5)

Then set \(x_{k+1} = x_{k} + d_{k}\).

Here, \(d(x, W)\) denotes the distance from x to W in the finite dimensional Banach space containing W. Note that the set \(\mathcal {D}_{\Delta}(x)\) (\(x \in \mathbb{R}^{l}\)) is nonempty and is the solution of the following convex optimization problem:
$$ \min_{d \in \mathbb{R}^{l} , \| d\|\leq\Delta} h \bigl(F(x)+ F'(x) d \bigr), $$
(2.6)
which can be solved by the well known methods such as the subgradient or cutting plane or bundle methods (see, for example, [18, 19]).

Notice that, in the special case when \(l=m\) and \(F(x)=H(x)-x\), the results obtained in this paper can be used to iteratively compute fixed points of the operator \(H:\mathbb {R}^{m} \rightarrow \mathbb {R}^{m}\). Therefore, the results obtained in this paper are useful in fixed point theory and its applications in optimization.

Let \(U(x,r)\) denote the open ball in \(\mathbb {R}^{l}\) (or \(\mathbb {R}^{m}\)) centered at x and of radius \(r>0\). By \(\overline{U} (x,r)\) we denote its closure. Let W be a closed convex subset of \(\mathbb {R}^{l}\) (or \(\mathbb {R}^{m}\)). The negative polar of W denoted by \(W ^{\circleddash}\) is defined as
$$ W^{\circleddash} =\{z : < z, w> \leq0 \mbox{ for each } w \in W\} . $$
(2.7)

We need the following notion of the generalized Lipschitz condition due to Wang in [12, 13] (see also [7]). From now on, \(L : [0, \infty[\longrightarrow\,]0, \infty[\) (or \(L_{0}\)) denotes a nondecreasing and absolutely continuous function. Moreover, η and α denote given positive numbers.

Definition 2.1

Let \(\mathcal {Y}\) be a Banach space and let \(x_{0} \in \mathbb{R}^{l}\). Let \(G : \mathbb{R}^{l} \longrightarrow \mathcal {Y}\). Then, G is said to satisfy:

(1) the center \(L_{0}\) -average condition on \(U (x_{0} , r)\) if
$$ \bigl\| G(x) - G(x_{0} ) \bigr\| \leq \int_{0}^{ \| x- x_{0} \|} L _{0} (u)\,du $$
(2.8)
for all \(x \in U(x_{0} , r)\);
(2) the L-average Lipschitz condition on \(U (x_{0} , r)\) if
$$ \bigl\| G(x) - G(y ) \bigr\| \leq \int_{\| y - x_{0} \|}^{\| x- y \|+ \| y - x_{0} \|} L (u)\,du $$
(2.9)
for all \(x,y \in U(x_{0} , r)\) with \(\| x- y \|+ \| y - x_{0} \|\leq r\).

Remark 2.2

It follows from (2.8) and (2.9) that, if G satisfies the L-average condition, then it satisfies the center L-Lipschitz condition, but not necessarily vice versa. We have
$$ L_{0} (u) \leq L (u) $$
(2.10)
for each \(u \in[0 , r]\) holds in general and \({L}/{L_{0}}\) can be arbitrarily large (see [1, 2, 10]).

Definition 2.3

Define a majorizing function \(\psi_{\alpha}\) on \([0, + \infty)\) by
$$ \psi_{\alpha}(t)= \eta- t + \alpha \int _{0} ^{t} L(u) (t-u)\,du $$
(2.11)
for each \(t \geq0\) and a majorizing sequence \(\{ t_{\alpha, n } \}\) by
$$ t_{\alpha, {0}} =0 ,\qquad t_{\alpha, {n+1}} = t_{\alpha, {n}} - \frac {\psi_{\alpha}( t_{\alpha, {n}})}{ \psi' _{\alpha}( t_{\alpha, {n}})} $$
(2.12)
for each \(n\geq0\). The sequence \(\{ t_{\alpha, n } \}\) was used in [7] as a majorizing sequence for \(\{x_{n} \}\) generated by the algorithm (GNA).
The sequence \(\{ t_{\alpha, n } \}\) can also be written, equivalently, for each \(n\geq1\) and \(t_{\alpha, 1 } =1\) as
$$ t_{\alpha, {n+1}} = t_{\alpha, {n}} - \frac { \gamma_{\alpha, {n}} }{ \psi' _{\alpha}( t_{\alpha, {n}}) }, $$
(2.13)
where
$$\begin{aligned} \gamma_{ \alpha, {n}} =&\int_{0} ^{1} \int_{t_{\alpha, {n-1}}}^{t_{\alpha, {n-1}} + \theta (t_{\alpha , {n}} - t_{\alpha, {n-1}} )} L(u)\,du \,d\theta ( t_{\alpha, {n}} - t_{\alpha, {n-1}} ) \\ =&\int_{0}^{t_{\alpha, {n}} - t_{\alpha, {n-1}}} L(t_{\alpha, {n-1}} + u) (t_{\alpha, {n}} -t_{\alpha, {n-1}} - u )\,du \end{aligned}$$
(2.14)
since (see (4.20) in [7])
$$ \psi_{\alpha}(t_{\alpha, {n}} ) =\frac{\gamma_{\alpha, {n}}}{\alpha} $$
(2.15)
for each \(n\geq1\).
From now on, we show how our convergence analysis for the algorithm (GNA) is finer than the one in [7]. Define a supplementary majorizing function \(\psi_{\alpha, 0}\) on \([0, + \infty)\) by
$$ \psi_{\alpha, 0} (t)=\eta- t +\alpha \int_{0} ^{t}L _{0} (u) (t-u)\,du $$
(2.16)
for each \(t \geq0\) and the corresponding majorizing sequence \(\{ s _{\alpha, n } \}\) by
$$ s _{\alpha, {0}} =0 ,\qquad s _{\alpha, {1}} = \eta,\qquad s_{\alpha, {n+1}} = s _{\alpha, {n}} - \frac{ \beta_{\alpha, {n}}}{ \psi' _{\alpha, 0}( s _{\alpha, {n}})} $$
(2.17)
for each ≥1, where \(\beta_{\alpha, {n}}\) is defined as \(\alpha_{\alpha, {n}}\) with \(s_{\alpha, {n-1}}\), \(s_{\alpha, {n}}\) replacing \(t_{\alpha, {n-1}}\), \(t_{\alpha, {n}}\), respectively.

The results concerning \(\{ t_{\alpha, n} \}\) are already in the literature (see, for example, [1, 7, 11]), whereas the corresponding ones for the sequence \(\{ s_{\alpha, n} \}\) can be derived in an analogous way by simply using \(\psi' _{\alpha, 0}\) instead of \(\psi' _{\alpha}\).

First, we need some auxiliary results for the properties of functions \(\psi_{\alpha}\), \(\psi_{\alpha, 0}\) and the relationship between sequences \(\{ s_{\alpha, n } \}\) and \(\{ t_{\alpha, n } \}\). The proofs of the next four lemmas involving the \(\psi_{\alpha}\) function can be found in [7], whereas the proofs for the function \(\psi_{\alpha, 0 }\) are analogously obtained by simply replacing L by \(L_{0}\).

Let \(r_{\alpha}>0\), \(b_{\alpha}>0\), \(r_{\alpha, 0} >0 \), and \(b_{\alpha, 0} >0\) be such that
$$ \alpha \int_{0} ^{r_{\alpha}} L (u)\,du =1,\qquad b_{\alpha}= \alpha \int_{0} ^{r_{\alpha}} L (u) u\,du, $$
(2.18)
and
$$ \alpha \int_{0} ^{r_{\alpha, 0}} L _{0} (u)\,du =1,\qquad b_{\alpha, 0} = \alpha \int _{0} ^{r_{\alpha, 0}} L _{0} (u) u \,du . $$
(2.19)
Clearly, we have
$$ b_{\alpha}< r_{\alpha}$$
(2.20)
and
$$ b_{\alpha, 0} < r _{\alpha, 0} . $$
(2.21)
In view of (2.10), (2.18), and (2.19), we get
$$ r_{\alpha}\leq r_{\alpha, 0} $$
(2.22)
and
$$ b_{\alpha}\leq b_{\alpha, 0} . $$
(2.23)

Lemma 2.4

Suppose that \(0 < \eta\leq b_{\alpha}\). Then \(b_{\alpha}< r_{\alpha}\) and the following assertions hold:

(1) \(\psi_{\alpha}\) is strictly decreasing on \([0, r_{\alpha}]\) and strictly increasing on \([r_{\alpha}, \infty)\) with \(\psi_{\alpha}(\eta) >0\), \(\psi_{\alpha}(r_{\alpha}) = \eta- b_{\alpha}\leq0\), \(\psi_{\alpha}(+ \infty) \geq\eta>0\);

(2) \(\psi_{\alpha, 0}\) is strictly decreasing on \([0, r_{\alpha, 0}]\) and strictly increasing on \([r_{\alpha, 0} , \infty)\) with \(\psi_{\alpha, 0} (\eta) >0\), \(\psi_{\alpha, 0}(r_{\alpha, 0}) = \eta- b_{\alpha, 0} \leq0\), \(\psi_{ \alpha, 0} (+ \infty) \geq\eta>0\).

Moreover, if \(\eta< b_{\alpha}\), then \(\psi_{\alpha}\) has two zeros, denoted by \(r _{\alpha}^{\star} \) and \(r _{\alpha}^{\star\star} \), such that
$$ \eta< r _{\alpha}^{\star} < \frac{r_{\alpha}}{b_{\alpha}} \eta< r_{\alpha}< r _{\alpha}^{\star\star} $$
(2.24)
and, if \(\eta=b_{\alpha}\), then \(\psi_{\alpha}\) has an unique zero \(r _{\alpha}^{\star} =r_{\alpha}\) in \((\eta, \infty)\);
\(\psi_{\alpha,0}\) has two zeros, denoted by \(r _{\alpha, 0}^{\star} \) and \(r _{\alpha, 0}^{\star\star} \), such that
$$\begin{aligned}& \eta< r _{\alpha, 0}^{\star} < \frac{r_{ \alpha, 0} }{b_{\alpha, 0} } \eta< r_{\alpha, 0} < r _{\alpha, 0}^{\star\star} , \\& r _{\alpha, 0}^{\star} \leq r _{\alpha}^{\star}, \end{aligned}$$
(2.25)
$$\begin{aligned}& r _{\alpha, 0}^{\star\star} \leq r _{\alpha}^{\star\star}, \end{aligned}$$
(2.26)
and, if \(\eta=b_{\alpha,0} \), then \(\psi_{\alpha,0}\) has an unique zero \(r _{\alpha, 0}^{\star} =r_{\alpha, 0} \) in \((\eta, \infty)\);

(3) \(\{ t _{\alpha, n} \}\) is strictly monotonically increasing and converges to \(r _{\alpha}^{\star} \);

(4) \(\{ s _{\alpha, n} \}\) is strictly monotonically increasing and converges to its unique least upper bound \(s _{\alpha }^{\star} \leq r_{\alpha, 0}^{\star}\);

(5) The convergence of \(\{ t _{\alpha, n} \}\) is quadratic if \(\eta< b_{\alpha}\) and linear if \(\eta= b_{\alpha}\).

Lemma 2.5

Let \(r_{\alpha}\), \(r_{\alpha,0}\), \(b_{\alpha}\), \(b_{\alpha,0}\), \(\psi _{\alpha}\), \(\psi_{\alpha,0}\) be as defined above. Let \(\overline{\alpha} > \alpha\). Then the following assertions hold:
  1. (1)

    The functions \(\alpha\rightarrow r_{\alpha}\), \(\alpha\rightarrow r_{\alpha, 0}\), \(\alpha\rightarrow b_{\alpha}\), \(\alpha\rightarrow b_{\alpha, 0}\) are strictly decreasing on \([0, \infty)\);

     
  2. (2)

    \(\psi_{\alpha}< \psi_{\overline{\alpha}}\) and \(\psi_{\alpha, 0} < \psi_{\overline{\alpha} , 0}\) on \([0, \infty)\);

     
  3. (3)
    The function \(\alpha\rightarrow r _{\alpha}^{\star}\) is strictly increasing on \(I(\eta)\), where
    $$I(\eta) = \{ \alpha>0 :\eta\leq b_{\alpha}\}; $$
     
  4. (4)

    The function \(\alpha\rightarrow r _{\alpha, 0}^{\star}\) is strictly increasing on \(I(\eta)\).

     

Lemma 2.6

Let \(0 \leq\lambda< \infty\). Define the functions
$$ \chi(t) = \frac{1}{t ^{2}} \int_{0} ^{t} L(\lambda+u) (t-u )\,du $$
(2.27)
for all \(t \geq0\) and
$$ \chi_{0} (t) = \frac{1}{t ^{2}} \int _{0} ^{t} L _{0}(\lambda+u) (t-u )\,du $$
(2.28)
for all \(t \geq0\). Then the functions χ and \(\chi_{0}\) are increasing on \([0 , \infty)\).

Lemma 2.7

Define the function
$$g_{\alpha}(t)= \frac{\psi_{\alpha}(t)}{\psi_{\alpha} ' (t)} $$
for all \(t \in[0, r _{\alpha}^{\star} )\). Suppose that \(0 < \eta\leq b_{\alpha}\). Then the function \(g_{\alpha}\) is increasing on \([0, r _{\alpha}^{\star} )\).

Next, we show that the sequence \(\{ s _{\alpha, n} \}\) is tighter than \(\{ t _{\alpha, n} \}\).

Lemma 2.8

Suppose that the hypotheses of Lemma  2.4 hold and the sequences \(\{ s _{\alpha, n} \}\), \(\{ t _{\alpha, n} \}\) are well defined for each \(n\geq0\). Then the following assertions hold: for all \(n\geq0\),
$$\begin{aligned}& s_{\alpha, {n}} \leq t _{\alpha, {n}} , \end{aligned}$$
(2.29)
$$\begin{aligned}& s_{\alpha, {n+1}} - s_{\alpha, {n}} \leq t_{\alpha, {n+1}} - t _{\alpha, {n}} , \end{aligned}$$
(2.30)
and
$$ s_{\alpha} ^{\star}= \lim_{n\rightarrow\infty} s_{\alpha, n} \leq r _{\alpha} ^{\star}= t _{\alpha} ^{\star}= \lim_{n\rightarrow\infty} t _{\alpha, n} . $$
(2.31)

Moreover, if the strict inequality holds in (2.10), so does in (2.29) and (2.30) for all \(n >1\). Furthermore, the convergence of \(\{ s _{\alpha, n} \}\) is quadratic if \(\eta< b_{\alpha}\) and linear if \(L_{0}=L\) and \(\eta= b_{\alpha}\).

Proof

First, we show, using induction, that (2.29) and (2.30) are satisfied for each \(n\geq0\). These estimates hold true for \(n=0,1\) since \(s _{\alpha, 0}=t _{\alpha, 0}=0\) and \(s _{\alpha, 1}=t _{\alpha, 1}=\eta\). Using (2.10), (2.13), and (2.17) for \(n=1\), we have
$$s _{\alpha, {2}} = s _{\alpha, {1}} - \frac{ \beta_{\alpha, {1}}}{ \psi' _{\alpha, 0}( s _{\alpha, {1}})} \leq t _{\alpha, {1}} - \frac{\gamma_{\alpha, {1}}}{\psi' _{\alpha}( t _{\alpha, {1}})} =t _{\alpha, {2}} $$
and
$$s _{\alpha, {2}} - s _{\alpha, {1}} = - \frac { \beta_{\alpha, {1}}}{ \psi' _{\alpha, 0}( s _{\alpha, {1}})} \leq - \frac{\gamma_{\alpha, {1}}}{\psi' _{\alpha}( t _{\alpha, {1}})} =t _{\alpha, {2}} - t _{\alpha, {1}} $$
since
$$ - \psi' _{\alpha, 0}(s) \leq-\psi' _{\alpha}(t) $$
(2.32)
for each \(s \leq t\). Hence the estimate (2.29) holds true for \(n=0,1,2\) and (2.30) holds true for \(n=0,1\). Suppose that
$$s_{\alpha, {m}} \leq t _{\alpha, {m}} $$
for each \(m=0,1,2, \ldots, k+1\) and
$$s_{\alpha, {m+1}} - s_{\alpha, {m}} \leq t_{\alpha, {m+1}} - t _{\alpha, {m}} $$
for each \(m=0,1,2, \ldots, k\). Then we have
$$s _{\alpha, {m+2}} = s _{\alpha, {m+1}} - \frac { \beta_{\alpha, {m+1}}}{ \psi' _{\alpha, 0}( s _{\alpha, {m+1}})} \leq t _{\alpha, {m+1}} - \frac { \gamma_{\alpha, {m+1}}}{ \psi' _{\alpha}( t _{\alpha, {m+1}})} = t _{\alpha, {m+2}} $$
and
$$s _{\alpha, {m+2}} - s _{\alpha, {m+1}} = - \frac { \beta_{\alpha, {m+1}}}{ \psi' _{\alpha, 0}( s _{\alpha, {m+1}})} \leq - \frac { \gamma_{\alpha, {m+1}}}{ \psi' _{\alpha}( t _{\alpha, {m+1}})} = t _{\alpha, {m+2}} - t _{\alpha, {m+1}} . $$
The induction for (2.29) and (2.30) is complete.

Finally, the estimate (2.31) follows from (2.30) by letting \(n\rightarrow\infty\). The convergence order part for the sequence \(\{ s_{\alpha, n} \}\) follows from (2.30) and Lemma 2.4(v). This completes the proof. □

Remark 2.9

If \(L_{0} =L\), the results in Lemmas 2.4-2.8 reduce to the corresponding ones in [7]. Otherwise (i.e., if \(L_{0} < L\)), our results constitute an improvement (see also (2.22)-(2.26)).

3 Background on regularities

In order for us to make the study as self-contained as possible, we mention some concepts and results on regularities which can be found in [7] (see, also, [1, 10, 12, 15, 2022]).

For a set-valued mapping \(T : \mathbb{R}^{l} \rightrightarrows\mathbb {R}^{m}\) and for a set A in \(\mathbb{R}^{l}\) or \(\mathbb{R}^{m} \), we denote by
$$\begin{aligned}& D(T)= \bigl\{ x\in\mathbb{R}^{l} : Tx \neq\emptyset \bigr\} ,\qquad R(T) = \bigcup_{x \in D(T) } Tx , \\& T^{-1} y = \bigl\{ x \in\mathbb{R}^{l} : y \in T x \bigr\} , \qquad \| A\|= \inf_{a\in A} \| a\|. \end{aligned}$$
Consider the inclusion
$$ F(x) \in C , $$
(3.1)
where C is a closed convex set in \(\mathbb{R}^{m}\). Let \(x\in\mathbb {R}^{l} \) and
$$ \mathcal {D}(x)= \bigl\{ d \in\mathbb{R}^{l} : F(x) + F'(x) d \in C \bigr\} . $$
(3.2)

Definition 3.1

Let \(x_{0} \in\mathbb{R}^{l}\).

(1) \(x_{0}\) is called a quasi-regular point of the inclusion (3.1) if there exist \(R \in\,]0, +\infty[\) and an increasing positive function β on \([0,R[\) such that
$$ \mathcal {D}(x) \neq\emptyset,\quad d \bigl(0, \mathcal {D}(x) \bigr) \leq \beta \bigl(\| x - x_{0} \| \bigr) d \bigl(F(x) , C \bigr) $$
(3.3)
for all \(x \in U(x_{0} , R)\), \(\beta(\| x - x_{0} \|)\) is an ‘error bound’ in determining how for the origin is away from the solution set of the inclusion (3.1).
(2) \(x_{0}\) is called a regular point of the inclusion (3.1) if
$$ \operatorname{ker} \bigl(F'(x_{0})^{T} \bigr) \cap \bigl(C - F(x_{0}) \bigr)^{\circleddash}= \{ 0 \} . $$
(3.4)

Proposition 3.2

(see [3])

Let \(x_{0}\) be a regular point of (3.1). Then there are constants \(R>0\) and \(\beta>0\) such that (3.3) holds for R and \(\beta( \cdot) = \beta\). Therefore, \(x_{0}\) is a quasi-regular point with the quasi-regular radius \(R_{x_{0}} \geq R\) and the quasi-regular bound function \(\beta_{x_{0}} \leq\beta\) on \([0, R]\).

Remark 3.3

(1) \(\mathcal {D}(x) \) can be considered as the solution set of the linearized problem associated to (3.1)
$$ F(x)+ F'(x) d \in C . $$
(3.5)
(2) If C defined in (3.1) is the set of all minimum points of h and there exists \(d_{0} \in \mathcal {D}(x) \) with \(\| d_{0} \|\leq\Delta\), then \(d_{0} \in \mathcal {D}_{\Delta}(x) \) and, for each \(d \in\mathbb{R}^{l}\), we have the following equivalence:
$$ d \in \mathcal {D}_{\Delta}(x) \quad\Longleftrightarrow\quad d \in \mathcal {D}(x) \quad\Longleftrightarrow\quad d \in \mathcal {D}_{\infty}(x) . $$
(3.6)
(3) Let \(R_{x_{0}}\) denote the supremum of R such that (3.3) holds for some function β defined in Definition 3.1. Let \(R \in[0, R _{x_{0}}]\) and \(\mathcal{B} _{R} ( x_{0} )\) denotes the set of function β defined on \([0, R)\) such that (3.3) holds. Define
$$ \beta_{x_{0}} (t)= \inf \bigl\{ \beta(t) : \beta\in \mathcal{B} _{R_{x_{0}}} ( x_{0} ) \bigr\} $$
(3.7)
for each \(t \in[0, R_{x_{0}})\). All the function \(\beta\in\mathcal{B} _{R} ( x_{0} )\) with \(\lim_{t \rightarrow R^{-}} \beta(t) < + \infty \) can be extended to an element of \(\mathcal{B} _{R_{x_{0}}} ( x_{0} ) \) and we have
$$ \beta_{x_{0}} (t)= \inf \bigl\{ \beta(t) : \beta\in \mathcal{B} _{R} ( x_{0} ) \bigr\} $$
(3.8)
for each \(t \in[0, R)\). Here, \(R_{x_{0}}\) and \(\beta_{x_{0}}\) are called the quasi-regular radius and the quasi-regular function of the quasi-regular point \(x_{0}\), respectively.

Definition 3.4

(1) A set-valued mapping \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) is said to be convex if the following items hold:
  1. (a)

    \(Tx+ T y \subseteq T (x+y)\) for all \(x, y \in\mathbb{R}^{l}\);

     
  2. (b)

    \(T \lambda x=\lambda Tx\) for all \(\lambda>0\) and \(x\in\mathbb{R}^{l}\);

     
  3. (c)

    \(0 \in T0\).

     
(2) Let \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) be a convex set-valued mapping. The norm of T be defined by
$$\| T\| = \sup_{x\in D(T)} \bigl\{ \| Tx \| : \| x \| \leq1 \bigr\} . $$
If \(\| T \|< \infty\), we say that T is normed.
(3) For two convex set-valued mappings T and \(S : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\), addition and multiplication are defined by
$$(T+S)x=Tx+Sx,\qquad (\lambda T)x=\lambda(Tx) $$
for all \(x\in\mathbb{R}^{l} \) and \(\lambda\in\mathbb{R}\), respectively.
(4) Let \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) be a mapping, C be closed convex in \(\mathbb{R}^{m}\) and \(x \in\mathbb{R}^{l}\). We define \(T_{x}\) by
$$ T_{x} d= F'(x)d - C $$
(3.9)
for all \(d \in\mathbb{R}^{l}\) and its inverse by
$$ T_{x}^{-1} y= \bigl\{ d\in \mathbb{R}^{l} : F'(x) d \in y + C \bigr\} $$
(3.10)
for all \(y \in\mathbb{R}^{m}\).
Note that, if C is a cone, then \(T_{x}\) is convex. For any \(x_{0} \in \mathbb{R}^{l}\), if the Robinson condition (see [8, 9]),
$$ T_{x_{0}} \mbox{ carries } \mathbb{R}^{l} \mbox{ onto } \mathbb{R}^{m} , $$
(3.11)
is satisfied, then \(D(T_{x})= \mathbb{R}^{l} \) for each \(x\in\mathbb {R}^{l}\) and \(D(T_{x_{0}}^{-1} ) = \mathbb{R}^{m}\).

Remark 3.5

Let \(T : \mathbb{R}^{l} \rightrightarrows\mathbb{R}^{m}\) be a mapping.
  1. (1)

    T is convex the graph \(Gr(T)\) is a convex cone in \(\mathbb{R}^{l} \times\mathbb{R}^{m}\).

     
  2. (2)

    T is convex \(\Longrightarrow T^{-1} \) is convex from \(\mathbb {R}^{m}\) to \(\mathbb{R}^{l}\).

     

Lemma 3.6

(see [8])

Let C be a closed convex cone in \(\mathbb{R}^{m}\). Suppose that \(x_{0} \in\mathbb{R}^{l}\) satisfies the Robinson condition (3.11). Then we have the following assertions:

(1) \(T_{x_{0}}^{-1}\) is normed.

(2) If S is a linear operator from \(\mathbb{R}^{l}\) to \(\mathbb {R}^{m}\) such that \(\| T_{x_{0}}^{-1}\| \| S\| < 1\), then the convex set-valued mapping \(\bar{T} =T_{x_{0}} + S \) carries \(\mathbb{R}^{l}\) onto \(\mathbb{R}^{m}\). Furthermore, \(\bar{T}^{-1}\) is normed and
$$\bigl\| \bar{T}^{-1} \bigr\| \leq \frac{\| T_{x_{0}}^{-1} \|}{1 - \| T_{x_{0}}^{-1}\| \| S\|} . $$

The following proposition shows that the condition (3.11) implies that \(x_{0}\) is regular point of (3.1). Using the center \(L_{0}\)-average Lipschitz condition, we also estimate in Proposition 3.7 the quasi-regular bound function. The proof is given in an analogous way to the corresponding result in [7] by simply using \(L_{0}\) instead of L.

Proposition 3.7

Let C be a closed convex cone in \(\mathbb{R}^{m}\), \(x_{0} \in\mathbb {R}^{l}\), and define \(T_{x_{0}}\) as in (3.9). Suppose that \(x_{0}\) satisfies the Robinson condition (3.11). Then we have the following assertions:

(1) \(x_{0}\) is a regular point of (3.1).

(2) If \(F'\) satisfies the center \(L_{0}\)-average Lipschitz condition (2.8) on \(U(x_{0}, R)\) for some \(R >0\). Let \(\beta_{0} = \| T_{x_{0}}^{-1}\|\) and let \(R_{\beta_{0}}\) such that
$$ \beta_{0} \int_{0} ^{R_{\beta_{0}}} L _{0}(u)\,du =1. $$
(3.12)
Then the quasi-regular radius \(R_{x_{0}}\), the quasi-regular bound function \(\beta_{x_{0}}\) satisfy \(R_{x_{0}} \geq\min\{R, R_{\beta_{0}} \}\) and
$$ \beta_{x_{0}} (t)\leq\frac{\beta_{0}}{1 - \beta_{0} \int_{0} ^{t} L _{0}(u)\,du} $$
(3.13)
for each \(0 \leq t < \min\{R, R_{\beta_{0}} \}\).

Remark 3.8

If \(L_{0} = L\), Proposition 3.7 reduces to the corresponding one in [7]. Otherwise, it constitutes an improvement (see (2.20)-(2.26)).

4 Semi-local convergence analysis for (GNA)

Assume that the set \(\mathcal {C}\) satisfies (2.3). Let \(x_{0} \in \mathbb {R}^{l}\) be a quasi-regular point of (2.3) with the quasi-regular radius \(R_{x_{0}}\) and the quasi-regular bound function \(\beta_{x_{0}}\) (i.e., see (3.7)). Let \(\xi\in[1, +\infty)\) and let
$$ \eta= \xi \beta_{x_{0}} (0) d \bigl(F(x_{0}), \mathcal {C}\bigr) . $$
(4.1)
For all \(R \in(0, R_{x_{0}}]\), we define
$$ \alpha_{0} (R) =\sup \biggl\{ \frac{\xi \beta_{x_{0}} (t) }{\xi \beta _{x_{0}} (t) \int_{0} ^{t} L_{0} (s)\,ds +1} : \eta \leq t < R \biggr\} . $$
(4.2)

Theorem 4.1

Let \(\xi\in[1, +\infty)\) and \(\Delta\in(0, +\infty] \). Let \(x_{0} \in \mathbb {R}^{l}\) be a quasi-regular point of (2.3) with the quasi-regular radius \(R_{x_{0}}\) and the quasi-regular bound function \(\beta_{x_{0}}\). Let \(\eta>0\) and \(\alpha_{0} (R)\) be given in (4.1) and (4.2), respectively. Let \(0 < R < R_{x_{0}}\), \(\alpha \geq\alpha_{0} (R)\) be a positive constant, and let \(b_{\alpha}\), \(r_{\alpha}\) be as defined in (2.18). Let \(\{ s_{\alpha,n} \}\) (\(n \geq0\)) and \(s_{\alpha}^{\star}\) be given by (2.17) and (2.31), respectively. Suppose that \(F'\) satisfies the L-average Lipschitz and the center \(L_{0}\)-average Lipschitz conditions on \(U(x_{0} ,s_{\alpha}^{\star})\). Suppose that
$$ \eta\leq\min\{ b_{\alpha}, \Delta\},\qquad s_{\alpha}^{\star}\leq R. $$
(4.3)
Then the sequence \(\{ x_{n} \}\) generated by (GNA) is well defined, remains in \(\overline{U}(x _{0}, s_{\alpha}^{\star})\) for all \(n \geq0\) and converges to some \(x^{\star}\) such that \(F(x^{\star}) \in \mathcal {C}\). Moreover, the following estimates hold: for each \(n\geq1\),
$$\begin{aligned}& \| x_{n} - x_{n-1} \| \leq s_{\alpha, {n}} -s_{\alpha, {n-1}}, \end{aligned}$$
(4.4)
$$\begin{aligned}& \| x_{n+1} - x_{n} \| \leq( s_{\alpha, {n+1}} - s_{\alpha, {n}} ) \biggl(\frac{\| x_{n} - x_{n-1} \|}{ s_{\alpha, {n}} - s_{\alpha, {n-1}}} \biggr)^{2} , \end{aligned}$$
(4.5)
$$\begin{aligned}& F(x_{n}) + F'(x_{n}) (x_{n+1} - x_{n} ) \in \mathcal {C}, \end{aligned}$$
(4.6)
and
$$ \bigl\| x_{n-1} - x^{\star}\bigr\| \leq s_{\alpha}^{\star}- s_{\alpha, {n-1}} . $$
(4.7)

Proof

By (4.3), (4.4), and Lemma 2.4, we have
$$ \eta\leq s_{\alpha, {n}} < s_{\alpha} ^{\star}\leq R \leq R_{x_{0}}. $$
(4.8)
Using the quasi-regularity property of \(x_{0}\), we have
$$ \mathcal {D}(x) \neq\emptyset,\quad d \bigl(0, \mathcal {D}(x) \bigr) \leq \beta_{x_{0}} \bigl(\| x - x_{0} \| \bigr) d \bigl(F(x) , \mathcal {C}\bigr) $$
(4.9)
for all \(x \in U(x_{0} , R)\).
First, we prove that the following assertion holds.
(\(\mathcal{T}\)): 

(4.4) holds for all \(n \leq k-1\) (4.5) and (4.6) hold for all \(n \leq k\).

Denote by \(x_{k}^{\theta}= \theta x_{k} + (1- \theta) x_{k-1} \) for all \(\theta\in[0,1]\). Using (4.8), we have
$$x_{k}^{\theta}\in U \bigl(x_{0} , s_{\alpha}^{\star}\bigr) \subseteq U(x_{0} , R) $$
for all \(\theta\in[0,1]\). Hence, for \(x=x_{k}\), (4.9) holds, i.e.,
$$ \mathcal {D}(x_{k}) \neq\emptyset,\quad d \bigl(0, \mathcal {D}(x_{k}) \bigr) \leq \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \| \bigr) d \bigl(F(x_{k}) , \mathcal {C}\bigr). $$
(4.10)
We have also
$$ \| x_{k} - x_{0} \| \leq \sum _{i=1}^{k} \| x_{i} - x_{i-1} \| \leq \sum_{i=1}^{k}s_{\alpha, i} - s_{\alpha, i-1}=s_{\alpha, k} $$
(4.11)
and
$$ \| x_{k-1} - x_{0} \| \leq s_{\alpha, k-1} \leq s_{\alpha, k}. $$
(4.12)
Now, we prove that
$$ \xi d \bigl(0, \mathcal {D}(x_{k}) \bigr) \leq ( s_{\alpha, {k+1}} - s_{\alpha, {k}} ) \biggl(\frac{\| x_{k} - x_{k-1} \|}{ s_{\alpha, {k}} - s_{\alpha, {k-1}}} \biggr)^{2} \leq s_{\alpha, {k}+1} -s_{\alpha, {k}}. $$
(4.13)
We show the first inequality in (4.13). We denote by \(A_{k} = \| x _{k-1}- x_{0} \|\) and \(B_{k}=\| x_{k} - x_{k-1} \|\). We have the following identity:
$$ \int_{0} ^{1} \int _{A_{k} }^{A_{k} + \theta B_{k} } L (u)\,du \,d\theta = \int _{0 }^{B_{k} } L (A_{k} +u ) \biggl(1- \frac{u}{B_{k} } \biggr)\,du . $$
(4.14)
Then, by the L-average condition on \(U(x_{0}, s_{\alpha}^{\star})\), (4.6) for \(n=k-1\) and (4.10)-(4.14), we get
$$\begin{aligned} &\xi d\bigl(0, \mathcal {D}(x_{k})\bigr) \\ &\quad\leq\xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \| \bigr) d \bigl(F(x_{k}) , \mathcal {C}\bigr) \\ &\quad\leq\xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \| \bigr) \bigl\| F(x_{k}) - F(x_{k-1}) - F'(x_{k-1}) (x_{k} - x_{k-1} ) \bigr\| \\ &\quad\leq\xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \| \bigr) \int_{0 }^{ 1 } \bigl\| \bigl( F' \bigl(x_{k}^{\theta}\bigr) - F'(x_{k-1}) \bigr) (x_{k} - x_{k-1} )\,d\theta \bigr\| \\ &\quad\leq\xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \| \bigr) \int_{0} ^{1} \int_{A_{k} }^{A_{k} + \theta B_{k} } L (u)\,du B_{k} \,d\theta \\ &\quad\leq\xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \| \bigr) \int_{0 }^{ B_{k}} L (A_{k} +u ) (B_{k} -u )\,du \\ &\quad\leq\xi \beta_{x_{0}} (s_{\alpha, k} ) \int _{0 }^{ B_{k}} L (s_{\alpha, {k-1}} +u ) (B_{k} -u )\,du . \end{aligned}$$
(4.15)
For simplicity, we denote \(\Xi_{\alpha, k}:=s_{\alpha, {k}} - s_{\alpha, {k-1}}\). By (4.4) for \(n=k\) and Lemma 2.6, we have in turn
$$ \frac{\int_{0 }^{ B_{k} } L (s_{\alpha, {k-1}} +u ) (B_{k} -u )\,du}{B_{k} ^{2} } \leq\frac{\int_{0 }^{\Xi_{\alpha, k} } L (s_{\alpha, {k-1}} +u ) (\Xi_{\alpha, k} -u )\,du}{ \Xi_{\alpha, k} ^{2} }. $$
(4.16)
Thus we deduce that
$$ \xi d \bigl(0, \mathcal {D}(x_{k}) \bigr) \leq\xi \beta_{x_{0}} (s_{\alpha, k} ) \biggl(\int_{0 }^{\Xi_{\alpha, k} } L (s_{\alpha, {k-1}} +u ) (\Xi_{\alpha, k} -u )\,du \biggr) \biggl( \frac{B_{k} }{\Xi_{\alpha, k}} \biggr)^{2}. $$
(4.17)
Using (4.2) and (4.8), we obtain
$$ \frac {\xi \beta_{x_{0}} (s_{\alpha, k })}{ \alpha_{0} (R)} \leq \biggl( 1- \alpha_{0} (R) \int _{0} ^{s_{\alpha, k }} L_{0} (u)\,du \biggr)^{-1} . $$
(4.18)
Note that \(\alpha\geq\alpha_{0} (R)\). By (2.9), we have
$$ \frac {\xi \beta_{x_{0}} (s_{\alpha, k })}{ \alpha} \leq \biggl( 1- \alpha \int _{0} ^{s_{\alpha, k }} L_{0} (u)\,du \biggr)^{-1} = - \bigl(\psi' _{\alpha, 0} (s_{\alpha, k }) \bigr) ^{-1} . $$
(4.19)
By (2.12), (4.17)-(4.19), we deduce that the first inequality in (4.13) holds. The second inequality of (4.13) follows from (4.4). Moreover, by (4.3) and Lemma 2.8, we have
$$\begin{aligned} {\Xi_{\alpha, {k+1}}} =&- \psi'_{\alpha, 0} (s_{\alpha, k})^{-1} \beta_{\alpha, k} \leq- \psi'_{\alpha, 0}(t_{\alpha, 0}) \gamma_{\alpha, 0} \\ =&- \psi'_{\alpha, 0}(t_{\alpha, 0}) \psi_{\alpha}(t_{\alpha, 0}) =\eta\leq\Delta. \end{aligned}$$
Hence (4.13) implies that \(d(0, \mathcal {D}(x_{k})) \leq\Delta\) and there exists \(d_{0} \in \mathbb {R}^{l}\) with \(\| d_{0} \|\leq\Delta\) such that \(F(x_{k}) + F'(x_{k}) d_{0} \in \mathcal {C}\). By Remark 3.3, we have
$$\mathcal {D}_{\Delta}(x_{k})= \bigl\{ d \in \mathbb {R}^{l} : \| d \| \leq\Delta \mbox{ and } F(x_{k}) + F'(x_{k})\,d \in \mathcal {C}\bigr\} $$
and
$$d \bigl(0, \mathcal {D}_{\Delta}(x_{k}) \bigr) = d \bigl(0, \mathcal {D}(x_{k}) \bigr) . $$
We deduce that (4.6) holds for \(n=k\) since \(d_{k} = x_{k+1} - x_{k} \in \mathcal {D}(x_{k})\). We also have
$$\| x_{k+1} - x_{k} \| \leq \xi d \bigl(0, \mathcal {D}_{\Delta}(x_{k}) \bigr) =\xi d \bigl(0, \mathcal {D}(x_{k}) \bigr) . $$
Hence (3.7) holds for \(n=k\) and the assertion (\(\mathcal{T} \)) holds. It follows from (4.4) that \(\{ x_{k} \}\) is a Cauchy sequence in a Banach space and as such it converges to some \(x^{\star}\in\overline{U} (x_{0} , s_{\alpha}^{\star}) \) (since \(\overline{U} (x_{0} , s_{\alpha}^{\star})\) is a closed set).
Now, we use now mathematical induction to prove that (4.4), (4.5), and (4.6) hold. By (4.1), (4.3), and (4.9), it follows that \(\mathcal {D}(x_{0} ) \neq\emptyset\) and
$$\xi d \bigl(0, \mathcal {D}(x_{0}) \bigr) \leq\xi \beta_{x_{0}} (0) d \bigl(F(x_{0}) , \mathcal {C}\bigr) =\eta\leq\Delta. $$
We also have
$$\| x_{1} - x_{0} \|= \| d_{0} \| \leq\xi d \bigl(0, \mathcal {D}_{\Delta}(x_{0}) \bigr) \leq \xi\beta_{x_{0}} (0) d \bigl(F(x_{0}) , \mathcal {C}\bigr) =\eta={\Xi_{\alpha, 0}} $$
and (4.4) holds for \(n=1\). By an induction argument, we get
$$\| x_{k+1} - x_{k} \| \leq {\Xi_{\alpha, {k+1}}} \biggl( \frac{\| x_{k} - x_{k-1} \|}{ {\Xi_{\alpha, k}} } \biggr)^{2} \leq {\Xi_{\alpha, k+1}} . $$
The induction is completed. This completes the proof. □

Remark 4.2

(1) If \(L=L_{0}\), then Theorem 4.1 reduces to the corresponding ones in [7]. Otherwise, in view of (2.29)-(2.31), our results constitute an improvement. The rest of [7] is improved since those results are corollaries of Theorem 4.1. For more details, we leave this part to the motivated reader.

(2) In view of the proof of our Theorem 4.1, we see that the sequence \(\{ r_{\alpha, n} \}\) given by
$$\begin{aligned} &r_{\alpha, 0} = 0,\qquad r_{\alpha, 1} = \eta, \\ &r_{\alpha, 2} = r_{\alpha, 1 } - \frac { \alpha \int_{0 }^{r_{\alpha, 1} - r_{\alpha, 0} } L _{0} (r _{\alpha, 0} +u ) (r _{\alpha, {1}} - r_{\alpha, {0}} -u )\,du}{\psi' _{\alpha, 0} (r_{\alpha, 1})} , \\ &r_{\alpha, n+1} = r_{\alpha, n } -\frac{ \alpha \int_{0 }^{r_{\alpha, n} - r_{\alpha, {n-1} } } L (r _{\alpha, n-1} +u ) (r_{\alpha, {n}} - r_{\alpha, {n-1}} -u )\,du}{\psi' _{\alpha, 0} (r_{\alpha, n})} \end{aligned}$$
(4.20)
for each \(n\geq2\) is also a majorizing sequence for the method (GNA). Following the proof of Lemma 2.8 and under the hypotheses of Theorem 4.1, we get
$$\begin{aligned}& r_{\alpha, {n}} \leq s_{\alpha, {n}}\leq t_{\alpha, {n}} , \end{aligned}$$
(4.21)
$$\begin{aligned}& r_{\alpha, {n+1}} -r_{\alpha, {n}} \leq s_{\alpha, {n+1}}-s_{\alpha, {n}} \leq t_{\alpha, {n+1}}-t_{\alpha, {n}}, \end{aligned}$$
(4.22)
and
$$ r_{\alpha}^{\star}= \lim_{n \longrightarrow\infty} r _{\alpha, {n}} \leq s_{\alpha}^{\star}\leq r _{\alpha}^{\star}. $$
(4.23)
Hence \(\{ r _{\alpha, {n}} \}\) and \(\{ s_{\alpha, {n}} \}\) are the tighter majorizing sequences for \(\{ x_{n} \}\) than \(\{ t _{\alpha, {n}} \}\) used by Li and Ng in [7]. The sequences \(\{ r _{\alpha, {n}} \}\) and \(\{ s _{\alpha, {n}} \}\) can converge under hypotheses weaker than the ones given in Theorem 4.1. Such conditions have already given by us for more general functions ψ and in the more general setting of Banach spaces as in [1, 2, 10, 11, 23]. Therefore, here, we only refer to the popular Kantorovich case as an illustration. Choose \(\alpha=1\), \(L(u)=L\), and \(L_{0} (u)= L_{0}\) for all \(u \geq0\). Then the sequence \(\{ t _{\alpha, {n}} \}\) converges under the Newton-Kantorovich hypothesis, famous for its simplicity and clarity (see [1, 24]),
$$ h = L \eta\leq\frac{1}{2} . $$
(4.24)
The sequence \(\{ r_{\alpha, {n}} \}\) converges provided that (see, for example, [23])
$$ h_{1} = {L}_{1} \eta\leq\frac{1}{2} , $$
(4.25)
where
$${L}_{1} = \frac{1}{8} \bigl(L + 4 L_{0} + \bigl(L^{2} + 8 L_{0} L \bigr)^{1/2} \bigr) $$
and the sequence \(\{ r _{\alpha, {n}} \}\) converges if (see, for example, [23])
$$ h_{2} = L_{2} \eta\leq\frac{1}{2} , $$
(4.26)
where
$$L_{2}=\frac{1}{8} \bigl(4 L_{0} + \bigl(L L_{0} + 8 L_{0} ^{2} \bigr)^{1/2} + (L_{0} L)^{1/2} \bigr) . $$
It follows from (4.24)-(4.26) that
$$ h \leq\frac{1}{2}\quad \Longrightarrow\quad h_{1} \leq \frac{1}{2} \quad\Longrightarrow\quad h_{2} \leq\frac{1}{2} , $$
(4.27)
but not vice versa unless \(L_{0}=L\). Moreover, we get
$$\frac{h_{1}}{h}\longrightarrow\frac{1}{4} ,\qquad \frac{ h_{2}}{h} \longrightarrow0,\qquad \frac{ h_{2}}{ h_{1}}\longrightarrow0 $$
as \(\frac{L_{0}}{L}\longrightarrow0\).
(3) There are cases when the sufficient convergence conditions developed in the preceding work are not satisfied. Then one can use the modified Gauss-Newton method (MGNM). In this case, the majorizing sequence proposed in [7] is given by
$$ q_{\alpha, {0}} =0 ,\qquad q_{\alpha, {n+1}} = q _{\alpha, {n}} - \frac{\psi_{\alpha}( q_{\alpha, {n}})}{\psi' _{\alpha}(0)} $$
(4.28)
for each \(n\geq0\). This sequence clearly converges under the hypotheses of Theorem 4.1, so that the estimates (4.4)-(4.7) hold with the sequence \(\{ q_{\alpha, {n}} \}\) replacing \(\{ s_{\alpha, {n}} \}\). However, according to the proof of Theorem 4.1, the hypotheses on \(\psi_{\alpha, 0}\) can replace the corresponding ones on \(\psi _{\alpha}\). Moreover, the majorizing sequence is given by
$$ p_{\alpha, {0}} =0 , \qquad p_{\alpha, {n+1}} = p_{\alpha, {n}} - \frac {\psi_{\alpha}( p_{\alpha, {n}})}{ \psi' _{\alpha, 0} (0)} $$
(4.29)
for each ≥0. Furthermore, we have
$$ \psi_{\alpha, 0} (s) \leq \psi_{\alpha} (s) $$
(4.30)
for each \(s \geq0\). Hence clearly it follows that, for each \(n\geq0\),
$$\begin{aligned}& p_{\alpha, {n}} \leq q_{\alpha, {n}} , \end{aligned}$$
(4.31)
$$\begin{aligned}& p_{\alpha, {n+1}}-p_{\alpha, {n}}\leq q_{\alpha, {n+1}}-q_{\alpha, {n}}, \end{aligned}$$
(4.32)
and
$$ p_{\alpha}^{\star}= \lim_{n \longrightarrow\infty} p _{\alpha, {n}} \leq q_{\alpha}^{\star}= \lim _{n \longrightarrow\infty} q _{\alpha, {n}} . $$
(4.33)
(Notice also the advantages of (2.20)-(2.26).)
In the special case when functions \(L_{0}\) and L are constants and \(\alpha=1\), we find that the conditions on the function \(\psi_{\alpha}\) reduce to (4.24), whereas using \(\psi_{\alpha, 0} \)
$$ h_{0} = L_{0} \eta\leq\frac{1}{2} . $$
(4.34)
Notice that
$$ \frac{h_{0}}{h}\longrightarrow0 $$
(4.35)
as \(\frac{L_{0}}{L}\longrightarrow0\). Therefore, one can use (MGNM) as a predictor until a certain iterate \(x_{N}\) for which the sufficient conditions for (GNM) are satisfied. Then we use \(x_{N}\) as the starting iterate for faster than (MGNM) method (GNM). Such an approach was used by the author in [25].

5 General majorant conditions

In this section, we provide a semilocal convergence analysis for (GNA) using more general majorant conditions than (2.8) and (2.9).

Definition 5.1

Let \(\mathcal {Y}\) be a Banach space, \(x_{0} \in \mathbb{R}^{l}\) and \(\alpha>0\). Let \(G : \mathbb{R}^{l} \longrightarrow \mathcal {Y}\) and \(f_{\alpha}: [0,r [ \longrightarrow\,]{-}\infty, + \infty[\) be continuously differentiable. Then G is said to satisfy:

(1) the center-majorant condition on \(U (x_{0} , r)\) if
$$ \bigl\| G(x) - G(x_{0} ) \bigr\| \leq\alpha^{-1} \bigl(f'_{\alpha}\bigl(\| x - x_{0} \| \bigr) - f' _{\alpha}(0) \bigr) $$
(5.1)
for all \(x \in U(x_{0} , r)\);
(2) the majorant condition on \(U (x_{0} , r)\) if
$$ \bigl\| G(x) - G(y ) \bigr\| \leq \alpha^{-1} \bigl(f'_{\alpha}\bigl(\| x - y \|+ \| y - x_{0} \| \bigr) - f' _{\alpha}\bigl(\| y - x_{0} \| \bigr) \bigr) $$
(5.2)
for all \(x,y \in U(x_{0} , r)\) with \(\| x- y \|+ \| y - x_{0} \|\leq r\).
Clearly, the conditions (5.1) and (5.2) generalize (2.8) and (2.9), respectively, in [20] (see also [1, 2, 10, 11, 23, 25]) (for \(G=F'\) and \(\alpha=1\)). Define the majorizing sequence \(\{ t_{\alpha, n} \}\) by
$$ t_{\alpha, 0} = 0,\qquad t_{\alpha, n+1} = t_{\alpha, n} - \frac {f_{\alpha}(t_{\alpha, n} )}{ f'_{\alpha}(t_{\alpha, n} )} . $$
(5.3)
Moreover, as in (4.2) and for \(R>0\), define (implicitly):
$$ \alpha_{0} (R): = \sup_{\xi\leq t < R} - \frac{\eta \beta_{x_{0}} (t)}{f'_{\alpha_{0} (R)} (t)} . $$
(5.4)

Next, we provide sufficient conditions for the convergence of the sequence \(\{ t_{\alpha, n} \}\) corresponding to the ones given in Lemma 2.4.

Lemma 5.2

(see, for example, [2, 10, 20])

Let \(r>0\), \(\alpha>0\), and \(f_{\alpha}: [0,r) \longrightarrow (-\infty, + \infty)\) be continuously differentiable. Suppose that:

(1) \(f_{\alpha}(0) >0\), \(f'_{\alpha}(0) = -1\);

(2) \(f'_{\alpha}\) is convex and strictly increasing;

(3) the equation \(f_{\alpha}(t)=0\) has positive zeros. Denote by \(r _{\alpha}^{\star} \) the smallest zero. Define \(r _{\alpha}^{\star\star} \) by
$$ r _{\alpha}^{\star\star} =\sup \bigl\{ t \in\bigl[r _{\alpha}^{\star} , r\bigr) : f_{\alpha}(t) \leq0 \bigr\} . $$
(5.5)
Then the sequence \(\{ t_{\alpha, n} \}\) is strictly increasing and converges to \(r _{\alpha}^{\star} \). Moreover, the following estimates hold:
$$ r _{\alpha}^{\star} - t_{\alpha, n } \leq \frac{D^{-} f'_{\alpha}(r _{\alpha}^{\star})}{ -2 f'_{\alpha}(r _{\alpha}^{\star})} \bigl(r _{\alpha}^{\star} - t_{\alpha, n-1} \bigr)^{2} $$
(5.6)
for each \(n\geq1\), where \(D^{-} f'\) is the left directional derivative of f (see, for example, [1, 2, 10, 20]).

Now, we show the following semilocal convergence result for the method (GNA) using the generalized majorant conditions (5.1) and (5.2).

Theorem 5.3

Let \(\xi\in[1, +\infty)\) and \(\Delta\in(0, +\infty] \). Let \(x_{0} \in \mathbb {R}^{l}\) be a quasi-regular point of (2.3) with the quasi-regular radius \(R_{x_{0}}\) and the quasi-regular bound function \(\beta_{x_{0}}\). Let \(\eta>0\) and \(\alpha_{0} (r)\) be given in (4.1) and (5.4). Let \(0 < R < R_{x_{0}}\), \(\alpha\geq\alpha _{0} (R)\) be a positive constant, and let \(r_{\alpha}^{\star}\), \(r_{\alpha}^{\star\star}\) be as defined in Lemma  5.2. Suppose that \(F'\) satisfies the majorant condition on \(U(x_{0} ,r_{\alpha}^{\star})\), and the conditions
$$ \eta\leq\min \bigl\{ r_{\alpha}^{\star}, \Delta \bigr\} ,\qquad r_{\alpha}^{\star}\leq R $$
(5.7)
hold. Then the sequence \(\{ x_{n} \}\) generated by (GNA) is well defined, remains in \(\overline{U}(x _{0}, r _{\alpha}^{\star})\) for all \(n \geq0\) and converges to some \(x^{\star}\) such that \(F(x^{\star}) \in \mathcal {C}\). Moreover, the following estimates hold: for each \(n\geq1\),
$$\begin{aligned}& \| x_{n} - x_{n-1} \| \leq t _{\alpha, {n}} -t _{\alpha, {n-1}}, \end{aligned}$$
(5.8)
$$\begin{aligned}& \| x_{n+1} - x_{n} \| \leq( t _{\alpha, {n+1}} -t _{\alpha, {n}} ) \biggl(\frac {\| x_{n} - x_{n-1} \|}{ t_{\alpha, {n}} -t _{\alpha, {n-1}}} \biggr)^{2} , \end{aligned}$$
(5.9)
$$\begin{aligned}& F(x_{n}) + F'(x_{n}) (x_{n+1} - x_{n} ) \in \mathcal {C}, \end{aligned}$$
(5.10)
and
$$ \bigl\| x_{n-1} - x^{\star}\bigr\| \leq r _{\alpha}^{\star}-t _{\alpha, {n-1}} , $$
(5.11)
where the sequence \(\{ t_{\alpha, n } \}\) is given by (5.3).

Proof

We use the same notations as in Theorem 4.1. We follow the proof of Theorem 4.1 until (4.13). Then, using (4.10), (5.2) (for \(G=F'\)), (5.3), (5.4), and the hypothesis \(\alpha\geq\alpha_{0} (R)\), we get in turn
$$\begin{aligned} &\xi d\bigl(0, \mathcal {D}(x_{k})\bigr) \\ &\quad\leq\xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \|\bigr) d \bigl(F(x_{k}) , \mathcal {C}\bigr) \\ &\quad\leq \xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \|\bigr) \bigl\| F(x_{k}) - F(x_{k-1}) - F'(x_{k-1}) (x_{k} - x_{k-1} ) \bigr\| \\ &\quad\leq \xi \beta_{x_{0}} \bigl(\| x_{k} - x_{0} \|\bigr) \int_{0 }^{ 1 }\bigl\| \bigl( F' \bigl(x_{k}^{\theta}\bigr) - F' (x_{k-1}) \bigr) (x_{k} - x_{k-1} )\,d\theta\bigr\| \\ &\quad\leq \xi \frac{\beta_{x_{0}} (t _{\alpha, k} )}{\alpha_{0} (R)} \int_{0 }^{ 1 } \bigl( f'_{\alpha}\bigl(t_{\alpha, k }^{\theta}\bigr) - f' _{\alpha}(t_{\alpha, k-1 })\bigr) (t_{\alpha, k } - t_{\alpha, k-1 } )\,d\theta \\ &\quad\leq \xi \frac{\beta_{x_{0}} (t _{\alpha, k} )}{\alpha} \int_{0 }^{ 1 } \bigl( f'_{\alpha}\bigl(t_{\alpha, k }^{\theta}\bigr) - f' _{\alpha}(t_{\alpha, k-1 })\bigr) (t_{\alpha, k } - t_{\alpha, k-1 } )\,d\theta \\ &\quad\leq - f'_{\alpha}(t_{\alpha, k})^{-1} \bigl( f_{\alpha}(t_{\alpha, k }) - f _{\alpha}(t_{\alpha, k-1 }) - f ' _{\alpha}(t_{\alpha, k-1 }) (t_{\alpha, k } - t_{\alpha, k-1 } ) \bigr) \\ &\quad=- f'_{\alpha}(t_{\alpha, k}) f_{\alpha}(t_{\alpha, k }) , \end{aligned}$$
(5.12)
where \(t_{\alpha, k }^{\theta}=\theta t_{\alpha, k } + (1-\theta) ( t_{\alpha, k } - t_{\alpha, k-1 }) \) for all \(\theta\in[0,1]\). The rest follows as in the proof of Theorem 4.1. This completes the proof. □

Remark 5.4

In view of the condition (5.2), there exists \(f_{\alpha,0} : [0,r) \longrightarrow(-\infty, + \infty)\) continuously differentiable such that
$$ \bigl\| G(x) - G(x_{0} ) \bigr\| \leq\alpha^{-1} \bigl(f'_{\alpha, 0} \bigl(\| x - x_{0} \| \bigr) - f' _{\alpha, 0} (0 ) \bigr) $$
(5.13)
for all \(x \in U(x_{0} , r)\) and \(r \leq R\). Moreover,
$$ f'_{\alpha, 0} (t) \leq f'_{\alpha} (t) $$
(5.14)
for all \(t \in[0, r]\) holds in general and \(\frac{f'_{\alpha}}{f'_{\alpha, 0}}\) can be arbitrarily large (see, for example, [1, 2, 10, 11, 23, 25]). These observations motivate us to introduce the tighter majorizing sequences \(\{ r_{\alpha, n } \}\), \(\{ s_{\alpha, n } \}\) by
$$\begin{aligned} &r_{\alpha, 0} = 0,\qquad r_{\alpha, 1} = \eta= -\frac{f_{\alpha}(0)}{ f' _{\alpha}(0) } , \\ &r _{\alpha, 2} = r _{\alpha, 1 } - \frac { \alpha (f_{\alpha, 0} ( r _{\alpha, 1} ) - f_{\alpha, 0} (r _{\alpha, 0} ) - f '_{\alpha, 0} (r _{\alpha, 0} ) (r_{\alpha, 1} - r _{\alpha, 0}) )}{f ' _{\alpha, 0} (r _{\alpha, 1})} , \\ &r _{\alpha, n+1} = r _{\alpha, n } - \frac{ \int_{0 }^{ 1 } ( f'_{\alpha}(r _{\alpha, k }^{\theta}) - f' _{\alpha}(r _{\alpha, k-1 })) (r _{\alpha, k } - r _{\alpha, k-1 } )\,d\theta}{f ' _{\alpha, 0} (r _{\alpha, n})} \end{aligned}$$
(5.15)
for each ≥2 and
$$ \begin{aligned} &s_{\alpha, 0} = 0,\qquad s_{\alpha, 1} = r_{\alpha, 1} , \\ &s _{\alpha, n+1} = s _{\alpha, n } - \frac{ \int_{0 }^{ 1 } ( f'_{\alpha}(s _{\alpha, k }^{\theta}) - f' _{\alpha}(s _{\alpha, k-1 })) (s _{\alpha, k } - s _{\alpha, k-1 } )\,d\theta}{f ' _{\alpha, 0} (s _{\alpha, n})} \end{aligned} $$
(5.16)
for each \(n\geq0\).

6 Conclusion

Using a combination of average and center-average type conditions, we presented a semilocal convergence analysis for the method (GNA) to approximate a solution or a fixed point of a convex composite optimization problem in the setting of finite dimensional spaces. Our analysis extends the applicability of the method (GNA) under the same computational cost as in earlier studies, such as [4, 5, 7, 12, 13, 2635].

Declarations

Acknowledgements

The second author was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and future Planning (2014R1A2A2A01002100).

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
Department of Mathematics Sciences, Cameron University
(2)
Department of Mathematics Education and RINS, Gyeongsang National University
(3)
Department of Mathematics, King Abdulaziz University
(4)
Laboratoire de Mathématiques et Applications, Poitiers University

References

  1. Argyros, IK: Convergence and Applications of Newton-Type Iterations. Springer, New York (2009) Google Scholar
  2. Argyros, IK, Hilout, S: Computational Methods in Nonlinear Analysis: Efficient Algorithms, Fixed Point Theory and Applications. World Scientific, Singapore (2013) View ArticleGoogle Scholar
  3. Burke, JV, Ferris, MC: A Gauss-Newton method for convex composite optimization. Math. Program., Ser. A 71, 179-194 (1995) MathSciNetMATHGoogle Scholar
  4. Moldovan, A, Pellegrini, L: On regularity for constrained extremum problems, I. Sufficient optimality conditions. J. Optim. Theory Appl. 142, 147-163 (2009) MathSciNetView ArticleMATHGoogle Scholar
  5. Moldovan, A, Pellegrini, L: On regularity for constrained extremum problems, II. Necessary optimality conditions. J. Optim. Theory Appl. 142, 165-183 (2009) MathSciNetView ArticleMATHGoogle Scholar
  6. Rockafellar, RT: Convex Analysis. Princeton Mathematical Series, vol. 28. Princeton University Press, Princeton (1970) MATHGoogle Scholar
  7. Li, C, Ng, KF: Majorizing functions and convergence of the Gauss-Newton method for convex composite optimization. SIAM J. Optim. 18, 613-642 (2007) MathSciNetView ArticleMATHGoogle Scholar
  8. Robinson, SM: Extension of Newton’s method to nonlinear functions with values in a cone. Numer. Math. 19, 341-347 (1972) MathSciNetView ArticleMATHGoogle Scholar
  9. Robinson, SM: Stability theory for systems of inequalities, I. Linear systems. SIAM J. Numer. Anal. 12, 754-769 (1975) MathSciNetView ArticleMATHGoogle Scholar
  10. Argyros, IK, Cho, YJ, Hilout, S: Numerical Methods for Equations and Its Applications. CRC Press, New York (2012) MATHGoogle Scholar
  11. Argyros, IK, Hilout, S: Extending the applicability of the Gauss-Newton method under average Lipschitz-type conditions. Numer. Algorithms 58, 23-52 (2011) MathSciNetView ArticleMATHGoogle Scholar
  12. Wang, XH: Convergence of Newton’s method and inverse function theorem in Banach space. Math. Comput. 68, 169-186 (1999) View ArticleMATHGoogle Scholar
  13. Wang, XH: Convergence of Newton’s method and uniqueness of the solution of equations in Banach space. IMA J. Numer. Anal. 20, 123-134 (2000) MathSciNetView ArticleMATHGoogle Scholar
  14. Xu, XB, Li, C: Convergence of Newton’s method for systems of equations with constant rank derivatives. J. Comput. Math. 25, 705-718 (2007) MathSciNetMATHGoogle Scholar
  15. Xu, XB, Li, C: Convergence criterion of Newton’s method for singular systems with constant rank derivatives. J. Math. Anal. Appl. 345, 689-701 (2008) MathSciNetView ArticleMATHGoogle Scholar
  16. Zabrejko, PP, Nguen, DF: The majorant method in the theory of Newton-Kantorovich approximations and the Ptǎk error estimates. Numer. Funct. Anal. Optim. 9, 671-684 (1987) MathSciNetView ArticleMATHGoogle Scholar
  17. Häubler, WM: A Kantorovich-type convergence analysis for the Gauss-Newton method. Numer. Math. 48, 119-125 (1986) MathSciNetView ArticleGoogle Scholar
  18. Hiriart-Urruty, JB, Lemaréchal, C: Convex Analysis and Minimization Algorithms I: Fundamentals. Advanced Theory and Bundle Methods, vol. 305. Springer, Berlin (1993) MATHGoogle Scholar
  19. Hiriart-Urruty, JB, Lemaréchal, C: Convex Analysis and Minimization Algorithms II. Advanced Theory and Bundle Methods, vol. 306. Springer, Berlin (1993) MATHGoogle Scholar
  20. Ferreira, OP, Svaiter, BF: Kantorovich’s majorants principle for Newton’s method. Comput. Optim. Appl. 42, 213-229 (2009) MathSciNetView ArticleMATHGoogle Scholar
  21. Li, C, Wang, XH: On convergence of the Gauss-Newton method for convex composite optimization. Math. Program., Ser. A 91, 349-356 (2002) View ArticleMATHGoogle Scholar
  22. Ng, KF, Zheng, XY: Characterizations of error bounds for convex multifunctions on Banach spaces. Math. Oper. Res. 29, 45-63 (2004) MathSciNetView ArticleMATHGoogle Scholar
  23. Argyros, IK, Hilout, S: Weaker conditions for the convergence of Newton’s method. J. Complex. 28, 364-387 (2012) MathSciNetView ArticleMATHGoogle Scholar
  24. Kantorovich, LV, Akilov, GP: Functional Analysis. Pergamon Press, Oxford (1982) MATHGoogle Scholar
  25. Argyros, IK: Approximating solutions of equations using Newton’s method with a modified Newton’s method iterate as a starting point. Rev. Anal. Numér. Théor. Approx. 36, 123-138 (2007) MathSciNetMATHGoogle Scholar
  26. Argyros, IK, Cho, YJ, George, S: On the ‘Terra incognita’ for the Newton-Kantorovich method. J. Korean Math. Soc. 51, 251-266 (2014) MathSciNetView ArticleMATHGoogle Scholar
  27. Argyros, IK, Cho, YJ, Khattri, SK: On a new semilocal convergence analysis for the Jarratt method. J. Inequal. Appl. 2013, 194 (2013) MathSciNetView ArticleGoogle Scholar
  28. Argyros, IK, Cho, YJ, Ren, HM: Convergence of Halley’s method for operators with the bounded second derivative in Banach spaces. J. Inequal. Appl. 2013, 260 (2013) MathSciNetView ArticleGoogle Scholar
  29. Argyros, IK, Hilout, S: Local convergence analysis for a certain class of inexact methods. J. Nonlinear Sci. Appl. 1, 244-253 (2008) MathSciNetMATHGoogle Scholar
  30. Argyros, IK, Hilout, S: Local convergence analysis of inexact Newton-like methods. J. Nonlinear Sci. Appl. 2, 11-18 (2009) MathSciNetMATHGoogle Scholar
  31. Argyros, IK, Hilout, S: Multipoint iterative processes of efficiency index higher than Newton’s method. J. Nonlinear Sci. Appl. 2, 195-203 (2009) MathSciNetMATHGoogle Scholar
  32. Sahu, DR, Cho, YJ, Agarwal, RP, Argyros, IK: Accessibility of solutions of operator equations by Newton-like method. J. Complex. 31, 637-657 (2015) MathSciNetView ArticleGoogle Scholar
  33. Li, C, Hu, N, Wang, J: Convergence behavior of Gauss-Newton’s method and extensions to the Smale point estimate theory. J. Complex. 26, 268-295 (2010) MathSciNetView ArticleMATHGoogle Scholar
  34. Li, C, Zhang, WH, Jin, XQ: Convergence and uniqueness properties of Gauss-Newton’s method. Comput. Math. Appl. 47, 1057-1067 (2004) MathSciNetView ArticleMATHGoogle Scholar
  35. Chen, X, Yamamoto, T: Convergence domains of certain iterative methods for solving nonlinear equations. Numer. Funct. Anal. Optim. 10, 37-48 (1989) MathSciNetView ArticleMATHGoogle Scholar

Copyright

© Argyros et al. 2015