Skip to main content

Mathematical programming for the sum of two convex functions with applications to lasso problem, split feasibility problems, and image deblurring problem

Abstract

In this paper, two iteration processes are used to find the solutions of the mathematical programming for the sum of two convex functions. In infinite Hilbert space, we establish two strong convergence theorems as regards this problem. As applications of our results, we give strong convergence theorems as regards the split feasibility problem with modified CQ method, strong convergence theorem as regards the lasso problem, and strong convergence theorems for the mathematical programming with a modified proximal point algorithm and a modified gradient-projection method in the infinite dimensional Hilbert space. We also apply our result on the lasso problem to the image deblurring problem. Some numerical examples are given to demonstrate our results. The main result of this paper entails a unified study of many types of optimization problems. Our algorithms to solve these problems are different from any results in the literature. Some results of this paper are original and some results of this paper improve, extend, and unify comparable results in existence in the literature.

1 Introduction

Let \(\mathbb{R}\) be the set of real numbers, H and \(H_{1}\) be (real) Hilbert spaces with inner product \(\langle\cdot,\cdot\rangle\) and norm \(\|\cdot\|\). Let C and Q be nonempty, closed, convex subsets of H and \(H_{1}\), respectively. Let \(\Gamma_{0}(H)\) be the space of all proper lower semicontinuous convex functions from H to \((-\infty ,\infty]\). In this paper, we consider the following minimization problem:

$$ \mathop{\arg\min}_{x\in H} h_{1}(x)+g_{1}(x) , $$
(1.1)

where \(h_{1},g_{1}\in\Gamma_{0}(H)\), and \(g_{1}:H\rightarrow \mathbb {R}\) is a Fréchet differential function.

Let A be a \(m\times n\) real matrix, \(x\in\mathbb{R}^{n}\), \(b\in R^{m}\), \(\gamma\geq0\) be a regularization parameter and \(t\geq0\). Tibshirani [1] studied the following minimization problem:

$$ \min_{x}\frac{1}{2}\|Ax-b \|^{2}_{2} \quad \mbox{subject to } \|x\| _{1}\leq t , $$
(1.2)

where \(\|x\|_{1}=\sum_{i=1}^{n}|x_{i}|\) for \(x=(x_{1},x_{2},\ldots,x_{n})\in\mathbb{R}^{n}\), and \(\|y\|^{2}_{2}=\sum_{i=1}^{m}(y_{i})^{2}\) for \(y=(y_{1},y_{2},\ldots, y_{m})\in\mathbb{R}^{m}\). Problem (1.2) is called lasso, an abbreviation of ‘the least absolute shrinkage and select operator’. This is a special case of problem (1.1). Besides, we know that problem (1.2) is equivalent to the following problem:

$$ \min_{x}\frac{1}{2}\|Ax-b \|^{2}_{2}+\gamma\|x\|_{1} . $$
(1.3)

Therefore, problems (1.2) and (1.3) are special cases of problem (1.1).

Due to the involvement of the \(\ell_{1}\) norm, which promotes the sparsity phenomena of many real world problems arising from image/signal processing, statistical regression, machining learning and so on, the lasso receives much attention (see Combettes and Wajs [2], Xu [3], and Wang and Xu [4]).

Let \(f:H\rightarrow (-\infty,\infty]\) be proper and let C be a nonempty subset of \(\operatorname{dom}(f)\). Then f is said to be uniformly convex on C if

$$f\bigl(\alpha x+(1-\alpha)y+\alpha(1-\alpha)p\bigl(\Vert x-y\Vert \bigr) \bigr)\leq\alpha f(x)+(1-\alpha)f(y) $$

for all \(x,y \in C\), and for all \(\alpha\in(0,1)\), where \(p:[0,\infty )\rightarrow [0,\infty]\) is an increasing function, p vanishes only at 0.

Let \(g\in\Gamma_{0}(H)\) and \(\lambda\in(0,\infty)\). The proximal operator of g is defined by

$$\operatorname{prox}_{g}x=\mathop{\arg\min}_{v\in H}\biggl(g(v)+ \frac{1}{2}\|v-x\|^{2}\biggr),\quad x\in H. $$

The proximal operator of \(g\in\Gamma_{0}(H)\) of order \(\lambda\in (0,\infty)\) is defined as the proximal operator of λg, that is,

$$\operatorname{prox}_{\lambda g}x=\mathop{\arg\min}_{v\in H}\biggl(g(v)+ \frac{1}{\lambda} \Vert v-x\Vert ^{2}\biggr),\quad x\in H. $$

The following results are important results on the solution of the problem (1.1).

Theorem 1.1

(Douglas-Rachford-algorithm) [5]

Let f and g be functions in \(\Gamma_{0}(H)\) such that \((\partial f+\partial g)^{-1}0\neq\emptyset\). Let \(\{\lambda_{n}\}_{n\in \mathbb{N}}\) be a sequence in \([0,2]\) such that \(\sum_{n\in\mathbb{N}}\lambda_{n}(2-\lambda_{n})=+\infty\). Let \(\gamma\in(0,\infty)\), and \(x_{0}\in H\). Set

$$ \left \{ \textstyle\begin{array}{l} y_{n}=\operatorname{prox}_{\gamma g} x_{n}, \\ z_{n}=\operatorname{prox}_{\gamma f}(2y_{n}-x_{n}), \\ x_{n+1}=x_{n}+\lambda_{n}(z_{n}-x_{n}),\quad n\in\mathbb{N}. \end{array}\displaystyle \right . $$

Then there exists \(x\in H\) such that the following hold:

  1. (i)

    \(\operatorname{prox}_{\gamma g} x\in\arg\min_{x\in H}(f+g)(x)\);

  2. (ii)

    \(\{y_{n}-z_{n}\}_{n\in\mathbb{N}}\) converges strongly to 0;

  3. (iii)

    \(\{x_{n}\}_{n\in\mathbb{N}}\) converges weakly to x;

  4. (iv)

    \(\{y_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converge weakly to \(\operatorname{prox}_{\gamma g}x\);

  5. (v)

    suppose that one of the following holds:

    1. (a)

      f is uniformly convex on every nonempty subset of \(\operatorname{dom}\partial f\);

    2. (b)

      g is uniformly convex on every nonempty bounded subset of \(\operatorname{dom}\partial g\).

Then \(\{y_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converge strongly to \(\operatorname{prox}_{\gamma g}x\), which is the unique minimizer of \(f+g\).

Theorem 1.2

(Forward-backward algorithm) [6]

Let \(f\in\Gamma_{0}(H)\), let \(g:H\rightarrow \mathbb{R}\) be convex and differentiable with a \(\frac {1}{\beta}\)-Lipschitz continuous gradient for some \(\beta\in (0,\infty)\), let \(\gamma\in(0,2\beta)\), and set \(\delta=\min\{ 1,\frac{\beta}{\gamma}\}+\frac{1}{2}\). Furthermore, let \(\{\lambda_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((0,2\delta]\) such that \(\sum_{n\in\mathbb{N}}\lambda_{n}(\delta-\lambda_{n})=+\infty\). Suppose that \(\arg\min (f+g)\neq\emptyset\) and let

$$ \left \{ \textstyle\begin{array}{l} y_{n}=x_{n}-\gamma\nabla g(x_{n}), \\ x_{n+1}=x_{n}+\lambda_{n}(\operatorname{prox}_{\gamma f}y_{n}-x_{n}), \quad n\in\mathbb{N}. \end{array}\displaystyle \right . $$

Then the following hold:

  1. (i)

    \((x_{n})_{n\in\mathbb{N}}\) converges weakly to a point in \(\arg\min_{x\in H}(f+g)(x)\);

  2. (ii)

    suppose that \(\inf_{n\in\mathbb{N}}\lambda_{n}\in (0,\infty)\) and one of the following hold:

    1. (a)

      f is uniformly convex on every nonempty bounded subset of \(\operatorname{dom}\partial f\);

    2. (b)

      g is uniformly convex on every nonempty bounded subset of H.

Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges strongly to the unique minimizer of \(f+g\)

Theorem 1.3

(Tseng’s algorithm) [7]

Let D be a nonempty subset of H, \(f\in\Gamma _{0}(H)\) be such that \(\operatorname{dom}\partial f\subset D\), and let \(g\in\Gamma _{0}(H)\) be such that Gâteaux differentiable on D. Suppose that C is a nonempty, closed, convex subset of D such that \(C\cap\arg \min_{x\in H}(f+g)(x)\neq\emptyset\), and that ∇g is \(\frac{1}{\beta}\)-Lipschitz continuous relative to \(C\cup \operatorname{dom} \partial f\) for some \(\beta\in(0,\infty)\). Let \(x_{0}\in C\) and \(\gamma\in(0,\beta)\), and set

$$ \left \{ \textstyle\begin{array}{l} y_{n}=x_{n}-\nabla g(x_{n}), \\ z_{n}=\operatorname{prox}_{\gamma f}y_{n}, \\ r_{n}=z_{n}-\gamma\nabla g(z_{n}), \\ x_{n+1}=P_{C}(x_{n}-y_{n}+r_{n}),\quad n\in\mathbb{N}. \end{array}\displaystyle \right . $$

Then

  1. (a)

    \(\{x_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converges weakly to a point in \(C\cap\arg\min_{x\in H}(f+g)(x)\);

  2. (b)

    suppose that f or g is uniformly convex on every nonempty subset of \(\operatorname{dom}\partial f\).

Then \(\{x_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converges strongly in \(C\cap\arg\min_{x\in H}(f+g)(x)\).

Combettes and Wajs [2] used the proximal gradient method to generalize a sequence \(\{x_{n}\}\) by the algorithm: \(x_{0}\in H \) is chosen arbitrarily, and

$$x_{n+1}=\operatorname{prox}_{\lambda_{n}g}(I-\lambda_{n}{ \nabla f})x_{n}. $$

We observed that Combettes and Wajs [2] showed that \(\{x_{n}\}\) converges weakly to a solution of the minimization problem (1.1) under suitable conditions. In 2014, Xu [3] gave an iteration process and proved weak convergence theorems to the solution for the problem (1.1). Next, Wang and Xu [4] studied problem (1.1) by the following two types of iteration processes:

  1. (a)

    \(x_{n+1}:=\operatorname{prox}_{\lambda_{n}g}((1-\gamma_{n})x_{n}-\delta _{n}\nabla f(x_{n}))\) for all \(n\in\mathbb{N}\);

  2. (b)

    \(x_{n+1}:=(I-\gamma_{n}\operatorname{prox}_{\lambda_{n}g}(I-\delta _{n}\nabla f))x_{n}\) for all \(n\in\mathbb{N}\).

For these two iteration processes, Wang and Xu [4] proved that they converge strongly to a solution of the problem (1.1) under suitable conditions.

Let I be the identity function of H, and let \(f_{1}: C\times C\rightarrow H\) be a bifunction. Let \(g_{1}:H\rightarrow (-\infty ,\infty]\) be a proper convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on \(\operatorname{int}(\operatorname{dom}(g_{1}))\), \(C \subset \operatorname{int}(\operatorname{dom}(g_{1}))\), and \(h_{1}:H\rightarrow (-\infty,\infty]\) be a proper convex lower semicontinuous function. Let \(P_{C}\) be the metric projection of H into C. Throughout this paper, we use these notations unless specified otherwise.

Motivated by the results of the above problems, in this paper, we introduce the following iterations to study problem (1.1).

Iteration (I)

Let \(x_{1}\in C \) be chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}),\quad n\in\mathbb{N}, \end{array}\displaystyle \right . $$

where \(f_{1}(x,y)=\langle y-x, \nabla g_{1}(x)\rangle+h_{1}(y)-h_{1}(x)\).

Iteration (II)

Let \(x_{1}\in C \) be chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{r}^{\partial{h_{1}}}(I-r\nabla g_{1})x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}), \quad n\in\mathbb{N}. \end{array}\displaystyle \right . $$

Then we establish two strong convergence theorems without the uniformly convex assumption on the functions we consider. Our results improve Combettes and Wajs [2], Xu [3], Douglas-Rachford [5], Theorem 1.2, and Tseng [7]. Our results is also different from Wang and Xu [4].

We also apply our results to study the following problems.

  1. (AP1)

    Split feasibility problem:

    $$ \text{Find } \bar{x}\in H \text{ such that } \bar{x}\in C \text{ and }A\bar{x}\in Q, $$
    (SFP)

    where \(A:H\rightarrow H_{1}\) is a linear and bounded operator.

In 1994, the split feasibility problem (SFP) in finite dimensional Hilbert spaces was first introduced by Censor and Elfving [8] for modeling inverse problems which arise from phase retrievals and in medical image reconstruction. Since then, the split feasibility problem (SFP) has received much attention due to its applications in signal processing, image reconstruction, with particular progress in intensity-modulated radiation therapy, approximation theory, control theory, biomedical engineering, communications, and geophysics. For examples, one can refer to [8–13] and related literature.

In 2002, Byrne [9] first introduced the so-called CQ algorithm which generates a sequence \(\{x_{n}\}\) by the following recursive procedure:

$$x_{n+1}=P_{C} \bigl(x_{n}-\rho_{n} A^{*}(I-P_{Q})Ax_{n}\bigr), $$

where the stepsize \(\rho_{n}\) is chosen in the interval \((0, 2/\|A\|^{2})\), and \(P_{C}\) and \(P_{Q}\) are the metric projections onto \(C\subseteq \mathbb{R}^{n}\) and \(Q\subseteq \mathbb{R}^{m}\), respectively. Byrne [9] used the CQ iteration method to study the split feasibility problem in finite dimensional spaces, but in the infinite dimensional Hilbert space, a strong convergence theorem may not be true for the split feasibility problem by the CQ algorithm [14]. Hence, some modified CQ algorithms are introduced.

In this paper, we give a new iteration to study the (SFP), we also give our modified CQ method to study (SFP), we study two strong convergence theorem to the solution of this problem. We establish a strong convergence of this problem. Our results are different from Theorem 5.1 of Xu [14] and improve other results of Xu [14].

  1. (AP2)

    Lasso problem:

    $$ \mathop{\arg\min}_{x\in\mathbb{R}^{n}}\frac{1}{2}\|Ax-b\|^{2}_{2}+ \gamma\|x\|_{1}. $$

We give two iterations to study the lasso problem, and we establish two strong convergence theorems of the lasso problem.

  1. (AP3)

    Mathematical programming for convex function:

    $$ \mathop{\arg\min}_{x\in H}h_{1}(x),\mbox{where }h_{1}\in\Gamma_{0}(H). $$

Rockafellar [15] used a proximal point algorithm and proved a weak convergence theorem to the solution of this problem. We establish a modified proximal point algorithm and prove a strongly convergence theorem to study this problem. Our result improves the results given by Rockafellar [15].

  1. (AP4)

    Mathematical programming for convex Fréchet differentiable function:

    $$ \mbox{Find } \mathop{\arg\min}_{x\in H}g_{1}(x) , \mbox{where } g_{1}:H\rightarrow \mathbb {R} \mbox{ is a Fr\'{e}chet differentiable function}. $$

Recently, Xu [16] studied weak convergence and strong convergence theorem of this problem with various types of relaxed gradient-projection iterations. He also used the viscosity nature of the gradient-projection method and the regularized method to study strong convergence theorems of this problem.

A special case of one of our iteration is modified gradient-projection algorithm. We use this modified gradient-projection algorithm to establish a strong convergence theorem of this problem (AP4), and our results improve recent results given by Xu in [16].

In this paper, we apply a recent result of Yu and Lin [17] to find the solution of the mathematical programming for two convex functions, then we apply our results on mathematical programming for two convex functions to study the above problems. We establish a strongly convergent theorem for these problems and apply our result on the lasso problem to the image deblurring problem. Some numerical examples are given to demonstrate our results. The main result of this paper gives a unified study of many types of optimization problems. Our algorithms to solve these problems are different from any results in the literature. Some results of this paper are original and some results of this paper improve, extend, and unify comparable results existence in the literature.

2 Preliminaries

Throughout this paper, we denote the strong convergence of \(\{x_{n}\}\) to \(x\in H\) by \(x_{n}\rightarrow x\). Let \(T:C\rightarrow H\) be mapping, and let \(\operatorname{Fix}(T):=\{x\in C: Tx=x\}\) denote the set of fixed points of T. Thus:

  1. (i)

    T is called nonexpansive if \(\|Tx-Ty\|\leq\|x-y\|\) for all \(x,y\in C\).

  2. (ii)

    T is strongly monotone if there exists \(\bar{\gamma}> 0\) such that \(\langle x-y, Tx-Ty\rangle\geq\bar{\gamma}\|x-y\|^{2}\) for all \(x, y\in C\).

  3. (iii)

    T is Lipschitz continuous if there exists \(L > 0\) such that \(\|Tx-Ty\|\leq L\|x-y\|\) for all \(x,y\in C\).

  4. (iv)

    Let \(\alpha>0\). Then T is α-inverse-strongly monotone if \(\langle x-y,Tx-Ty\rangle \geq\alpha\|Tx-Ty\|^{2}\) for all \(x,y\in C\). We denote T is α-ism if T is α-inverse-strongly monotone.

  5. (v)

    T is firmly nonexpansive if

    $$\|Tx-Ty\|^{2}\leq\|x-y\|^{2}-\bigl\Vert (I-T)x-(I-T)y\bigr\Vert ^{2}\quad \text{for every }x,y\in C, $$

    that is,

    $$\|Tx-Ty\|^{2}\leq\langle x-y,Tx-Ty\rangle\quad \text{for every }x,y\in C. $$

Let \(B:H\multimap H\) be a multivalued mapping. The effective domain of B is denoted by \(D(B)\), that is, \(D(B) = \{x\in H : Bx\neq\emptyset \}\). Thus:

  1. (i)

    B is a monotone operator on H if \(\langle x-y,u-v\rangle \geq0\) for all \(x, y\in D(B)\), \(u\in Bx\) and \(v\in By\).

  2. (ii)

    B is a maximal monotone operator on H if B is a monotone operator on H and its graph is not properly contained in the graph of any other monotone operator on H.

Lemma 2.1

[18]

Let \(G_{1}:H\multimap H\) be maximal monotone. Let \(J_{r}^{G_{1}}\) be the resolvent of G defined by \(J_{r}^{G_{1}}=(I+r G_{1})^{-1}\) for each \(r>0\). Then the following hold:

  1. (i)

    For each \(r>0\), \(J_{r}^{G_{1}}\) is single-valued and firmly nonexpansive.

  2. (ii)

    \(\mathcal{D}(J_{r}^{G_{1}})=H_{1}\) and \(\operatorname{Fix}(J_{r}^{G_{1}})=\{ x\in \mathcal{D}(G_{1}):0\in G_{1}x\}\).

Let C be a nonempty, closed, convex subset of a real Hilbert space H. Let \(g:C\times C\rightarrow \mathbb{R}\) be a function. The Ky Fan inequality problem [19] is to find \(z\in C\) such that

$$ g(z,y)\geq0\quad \text{for each }y\in C. $$
(EP)

The solution set of Ky Fan inequality problem (KF) is denoted by \(\operatorname{KF}(C,g)\).

For solving the Ky Fan inequalities problem, let us assume that the bifunction \(g:C\times C\rightarrow \mathbb{R}\) satisfies the following conditions:

  1. (A1)

    \(g(x,x)=0\) for each \(x\in C\);

  2. (A2)

    g is monotone, i.e., \(g(x, y) +g(y, x)\leq0\) for any \(x,y\in C\);

  3. (A3)

    for each \(x,y,z\in C\), \(\limsup_{t\downarrow0}g(tz+(1-t)x,y)\leq g(x,y)\);

  4. (A4)

    for each \(x\in C\), the scalar function \(y\rightarrow g(x,y)\) is convex and lower semicontinuous.

We have the following result from Blum and Oettli [20].

Lemma 2.2

[20]

Let \(g: C\times C\rightarrow \mathbb {R}\) be a bifunction which satisfies conditions (A1)-(A4). Then for each \(r>0\) and each \(x\in H\), there exists \(z\in C\) such that

$$g(z,y)+\frac{1}{r}\langle y-z,z-x\rangle\geq0 $$

for all \(y\in C\).

In 2005, Combettes and Hirstoaga [21] established the following important properties of resolvent operator.

Lemma 2.3

[21]

Let \(g:C\times C\rightarrow \mathbb{R}\) be a function satisfying conditions (A1)-(A4). For \(r>0\), define \(T_{r}^{g}:H\rightarrow C\) by

$$T_{r}^{g}x= \biggl\{ z\in C:g(z,y)+\frac{1}{r}\langle y-z,z-x\rangle \geq 0, \forall y\in C \biggr\} $$

for all \(x\in H\). Then the following hold:

  1. (i)

    \(T_{r}^{g}\) is single-valued;

  2. (ii)

    \(T_{r}^{g}\) is firmly nonexpansive, that is, \(\|T_{r}^{g}x-T_{r}^{g}y\|^{2}\leq\langle x-y,T_{r}^{g}x-T_{r}^{g}y\rangle\) for all \(x,y\in H\);

  3. (iii)

    \(\{x\in H: T_{r}^{g}x=x\}=\{x\in C: g(x,y)\geq0, \forall y\in C\}\);

  4. (iv)

    \(\{x\in C: g(x,y)\geq0, \forall y\in C\}\) is a closed and convex subset of C.

We call such \(T_{r}^{g}\) the resolvent of g for \(r>0\).

Takahashi et al. [22] gave the following lemma.

Lemma 2.4

[22]

Let \(g: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying the conditions (A1)-(A4). Define \(A_{g}\) as follows:

$$ A_{g}x=\left \{ \textstyle\begin{array}{l@{\quad}l} \{z\in H: g(x,y)\geq\langle y-x, z\rangle, \forall y\in C\} & \textit{if } x\in C, \\ \emptyset& \textit{if } x\notin C. \end{array}\displaystyle \right . $$
(L4.2)

Then \(\operatorname{KF}(C,g)=A^{-1}_{g}0\) and \(A_{g}\) is a maximal monotone operator with the domain of \(A_{g}\subset C\). Furthermore, for any \(x\in H\) and \(r>0\), the resolvent \(T_{r}^{g}\) of g coincides with the resolvent of \(A_{g}\), i.e., \(T_{r}^{g}x=(I+rA_{g})^{-1}x\).

Let \(f:H\rightarrow (-\infty,\infty]\) be proper. The subdifferential ∂f of f is the set valued operator, defined \(\partial f= \{u\in H: f(y)\geq f(x)+\langle y-x,u\rangle\text{ for all } y\in H\}\).

Let \(x\in H\). Then f is subdifferentiable at x if \(\partial f(x)\neq\emptyset\). Then the elements of \(\partial f(x)\) are called the subgradient of f at x.

The directional derivative of f at x in the direction y is

$$f'(x,y)= \lim_{\alpha\downarrow0}\frac {f(x+ay)-f(x)}{\alpha}, $$

provided that the limit exists in \([-\infty,\infty]\).

Let \(x\in \operatorname{dom} f\) and suppose that \(f'(x,y)\) is linear and continuous, then f is said to be Gâteaux differentiable at x. By the Riesz representation theorem, there exists a unique vector \(\nabla f(x)\in H\) such that \(f'(x,y)=\langle y,\nabla f(x)\rangle\) for all \(y\in H\).

Let \(x\in H\), let \(\mu(x)\) denote the family of all neighborhood of x, let \(H_{1}\) be a Hilbert space, let \(C\in\mu(x)\) and let \(f:C\rightarrow (-\infty,\infty]\). Then f is said to be Fréchet differentiable at x if there exists an operator \(\nabla f(x)\in B(H,\mathbb{R})\), called the Fréchet derivative of f at x, such that

$$\lim_{0\neq\|y\|\rightarrow 0}\frac {|f(x+y)-f(x)-\langle y,\nabla f(x)\rangle|}{\|y\|}=0. $$

Further, if f is Frêchet differentiable at x, then f is Gâteaux differentiable at x.

Let C be a nonempty, closed, convex subset of H. The indicator function \(\iota_{C}\) defined by

$$ \iota_{C}x=\left \{ \textstyle\begin{array}{l@{\quad}l} 0, &x\in C, \\ \emptyset, &x\notin C, \end{array}\displaystyle \right . $$

is a proper lower semicontinuous convex function and its subdifferential \(\partial\iota_{C}\) defined by

$$ \partial\iota_{C}x=\bigl\{ z\in H:\langle y-x,z\rangle \leq \iota_{C}(y)-\iota _{C}(x), \forall y\in H\bigr\} $$

is a maximal monotone operator (see Lemma 2.8). Furthermore, we also define the normal cone \(N_{C}u\) of C at u as follows:

$$N_{C}u=\bigl\{ z\in H: \langle z, v-u\rangle\leq0, \forall v\in C\bigr\} . $$

We can define the resolvent \(J_{\lambda}^{\partial i_{C}}\) of \(\partial i_{C}\) for \(\lambda>0\), i.e.,

$$J_{\lambda}^{\partial i_{C}}x=(I+\lambda\partial i_{C})^{-1}x $$

for all \(x\in H\). Since

$$\begin{aligned} \partial i_{C}x = &\bigl\{ z\in H: i_{C}x+\langle z,y-x\rangle\leq i_{C}y, \forall y\in H\bigr\} \\ = & \bigl\{ z\in H: \langle z,y-x\rangle\leq0, \forall y\in C\bigr\} \\ = & N_{C}x \end{aligned}$$

for all \(x\in C\), we have

$$\begin{aligned} u=J_{\lambda}^{\partial i_{C}}x \quad \Leftrightarrow\quad &x\in u+\lambda\partial i_{C}u \\ \quad \Leftrightarrow \quad &x-u\in\lambda N_{C}u \\ \quad \Leftrightarrow \quad & \langle x-u,y-u\rangle\leq0, \quad \forall y\in C \\ \quad \Leftrightarrow\quad &u=P_{C}x. \end{aligned}$$

For details see [22].

Lemma 2.5

[4]

Let \(g\in \Gamma_{0}(H)\) and \(\lambda\in(0,\infty)\). Thus:

  1. (i)

    If C is a nonempty, closed, convex subset of H and \(g=i_{C}\) is the indicator function of C, then the proximal operator \(\operatorname{prox}_{\lambda g}=P_{C}\) for all \(\lambda\in(0,\infty)\), where \(P_{C}\) is the metric projection operator from H to C.

  2. (ii)

    \(\operatorname{prox}_{\lambda g}\) is firmly nonexpansive.

  3. (iii)

    \(\operatorname{prox}_{\lambda g}=(I+\lambda\partial g)^{-1}=J_{\lambda }^{\partial g}\).

Lemma 2.6

[3]

Let \(f,g\in\Gamma _{0}(H)\). Let \(x^{*}\in H\) and \(\lambda\in(0,\infty)\). Assume that f is finite valued and Fréchet differentiable function on H with Fréchet derivative ∇f. Then \(x^{*}\) is a solution to the problem \(\arg\min_{x\in H}f(x)+g(x)\) if and only if \(x^{*}=\operatorname{prox}_{\lambda g}(I-\lambda\nabla f)x^{*}\).

Lemma 2.7

[23]

Let \(C\subset H\) be nonempty, closed, convex subset, let \(A:H\rightarrow H\), and let \(f:H\rightarrow \mathbb{R}\) be convex and Fréchet differentiable. Let A be the Fréchet derivative of f. Then \(\operatorname{VI}(C, A)=\arg\min_{x\in C}f(x)\).

Lemma 2.8

[6]

Let \(f\in \Gamma_{0}(H)\), then ∂f is maximum monotone.

Lemma 2.9

[6]

Let f and g be functions in \(\Gamma_{0}(H)\) such that one of the following holds:

  1. (i)

    \(\operatorname{dom} f\cap \operatorname{int} (\operatorname{dom}) g\neq\emptyset\);

  2. (ii)

    \(\operatorname{dom} g=H\).

Then \(\partial(f+g)=\partial f+\partial g\).

A mapping \(T_{\alpha}:H\rightarrow H\) is said to be averaged if \(T_{\alpha}=(1-\alpha)I +\alpha T\), where \(\alpha\in(0, 1)\) and \(T: H\rightarrow H \) is nonexpansive. In this case, we say that \(T_{\alpha}\) is α-averaged. Clearly, a firmly nonexpansive mapping is \(\frac{1}{2}\)-averaged.

Lemma 2.10

[24]

Let \(T:H\rightarrow H\) be a mapping. Then the following hold:

  1. (i)

    T is nonexpansive if and only if the complement \((I-T)\) is \(1/2\)-ism;

  2. (ii)

    if S is υ-ism, then for \(\gamma>0\), γS is \(\upsilon/\gamma\)-ism;

  3. (iii)

    S is averaged if and only if the complement \(I-S\) is Ï…-ism for some \(\upsilon> 1/2\);

  4. (iv)

    if S and T are both averaged, then the product (composite) ST is averaged;

  5. (v)

    if the mappings \(\{T_{i}\}_{i=1}^{n}\) are averaged and have a common fixed point, then \(\bigcap_{i=1}^{n}\operatorname{Fix}(T_{i}) = \operatorname{Fix}(T_{1}\cdots T_{n})\).

Lemma 2.11

[6]

Let \(f:H\rightarrow (-\infty,\infty]\) be proper and convex. Suppose that f is Gâteaux differentiable at x. Then \(\partial f(x)=\{\nabla f(x)\}\).

3 Common solution of variational inequality problem, fixed point, and Ky Fan inequalities problem

For each \(i=1,2\), let \(F_{i}:C\rightarrow H\) be a \(\kappa _{i}\)-inverse-strongly monotone mapping of C into H with \(\kappa_{i}>0\). For each \(i=1,2\), let \(G_{i}\) be a maximal monotone mapping on H such that the domain of \(G_{i}\) is included in C and define the set \(G_{i}^{-1}0\) as \(G_{i}^{-1}0= \{x\in H: 0\in G_{i}x\}\). Let \(J_{\lambda}^{G_{1}}=(I+\lambda G_{1})^{-1}\) and \(J_{r}^{G_{2}}=(I+r G_{2})^{-1}\) for each \(n\in\mathbb{N}\), \(\lambda >0\) and \(r>0\). Let \(\{\theta_{n}\}\subset H\) be a sequence. Let V be a \(\bar{\gamma}\)-strongly monotone and L-Lipschitz continuous operator with \(\bar{\gamma}>0\) and \(L > 0\). Let \(T : C\rightarrow H\) be a nonexpansive mapping. Throughout this paper, we use these notations and assumptions unless specified otherwise.

In this paper, we say conditions (D) hold if the following conditions are satisfied:

  1. (i)

    \(0<\liminf_{n\rightarrow\infty}\alpha_{n}\leq\limsup_{n\rightarrow\infty}\alpha_{n}<1\);

  2. (ii)

    \(\lim_{n\rightarrow \infty}\beta_{n}=0\), and \(\sum_{n=1}^{\infty}\beta_{n}=\infty\);

  3. (iii)

    \(\lim_{n\rightarrow\infty}\theta_{n}=0\).

The following strong convergence theorem is needed in this paper.

Theorem 3.1

[17]

Suppose that \(\Omega_{1}=\operatorname{Fix}(T)\cap \operatorname{Fix}(J_{\lambda }^{G_{1}}(I-\lambda F_{1}))\cap \operatorname{Fix}(J_{r}^{G_{2}}(I-rF_{2}))\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily,

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{G_{1}}(I-\lambda F_{1})J_{r}^{G_{2}}(I-rF_{2})x_{n}, \\ s_{n}=T y_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)s_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\{\lambda, r\}\subset(0,\infty)\), \(\{ \alpha_{n}\}\subset(0,1)\), and \(\{\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold and \(0<\lambda<2\kappa_{1}\) and \(0< r<2\kappa_{2}\). Then

$$\lim_{n\rightarrow \infty}x_{n}=\bar{x} ,$$

where

$$\bar{x}=P_{\operatorname{Fix}(T)\cap \operatorname{Fix}(J_{\lambda}^{G_{1}}(I-\lambda F_{1}))\cap \operatorname{Fix}(J_{r}^{G_{2}}(I-rF_{2}))}(\bar{x}-V\bar{x}). $$

This point \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in \operatorname{Fix}(T) \cap \operatorname{Fix}\bigl(J_{\lambda}^{G_{1}}(I-\lambda F_{1})\bigr)\cap \operatorname{Fix}\bigl(J_{r}^{G_{2}}(I-rF_{2}) \bigr). $$

For each \(i=1,2\), let \(f_{i}: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying conditions (A1)-(A4). An iteration is used to find common solutions of a variational inequality problem, Ky Fan inequalities problems, and a fixed point set of a mapping:

$$ \left \{ \textstyle\begin{array}{l} \text{Find }\bar{x}\in H \text{ such that }\bar{x} \in \operatorname{Fix}(T)\cap \operatorname{KF}(C,f_{1})\cap \operatorname{KF}(C,f_{2}) \text{ and} \\ \langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in \operatorname{Fix}(T)\cap \operatorname{KF}(C,f_{1})\cap \operatorname{KF}(C,f_{2}). \end{array}\displaystyle \right . $$

Theorem 3.2

For each \(i=1,2\), let \(f_{i}: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying conditions (A1)-(A4), and let \(A_{f_{i}}\) be defined as (L4.2) in Lemma  2.4. Suppose that \(\Omega_{2}: =\operatorname{Fix}(T)\cap \operatorname{KF}(C,f_{1})\cap \operatorname{KF}(C,f_{2})\neq \emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}J_{r}^{A_{f_{2}}}x_{n}, \\ s_{n}=T y_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)s_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\{\lambda,r\}\subset(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{2}}(\bar{x}-V\bar{x})\). This point \(\bar{x} \in \Omega_{2}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Omega_{2}. $$

Proof

For each \(i=1,2\), by Lemma 2.4, we know that \(A_{f_{i}}\) is a maximal monotone operator with the domain of \(A_{f_{i}}\subset C\) and \(\operatorname{KF}(C,f_{i})=A^{-1}_{f_{i}}0\). For each \(i=1,2\), let \(F_{i}=0\), and \(G_{i}=A_{f_{i}}\) in Theorem 3.1. By Lemma 2.1(ii), we have, for each \(i=1,2\),

$$\operatorname{Fix}\bigl(J_{\lambda}^{G_{i}}(I-\lambda F_{i})\bigr)={G_{i}}^{-1}0=A^{-1}_{f_{i}}0= \operatorname{KF}(C,f_{i}). $$

This implies that \(\Omega_{1} =\Omega_{2}\). By Theorem 3.1, \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{2}}(\bar{x}-V\bar{x})\). This point \(\bar{x}\in \Omega_{2}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Omega_{2}. $$

Thus,

$$\bar{x}\in \operatorname{Fix}(T)\cap \operatorname{KF}( C,f_{1})\cap \operatorname{KF}(C,f_{2}) $$

and

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in \operatorname{Fix}(T) \cap \operatorname{KF}( C,f_{1})\cap \operatorname{KF}(C,f_{2}). $$

Therefore, the proof is completed. □

As a simple consequence of Theorem 3.2, we study the common solution of the Ky Fan inequalities problems.

Theorem 3.3

Let \(f_{1}: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying conditions (A1)-(A4) and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma  2.4. Suppose that \(\Omega_{3}: = \operatorname{KF}(C,f_{1})\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\{\lambda,r\}\subset(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{3}}(\bar{x}-V\bar{x})\). This point \(\bar{x} \in \operatorname{KF}(C,f_{1})\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in\Omega_{3}. $$

Proof

Let \(I|_{C}\) and \(i_{C}\) be the restriction of the identity function on C and the indicate function on C respectively and let \(T=I|_{C}\), \(f_{2}=i_{C}\) in Theorem 3.2, then Theorem 3.3 follows from Theorem 3.2. □

Theorem 3.4

Let \(\Omega_{4}: =\operatorname{Fix}(T)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} s_{n}=T P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(1-\beta _{n})s_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{4}}(\bar{x}-V\bar{x})\). This point \(\bar{x} \in \operatorname{Fix} (T)\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in\Omega_{4}. $$

Proof

For each \(i=1,2\), let \(f_{i}=i_{C}\), \(A_{f_{i}}=\partial i_{C}\) in Theorem 3.2, where \(i_{C}\) is the indicator function of C. Then \(\operatorname{KF}(C,f_{i})=C\) and \(J_{r}^{A_{f_{i}}}=P_{C}\). Therefore, Theorem 3.4 follows immediately from Theorem 3.2. □

4 Mathematical programming for the sum of two convex functions

In the following theorem, an iteration is used to find the solution of the optimization problem for the sum of two convex functions:

$$ \mathop{\arg\min}_{y\in C}(g_{2}+h_{2}) (y). $$

Theorem 4.1

Let \(g_{1}:H\rightarrow (-\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H, and \(h_{1}:H\rightarrow (-\infty,\infty]\) be a proper convex lower semicontinuous function. Let \(f_{1}(x,y)=\langle y-x,\nabla g_{1}x\rangle+h_{1}(y)-h_{1}(x)\) for all \(x,y\in H\) and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma  2.4. Take \(\mu\in \mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Pi_{1} :=H-\operatorname{VI}(C,\nabla g_{1},h_{1})\neq\emptyset\), where

$$H-\operatorname{VI}(C,\nabla g_{1},h_{1})=\bigl\{ x\in C: \langle y-x,\nabla g_{1}x\rangle+h_{1}(y)-h_{1}(x) \geq0 \textit{ for all } y\in C\bigr\} . $$

A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\lambda\in(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{1}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Pi_{1}. $$

Proof

Since \(\nabla g_{1}\) is the Fréchet derivative of the convex function \(g_{1}\), it follows from Corollary 17.33 of [6] that \(\nabla g_{1}\) is continuous on C. By Proposition 17.10 of [6], \(\nabla g_{1}\) is monotone on C. Hence, \(\nabla g_{1}\) is bounded on any line segment of C. By Proposition 17.2 of [6],

$$ \bigl\langle y-x,\nabla g_{1}(x)\bigr\rangle \leq g_{1}(y)-g_{1}(x)\quad \text{for all } x,y\in C. $$
(4.1)

Since \(h_{1}:C\rightarrow \mathbb{R}\) is a proper convex lower semicontinuous function, it is easy to see that for each \(x,y,z\in C\),

$$\begin{aligned}& \limsup_{t\downarrow0}f_{1}\bigl(tz+(1-t)x,y\bigr) \\& \quad \leq \limsup_{t\downarrow0}\bigl\langle y-\bigl( tz+(1-t)x\bigr), \nabla g_{1}\bigl(tz+(1-t)x\bigr)\bigr\rangle + \limsup _{t\downarrow0}\bigl(h_{1}(y)-h_{1}\bigl(tz+(1-t)x \bigr)\bigr) \\& \quad = \limsup_{t\downarrow0} \bigl(\bigl\langle y-tz-(1-t)x-(y-x), \nabla g_{1}\bigl(tz+(1-t)x\bigr)\bigr\rangle +\bigl\langle (y-x),\nabla g_{1}\bigl(tz+(1-t)x\bigr)\bigr\rangle \bigr) \\& \qquad {}+ \limsup_{t\downarrow 0}\bigl(h_{1}(y)-h_{1} \bigl(tz+(1-t)x\bigr)\bigr) \\& \quad \leq \langle y-x,\nabla g_{1}x\rangle+h_{1}(y)-h_{1}(x) \\& \quad = f_{1}(x,y). \end{aligned}$$

This shows that condition (A3) is satisfied. It is easy to see that \(f_{1}\) also satisfies conditions (A1), (A2), and (A4). We see \(\operatorname{KF}(C, f_{1})=H-\operatorname{VI}(C, \nabla g_{1},h_{1})\) and \(\Pi_{1}=\Omega_{3}\neq\emptyset\). By Theorem 3.3, \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{3}}(\bar{x}-V\bar{x})\), \(\bar{x}\in \operatorname{KF}(C,f_{1})\). This point \(\bar{x}\in\Omega_{3}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Omega_{3}. $$

By \(\Omega_{3}=\Pi_{1}\) and \(\bar{x}\in \operatorname{KF}(C,f_{1}) \), we have

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Pi_{1} $$

and

$$ \bigl\langle y-\bar{x},\nabla g_{1}(\bar{x})\bigr\rangle +h_{1}(y)-h_{1}(\bar{x})\geq0\quad \text{for all } y \in C. $$
(4.2)

By (4.1) and (4.2), we have

$$g_{1}(y)+h_{1}(y)-g_{1}(\bar{x})-h_{1}( \bar{x})\geq\bigl\langle y-x,\nabla g_{1}(x)\bigr\rangle +h_{1}(y)-h_{1}( \bar{x})\geq0 $$

for all \(y\in C\). Then \(\bar{x}\in\arg\min_{y\in C}(g_{1}+h_{1})(y)\). □

Example 4.1

Let \(h_{1}(x)=x^{2}\), \(g_{1}(x)=x^{2}+2x+1\), \(C=[-1,1]\), \(H=\mathbb{R}\), \(\lambda=1\), \(V=I\), \(\alpha_{n}=\frac{1}{2}\) for all \(n\in \mathbb{N}\), \(\beta_{n}=\frac{1}{1\text{,}000n}\), \(C=[-1,1]\), \(\theta_{n}=0\), \(f_{1}(x,y)=\langle y-x,\nabla g_{1}(x)\rangle+h_{1}(y)-h_{1}(x)\). Then \(f_{1}(x,y)=(y-x)(2x+y+x+2)\), this implies that \(f_{1}(-\frac{1}{2},y)=(y+\frac{1}{2})^{2}\geq0\) for all \(y\in[-1,1]\), and \(-\frac{1}{2}\in H-\operatorname{VI}([-1,1],\nabla g_{1},h_{1})\neq\emptyset\).

We also see \(A_{f_{1}}(-1)=(-\infty,-2]\), \(A_{f_{1}}(1)=[6,\infty)\), and \(A_{f_{1}}(x)=4x+2\) if \(x\in(-1,1)\). Let \(y_{n}=J_{1}^{A_{f_{1}}}P_{C}x_{n}\), \(x_{n+1}=\frac{1}{2}x_{n}+\frac{1}{2}(1-\frac{1}{1\text{,}000n})y_{n}\).

It is easy to see that \(P_{C}x_{n}=5y_{n}+2\) and \(y_{n}=\frac{1}{5}(P_{C}x_{n}-2)\).

Hence \(x_{n+1}=\frac{1}{2}x_{n}+\frac{1}{10}(1-\frac{1}{1\text{,}000n})(P_{C}x_{n}-2)\). It is easy to see all the conditions of Theorem 4.1 are satisfied.

Let \(x_{1}=0\), then \(x_{2}=-0.1998\), \(x_{3}=-0.31977001\), \(x_{4}=-0.39177967\), \(x_{5}=-0.435008\), … , we see \(\lim_{n\rightarrow \infty} x_{n}=\bar{x}=-\frac {1}{2}\in\arg\min_{x\in [-1,1]}g_{1}x+h_{1}x\).

Next, an iteration is used to find the solution of the following optimization problem for the convex differentiable function:

$$ \mathop{\arg\min}_{y\in C}g_{1}(y) . $$

Corollary 4.1

Let \(g_{1}:H\rightarrow \mathbb{R}\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\). Let \(f_{1}(x,y)=\langle y-x,\nabla g_{1}x \rangle\) for all \(x,y\in H\) and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma  2.4. Suppose that \(\Pi_{1,1} :=\arg\min_{y\in C}g_{1}(y)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\lambda\in(0,\infty)\) and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{1,1}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in\Pi_{1,1}. $$

Proof

By Lemma 2.7, we know that \(\operatorname{VI}(C,\nabla g_{1})=\arg\min_{y\in C}g_{1}(y)\). Therefore, Corollary 4.1 follows immediately from Theorem 4.1 by letting \(h_{1}=0\). □

Next, another iteration is used to find the solution of the following optimization problem for a convex function:

$$ \mathop{\arg\min}_{z\in C}h_{1}(z). $$

Corollary 4.2

Let \(f_{1}(x,y)=h_{1}(y)-h_{1}(x)\), for all \(x,y\in C\), and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma  2.4. Suppose that \(\Pi_{1,2} :=\arg\min_{y\in C}h_{1}(y)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\lambda\in(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{1,2}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in\Pi_{1,2}. $$

Proof

Put \(g_{1}=0\) in Theorem 4.1. Then Corollary 4.2 follows from Theorem 4.1. □

In the following theorem, an iteration is used to find the solution of the following optimization problem for the sum of two convex functions:

$$ \mathop{\arg\min}_{y\in H}g_{1}(y)+h_{1}(y) . $$

Theorem 4.2

Let \(g_{1}:H\rightarrow (-\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H, and \(h_{1}:H\rightarrow (-\infty,\infty]\) be a proper convex lower semicontinuous function. Suppose that \(\nabla g_{1}\) is Lipschitz with Lipschitz constant \(\frac{1}{L_{1}}\) and \(\Pi_{2} :=\arg\min_{x\in H} (g_{1}+h_{1})(x)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in H \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=\operatorname{prox}_{r h_{1}}(I-r\nabla g_{1}) x_{n}=J_{r}^{\partial {h_{1}}}(I-r\nabla g_{1}) x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(r\in(0,\frac{2}{L_{1}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{2}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in\Pi_{2}. $$

We give two different proofs for this theorem.

Proof I

Put \(C=H\), \(G_{1}=\partial h_{1}\), \(F_{1}=\nabla g_{1}\), \(T=I|_{C}\), \(F_{2}=0\), \(G_{2}=\partial i_{H}\) in Theorem 3.1, where \(I|_{C}\) is the restriction of I on C and \(i_{C}\) is the indicate function of C. Since \(\nabla g_{1}\) is Lipschitz continuous with Lipschitz constant \(L_{1}\), it follows from Corollary 10 of [25] that \(\nabla g_{1} \) is \(\frac{1}{L_{1}}\)-strongly-inverse-monotone. Since \(h_{1}\) is a proper convex lower semicontinuous function, it follows from Lemma 2.8 that \(\partial h_{1}\) is a set valued maximum monotone mapping. By Lemma 2.11, \(\partial g_{1}=\{\nabla g_{1}\}\). It follows from \(\operatorname{dom}(f)\cap \operatorname{int} (\operatorname{dom}(g))\neq\emptyset\), \(\operatorname{dom}(g)=H\), and Lemma 2.9 that

$$\partial(h_{1}+g_{1}) (x)=\partial h_{1}(x)+ \partial g_{1}(x)=\partial h_{1}(x)+\nabla g_{1}(x). $$

Hence,

$$\begin{aligned} \Pi_{2} = &\mathop{\arg\min}_{y\in H}(g_{1}+h_{1}) (y)=\bigl\{ x\in H: x\in (\partial h_{1}+\nabla g_{1})^{-1}0 \bigr\} \\ = &\operatorname{Fix}(I|_{C})\cap \operatorname{Fix} \bigl(J_{1}^{\partial_{h_{1}}}(I-\nabla g_{1})\bigr)\cap \operatorname{Fix}(P_{C}) \\ = &\operatorname{Fix}(T)\cap \operatorname{Fix}\bigl(J_{1}^{\partial_{h_{1}}}(I- \nabla g_{1})\bigr)\cap \operatorname{Fix}\bigl(J_{1}^{\partial i_{H}} \bigr) \\ = &\Omega_{1}. \end{aligned}$$

Therefore, we get the conclusion of Theorem 4.2 from Theorem 3.1. □

Proof II

Let \(C=H\), \(T=\operatorname{prox}_{r h_{1}}(I-r \nabla g_{1})\) in Theorem 3.4. Since \(\nabla g_{1}\) is Lipschitz constant \(L_{1}\), it follows from Corollary 10 of [25] that \(\nabla g_{1}\) is \(\frac{1}{L_{1}}\)-inverse- strongly-monotone. By Lemma 2.10, \(r\nabla g_{1}\) is \(\frac{1}{rL_{1}}\)-ism and \((I-r \nabla g_{1})\) is averaged. Since \(\partial h_{1}\) is maximum monotone, it follows from Lemma 2.5, \(\operatorname{prox}_{rh_{1}}=J_{r}^{\partial h_{1}}\) is firmly nonexpansive. Hence \(\operatorname{prox}_{rh_{1}}\) is \(\frac{1}{2}\)-averaged. Then by Lemma 2.10, T is averaged and nonexpansive. We have

$$\begin{aligned} \operatorname{Fix} (T) = &\operatorname{Fix} \bigl(J_{r}^{\partial h_{1}}(I-r \nabla g_{1})P_{H}\bigr) \\ = &\operatorname{Fix} \bigl(\operatorname{prox}_{r h_{1}}(I-r \nabla g_{1})\bigr)\cap \operatorname{Fix}(P_{H}) \\ = &\mathop{\arg\min}_{x\in H} g_{1}(x)+h_{1}(x)\cap H \\ = &\mathop{\arg\min}_{x\in C} g_{1}(x)+h_{1}(x). \end{aligned}$$

Hence, \(\Pi_{2}=\Omega_{4}\), and we get the conclusion of Theorem 4.2 from Theorem 3.4. □

Remark 4.1

  1. (a)

    The iterations in Theorems 4.1 and 4.2 are different.

  2. (b)

    Theorem 1.1, Theorem 1.2, Theorem 1.3, and the results given by Combettes and Wajs [2], and Xu [3] are weak convergence theorems for the problem:

    $$ \mathop{\arg\min}_{x\in H} g_{1}(x)+h_{1}(x). $$

Further, Theorem 1.1, Theorem 1.2, and Theorem 1.3 gave strong convergence theorems of this problem under the uniform convex assumption on \(h_{1}\) or \(g_{1}\). Therefore, Theorem 4.2 is different from these results. Besides, Theorem 4.2 is also different from the result given by Wang and Xu [4] and related algorithms in the literature.

Example 4.2

Let \(h_{1}(x)=x^{2}\), \(g_{1}(x)=x^{2}+2x+1\), \(H=\mathbb{R}\), \(\alpha=1\), \(V=I\), \(\alpha_{n}=\frac{1}{2}\) for all \(n\in \mathbb{N}\), \(\beta_{n}=\frac{1}{1\text{,}000n}\), \(C=[-1,1]\), \(\theta_{n}=0\). Then \(\partial h_{1}(x)=\{2x\}\), \(\nabla g_{1}(x)=2x+2\), \(\theta_{n}=0\) for all \(n\in\mathbb{N}\). We see \(-\frac{1}{2}\in\arg\min_{x\in\mathbb{R}}g_{1}(x)+h_{1}(x)\neq\emptyset\), let

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{1}^{\partial h_{1}}(I- \nabla h_{1}g_{1})x_{n}=(I+\partial h_{1})^{-1}(I-\nabla g_{1})x_{n}, \\ x_{n+1}=\frac{1}{2}x_{n}+\frac{1}{2}(1-\frac{1}{1\text{,}000n})y_{n} \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\).

We see all conditions of Theorem 4.2 are satisfied.

We obtain \((I-\nabla g_{1})x_{n}=(I+\partial h_{1})y_{n}=y_{n}+2y_{n}=3y_{n}=x_{n}-2x_{n}-2\).

From this we obtain \(y_{n}=\frac{-(x_{n}+2)}{3}\) and

$$\begin{aligned} x_{n+1} =&\frac{1}{2}x_{n}+\frac{1}{2}\biggl(1- \frac{1}{1\text{,}000n}\biggr)y_{n} \\ =&\frac{1}{2}x_{n}- \frac{1}{2}\biggl(1-\frac{1}{1\text{,}000n}\biggr)\frac{(x_{n}+2)}{3} \\ =& \frac{1}{2}x_{n}-\biggl(1-\frac{1}{1\text{,}000n}\biggr) \frac{(x_{n}+2)}{6}. \end{aligned}$$

Let \(x_{1}=1\), then \(x_{2}=0.0005\), \(x_{3}=-0.33075\), \(x_{4}=-0.4434906\), \(x_{5}=-04810987\), \(x_{6}=-0.493649\), \(x_{7}=-0.4978412\), \(x_{8}=-0.4992446\), \(x_{9}=-0.4997169\), \(x_{10}=-0.49987785\), \(x_{11}=-0.49993425\), \(x_{12}=-4999553\), \(x_{13}=-0.4999642\), … . From the relation

$$x_{n+1} =\frac{1}{2}x_{n}-\biggl(1-\frac{1}{1\text{,}000n} \biggr)\frac{-(x_{n}+2)}{6} $$

and \(x_{1}=1\), it is easy to see that the sequence \(\{x_{n}\}_{n\in\mathbb{N}}\) is nonincreasing for some \(n\geq m\) and bounded. Hence \(\lim_{n\rightarrow \infty}x_{n}\) exists. Let \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\). From the relation

$$x_{n+1} =\frac{1}{2}x_{n}-\biggl(1-\frac{1}{1\text{,}000n} \biggr)\frac{(x_{n}+2)}{6}, $$

we see that \(\bar{x}=\frac{1}{2}\bar{x}-\frac{\bar{x}+2}{6}\). Therefore \(\bar{x}=-\frac{1}{2}\in\arg\min_{x\in H}g_{1}(x)+h_{1}(x)\).

In the following corollary, an iteration is used to find the solution of the following optimization problem:

$$ \mbox{Find } \bar{x}\in\mathop{\arg\min}_{y\in H}h_{1}(y) . $$

Corollary 4.3

Let \(h_{1}:H\rightarrow (-\infty ,\infty]\) be a proper convex lower semicontinuous function. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Pi_{2,1} := \arg\min_{y\in H}h_{1}(y)\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in H \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\alpha}^{\partial h_{1}}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\infty)\) and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{2,1}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0, \quad \forall q\in\Pi_{2,1}. $$

Remark 4.2

In 1976, Rockafellar [15] proved the following in the Hilbert space setting: If \(h_{1}\) is a proper convex lower semicontinuous function on H, the solution set \(\arg\min_{y\in H}h_{1}(y)\) is nonempty and \(\liminf_{n\rightarrow \infty}\beta_{n}>0\). Let

$$ x_{n+1}=\mathop{\arg\min}_{y\in H}\biggl\{ h_{1}(y)+\frac{1}{2\beta_{n}}\|y-x_{n}\|^{2}\biggr\} =\operatorname{prox}_{\beta _{n}h_{1}}x_{n}=J_{\beta_{n}}^{\partial h_{1}}x_{n}, \quad n\in\mathbb{N}, $$
(4.3)

then \(\{x_{n}\}\) converges weakly to a minimizer of \(h_{1}\). We see that Corollary 4.3 gives a different iteration which converges strongly to the solution of the following problem: Find \(\bar{x}\in \arg\min_{y\in H}h_{1}(y)\).

Next, a modified gradient-projection algorithm is used to find the solution of the following mathematical program:

$$ \mbox{Find } \bar{x}\in\mathop{\arg\min}_{y\in C}g_{1}(y) . $$

Theorem 4.3

Let \(g_{1}:H\rightarrow (-\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\nabla g_{1}\) is Lipschitz continuous with Lipschitz constant \(L_{1}\) and \(\Pi_{3} := \arg\min_{y\in C}g_{1}(y)\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=P_{C}(I-\alpha\nabla g_{1})x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{L_{1}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{3}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Pi_{3}. $$

Proof

Let \(h_{1}=i_{C}\), where \(i_{C}\) denotes the indicator function of C. From Lemma 2.5, \(\operatorname{prox}_{\lambda h_{1}}=P_{C}\), and

$$\mathop{\arg\min}_{x\in C}\bigl(h_{1}(x)+g_{1}(x) \bigr)=\mathop{\arg\min}_{x\in C} g_{1}(x), $$

Theorem 4.3 follows immediately from Theorem 4.2. □

Remark 4.3

We know an iteration, defined by

$$x_{n+1}=P_{C}(I-\alpha_{n} \nabla g_{1})x_{n},\quad n\in\mathbb{N}, $$

is called a gradient-projection algorithm, where \(\nabla g_{1}\) is Lipschitz continuous. In 2011, Xu [16] used the gradient-projection algorithm and the relaxed gradient-projection algorithm and studied the problem

$$ \mbox{Find }\bar{x}\in\mathop{\arg\min}_{y\in C}g_{1}(y) , $$

and gave weak convergence theorems. Xu also used the viscosity nature of the gradient-projection algorithms and regularized algorithm to study strong convergence theorems for this problem [16]. In Theorem 4.3, we establish a strong convergence theorem for this problem by a different modified gradient-projection algorithm and a different approach.

In the end of this section, an iteration is used to find the solution of the following optimization problem:

$$ \mbox{Find } \bar{x}\in\arg\min_{y\in H}g_{1}(y) . $$

Corollary 4.4

Let \(g_{1}:H\rightarrow (-\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H. Take \(\mu\in \mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\nabla g_{1}\) is Lipschitz continuous with Lipschitz constant \(L_{1}\) and \(\Pi_{3,1} := \arg\min_{y\in H}g_{1}(y)\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in H\) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=(I-\alpha\nabla g_{1})x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{L_{1}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{3,1}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x},q-\bar{x}\rangle\geq0,\quad \forall q\in\Pi_{3,1}. $$

Proof

Let \(h_{1}=i_{H}\). By Lemma 2.5, we know that \(\operatorname{prox}_{\lambda h_{1}}=J_{\lambda}^{\partial i_{H}}= P_{H}=I\), and

$$\mathop{\arg\min}_{x\in H} \bigl(h_{1}(x)+g_{1}(x) \bigr)=\mathop{\arg\min}_{x\in H} g_{1}(x). $$

Hence, Corollary 4.4 follows immediately from Theorem 4.3. □

5 Split feasibility problems and lasso problems

In the following theorem, a modified Byrne CQ iteration is used to find the solution of the following split feasibility problem: Find \(\bar{x}\in C\) such that \(A\bar{x}\in Q\).

Theorem 5.1

Let \(A:H\rightarrow H_{1}\) be a bounded linear operator, \(A^{*}\) be the adjoint of A. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Delta_{1}:=\{x\in C: Ax\in Q\}\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=P_{C}(I-\alpha A^{*}(A-P_{Q}A))x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\|A\|^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{1}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x}-\theta,q-\bar{x}\rangle\geq0,\quad \forall q\in \Delta_{1}. $$

Proof

Let \(g_{1}(x)=\frac{\|Ax-P_{Q}Ax\|^{2}_{2}}{2}\). It is easy to see that \(g_{1}(x)=\min_{y\in Q}\frac{1}{2}\|Ax-y\|^{2}\) is a convex function. Then for any \(v\in H_{1}\), we have

$$\begin{aligned}& 2g_{1}(x+v)-2g_{1}(x)- 2\bigl\langle Av,(I-P_{Q})Ax\bigr\rangle \\& \quad = \bigl\langle (I-P_{Q})A(x+v),(I-P_{Q})A(x+v) \bigr\rangle -\bigl\langle (I-P_{Q})Ax,(I-P_{Q})Ax\bigr\rangle \\& \qquad {}-2\bigl\langle Av,(I-P_{Q})Ax\bigr\rangle \\& \quad = \bigl\langle Ax+Av-P_{Q}Ax+\bigl(P_{Q}Ax-P_{Q}A(x+v) \bigr),Ax+Av-P_{Q}Ax \\& \qquad {} +\bigl(P_{Q}Ax-P_{Q}A(x+v)\bigr)\bigr\rangle - \bigl\langle (I-P_{Q})Ax,(I-P_{Q})Ax\bigr\rangle -2\bigl\langle Av,(I-P_{Q})Ax\bigr\rangle \\& \quad = \langle Av,Av\rangle+\bigl\langle P_{Q}Ax-P_{Q}A(x+v),P_{Q}Ax-P_{Q}A(x+v) \bigr\rangle \\& \qquad {} +2\bigl\langle (I-P_{Q})Ax,P_{Q}Ax-P_{Q}A(x+v) \bigr\rangle +2\bigl\langle Av,P_{Q}Ax-P_{Q}A(x+v)\bigr\rangle . \end{aligned}$$
(5.1)

\(P_{Q}\) is a self-adjoint operator and \(P_{Q}^{2}=P_{Q}\), therefore, we have

$$\begin{aligned}& \bigl\vert \bigl\langle (I-P_{Q})Ax,P_{Q}Ax-P_{Q}A(x+v) \bigr\rangle \bigr\vert \\& \quad = \bigl\vert \bigl\langle (I-P_{Q})Ax,P_{Q}Ax \bigr\rangle -\bigl\langle (I-P_{Q})Ax,P_{Q}A(x+v)\bigr\rangle \bigr\vert \\& \quad = \bigl\langle P_{Q}(I-P_{Q})Ax,Ax\bigr\rangle - \bigl\langle P_{Q}(I-P_{Q})Ax,A(x+v)\bigr\rangle =0. \end{aligned}$$
(5.2)

Since \(P_{Q}:H\rightarrow Q\) is a nonexpansive mapping, we have

$$\begin{aligned}& \bigl\vert \bigl\langle P_{Q}Ax-P_{Q}A(x+v),P_{Q}Ax-P_{Q}A(x+v) \bigr\rangle \bigr\vert \\& \quad = \bigl\Vert P_{Q}Ax-P_{Q}A(x+v) \bigr\Vert ^{2}_{2} \\& \quad \leq \bigl\Vert Ax-(Ax+Av)\bigr\Vert ^{2}_{2} \\& \quad = \|Av\|^{2}_{2} \\& \quad \leq \|A\|^{2}_{2}\cdot\|v\|^{2}_{2} \end{aligned}$$
(5.3)

and

$$\begin{aligned}& 2\bigl\vert \bigl\langle Av,P_{Q}Ax-P_{Q}A(x+v) \bigr\rangle \bigr\vert \\& \quad \leq 2\bigl\vert \|Av\|_{2}\bigl\Vert P_{Q}Ax-P_{Q}A(x+v)\bigr\Vert _{2}\bigr\vert \\& \quad \leq 2\|Av\|_{2}\bigl\Vert Ax-A(x+v)\bigr\Vert _{2} \\& \quad \leq 2\|Av\|_{2}\|Av\|_{2} \\& \quad \leq 2\|A\|^{2}_{2}\cdot\|v\|^{2}_{2}. \end{aligned}$$
(5.4)

We also see that

$$ \langle Av,Av\rangle\leq \|A\|^{2}_{2}\cdot\|v \|^{2}_{2}. $$
(5.5)

By (5.1), (5.2), (5.3), (5.4), and (5.5), we have

$$\begin{aligned} \begin{aligned}[b] &\lim_{v\rightarrow 0}\frac{ |g_{1}(x+v)-g_{1}(x)-\langle v,A^{*}(I-P_{Q})Ax\rangle|}{\|v\|_{2}} \\ &\quad = \lim_{v\rightarrow 0}\frac{ |2g_{1}(x+v)-2g_{1}(x)-2\langle v,A^{*}(I-P_{Q})Ax\rangle|}{\|2v\| _{2}} \\ &\quad \leq \lim_{v\rightarrow 0}2\|A\|^{2}_{2}\|v \|_{2} \\ &\quad = 0. \end{aligned} \end{aligned}$$
(5.6)

This shows that \(g_{1}\) is Fréchet differentiable with Fréchet derivative \(\nabla g_{1}=A^{*}(A-P_{Q}A)\). Since A is a bounded operator and \(P_{Q}\) is a firmly nonexpansive mapping,

$$\begin{aligned} \bigl\Vert \nabla g_{1}(x)-\nabla g_{1}(y)\bigr\Vert =&\bigl\Vert A^{*}(A-P_{Q}A)x-A^{*}(A-P_{Q}A)y \bigr\Vert \\ \leq &\|A\|\bigl\Vert (I-P_{Q})Ax-(I-P_{Q})Ay\bigr\Vert \\ \leq &\|A\|\|Ax-Ay\| \\ \leq &\|A\|^{2}\cdot\|x-y\|. \end{aligned}$$
(5.7)

This shows that \(\nabla g_{1}\) is a Lipschitz function with Lipschitz constant \(\|A\|^{2}\). By the assumption \(\Delta_{1}\neq \emptyset\), we know that \(\Delta_{1}=\Pi_{3}\). Then we get the conclusion of Theorem 5.1 from Theorem 4.3. □

Remark 5.1

In 2010, Xu [14] used various algorithms to establish weak convergence theorems in infinite dimensional Hilbert spaces for the split feasibility problem (see Theorems 3.1, 3.3, 3.4, 4.1 and 5.7 of [14]). Also, Xu [14] established a strongly theorem for this problem in the infinite dimensional Hilbert space (see Theorem 5.5 of [14]). Now, Theorem 5.1 gives an algorithm to study the split feasibility problem which converges strongly to the split feasibility problem, and this result improves Byrne’s CQ algorithm [9], and the results given by Xu [14].

By Theorem 5.1, we get the following result for split feasibility problem.

Corollary 5.1

Let \(A:H\rightarrow H_{1}\) be a bounded linear operator, \(A^{*}\) be the adjoint of A. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<2\). Suppose that \(\Delta_{1} :=\{x\in C:Ax\in Q\} \neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=P_{C}(I-\alpha A^{*}(A-P_{Q}A))x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})((1-\beta_{n})y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\|A\|^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{1}}0\).

Proof

Let \(\theta_{n}=\theta=0\) for all \(n\in \mathbb{N}\) and \(V=I\) in Theorem 5.1, then Corollary 5.1 follows from Theorem 5.1. □

Next, an iteration is used to find the solution of the following lasso problem:

$$ \mbox{Find } \bar{x}\in\mathop{\arg\min}_{x\in\mathbb{R}^{n}}\biggl(\frac{1}{2} \|Ax-b\|^{2}_{2}+\gamma\|x\|_{1}\biggr) . $$

Theorem 5.2

Let A be a \(m\times n\) real matrix, \(x\in\mathbb{R}^{n}\), \(b\in \mathbb{R}^{m}\), and \(\gamma\geq0\) be a regularization parameter. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Delta_{2} := \arg\min_{x\in\mathbb{R}^{n}}\frac{1}{2}\|Ax-b\|^{2}_{2}+\gamma\|x\|_{1}\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in \mathbb{R}^{n} \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\alpha}^{\partial\gamma\|\cdot\|_{1} }(x_{n}-\alpha A^{*}(Ax_{n}-b)), \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\|A^{2}\|})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{2}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x}-\theta,q-\bar{x}\rangle\geq0,\quad \forall q\in \Delta_{2}. $$

Proof

Let \(g_{1}(x)=\frac{1}{2}\|Ax-b\|^{2}_{2}\), and \(h_{1}(x)=\gamma\|x\|_{1}\). For each \(v\in\mathbb{R}^{n}\),

$$\begin{aligned}& \lim_{v\rightarrow 0}\frac{ |g_{1}(x+v)-g_{1}(x)-\langle v,A^{*}(Ax-b)\rangle|}{\|v\|_{2}} \\& \quad = \lim_{v\rightarrow 0}\frac{|\langle A(x+v)-b,A(x+v)-b\rangle-\langle Ax-b,Ax-b\rangle-2\langle v,A^{*}(Ax-b)\rangle|}{2\|v\|_{2}} \\& \quad = 0. \end{aligned}$$
(5.8)

Then \(\nabla g_{1}(x)=A^{*}(Ax-b)\), and \(h_{1}\) is a proper convex and lower semicontinuous function on H. Hence,

$$\bigl\Vert \nabla g_{1}(x)-\nabla g_{1}(y)\bigr\Vert = \bigl\Vert A^{*}(Ax-b)-A^{*}(Ay-b)\bigr\Vert _{2}\leq \|A\|^{2}\|x-y\| $$

and \(\nabla g_{1}\) is a Lipschitz function with Lipschitz constant \(\| A\|^{2}\). Therefore, Theorem 5.2 follows from Theorem 4.2. □

The following is a special case of Theorem 5.2.

Corollary 5.2

Let A be a \(m\times n\) real matrix, \(x\in\mathbb{R}^{n}\), \(b\in R^{m}\), and \(\gamma\geq0\) be a regularization parameter. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<2\). Suppose that \(\Delta_{2} := \arg\min_{x\in\mathbb{R}^{n}}\frac{1}{2}\|Ax-b\|^{2}_{2}+\gamma\|x\|_{1}\neq\emptyset\). A sequence \(\{x_{n}\}\subset\mathbb{R}^{n}\) is defined as follows: \(x_{1}\in\mathbb{R}^{n}\) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\alpha}^{\partial\gamma\|\cdot\|_{1} }(x_{n}-\alpha A^{*}(Ax_{n}-b)), \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})((1-\beta_{n})y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\|A\|^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{2}}(0)\).

Proof

Let \(\theta_{n}=\theta=0\) for all \(n\in\mathbb{N}\) and \(V=I\) in Theorem 5.2, then Corollary 5.2 follows from Theorem 5.2. □

Apply Corollary 4.1, an iteration is used to find the solution to the split feasibility problem: Find \(\bar{x}\in C\), \(A\bar{x}\in Q\).

Theorem 5.3

Let \(A:H\rightarrow H_{1}\) be a bounded linear operator, \(A^{*}\) be the adjoint of A. Let \(A_{f_{1}}\) be defined as (L4.2) in Lemma  2.4. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Pi_{13}:=\{x\in C: Ax\in Q\}\neq\emptyset\). Let \(f_{1}(x,y)=\langle y-x, A^{*}(A-P_{Q}A)x\rangle\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and

$$ \left \{ \textstyle\begin{array}{l} y_{n}=J_{\lambda}^{A_{f_{1}}}P_{C}x_{n}, \\ x_{n+1}=\alpha_{n}x_{n}+(1-\alpha_{n})(\beta_{n}\theta_{n}+(I-\beta _{n}V)y_{n}) \end{array}\displaystyle \right . $$

for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\|A\|^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{13}}(\bar{x}-V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:

$$\langle V\bar{x}-\theta,q-\bar{x}\rangle\geq0,\quad \forall q\in \Pi_{13}. $$

Proof

Let \(g_{1}(x)=\frac{\|Ax-P_{Q}Ax\|^{2}_{2}}{2}\), then shows that \(g_{1}\) is Fréchet differentiable with Fréchet derivative \(\nabla g_{1}=A^{*}(A-P_{Q}A)\), and \(\nabla g_{1}\) a Lipschitz function with Lipschitz constant \(\|A\|^{2}\). Applying Corollary 4.1 and following the same argument as Theorem 5.1, we can prove Theorem 5.3. □

6 Image deblurring problem

This section mainly focuses on the image deblurring problems, which has received a lot of attention in recent years. Until now, some researchers have proposed many novel algorithms for this problem based on different deblurring models; for examples, see [26]. Now, by Corollary 5.2, we can consider the image deblurring problem.

All pixels of the original images described in the examples were first scaled into the range between 0 and 1.

The image went through a Gaussian blur of size \(9\times9\) and standard deviation 4 (applied by the MATLAB functions imfilter and fspecial) followed by an additive zero-mean white Gaussian noise with standard deviation 10−3. The original and observed images are given in Figures 1-3.

Figure 1
figure 1

The original image.

Figure 2
figure 2

The blurred image.

Figure 3
figure 3

The deblurred image.

Remark 6.1

In the literature, we may observe that there are many fast algorithms for the image deblurring problem. Here, we show that we can also consider this problem by Corollary 5.2.

7 Conclusion and remarks

In this paper, we apply a recent fixed point theorem in [17] to study mathematical programming for the sum of two convex functions, mathematical programming of convex function, the split feasibility problem, and the lasso problem. We establish strong convergence theorems as regards these problems. The study of such problems will give many other applications in science, nonlinear analysis, and statistics.

References

  1. Tibshirani, R: Regression shrinkage and selection for the lasso. J. R. Stat. Soc., Ser. B 58, 267-288 (1996)

    MathSciNet  MATH  Google Scholar 

  2. Combettes, PL, Wajs, R: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168-1200 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  3. Xu, HK: Properties and iterative methods for the lasso and its variants. Chin. Ann. Math., Ser. B 35, 501-518 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  4. Wang, Y, Xu, HK: Strong convergence for the proximal gradient methods. J. Nonlinear Convex Anal. 15, 581-593 (2014)

    MathSciNet  MATH  Google Scholar 

  5. Douglas, J, Rachford, HH: On the numerical solution of heat conduction in two and three space variable. Trans. Am. Math. Soc. 82, 421-439 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bauschken, HH, Combettes, PL: Convex Analysis and Monotone Operator Theory in Hilbert Space. Springer, Berlin (2011)

    Book  Google Scholar 

  7. Tseng, P: Further applications of a splitting algorithm to decomposition in variational inequalities and convex programming. Math. Program., Ser. B 48, 249-263 (1990)

    Article  MATH  Google Scholar 

  8. Censor, Y, Elfving, T: A multiprojection algorithm using Bregman projection in a product space. Numer. Algorithms 8, 221-239 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  9. Byrne, C: Iterative oblique projection onto convex sets and the split feasibility problem. Inverse Probl. 18, 441-453 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  10. Byrne, C: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Probl. 20, 103-120 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  11. Censor, Y, Bortfeld, T, Martin, B, Trofimov, A: A unified approach for inversion problems in intensity-modulated radiation therapy. Phys. Med. Biol. 51, 2353-2365 (2003)

    Article  Google Scholar 

  12. López, G, Martín-Márquez, V, Xu, HK: Iterative algorithms for the multiple-sets split feasibility problem. In: Censor, Y, Jiang, M, Wang, G (eds.) Biomedical Mathematics: Promising Directions in Imaging, Therapy Planning and Inverse Problems, pp. 243-279. Medical Physics Publishing, Madison (2010)

    Google Scholar 

  13. Stark, H: Image Recovery: Theory and Applications. Academic Press, San Diego (1987)

    MATH  Google Scholar 

  14. Xu, HK: Iterative methods for the split feasibility problem in infinite-dimensional Hilbert spaces. Inverse Probl. 26, 105018 (2010)

    Article  Google Scholar 

  15. Rockafellar, TA: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877-898 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  16. Xu, HK: Averaged mappings and the gradient projection algorithm. J. Optim. Theory Appl. 150, 360-378 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  17. Yu, ZT, Lin, LJ: Hierarchical problems with applications to mathematical programming with multiple sets split feasibility constraints. Fixed Point Theory Appl. 2013, 283 (2013)

    Article  MathSciNet  Google Scholar 

  18. Takahashi, W: Nonlinear Functional Analysis: Fixed Point Theory and Its Applications. Yokohama Publishers, Yokohama (2000)

    Google Scholar 

  19. Fan, K: A minimax inequalities and its applications. In: Shisha, O (ed.) Inequalities III, pp. 103-113. Academic Press, San Diego (1972)

    Google Scholar 

  20. Blum, E, Oettli, W: From optimization and variational inequalities to equilibrium problems. Math. Stud. 63, 123-146 (1994)

    MathSciNet  MATH  Google Scholar 

  21. Combettes, PL, Hirstoaga, SA: Equilibrium programming in Hilbert spaces. J. Nonlinear Convex Anal. 6, 117-136 (2005)

    MathSciNet  MATH  Google Scholar 

  22. Takahashi, S, Takahashi, W, Toyoda, M: Strong convergence theorems for maximal monotone operators with nonlinear mappings in Hilbert spaces. J. Optim. Theory Appl. 147, 27-41 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  23. Ekaland, I, Temam, R: Convex Analysis and Variational Problems. North-Holland, Amsterdam (1976)

    Google Scholar 

  24. Combettes, PL: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53(5-6), 475-504 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  25. Baillon, JB, Haddad, G: Quelques propriétés des opérateurs angle-bornés et n-cycliquement monotones. Isr. J. Math. 26, 137-150 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  26. Chambolle, A: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89-97 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Prof. CS Chuang was supported by the National Science Council of Republic of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lai-Jiu Lin.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

All authors contributed equally to this work. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chuang, C.S., Yu, ZT. & Lin, LJ. Mathematical programming for the sum of two convex functions with applications to lasso problem, split feasibility problems, and image deblurring problem. Fixed Point Theory Appl 2015, 143 (2015). https://doi.org/10.1186/s13663-015-0388-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13663-015-0388-0

MSC

Keywords