 Research
 Open Access
 Published:
Mathematical programming for the sum of two convex functions with applications to lasso problem, split feasibility problems, and image deblurring problem
Fixed Point Theory and Applications volume 2015, Article number: 143 (2015)
Abstract
In this paper, two iteration processes are used to find the solutions of the mathematical programming for the sum of two convex functions. In infinite Hilbert space, we establish two strong convergence theorems as regards this problem. As applications of our results, we give strong convergence theorems as regards the split feasibility problem with modified CQ method, strong convergence theorem as regards the lasso problem, and strong convergence theorems for the mathematical programming with a modified proximal point algorithm and a modified gradientprojection method in the infinite dimensional Hilbert space. We also apply our result on the lasso problem to the image deblurring problem. Some numerical examples are given to demonstrate our results. The main result of this paper entails a unified study of many types of optimization problems. Our algorithms to solve these problems are different from any results in the literature. Some results of this paper are original and some results of this paper improve, extend, and unify comparable results in existence in the literature.
Introduction
Let \(\mathbb{R}\) be the set of real numbers, H and \(H_{1}\) be (real) Hilbert spaces with inner product \(\langle\cdot,\cdot\rangle\) and norm \(\\cdot\\). Let C and Q be nonempty, closed, convex subsets of H and \(H_{1}\), respectively. Let \(\Gamma_{0}(H)\) be the space of all proper lower semicontinuous convex functions from H to \((\infty ,\infty]\). In this paper, we consider the following minimization problem:
where \(h_{1},g_{1}\in\Gamma_{0}(H)\), and \(g_{1}:H\rightarrow \mathbb {R}\) is a Fréchet differential function.
Let A be a \(m\times n\) real matrix, \(x\in\mathbb{R}^{n}\), \(b\in R^{m}\), \(\gamma\geq0\) be a regularization parameter and \(t\geq0\). Tibshirani [1] studied the following minimization problem:
where \(\x\_{1}=\sum_{i=1}^{n}x_{i}\) for \(x=(x_{1},x_{2},\ldots,x_{n})\in\mathbb{R}^{n}\), and \(\y\^{2}_{2}=\sum_{i=1}^{m}(y_{i})^{2}\) for \(y=(y_{1},y_{2},\ldots, y_{m})\in\mathbb{R}^{m}\). Problem (1.2) is called lasso, an abbreviation of ‘the least absolute shrinkage and select operator’. This is a special case of problem (1.1). Besides, we know that problem (1.2) is equivalent to the following problem:
Therefore, problems (1.2) and (1.3) are special cases of problem (1.1).
Due to the involvement of the \(\ell_{1}\) norm, which promotes the sparsity phenomena of many real world problems arising from image/signal processing, statistical regression, machining learning and so on, the lasso receives much attention (see Combettes and Wajs [2], Xu [3], and Wang and Xu [4]).
Let \(f:H\rightarrow (\infty,\infty]\) be proper and let C be a nonempty subset of \(\operatorname{dom}(f)\). Then f is said to be uniformly convex on C if
for all \(x,y \in C\), and for all \(\alpha\in(0,1)\), where \(p:[0,\infty )\rightarrow [0,\infty]\) is an increasing function, p vanishes only at 0.
Let \(g\in\Gamma_{0}(H)\) and \(\lambda\in(0,\infty)\). The proximal operator of g is defined by
The proximal operator of \(g\in\Gamma_{0}(H)\) of order \(\lambda\in (0,\infty)\) is defined as the proximal operator of λg, that is,
The following results are important results on the solution of the problem (1.1).
Theorem 1.1
(DouglasRachfordalgorithm) [5]
Let f and g be functions in \(\Gamma_{0}(H)\) such that \((\partial f+\partial g)^{1}0\neq\emptyset\). Let \(\{\lambda_{n}\}_{n\in \mathbb{N}}\) be a sequence in \([0,2]\) such that \(\sum_{n\in\mathbb{N}}\lambda_{n}(2\lambda_{n})=+\infty\). Let \(\gamma\in(0,\infty)\), and \(x_{0}\in H\). Set
Then there exists \(x\in H\) such that the following hold:

(i)
\(\operatorname{prox}_{\gamma g} x\in\arg\min_{x\in H}(f+g)(x)\);

(ii)
\(\{y_{n}z_{n}\}_{n\in\mathbb{N}}\) converges strongly to 0;

(iii)
\(\{x_{n}\}_{n\in\mathbb{N}}\) converges weakly to x;

(iv)
\(\{y_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converge weakly to \(\operatorname{prox}_{\gamma g}x\);

(v)
suppose that one of the following holds:

(a)
f is uniformly convex on every nonempty subset of \(\operatorname{dom}\partial f\);

(b)
g is uniformly convex on every nonempty bounded subset of \(\operatorname{dom}\partial g\).

(a)
Then \(\{y_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converge strongly to \(\operatorname{prox}_{\gamma g}x\), which is the unique minimizer of \(f+g\).
Theorem 1.2
(Forwardbackward algorithm) [6]
Let \(f\in\Gamma_{0}(H)\), let \(g:H\rightarrow \mathbb{R}\) be convex and differentiable with a \(\frac {1}{\beta}\)Lipschitz continuous gradient for some \(\beta\in (0,\infty)\), let \(\gamma\in(0,2\beta)\), and set \(\delta=\min\{ 1,\frac{\beta}{\gamma}\}+\frac{1}{2}\). Furthermore, let \(\{\lambda_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((0,2\delta]\) such that \(\sum_{n\in\mathbb{N}}\lambda_{n}(\delta\lambda_{n})=+\infty\). Suppose that \(\arg\min (f+g)\neq\emptyset\) and let
Then the following hold:

(i)
\((x_{n})_{n\in\mathbb{N}}\) converges weakly to a point in \(\arg\min_{x\in H}(f+g)(x)\);

(ii)
suppose that \(\inf_{n\in\mathbb{N}}\lambda_{n}\in (0,\infty)\) and one of the following hold:

(a)
f is uniformly convex on every nonempty bounded subset of \(\operatorname{dom}\partial f\);

(b)
g is uniformly convex on every nonempty bounded subset of H.

(a)
Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges strongly to the unique minimizer of \(f+g\)
Theorem 1.3
(Tseng’s algorithm) [7]
Let D be a nonempty subset of H, \(f\in\Gamma _{0}(H)\) be such that \(\operatorname{dom}\partial f\subset D\), and let \(g\in\Gamma _{0}(H)\) be such that Gâteaux differentiable on D. Suppose that C is a nonempty, closed, convex subset of D such that \(C\cap\arg \min_{x\in H}(f+g)(x)\neq\emptyset\), and that ∇g is \(\frac{1}{\beta}\)Lipschitz continuous relative to \(C\cup \operatorname{dom} \partial f\) for some \(\beta\in(0,\infty)\). Let \(x_{0}\in C\) and \(\gamma\in(0,\beta)\), and set
Then

(a)
\(\{x_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converges weakly to a point in \(C\cap\arg\min_{x\in H}(f+g)(x)\);

(b)
suppose that f or g is uniformly convex on every nonempty subset of \(\operatorname{dom}\partial f\).
Then \(\{x_{n}\}_{n\in\mathbb{N}}\) and \(\{z_{n}\}_{n\in\mathbb{N}}\) converges strongly in \(C\cap\arg\min_{x\in H}(f+g)(x)\).
Combettes and Wajs [2] used the proximal gradient method to generalize a sequence \(\{x_{n}\}\) by the algorithm: \(x_{0}\in H \) is chosen arbitrarily, and
We observed that Combettes and Wajs [2] showed that \(\{x_{n}\}\) converges weakly to a solution of the minimization problem (1.1) under suitable conditions. In 2014, Xu [3] gave an iteration process and proved weak convergence theorems to the solution for the problem (1.1). Next, Wang and Xu [4] studied problem (1.1) by the following two types of iteration processes:

(a)
\(x_{n+1}:=\operatorname{prox}_{\lambda_{n}g}((1\gamma_{n})x_{n}\delta _{n}\nabla f(x_{n}))\) for all \(n\in\mathbb{N}\);

(b)
\(x_{n+1}:=(I\gamma_{n}\operatorname{prox}_{\lambda_{n}g}(I\delta _{n}\nabla f))x_{n}\) for all \(n\in\mathbb{N}\).
For these two iteration processes, Wang and Xu [4] proved that they converge strongly to a solution of the problem (1.1) under suitable conditions.
Let I be the identity function of H, and let \(f_{1}: C\times C\rightarrow H\) be a bifunction. Let \(g_{1}:H\rightarrow (\infty ,\infty]\) be a proper convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on \(\operatorname{int}(\operatorname{dom}(g_{1}))\), \(C \subset \operatorname{int}(\operatorname{dom}(g_{1}))\), and \(h_{1}:H\rightarrow (\infty,\infty]\) be a proper convex lower semicontinuous function. Let \(P_{C}\) be the metric projection of H into C. Throughout this paper, we use these notations unless specified otherwise.
Motivated by the results of the above problems, in this paper, we introduce the following iterations to study problem (1.1).
Iteration (I)
Let \(x_{1}\in C \) be chosen arbitrarily, and
where \(f_{1}(x,y)=\langle yx, \nabla g_{1}(x)\rangle+h_{1}(y)h_{1}(x)\).
Iteration (II)
Let \(x_{1}\in C \) be chosen arbitrarily, and
Then we establish two strong convergence theorems without the uniformly convex assumption on the functions we consider. Our results improve Combettes and Wajs [2], Xu [3], DouglasRachford [5], Theorem 1.2, and Tseng [7]. Our results is also different from Wang and Xu [4].
We also apply our results to study the following problems.

(AP1)
Split feasibility problem:
$$ \text{Find } \bar{x}\in H \text{ such that } \bar{x}\in C \text{ and }A\bar{x}\in Q, $$(SFP)where \(A:H\rightarrow H_{1}\) is a linear and bounded operator.
In 1994, the split feasibility problem (SFP) in finite dimensional Hilbert spaces was first introduced by Censor and Elfving [8] for modeling inverse problems which arise from phase retrievals and in medical image reconstruction. Since then, the split feasibility problem (SFP) has received much attention due to its applications in signal processing, image reconstruction, with particular progress in intensitymodulated radiation therapy, approximation theory, control theory, biomedical engineering, communications, and geophysics. For examples, one can refer to [8–13] and related literature.
In 2002, Byrne [9] first introduced the socalled CQ algorithm which generates a sequence \(\{x_{n}\}\) by the following recursive procedure:
where the stepsize \(\rho_{n}\) is chosen in the interval \((0, 2/\A\^{2})\), and \(P_{C}\) and \(P_{Q}\) are the metric projections onto \(C\subseteq \mathbb{R}^{n}\) and \(Q\subseteq \mathbb{R}^{m}\), respectively. Byrne [9] used the CQ iteration method to study the split feasibility problem in finite dimensional spaces, but in the infinite dimensional Hilbert space, a strong convergence theorem may not be true for the split feasibility problem by the CQ algorithm [14]. Hence, some modified CQ algorithms are introduced.
In this paper, we give a new iteration to study the (SFP), we also give our modified CQ method to study (SFP), we study two strong convergence theorem to the solution of this problem. We establish a strong convergence of this problem. Our results are different from Theorem 5.1 of Xu [14] and improve other results of Xu [14].

(AP2)
Lasso problem:
$$ \mathop{\arg\min}_{x\in\mathbb{R}^{n}}\frac{1}{2}\Axb\^{2}_{2}+ \gamma\x\_{1}. $$
We give two iterations to study the lasso problem, and we establish two strong convergence theorems of the lasso problem.

(AP3)
Mathematical programming for convex function:
$$ \mathop{\arg\min}_{x\in H}h_{1}(x),\mbox{where }h_{1}\in\Gamma_{0}(H). $$
Rockafellar [15] used a proximal point algorithm and proved a weak convergence theorem to the solution of this problem. We establish a modified proximal point algorithm and prove a strongly convergence theorem to study this problem. Our result improves the results given by Rockafellar [15].

(AP4)
Mathematical programming for convex Fréchet differentiable function:
$$ \mbox{Find } \mathop{\arg\min}_{x\in H}g_{1}(x) , \mbox{where } g_{1}:H\rightarrow \mathbb {R} \mbox{ is a Fr\'{e}chet differentiable function}. $$
Recently, Xu [16] studied weak convergence and strong convergence theorem of this problem with various types of relaxed gradientprojection iterations. He also used the viscosity nature of the gradientprojection method and the regularized method to study strong convergence theorems of this problem.
A special case of one of our iteration is modified gradientprojection algorithm. We use this modified gradientprojection algorithm to establish a strong convergence theorem of this problem (AP4), and our results improve recent results given by Xu in [16].
In this paper, we apply a recent result of Yu and Lin [17] to find the solution of the mathematical programming for two convex functions, then we apply our results on mathematical programming for two convex functions to study the above problems. We establish a strongly convergent theorem for these problems and apply our result on the lasso problem to the image deblurring problem. Some numerical examples are given to demonstrate our results. The main result of this paper gives a unified study of many types of optimization problems. Our algorithms to solve these problems are different from any results in the literature. Some results of this paper are original and some results of this paper improve, extend, and unify comparable results existence in the literature.
Preliminaries
Throughout this paper, we denote the strong convergence of \(\{x_{n}\}\) to \(x\in H\) by \(x_{n}\rightarrow x\). Let \(T:C\rightarrow H\) be mapping, and let \(\operatorname{Fix}(T):=\{x\in C: Tx=x\}\) denote the set of fixed points of T. Thus:

(i)
T is called nonexpansive if \(\TxTy\\leq\xy\\) for all \(x,y\in C\).

(ii)
T is strongly monotone if there exists \(\bar{\gamma}> 0\) such that \(\langle xy, TxTy\rangle\geq\bar{\gamma}\xy\^{2}\) for all \(x, y\in C\).

(iii)
T is Lipschitz continuous if there exists \(L > 0\) such that \(\TxTy\\leq L\xy\\) for all \(x,y\in C\).

(iv)
Let \(\alpha>0\). Then T is αinversestrongly monotone if \(\langle xy,TxTy\rangle \geq\alpha\TxTy\^{2}\) for all \(x,y\in C\). We denote T is αism if T is αinversestrongly monotone.

(v)
T is firmly nonexpansive if
$$\TxTy\^{2}\leq\xy\^{2}\bigl\Vert (IT)x(IT)y\bigr\Vert ^{2}\quad \text{for every }x,y\in C, $$that is,
$$\TxTy\^{2}\leq\langle xy,TxTy\rangle\quad \text{for every }x,y\in C. $$
Let \(B:H\multimap H\) be a multivalued mapping. The effective domain of B is denoted by \(D(B)\), that is, \(D(B) = \{x\in H : Bx\neq\emptyset \}\). Thus:

(i)
B is a monotone operator on H if \(\langle xy,uv\rangle \geq0\) for all \(x, y\in D(B)\), \(u\in Bx\) and \(v\in By\).

(ii)
B is a maximal monotone operator on H if B is a monotone operator on H and its graph is not properly contained in the graph of any other monotone operator on H.
Lemma 2.1
[18]
Let \(G_{1}:H\multimap H\) be maximal monotone. Let \(J_{r}^{G_{1}}\) be the resolvent of G defined by \(J_{r}^{G_{1}}=(I+r G_{1})^{1}\) for each \(r>0\). Then the following hold:

(i)
For each \(r>0\), \(J_{r}^{G_{1}}\) is singlevalued and firmly nonexpansive.

(ii)
\(\mathcal{D}(J_{r}^{G_{1}})=H_{1}\) and \(\operatorname{Fix}(J_{r}^{G_{1}})=\{ x\in \mathcal{D}(G_{1}):0\in G_{1}x\}\).
Let C be a nonempty, closed, convex subset of a real Hilbert space H. Let \(g:C\times C\rightarrow \mathbb{R}\) be a function. The Ky Fan inequality problem [19] is to find \(z\in C\) such that
The solution set of Ky Fan inequality problem (KF) is denoted by \(\operatorname{KF}(C,g)\).
For solving the Ky Fan inequalities problem, let us assume that the bifunction \(g:C\times C\rightarrow \mathbb{R}\) satisfies the following conditions:

(A1)
\(g(x,x)=0\) for each \(x\in C\);

(A2)
g is monotone, i.e., \(g(x, y) +g(y, x)\leq0\) for any \(x,y\in C\);

(A3)
for each \(x,y,z\in C\), \(\limsup_{t\downarrow0}g(tz+(1t)x,y)\leq g(x,y)\);

(A4)
for each \(x\in C\), the scalar function \(y\rightarrow g(x,y)\) is convex and lower semicontinuous.
We have the following result from Blum and Oettli [20].
Lemma 2.2
[20]
Let \(g: C\times C\rightarrow \mathbb {R}\) be a bifunction which satisfies conditions (A1)(A4). Then for each \(r>0\) and each \(x\in H\), there exists \(z\in C\) such that
for all \(y\in C\).
In 2005, Combettes and Hirstoaga [21] established the following important properties of resolvent operator.
Lemma 2.3
[21]
Let \(g:C\times C\rightarrow \mathbb{R}\) be a function satisfying conditions (A1)(A4). For \(r>0\), define \(T_{r}^{g}:H\rightarrow C\) by
for all \(x\in H\). Then the following hold:

(i)
\(T_{r}^{g}\) is singlevalued;

(ii)
\(T_{r}^{g}\) is firmly nonexpansive, that is, \(\T_{r}^{g}xT_{r}^{g}y\^{2}\leq\langle xy,T_{r}^{g}xT_{r}^{g}y\rangle\) for all \(x,y\in H\);

(iii)
\(\{x\in H: T_{r}^{g}x=x\}=\{x\in C: g(x,y)\geq0, \forall y\in C\}\);

(iv)
\(\{x\in C: g(x,y)\geq0, \forall y\in C\}\) is a closed and convex subset of C.
We call such \(T_{r}^{g}\) the resolvent of g for \(r>0\).
Takahashi et al. [22] gave the following lemma.
Lemma 2.4
[22]
Let \(g: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying the conditions (A1)(A4). Define \(A_{g}\) as follows:
Then \(\operatorname{KF}(C,g)=A^{1}_{g}0\) and \(A_{g}\) is a maximal monotone operator with the domain of \(A_{g}\subset C\). Furthermore, for any \(x\in H\) and \(r>0\), the resolvent \(T_{r}^{g}\) of g coincides with the resolvent of \(A_{g}\), i.e., \(T_{r}^{g}x=(I+rA_{g})^{1}x\).
Let \(f:H\rightarrow (\infty,\infty]\) be proper. The subdifferential ∂f of f is the set valued operator, defined \(\partial f= \{u\in H: f(y)\geq f(x)+\langle yx,u\rangle\text{ for all } y\in H\}\).
Let \(x\in H\). Then f is subdifferentiable at x if \(\partial f(x)\neq\emptyset\). Then the elements of \(\partial f(x)\) are called the subgradient of f at x.
The directional derivative of f at x in the direction y is
provided that the limit exists in \([\infty,\infty]\).
Let \(x\in \operatorname{dom} f\) and suppose that \(f'(x,y)\) is linear and continuous, then f is said to be Gâteaux differentiable at x. By the Riesz representation theorem, there exists a unique vector \(\nabla f(x)\in H\) such that \(f'(x,y)=\langle y,\nabla f(x)\rangle\) for all \(y\in H\).
Let \(x\in H\), let \(\mu(x)\) denote the family of all neighborhood of x, let \(H_{1}\) be a Hilbert space, let \(C\in\mu(x)\) and let \(f:C\rightarrow (\infty,\infty]\). Then f is said to be Fréchet differentiable at x if there exists an operator \(\nabla f(x)\in B(H,\mathbb{R})\), called the Fréchet derivative of f at x, such that
Further, if f is Frêchet differentiable at x, then f is Gâteaux differentiable at x.
Let C be a nonempty, closed, convex subset of H. The indicator function \(\iota_{C}\) defined by
is a proper lower semicontinuous convex function and its subdifferential \(\partial\iota_{C}\) defined by
is a maximal monotone operator (see Lemma 2.8). Furthermore, we also define the normal cone \(N_{C}u\) of C at u as follows:
We can define the resolvent \(J_{\lambda}^{\partial i_{C}}\) of \(\partial i_{C}\) for \(\lambda>0\), i.e.,
for all \(x\in H\). Since
for all \(x\in C\), we have
For details see [22].
Lemma 2.5
[4]
Let \(g\in \Gamma_{0}(H)\) and \(\lambda\in(0,\infty)\). Thus:

(i)
If C is a nonempty, closed, convex subset of H and \(g=i_{C}\) is the indicator function of C, then the proximal operator \(\operatorname{prox}_{\lambda g}=P_{C}\) for all \(\lambda\in(0,\infty)\), where \(P_{C}\) is the metric projection operator from H to C.

(ii)
\(\operatorname{prox}_{\lambda g}\) is firmly nonexpansive.

(iii)
\(\operatorname{prox}_{\lambda g}=(I+\lambda\partial g)^{1}=J_{\lambda }^{\partial g}\).
Lemma 2.6
[3]
Let \(f,g\in\Gamma _{0}(H)\). Let \(x^{*}\in H\) and \(\lambda\in(0,\infty)\). Assume that f is finite valued and Fréchet differentiable function on H with Fréchet derivative ∇f. Then \(x^{*}\) is a solution to the problem \(\arg\min_{x\in H}f(x)+g(x)\) if and only if \(x^{*}=\operatorname{prox}_{\lambda g}(I\lambda\nabla f)x^{*}\).
Lemma 2.7
[23]
Let \(C\subset H\) be nonempty, closed, convex subset, let \(A:H\rightarrow H\), and let \(f:H\rightarrow \mathbb{R}\) be convex and Fréchet differentiable. Let A be the Fréchet derivative of f. Then \(\operatorname{VI}(C, A)=\arg\min_{x\in C}f(x)\).
Lemma 2.8
[6]
Let \(f\in \Gamma_{0}(H)\), then ∂f is maximum monotone.
Lemma 2.9
[6]
Let f and g be functions in \(\Gamma_{0}(H)\) such that one of the following holds:

(i)
\(\operatorname{dom} f\cap \operatorname{int} (\operatorname{dom}) g\neq\emptyset\);

(ii)
\(\operatorname{dom} g=H\).
Then \(\partial(f+g)=\partial f+\partial g\).
A mapping \(T_{\alpha}:H\rightarrow H\) is said to be averaged if \(T_{\alpha}=(1\alpha)I +\alpha T\), where \(\alpha\in(0, 1)\) and \(T: H\rightarrow H \) is nonexpansive. In this case, we say that \(T_{\alpha}\) is αaveraged. Clearly, a firmly nonexpansive mapping is \(\frac{1}{2}\)averaged.
Lemma 2.10
[24]
Let \(T:H\rightarrow H\) be a mapping. Then the following hold:

(i)
T is nonexpansive if and only if the complement \((IT)\) is \(1/2\)ism;

(ii)
if S is υism, then for \(\gamma>0\), γS is \(\upsilon/\gamma\)ism;

(iii)
S is averaged if and only if the complement \(IS\) is υism for some \(\upsilon> 1/2\);

(iv)
if S and T are both averaged, then the product (composite) ST is averaged;

(v)
if the mappings \(\{T_{i}\}_{i=1}^{n}\) are averaged and have a common fixed point, then \(\bigcap_{i=1}^{n}\operatorname{Fix}(T_{i}) = \operatorname{Fix}(T_{1}\cdots T_{n})\).
Lemma 2.11
[6]
Let \(f:H\rightarrow (\infty,\infty]\) be proper and convex. Suppose that f is Gâteaux differentiable at x. Then \(\partial f(x)=\{\nabla f(x)\}\).
Common solution of variational inequality problem, fixed point, and Ky Fan inequalities problem
For each \(i=1,2\), let \(F_{i}:C\rightarrow H\) be a \(\kappa _{i}\)inversestrongly monotone mapping of C into H with \(\kappa_{i}>0\). For each \(i=1,2\), let \(G_{i}\) be a maximal monotone mapping on H such that the domain of \(G_{i}\) is included in C and define the set \(G_{i}^{1}0\) as \(G_{i}^{1}0= \{x\in H: 0\in G_{i}x\}\). Let \(J_{\lambda}^{G_{1}}=(I+\lambda G_{1})^{1}\) and \(J_{r}^{G_{2}}=(I+r G_{2})^{1}\) for each \(n\in\mathbb{N}\), \(\lambda >0\) and \(r>0\). Let \(\{\theta_{n}\}\subset H\) be a sequence. Let V be a \(\bar{\gamma}\)strongly monotone and LLipschitz continuous operator with \(\bar{\gamma}>0\) and \(L > 0\). Let \(T : C\rightarrow H\) be a nonexpansive mapping. Throughout this paper, we use these notations and assumptions unless specified otherwise.
In this paper, we say conditions (D) hold if the following conditions are satisfied:

(i)
\(0<\liminf_{n\rightarrow\infty}\alpha_{n}\leq\limsup_{n\rightarrow\infty}\alpha_{n}<1\);

(ii)
\(\lim_{n\rightarrow \infty}\beta_{n}=0\), and \(\sum_{n=1}^{\infty}\beta_{n}=\infty\);

(iii)
\(\lim_{n\rightarrow\infty}\theta_{n}=0\).
The following strong convergence theorem is needed in this paper.
Theorem 3.1
[17]
Suppose that \(\Omega_{1}=\operatorname{Fix}(T)\cap \operatorname{Fix}(J_{\lambda }^{G_{1}}(I\lambda F_{1}))\cap \operatorname{Fix}(J_{r}^{G_{2}}(IrF_{2}))\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily,
for each \(n\in\mathbb{N}\), \(\{\lambda, r\}\subset(0,\infty)\), \(\{ \alpha_{n}\}\subset(0,1)\), and \(\{\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold and \(0<\lambda<2\kappa_{1}\) and \(0< r<2\kappa_{2}\). Then
where
This point \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
For each \(i=1,2\), let \(f_{i}: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying conditions (A1)(A4). An iteration is used to find common solutions of a variational inequality problem, Ky Fan inequalities problems, and a fixed point set of a mapping:
Theorem 3.2
For each \(i=1,2\), let \(f_{i}: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying conditions (A1)(A4), and let \(A_{f_{i}}\) be defined as (L4.2) in Lemma 2.4. Suppose that \(\Omega_{2}: =\operatorname{Fix}(T)\cap \operatorname{KF}(C,f_{1})\cap \operatorname{KF}(C,f_{2})\neq \emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\{\lambda,r\}\subset(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{2}}(\bar{x}V\bar{x})\). This point \(\bar{x} \in \Omega_{2}\) is also the unique solution to the hierarchical variational inequality:
Proof
For each \(i=1,2\), by Lemma 2.4, we know that \(A_{f_{i}}\) is a maximal monotone operator with the domain of \(A_{f_{i}}\subset C\) and \(\operatorname{KF}(C,f_{i})=A^{1}_{f_{i}}0\). For each \(i=1,2\), let \(F_{i}=0\), and \(G_{i}=A_{f_{i}}\) in Theorem 3.1. By Lemma 2.1(ii), we have, for each \(i=1,2\),
This implies that \(\Omega_{1} =\Omega_{2}\). By Theorem 3.1, \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{2}}(\bar{x}V\bar{x})\). This point \(\bar{x}\in \Omega_{2}\) is also the unique solution to the hierarchical variational inequality:
Thus,
and
Therefore, the proof is completed. □
As a simple consequence of Theorem 3.2, we study the common solution of the Ky Fan inequalities problems.
Theorem 3.3
Let \(f_{1}: C\times C\rightarrow \mathbb{R}\) be a bifunction satisfying conditions (A1)(A4) and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma 2.4. Suppose that \(\Omega_{3}: = \operatorname{KF}(C,f_{1})\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\{\lambda,r\}\subset(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{3}}(\bar{x}V\bar{x})\). This point \(\bar{x} \in \operatorname{KF}(C,f_{1})\) is also the unique solution to the hierarchical variational inequality:
Proof
Let \(I_{C}\) and \(i_{C}\) be the restriction of the identity function on C and the indicate function on C respectively and let \(T=I_{C}\), \(f_{2}=i_{C}\) in Theorem 3.2, then Theorem 3.3 follows from Theorem 3.2. □
Theorem 3.4
Let \(\Omega_{4}: =\operatorname{Fix}(T)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{4}}(\bar{x}V\bar{x})\). This point \(\bar{x} \in \operatorname{Fix} (T)\) is also the unique solution to the hierarchical variational inequality:
Proof
For each \(i=1,2\), let \(f_{i}=i_{C}\), \(A_{f_{i}}=\partial i_{C}\) in Theorem 3.2, where \(i_{C}\) is the indicator function of C. Then \(\operatorname{KF}(C,f_{i})=C\) and \(J_{r}^{A_{f_{i}}}=P_{C}\). Therefore, Theorem 3.4 follows immediately from Theorem 3.2. □
Mathematical programming for the sum of two convex functions
In the following theorem, an iteration is used to find the solution of the optimization problem for the sum of two convex functions:
Theorem 4.1
Let \(g_{1}:H\rightarrow (\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H, and \(h_{1}:H\rightarrow (\infty,\infty]\) be a proper convex lower semicontinuous function. Let \(f_{1}(x,y)=\langle yx,\nabla g_{1}x\rangle+h_{1}(y)h_{1}(x)\) for all \(x,y\in H\) and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma 2.4. Take \(\mu\in \mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Pi_{1} :=H\operatorname{VI}(C,\nabla g_{1},h_{1})\neq\emptyset\), where
A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\lambda\in(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{1}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is the unique solution to the hierarchical variational inequality:
Proof
Since \(\nabla g_{1}\) is the Fréchet derivative of the convex function \(g_{1}\), it follows from Corollary 17.33 of [6] that \(\nabla g_{1}\) is continuous on C. By Proposition 17.10 of [6], \(\nabla g_{1}\) is monotone on C. Hence, \(\nabla g_{1}\) is bounded on any line segment of C. By Proposition 17.2 of [6],
Since \(h_{1}:C\rightarrow \mathbb{R}\) is a proper convex lower semicontinuous function, it is easy to see that for each \(x,y,z\in C\),
This shows that condition (A3) is satisfied. It is easy to see that \(f_{1}\) also satisfies conditions (A1), (A2), and (A4). We see \(\operatorname{KF}(C, f_{1})=H\operatorname{VI}(C, \nabla g_{1},h_{1})\) and \(\Pi_{1}=\Omega_{3}\neq\emptyset\). By Theorem 3.3, \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Omega_{3}}(\bar{x}V\bar{x})\), \(\bar{x}\in \operatorname{KF}(C,f_{1})\). This point \(\bar{x}\in\Omega_{3}\) is also the unique solution to the hierarchical variational inequality:
By \(\Omega_{3}=\Pi_{1}\) and \(\bar{x}\in \operatorname{KF}(C,f_{1}) \), we have
and
for all \(y\in C\). Then \(\bar{x}\in\arg\min_{y\in C}(g_{1}+h_{1})(y)\). □
Example 4.1
Let \(h_{1}(x)=x^{2}\), \(g_{1}(x)=x^{2}+2x+1\), \(C=[1,1]\), \(H=\mathbb{R}\), \(\lambda=1\), \(V=I\), \(\alpha_{n}=\frac{1}{2}\) for all \(n\in \mathbb{N}\), \(\beta_{n}=\frac{1}{1\text{,}000n}\), \(C=[1,1]\), \(\theta_{n}=0\), \(f_{1}(x,y)=\langle yx,\nabla g_{1}(x)\rangle+h_{1}(y)h_{1}(x)\). Then \(f_{1}(x,y)=(yx)(2x+y+x+2)\), this implies that \(f_{1}(\frac{1}{2},y)=(y+\frac{1}{2})^{2}\geq0\) for all \(y\in[1,1]\), and \(\frac{1}{2}\in H\operatorname{VI}([1,1],\nabla g_{1},h_{1})\neq\emptyset\).
We also see \(A_{f_{1}}(1)=(\infty,2]\), \(A_{f_{1}}(1)=[6,\infty)\), and \(A_{f_{1}}(x)=4x+2\) if \(x\in(1,1)\). Let \(y_{n}=J_{1}^{A_{f_{1}}}P_{C}x_{n}\), \(x_{n+1}=\frac{1}{2}x_{n}+\frac{1}{2}(1\frac{1}{1\text{,}000n})y_{n}\).
It is easy to see that \(P_{C}x_{n}=5y_{n}+2\) and \(y_{n}=\frac{1}{5}(P_{C}x_{n}2)\).
Hence \(x_{n+1}=\frac{1}{2}x_{n}+\frac{1}{10}(1\frac{1}{1\text{,}000n})(P_{C}x_{n}2)\). It is easy to see all the conditions of Theorem 4.1 are satisfied.
Let \(x_{1}=0\), then \(x_{2}=0.1998\), \(x_{3}=0.31977001\), \(x_{4}=0.39177967\), \(x_{5}=0.435008\), … , we see \(\lim_{n\rightarrow \infty} x_{n}=\bar{x}=\frac {1}{2}\in\arg\min_{x\in [1,1]}g_{1}x+h_{1}x\).
Next, an iteration is used to find the solution of the following optimization problem for the convex differentiable function:
Corollary 4.1
Let \(g_{1}:H\rightarrow \mathbb{R}\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\). Let \(f_{1}(x,y)=\langle yx,\nabla g_{1}x \rangle\) for all \(x,y\in H\) and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma 2.4. Suppose that \(\Pi_{1,1} :=\arg\min_{y\in C}g_{1}(y)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\lambda\in(0,\infty)\) and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{1,1}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Proof
By Lemma 2.7, we know that \(\operatorname{VI}(C,\nabla g_{1})=\arg\min_{y\in C}g_{1}(y)\). Therefore, Corollary 4.1 follows immediately from Theorem 4.1 by letting \(h_{1}=0\). □
Next, another iteration is used to find the solution of the following optimization problem for a convex function:
Corollary 4.2
Let \(f_{1}(x,y)=h_{1}(y)h_{1}(x)\), for all \(x,y\in C\), and let \(A_{f_{1}}\) be defined as (L4.2) in Lemma 2.4. Suppose that \(\Pi_{1,2} :=\arg\min_{y\in C}h_{1}(y)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\lambda\in(0,\infty)\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{1,2}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Proof
Put \(g_{1}=0\) in Theorem 4.1. Then Corollary 4.2 follows from Theorem 4.1. □
In the following theorem, an iteration is used to find the solution of the following optimization problem for the sum of two convex functions:
Theorem 4.2
Let \(g_{1}:H\rightarrow (\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H, and \(h_{1}:H\rightarrow (\infty,\infty]\) be a proper convex lower semicontinuous function. Suppose that \(\nabla g_{1}\) is Lipschitz with Lipschitz constant \(\frac{1}{L_{1}}\) and \(\Pi_{2} :=\arg\min_{x\in H} (g_{1}+h_{1})(x)\neq\emptyset\). Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). A sequence \(\{x_{n}\}\subset H_{1}\) is defined as follows: \(x_{1}\in H \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(r\in(0,\frac{2}{L_{1}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{2}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
We give two different proofs for this theorem.
Proof I
Put \(C=H\), \(G_{1}=\partial h_{1}\), \(F_{1}=\nabla g_{1}\), \(T=I_{C}\), \(F_{2}=0\), \(G_{2}=\partial i_{H}\) in Theorem 3.1, where \(I_{C}\) is the restriction of I on C and \(i_{C}\) is the indicate function of C. Since \(\nabla g_{1}\) is Lipschitz continuous with Lipschitz constant \(L_{1}\), it follows from Corollary 10 of [25] that \(\nabla g_{1} \) is \(\frac{1}{L_{1}}\)stronglyinversemonotone. Since \(h_{1}\) is a proper convex lower semicontinuous function, it follows from Lemma 2.8 that \(\partial h_{1}\) is a set valued maximum monotone mapping. By Lemma 2.11, \(\partial g_{1}=\{\nabla g_{1}\}\). It follows from \(\operatorname{dom}(f)\cap \operatorname{int} (\operatorname{dom}(g))\neq\emptyset\), \(\operatorname{dom}(g)=H\), and Lemma 2.9 that
Hence,
Therefore, we get the conclusion of Theorem 4.2 from Theorem 3.1. □
Proof II
Let \(C=H\), \(T=\operatorname{prox}_{r h_{1}}(Ir \nabla g_{1})\) in Theorem 3.4. Since \(\nabla g_{1}\) is Lipschitz constant \(L_{1}\), it follows from Corollary 10 of [25] that \(\nabla g_{1}\) is \(\frac{1}{L_{1}}\)inverse stronglymonotone. By Lemma 2.10, \(r\nabla g_{1}\) is \(\frac{1}{rL_{1}}\)ism and \((Ir \nabla g_{1})\) is averaged. Since \(\partial h_{1}\) is maximum monotone, it follows from Lemma 2.5, \(\operatorname{prox}_{rh_{1}}=J_{r}^{\partial h_{1}}\) is firmly nonexpansive. Hence \(\operatorname{prox}_{rh_{1}}\) is \(\frac{1}{2}\)averaged. Then by Lemma 2.10, T is averaged and nonexpansive. We have
Hence, \(\Pi_{2}=\Omega_{4}\), and we get the conclusion of Theorem 4.2 from Theorem 3.4. □
Remark 4.1
 (a)

(b)
Theorem 1.1, Theorem 1.2, Theorem 1.3, and the results given by Combettes and Wajs [2], and Xu [3] are weak convergence theorems for the problem:
$$ \mathop{\arg\min}_{x\in H} g_{1}(x)+h_{1}(x). $$
Further, Theorem 1.1, Theorem 1.2, and Theorem 1.3 gave strong convergence theorems of this problem under the uniform convex assumption on \(h_{1}\) or \(g_{1}\). Therefore, Theorem 4.2 is different from these results. Besides, Theorem 4.2 is also different from the result given by Wang and Xu [4] and related algorithms in the literature.
Example 4.2
Let \(h_{1}(x)=x^{2}\), \(g_{1}(x)=x^{2}+2x+1\), \(H=\mathbb{R}\), \(\alpha=1\), \(V=I\), \(\alpha_{n}=\frac{1}{2}\) for all \(n\in \mathbb{N}\), \(\beta_{n}=\frac{1}{1\text{,}000n}\), \(C=[1,1]\), \(\theta_{n}=0\). Then \(\partial h_{1}(x)=\{2x\}\), \(\nabla g_{1}(x)=2x+2\), \(\theta_{n}=0\) for all \(n\in\mathbb{N}\). We see \(\frac{1}{2}\in\arg\min_{x\in\mathbb{R}}g_{1}(x)+h_{1}(x)\neq\emptyset\), let
for each \(n\in\mathbb{N}\).
We see all conditions of Theorem 4.2 are satisfied.
We obtain \((I\nabla g_{1})x_{n}=(I+\partial h_{1})y_{n}=y_{n}+2y_{n}=3y_{n}=x_{n}2x_{n}2\).
From this we obtain \(y_{n}=\frac{(x_{n}+2)}{3}\) and
Let \(x_{1}=1\), then \(x_{2}=0.0005\), \(x_{3}=0.33075\), \(x_{4}=0.4434906\), \(x_{5}=04810987\), \(x_{6}=0.493649\), \(x_{7}=0.4978412\), \(x_{8}=0.4992446\), \(x_{9}=0.4997169\), \(x_{10}=0.49987785\), \(x_{11}=0.49993425\), \(x_{12}=4999553\), \(x_{13}=0.4999642\), … . From the relation
and \(x_{1}=1\), it is easy to see that the sequence \(\{x_{n}\}_{n\in\mathbb{N}}\) is nonincreasing for some \(n\geq m\) and bounded. Hence \(\lim_{n\rightarrow \infty}x_{n}\) exists. Let \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\). From the relation
we see that \(\bar{x}=\frac{1}{2}\bar{x}\frac{\bar{x}+2}{6}\). Therefore \(\bar{x}=\frac{1}{2}\in\arg\min_{x\in H}g_{1}(x)+h_{1}(x)\).
In the following corollary, an iteration is used to find the solution of the following optimization problem:
Corollary 4.3
Let \(h_{1}:H\rightarrow (\infty ,\infty]\) be a proper convex lower semicontinuous function. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Pi_{2,1} := \arg\min_{y\in H}h_{1}(y)\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in H \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\infty)\) and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{2,1}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Remark 4.2
In 1976, Rockafellar [15] proved the following in the Hilbert space setting: If \(h_{1}\) is a proper convex lower semicontinuous function on H, the solution set \(\arg\min_{y\in H}h_{1}(y)\) is nonempty and \(\liminf_{n\rightarrow \infty}\beta_{n}>0\). Let
then \(\{x_{n}\}\) converges weakly to a minimizer of \(h_{1}\). We see that Corollary 4.3 gives a different iteration which converges strongly to the solution of the following problem: Find \(\bar{x}\in \arg\min_{y\in H}h_{1}(y)\).
Next, a modified gradientprojection algorithm is used to find the solution of the following mathematical program:
Theorem 4.3
Let \(g_{1}:H\rightarrow (\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\nabla g_{1}\) is Lipschitz continuous with Lipschitz constant \(L_{1}\) and \(\Pi_{3} := \arg\min_{y\in C}g_{1}(y)\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{L_{1}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{3}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is the unique solution to the hierarchical variational inequality:
Proof
Let \(h_{1}=i_{C}\), where \(i_{C}\) denotes the indicator function of C. From Lemma 2.5, \(\operatorname{prox}_{\lambda h_{1}}=P_{C}\), and
Theorem 4.3 follows immediately from Theorem 4.2. □
Remark 4.3
We know an iteration, defined by
is called a gradientprojection algorithm, where \(\nabla g_{1}\) is Lipschitz continuous. In 2011, Xu [16] used the gradientprojection algorithm and the relaxed gradientprojection algorithm and studied the problem
and gave weak convergence theorems. Xu also used the viscosity nature of the gradientprojection algorithms and regularized algorithm to study strong convergence theorems for this problem [16]. In Theorem 4.3, we establish a strong convergence theorem for this problem by a different modified gradientprojection algorithm and a different approach.
In the end of this section, an iteration is used to find the solution of the following optimization problem:
Corollary 4.4
Let \(g_{1}:H\rightarrow (\infty,\infty)\) be a convex Fréchet differentiable function with Fréchet derivative \(\nabla g_{1}\) on H. Take \(\mu\in \mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\nabla g_{1}\) is Lipschitz continuous with Lipschitz constant \(L_{1}\) and \(\Pi_{3,1} := \arg\min_{y\in H}g_{1}(y)\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in H\) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{L_{1}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{3,1}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Proof
Let \(h_{1}=i_{H}\). By Lemma 2.5, we know that \(\operatorname{prox}_{\lambda h_{1}}=J_{\lambda}^{\partial i_{H}}= P_{H}=I\), and
Hence, Corollary 4.4 follows immediately from Theorem 4.3. □
Split feasibility problems and lasso problems
In the following theorem, a modified Byrne CQ iteration is used to find the solution of the following split feasibility problem: Find \(\bar{x}\in C\) such that \(A\bar{x}\in Q\).
Theorem 5.1
Let \(A:H\rightarrow H_{1}\) be a bounded linear operator, \(A^{*}\) be the adjoint of A. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Delta_{1}:=\{x\in C: Ax\in Q\}\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\A\^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{1}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Proof
Let \(g_{1}(x)=\frac{\AxP_{Q}Ax\^{2}_{2}}{2}\). It is easy to see that \(g_{1}(x)=\min_{y\in Q}\frac{1}{2}\Axy\^{2}\) is a convex function. Then for any \(v\in H_{1}\), we have
\(P_{Q}\) is a selfadjoint operator and \(P_{Q}^{2}=P_{Q}\), therefore, we have
Since \(P_{Q}:H\rightarrow Q\) is a nonexpansive mapping, we have
and
We also see that
By (5.1), (5.2), (5.3), (5.4), and (5.5), we have
This shows that \(g_{1}\) is Fréchet differentiable with Fréchet derivative \(\nabla g_{1}=A^{*}(AP_{Q}A)\). Since A is a bounded operator and \(P_{Q}\) is a firmly nonexpansive mapping,
This shows that \(\nabla g_{1}\) is a Lipschitz function with Lipschitz constant \(\A\^{2}\). By the assumption \(\Delta_{1}\neq \emptyset\), we know that \(\Delta_{1}=\Pi_{3}\). Then we get the conclusion of Theorem 5.1 from Theorem 4.3. □
Remark 5.1
In 2010, Xu [14] used various algorithms to establish weak convergence theorems in infinite dimensional Hilbert spaces for the split feasibility problem (see Theorems 3.1, 3.3, 3.4, 4.1 and 5.7 of [14]). Also, Xu [14] established a strongly theorem for this problem in the infinite dimensional Hilbert space (see Theorem 5.5 of [14]). Now, Theorem 5.1 gives an algorithm to study the split feasibility problem which converges strongly to the split feasibility problem, and this result improves Byrne’s CQ algorithm [9], and the results given by Xu [14].
By Theorem 5.1, we get the following result for split feasibility problem.
Corollary 5.1
Let \(A:H\rightarrow H_{1}\) be a bounded linear operator, \(A^{*}\) be the adjoint of A. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<2\). Suppose that \(\Delta_{1} :=\{x\in C:Ax\in Q\} \neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\A\^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{1}}0\).
Proof
Let \(\theta_{n}=\theta=0\) for all \(n\in \mathbb{N}\) and \(V=I\) in Theorem 5.1, then Corollary 5.1 follows from Theorem 5.1. □
Next, an iteration is used to find the solution of the following lasso problem:
Theorem 5.2
Let A be a \(m\times n\) real matrix, \(x\in\mathbb{R}^{n}\), \(b\in \mathbb{R}^{m}\), and \(\gamma\geq0\) be a regularization parameter. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Delta_{2} := \arg\min_{x\in\mathbb{R}^{n}}\frac{1}{2}\Axb\^{2}_{2}+\gamma\x\_{1}\neq\emptyset\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in \mathbb{R}^{n} \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\A^{2}\})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{2}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Proof
Let \(g_{1}(x)=\frac{1}{2}\Axb\^{2}_{2}\), and \(h_{1}(x)=\gamma\x\_{1}\). For each \(v\in\mathbb{R}^{n}\),
Then \(\nabla g_{1}(x)=A^{*}(Axb)\), and \(h_{1}\) is a proper convex and lower semicontinuous function on H. Hence,
and \(\nabla g_{1}\) is a Lipschitz function with Lipschitz constant \(\ A\^{2}\). Therefore, Theorem 5.2 follows from Theorem 4.2. □
The following is a special case of Theorem 5.2.
Corollary 5.2
Let A be a \(m\times n\) real matrix, \(x\in\mathbb{R}^{n}\), \(b\in R^{m}\), and \(\gamma\geq0\) be a regularization parameter. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<2\). Suppose that \(\Delta_{2} := \arg\min_{x\in\mathbb{R}^{n}}\frac{1}{2}\Axb\^{2}_{2}+\gamma\x\_{1}\neq\emptyset\). A sequence \(\{x_{n}\}\subset\mathbb{R}^{n}\) is defined as follows: \(x_{1}\in\mathbb{R}^{n}\) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\A\^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Delta_{2}}(0)\).
Proof
Let \(\theta_{n}=\theta=0\) for all \(n\in\mathbb{N}\) and \(V=I\) in Theorem 5.2, then Corollary 5.2 follows from Theorem 5.2. □
Apply Corollary 4.1, an iteration is used to find the solution to the split feasibility problem: Find \(\bar{x}\in C\), \(A\bar{x}\in Q\).
Theorem 5.3
Let \(A:H\rightarrow H_{1}\) be a bounded linear operator, \(A^{*}\) be the adjoint of A. Let \(A_{f_{1}}\) be defined as (L4.2) in Lemma 2.4. Take \(\mu\in\mathbb{R}\) such that \(0<\mu<\frac{2\bar{\gamma}}{L^{2}}\). Suppose that \(\Pi_{13}:=\{x\in C: Ax\in Q\}\neq\emptyset\). Let \(f_{1}(x,y)=\langle yx, A^{*}(AP_{Q}A)x\rangle\). A sequence \(\{x_{n}\}\subset H\) is defined as follows: \(x_{1}\in C \) is chosen arbitrarily, and
for each \(n\in\mathbb{N}\), \(\alpha\in(0,\frac{2}{\A\^{2}})\), and \(\{\alpha_{n},\beta_{n}\}\subset(0,1)\). Assume that conditions (D) hold. Then \(\lim_{n\rightarrow \infty}x_{n}=\bar{x}\), where \(\bar{x}=P_{\Pi_{13}}(\bar{x}V\bar{x})\). Further, \(\bar{x}\) is also the unique solution to the hierarchical variational inequality:
Proof
Let \(g_{1}(x)=\frac{\AxP_{Q}Ax\^{2}_{2}}{2}\), then shows that \(g_{1}\) is Fréchet differentiable with Fréchet derivative \(\nabla g_{1}=A^{*}(AP_{Q}A)\), and \(\nabla g_{1}\) a Lipschitz function with Lipschitz constant \(\A\^{2}\). Applying Corollary 4.1 and following the same argument as Theorem 5.1, we can prove Theorem 5.3. □
Image deblurring problem
This section mainly focuses on the image deblurring problems, which has received a lot of attention in recent years. Until now, some researchers have proposed many novel algorithms for this problem based on different deblurring models; for examples, see [26]. Now, by Corollary 5.2, we can consider the image deblurring problem.
All pixels of the original images described in the examples were first scaled into the range between 0 and 1.
The image went through a Gaussian blur of size \(9\times9\) and standard deviation 4 (applied by the MATLAB functions imfilter and fspecial) followed by an additive zeromean white Gaussian noise with standard deviation 10^{−3}. The original and observed images are given in Figures 13.
Remark 6.1
In the literature, we may observe that there are many fast algorithms for the image deblurring problem. Here, we show that we can also consider this problem by Corollary 5.2.
Conclusion and remarks
In this paper, we apply a recent fixed point theorem in [17] to study mathematical programming for the sum of two convex functions, mathematical programming of convex function, the split feasibility problem, and the lasso problem. We establish strong convergence theorems as regards these problems. The study of such problems will give many other applications in science, nonlinear analysis, and statistics.
References
 1.
Tibshirani, R: Regression shrinkage and selection for the lasso. J. R. Stat. Soc., Ser. B 58, 267288 (1996)
 2.
Combettes, PL, Wajs, R: Signal recovery by proximal forwardbackward splitting. Multiscale Model. Simul. 4, 11681200 (2005)
 3.
Xu, HK: Properties and iterative methods for the lasso and its variants. Chin. Ann. Math., Ser. B 35, 501518 (2014)
 4.
Wang, Y, Xu, HK: Strong convergence for the proximal gradient methods. J. Nonlinear Convex Anal. 15, 581593 (2014)
 5.
Douglas, J, Rachford, HH: On the numerical solution of heat conduction in two and three space variable. Trans. Am. Math. Soc. 82, 421439 (1956)
 6.
Bauschken, HH, Combettes, PL: Convex Analysis and Monotone Operator Theory in Hilbert Space. Springer, Berlin (2011)
 7.
Tseng, P: Further applications of a splitting algorithm to decomposition in variational inequalities and convex programming. Math. Program., Ser. B 48, 249263 (1990)
 8.
Censor, Y, Elfving, T: A multiprojection algorithm using Bregman projection in a product space. Numer. Algorithms 8, 221239 (1994)
 9.
Byrne, C: Iterative oblique projection onto convex sets and the split feasibility problem. Inverse Probl. 18, 441453 (2002)
 10.
Byrne, C: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Probl. 20, 103120 (2004)
 11.
Censor, Y, Bortfeld, T, Martin, B, Trofimov, A: A unified approach for inversion problems in intensitymodulated radiation therapy. Phys. Med. Biol. 51, 23532365 (2003)
 12.
López, G, MartínMárquez, V, Xu, HK: Iterative algorithms for the multiplesets split feasibility problem. In: Censor, Y, Jiang, M, Wang, G (eds.) Biomedical Mathematics: Promising Directions in Imaging, Therapy Planning and Inverse Problems, pp. 243279. Medical Physics Publishing, Madison (2010)
 13.
Stark, H: Image Recovery: Theory and Applications. Academic Press, San Diego (1987)
 14.
Xu, HK: Iterative methods for the split feasibility problem in infinitedimensional Hilbert spaces. Inverse Probl. 26, 105018 (2010)
 15.
Rockafellar, TA: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877898 (1976)
 16.
Xu, HK: Averaged mappings and the gradient projection algorithm. J. Optim. Theory Appl. 150, 360378 (2011)
 17.
Yu, ZT, Lin, LJ: Hierarchical problems with applications to mathematical programming with multiple sets split feasibility constraints. Fixed Point Theory Appl. 2013, 283 (2013)
 18.
Takahashi, W: Nonlinear Functional Analysis: Fixed Point Theory and Its Applications. Yokohama Publishers, Yokohama (2000)
 19.
Fan, K: A minimax inequalities and its applications. In: Shisha, O (ed.) Inequalities III, pp. 103113. Academic Press, San Diego (1972)
 20.
Blum, E, Oettli, W: From optimization and variational inequalities to equilibrium problems. Math. Stud. 63, 123146 (1994)
 21.
Combettes, PL, Hirstoaga, SA: Equilibrium programming in Hilbert spaces. J. Nonlinear Convex Anal. 6, 117136 (2005)
 22.
Takahashi, S, Takahashi, W, Toyoda, M: Strong convergence theorems for maximal monotone operators with nonlinear mappings in Hilbert spaces. J. Optim. Theory Appl. 147, 2741 (2010)
 23.
Ekaland, I, Temam, R: Convex Analysis and Variational Problems. NorthHolland, Amsterdam (1976)
 24.
Combettes, PL: Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53(56), 475504 (2004)
 25.
Baillon, JB, Haddad, G: Quelques propriétés des opérateurs anglebornés et ncycliquement monotones. Isr. J. Math. 26, 137150 (1977)
 26.
Chambolle, A: An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 8997 (2004)
Acknowledgements
Prof. CS Chuang was supported by the National Science Council of Republic of China.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
All authors contributed equally to this work. All authors read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Chuang, C.S., Yu, Z. & Lin, L. Mathematical programming for the sum of two convex functions with applications to lasso problem, split feasibility problems, and image deblurring problem. Fixed Point Theory Appl 2015, 143 (2015). https://doi.org/10.1186/s1366301503880
Received:
Accepted:
Published:
MSC
 90C33
 90C34
 90C59
Keywords
 lasso problem
 mathematical programming for the sum of two functions
 split feasibility problem
 gradientprojection algorithm
 proximal point algorithm