- Research
- Open access
- Published:
An improved fast iterative shrinkage thresholding algorithm with an error for image deblurring problem
Fixed Point Theory and Algorithms for Sciences and Engineering volume 2021, Article number: 18 (2021)
Abstract
In this paper, we introduce a new iterative forward-backward splitting method with an error for solving the variational inclusion problem of the sum of two monotone operators in real Hilbert spaces. We suggest and analyze this method under some mild appropriate conditions imposed on the parameters such that another strong convergence theorem for these problem is obtained. We also apply our main result to improve the fast iterative shrinkage thresholding algorithm (IFISTA) with an error for solving the image deblurring problem. Finally, we provide numerical experiments to illustrate the convergence behavior and show the effectiveness of the sequence constructed by the inertial technique to the fast processing with high performance and the fast convergence with good performance of IFISTA.
1 Introduction
Let C be a nonempty closed convex subset of a real Hilbert space H. The variational inclusion problem is a fundamental problem in optimization theory, it can be applied in many areas of science and applied science, engineering, economics, and medicine [1–9], in image processing, machine learning, modeling intensity modulated radiation theory treatment planning [10–15]. It is to find \(x^{*} \in H\) such that
where \(A:H \rightarrow H\) is an operator and \(B: D(B) \subset H \rightarrow 2^{H}\) is a set-valued operator.
To solve the variational inclusion problem (1.1) via fixed point theory, we define the mapping \(J_{r}^{A,B} : H\rightarrow D(B)\) as follows:
where \(J_{r}^{B} = (I+rB)^{-1}\) is the resolvent operator of B for \(r>0\). For \(x \in H\), we see that
which shows that the fixed point set of \(J_{r}^{A,B}\) coincides with the solutions set of \((A+B)^{-1}(0)\). This suggests the following iteration process: \(x_{1} \in C\) and
where \(\{r_{n}\} \subset (0,\infty )\) and \(D(B) \subset C\). This method is called a forward-backward splitting algorithm [16, 17].
In applications, we always let \(A=\nabla F\) and \(B = \partial G\) such that \(F:H \rightarrow \mathbb{R}\) is a convex and differentiable function and \(G:H \rightarrow \mathbb{R}\cup \{+\infty \}\) is a convex and lower semi-continuous function, where ∇F is the gradient of F with L-Lipschitz continuous and ∂G is the subdifferential of G which defined by
Then problem (1.1) is reduced to the following convex minimization problem:
Recall that the proximity operator \(\text{prox}_{G}\) of G is defined for all \(x \in H\) as follows:
For \(x \in H\) and \(r>0\), we see that
Therefore,
Many researchers have proposed and analyzed the iterative shrinkage thresholding algorithms for solving the convex minimization problem (1.2) under a few specific conditions as follows.
In the weak convergence theorems, Lions and Mercier [16] first introduced forward-backward splitting (FBS) algorithm:
where \(x_{1} \in H\) and \(\{ \lambda _{n} \} \subset (0,2/L)\). Later, Moudafi and Oliny [18] introduced the iterative forward-backward splitting (IFBS) algorithm:
where \(x_{0}, x_{1} \in H\), \(\{\theta _{n}\} \subset [0,a] \subset [0,1)\), \(\{ \lambda _{n}\} \subset [b,c] \subset (0,2/L)\) such that \(\sum_{n=1}^{\infty }\theta _{n} \|x_{n}-x_{n-1}\|^{2} < \infty \). In our research, we focus attention on the inertial parameter \(\theta _{n}\) which controls the momentum of \(x_{n} - x_{n-1}\) in the fast iterative shrinkage thresholding algorithm (FISTA) of Beck and Teboulle [19] as follows:
where \(y_{1} = x_{0} \in H\) and \(t_{1}=1\). In FISTA, we observe that \(y_{n}\) is known before \(x_{n}\), where the sequence \(\{x_{n}\}\) converges weakly to the solution of the convex minimization problem (1.2). Recently, Hanjing and Suantai [20] introduced the forward-backward modified W-algorithm (FBMWA) as follows:
for all \(n \in \mathbb{N}\), where \(x_{0},x_{1} \in H\) and \(\{\alpha _{n}\} \subset [0,a] \subset [0,1)\), \(\{\beta _{n}\} \subset [0,1]\), \(\{\gamma _{n}\} \subset [b,c] \subset (0,1)\), and \(\{ \theta _{n} \} \subset [0,\infty ) \) such that \(\sum_{n=1}^{\infty }\theta _{n} < \infty \), and \(\{\lambda _{n}\} \subset (0,2/L)\) such that \(\lambda _{n} \rightarrow \lambda \in (0,2/L)\) as \(n \rightarrow \infty \). In the same way, Padcharoen and Kuman [21] introduced the forward-backward modified MM-algorithm (FBMMMA) as follows:
where \(x_{0},x_{1} \in H\) and \(\{\alpha _{n}\},\{\beta _{n}\},\{\gamma _{n}\} \subset [0,1]\) such that \(\alpha _{n}+\beta _{n} \in [0,1]\) and \(\{ \theta _{n} \} \subset (0,1)\) such that \(\sum_{n=1}^{\infty }\theta _{n} < \infty \), and \(\{\lambda _{n}\} \subset (0,2/L)\) such that \(\lambda _{n} \rightarrow \lambda \in (0,2/L)\) as \(n \rightarrow \infty \). Other weak convergence theorems of all those algorithms are obtained.
In the strong convergence theorems, Verma and Shukla [22] introduced a new accelerated proximal gradient algorithm (NAGA) as follows:
where \(x_{0},x_{1} \in H\), \(\{\alpha _{n}\}, \{\theta _{n}\} \subset (0,1]\), and \(\{\lambda _{n}\} \subset (0,2/L)\). They proved that the sequence \(\{x_{n}\}\) of NAGA converges strongly under the condition \(\frac{\|x_{n}-x_{n-1}\|}{\theta _{n}} \rightarrow 0\) as \(n \rightarrow \infty \). How to choose the parameter \(\theta _{n}\)? We leave it for the reader to verify. In their proof, we observe that NAGA still holds under conditions \(\alpha _{n} \rightarrow 0\) and \(\frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1}\| \rightarrow 0\) as \(n\rightarrow \infty \), and the parameter \(\theta _{n}\) can be chosen as
where \(\{\omega _{n}\}\) is a positive sequence such that \(\omega _{n} = o(\alpha _{n})\). Cholamjiak et al. [23] introduced the strong convergence theorem for the inclusion problem (SCTIP) by letting \(S=I\), \(A=\nabla F\), and \(B=\partial G\) as follows:
where \(x_{0},x_{1} \in C\) and f is a contraction of C into itself, and \(\{\alpha _{n}\},\{\beta _{n} \} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0,2/L)\), and \(\{\theta _{n} \} \subset [0,\theta ]\) such that \(\theta \in [0,1)\). They proved that the sequence \(\{x_{n}\}\) of SCTIP converges strongly under the following conditions:
-
(C1)
\(\lim_{n\rightarrow \infty } \alpha _{n} = 0\) and \(\sum_{n=1}^{\infty }\alpha _{n} = \infty \),
-
(C2)
\(\liminf_{n\rightarrow \infty } \beta _{n} (1-\beta _{n}) > 0\),
-
(C3)
\(0 < \liminf_{n\rightarrow \infty } \lambda _{n} \leq \limsup_{n \rightarrow \infty } \lambda _{n} < 2/L\),
-
(C4)
\(\lim_{n\rightarrow \infty } \frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1} \| = 0\).
Moreover, many researchers have proposed and analyzed the iterative forward-backward scheme with a variable step size, which does not depend on the Lipschitz constant of the operator \(A=\nabla F\) (see also [24, 25]).
In our research, we consider the forward-backward splitting method with an error as follows: \(x_{1} \in C\) and
where \(\{\lambda _{n}\} \subset (0,\infty )\), \(\{e_{n}\} \subset H\), \(D(B) \subset C\), and \(J_{\lambda _{n}}^{B} = (I+\lambda _{n} B)^{-1}\). We introduce a new iterative forward-backward splitting method with an error for solving the variational inclusion problem (1.1) as follows:
where \(x_{0},x_{1} \in C\) and f is a contraction of C into itself, and \(\{\alpha _{n}\} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0,2/L)\), \(\{e_{n} \} \subset H\), and \(\{\theta _{n}\} \subset [0,\theta ]\) such that \(\theta \in [0,1)\). Moreover, it can be applied to improve the fast iterative shrinkage thresholding algorithm (IFISTA) with an error for solving the convex minimization problem (1.2) by letting \(A=\nabla F\) and \(B=\partial G\) as follows:
which obtains a self-adaptive scheme with fast convergence properties under some mild conditions when compared to the existing algorithms in the literature. The outline of our research is as follows: in Sect. 2, we give some well-known definitions and lemmas which are used in Sect. 3 to prove the strong convergence theorem of IFISTA for solving the variational inclusion problem (1.1), and we also apply its result in Sect. 4 for solving the image deblurring problem, which is a special case of convex minimization problem (1.2); and in Sect. 5, we provide numerical experiments to illustrate the fast processing with high performance and the fast convergence with good performance of IFISTA by the inertial technique.
2 Preliminaries
Let C be a nonempty closed convex subset of a real Hilbert space H. We will use the notation: → to denote the strong convergence, ⇀ to denote the weak convergence,
to denote the weak limit set of \(\{x_{n}\}\), and \(\text{Fix}(T) = \{x:x=Tx \}\) to denote the fixed point set of the mapping T.
Recall that the metric projection \(P_{C}: H \rightarrow C\) is defined as follows: for each \(x \in H\), \(P_{C} x\) is the unique point in C satisfying
The operator \(T:H\rightarrow H\) is called:
-
(i)
monotone if
$$ \langle x-y,Tx-Ty \rangle \geq 0,\quad \forall x,y \in H, $$ -
(ii)
L-Lipschitzian with \(L>0\) if
$$ \Vert Tx-Ty \Vert \leq L \Vert x-y \Vert ,\quad \forall x,y \in H, $$ -
(iii)
k-contraction if it is k-Lipschitzian with \(k \in (0,1)\),
-
(iv)
nonexpansive if it is 1-Lipschitzian,
-
(v)
firmly nonexpansive if
$$ \Vert Tx-Ty \Vert ^{2} \leq \Vert x-y \Vert ^{2} - \bigl\Vert (I-T)x-(I-T)y \bigr\Vert ^{2},\quad \forall x,y \in H, $$ -
(vi)
α-strongly monotone with \(\alpha > 0\) if
$$ \langle Tx-Ty,x-y \rangle \geq \alpha \Vert x-y \Vert ^{2},\quad \forall x,y \in H, $$ -
(vii)
α-inverse strongly monotone with \(\alpha > 0\) if
$$ \langle Tx-Ty,x-y \rangle \geq \alpha \Vert Tx-Ty \Vert ^{2},\quad \forall x,y \in H. $$
Let B be a mapping of H into \(2^{H}\). The domain and the range of B are denoted by \(D(B) = \{x\in H : Bx \neq \emptyset \}\) and \(R(B) = \bigcup \{Bx:x \in D(B) \}\), respectively. The inverse of B, denoted by \(B^{-1}\), is defined by \(x\in B^{-1}y\) if and only if \(y\in Bx\). A multi-valued mapping B is said to be a monotone operator on H if \(\langle x-y,u-v\rangle \geq 0\) for all \(x,y \in D(B)\), \(u \in Bx\), and \(v \in By\). A monotone operator B on H is said to be maximal if its graph is not strictly contained in the graph of any other monotone operator on H. For a maximal monotone operator B on H and \(r>0\), we define the single-valued resolvent operator \(J_{r}^{B}:H\rightarrow D(B)\) by \(J_{r}^{B}=(I+rB)^{-1}\). It is well known that \(J_{r}^{B}\) is firmly nonexpansive and \(\text{Fix}(J_{r}^{B})=B^{-1}(0)\).
We collect together some known lemmas which are the main tools in proving our result.
Lemma 2.1
([26])
Let C be a nonempty closed convex subset of a real Hilbert space H. Then:
-
(i)
\(\|x \pm y\|^{2} = \|x\|^{2} \pm 2 \langle x,y \rangle + \|y\|^{2}\), \(\forall x,y\in H\),
-
(ii)
\(\|\lambda x+(1-\lambda )y\|^{2} = \lambda \|x\|^{2}+(1-\lambda )\|y \|^{2}-\lambda (1-\lambda )\|x-y\|^{2}\), \(\forall x,y\in H\), \(\lambda \in \mathbb{R}\),
-
(iii)
\(z=P_{C}x \Leftrightarrow \langle x-z,z -y \rangle \geq 0\), \(\forall x\in H\), \(y \in C\),
-
(iv)
\(z=P_{C}x \Leftrightarrow \|x-z \|^{2} \leq \|x-y\|^{2} - \| y-z \|^{2}\), \(\forall x\in H\), \(y \in C\),
-
(v)
\(\| P_{C} x - P_{C} y\|^{2} \leq \langle x-y,P_{C} x - P_{C} y \rangle \), \(\forall x,y\in H\).
Lemma 2.2
([27])
Let H and K be two real Hilbert spaces, and let \(T:K \rightarrow K\) be a firmly nonexpansive mapping such that \(\|(I-T)x\|\) is a convex function from K to \(\overline{\mathbb{R}}=[-\infty ,+\infty ]\). Let \(A:H\rightarrow K\) be a bounded linear operator and \(f(x) = \frac{1}{2}\|(I-T)Ax\|^{2} \) for all \(x\in H\). Then:
-
(i)
f is convex and differential,
-
(ii)
\(\nabla f(x) = A^{*}(I-T)Ax \) for all \(x\in H\) such that \(A^{*}\) denotes the adjoint of A,
-
(iii)
f is weakly lower semi-continuous on H,
-
(iv)
∇f is \(\|A\|^{2}\)-Lipschitzian.
Lemma 2.3
([27])
Let H be a real Hilbert space and \(T: H\rightarrow H\) be an operator. The following statements are equivalent:
-
(i)
T is firmly nonexpansive,
-
(ii)
\(\|Tx-Ty\|^{2} \leq \langle x-y,Tx-Ty \rangle \), \(\forall x,y \in H\),
-
(iii)
\(I-T\) is firmly nonexpansive.
Lemma 2.4
([28])
Let C be a nonempty closed convex subset of a real Hilbert space H. Let the mapping \(A:C\rightarrow H\) be α-inverse strongly monotone and \(r>0\) be a constant. Then we have
for all \(x,y \in C\). In particular, if \(0< r\leq 2\alpha \), then \(I-rA\) is nonexpansive.
Lemma 2.5
([29] (Demiclosedness principle))
Let C be a nonempty closed convex subset of a real Hilbert space H, and let \(S:C \rightarrow C\) be a nonexpansive mapping with \(\textit{Fix}(S)\neq \emptyset \). If the sequence \(\{x_{n}\}\subset C\) converges weakly to x and the sequence \(\{(I-S)x_{n}\}\) converges strongly to y, then \((I-S)x = y\); in particular, if \(y=0\), then \(x\in \textit{Fix}(S)\).
Lemma 2.6
([30])
Let \(\{a_{n}\}\) and \(\{c_{n}\}\) be sequences of nonnegative real numbers such that
where \(\{\delta _{n} \}\) is a sequence in \((0,1)\) and \(\{b_{n}\}\) is a real sequence. Assume that \(\sum_{n=0}^{\infty }c_{n} < \infty \). Then the following results hold:
-
(i)
if \(b_{n} \leq \delta _{n} M\) for some \(M\geq 0\), then \(\{a_{n}\}\) is a bounded sequence,
-
(ii)
if \(\sum_{n=0}^{\infty }\delta _{n} = \infty \) and \(\limsup_{n\rightarrow \infty } b_{n}/\delta _{n} \leq 0\), then \(\lim_{n\rightarrow \infty }a_{n}=0\).
Lemma 2.7
([31])
Assume that \(\{s_{n}\}\) is a sequence of nonnegative real numbers such that
and
where \(\{\gamma _{n}\}\) is a sequence in \((0,1)\), \(\{\eta _{n}\}\) is a sequence of nonnegative real numbers, and \(\{\delta _{n}\}\), \(\{\rho _{n}\}\) are real sequences such that
-
(i)
\(\sum_{n=0}^{\infty }\gamma _{n} = \infty \),
-
(ii)
\(\lim_{n\rightarrow \infty } \rho _{n} = 0\),
-
(iii)
if \(\lim_{k\rightarrow \infty } \eta _{n_{k}} = 0\), then \(\limsup_{k\rightarrow \infty } \delta _{n_{k}} \leq 0\) for any subsequence \(\{n_{k}\}\) of \(\{n\}\).
Then \(\lim_{n\rightarrow \infty } s_{n} = 0\).
3 Main result
Theorem 3.1
Let C be a nonempty closed convex subset of a real Hilbert space H. Let A be an α-inverse strongly monotone mapping of H into itself and B be a maximal monotone operator on H such that the domain of B is included in C, and assume that \((A+B)^{-1}(0)\) is nonempty. Let \(J_{\lambda }^{B}=(I+\lambda B)^{-1}\) be the resolvent of B for \(\lambda > 0\) and f be a k-contraction mapping of C into itself. Let \(x_{0},x_{1} \in C\) and \(\{x_{n}\} \subset C\) be a sequence generated by
for all \(n \in \mathbb{N}\), where \(\{\alpha _{n}\} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0,2\alpha )\), \(\{e_{n}\} \subset H\), and \(\{\theta _{n}\} \subset [0,\theta ]\) such that \(\theta \in [0,1)\) satisfy the following conditions:
-
(C1)
\(\lim_{n\rightarrow \infty } \alpha _{n} = 0\) and \(\sum_{n=1}^{\infty }\alpha _{n} = \infty \),
-
(C2)
\(0< a\leq \lambda _{n} \leq b < 2\alpha \) for some \(a,b>0\),
-
(C3)
\(\lim_{n\rightarrow \infty }\frac{\|e_{n}\|}{\alpha _{n}}=0\),
-
(C4)
\(\sum_{n=1}^{\infty }\|e_{n}\| < \infty \) and \(\lim_{n\rightarrow \infty } \frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1} \| = 0\).
Then the sequence \(\{x_{n}\}\) converges strongly to a point \(x^{*} \in (A+B)^{-1}(0)\) where \(x^{*} = P_{(A+B)^{-1}(0)} f(x^{*})\).
Proof
Picking \(z\in (A+B)^{-1}(0)\) and fixing \(n \in \mathbb{N}\), it follows that \(z=J_{\lambda _{n}}^{B} (z-\lambda _{n} Az)\). Firstly, we will show that \(\{x_{n}\}\), \(\{y_{n}\}\), and \(\{z_{n}\}\) are bounded. Since
therefore, by nonexpansiveness of \(J_{\lambda _{n}}^{B}\) and \(I-\lambda _{n} A\), we have
It follows by the same arguments again that
So, by condition (C4) and putting \(M = \frac{1}{1-k} ( \|f(z)-z\|+ \sup_{n\in \mathbb{N}} \frac{\theta _{n}}{\alpha _{n}} \|x_{n}-x_{n-1}\| ) \geq 0\) in Lemma 2.6 (i), we conclude that the sequence \(\{\|x_{n}-z\|\}\) is bounded. That is the sequence \(\{x_{n}\}\) is bounded, and so is \(\{z_{n}\}\). Moreover, by condition (C4), \(\sum_{n=1}^{\infty }\|e_{n}\| < \infty \) implies \(\lim_{n\rightarrow \infty } \|e_{n}\| =0\), that is, \(\lim_{n \rightarrow \infty } e_{n} =0\), it follows that the sequence \(\{y_{n}\}\) is also bounded.
Since \(P_{(A+B)^{-1}(0)} f\) is k-contraction on C, by Banach’s contraction principle there exists a unique element \(x^{*} \in C\) such that \(x^{*} = P_{(A+B)^{-1}(0)} f(x^{*})\), that is, \(x^{*} \in (A+B)^{-1}(0)\), it follows that \(x^{*}=J_{\lambda _{n}}^{B} (x^{*}-\lambda _{n} Ax^{*})\). Now, we will show that \(x_{n} \rightarrow x^{*}\) as \(n\rightarrow \infty \). On the other hand, we have
This implies that
It follows by (3.1), Lemma 2.4, and the firm nonexpansiveness of \(J_{\lambda _{n}}^{B}\) that
We also have
This implies that
Hence, by (3.1), (3.2), (3.3), the nonexpansiveness of \(J_{\lambda _{n}}^{B}\), and \(I-\lambda _{n} A\), we obtain
It follows that
and
which are of the forms
and
respectively, where \(s_{n}=\|x_{n}-x^{*}\|^{2}\), \(\gamma _{n} = \frac{\alpha _{n} (1-k)}{1+\alpha _{n} (1-k)}\), \(\delta _{n} = \frac{2}{1-k} \frac{\theta _{n}}{\alpha _{n}} \| x_{n}-x_{n-1}\| \|z_{n}-x^{*} \| + \frac{4}{1-k} \frac{\|e_{n}\|}{\alpha _{n}} \|y_{n}-x^{*}\| + \frac{2}{1-k}\frac{\|e_{n}\|}{\alpha _{n}}\|e_{n}\| +\frac{2}{1-k} \langle f(x^{*})-x^{*} ,y_{n}-x^{*} \rangle +\frac{2}{1-k} \frac{\|e_{n}\|}{\alpha _{n}} \|(z_{n}-\lambda _{n} Az_{n})-(x^{*}- \lambda _{n} Ax^{*})\|\), \(\eta _{n} = \lambda _{n} (2\alpha -\lambda _{n}) \|Az_{n}-Ax^{*}\|^{2} +\|(I-J_{\lambda _{n}}^{B})(z_{n}-\lambda _{n} Az_{n}+e_{n})-(I-J_{ \lambda _{n}}^{B})(x^{*}-\lambda _{n} Ax^{*})\|^{2} \) and \(\rho _{n} =2\alpha _{n} \frac{\theta _{n}}{\alpha _{n}} \| x_{n}-x_{n-1} \| \|z_{n}-x^{*} \|+2\alpha _{n} \frac{\|e_{n}\|}{\alpha _{n}}\|y_{n}-x^{*} \|+2\|e_{n}\|^{2} +2\alpha _{n} \|f(x^{*})-x^{*}\| \|y_{n}-x^{*}\| + 2 \|(z_{n}-\lambda _{n} Az_{n})-(x^{*}-\lambda _{n} Ax^{*})\| \|e_{n}\| \). Therefore, using conditions (C1), (C3), and (C4), we can check that all those sequences satisfy conditions (i) and (ii) in Lemma 2.7. To complete the proof, we verify that condition (iii) in Lemma 2.7 is satisfied. Let \(\lim_{i\rightarrow \infty }\eta _{n_{i}} = 0\). Then, by condition (C2), we have
and
It follows by conditions (C2), (C4) and (3.4) that
Consider a subsequence \(\{x_{n_{i}}\}\) of \(\{x_{n}\}\). As \(\{x_{n}\}\) is bounded, so is \(\{x_{n_{i}}\}\), there exists a subsequence \(\{x_{n_{i_{j}}}\}\) of \(\{x_{n_{i}}\}\) which converges weakly to \(x \in C\). Without loss of generality, we can assume that \(x_{n_{i}} \rightharpoonup x\) as \(i\rightarrow \infty \). On the other hand, by conditions (C1) and (C4), we have
It follows that \(z_{n_{i}} \rightharpoonup x\) as \(i\rightarrow \infty \). Hence, by (3.5) and the demiclosedness at zero in Lemma 2.5, we obtain \(x \in \text{Fix}(J_{\lambda _{n_{i}}}^{B} (I-\lambda _{n_{i}}A))\), that is, \(x \in (A+B)^{-1}(0)\). Since
then, by (3.5) and conditions (C1) and (C4), we obtain
It implies that \(y_{n_{i}} \rightharpoonup x\) as \(i\rightarrow \infty \). Therefore, by Lemma 2.1(iii), we obtain
It follows by conditions (C1), (C3), and (C4) that \(\limsup_{i\rightarrow \infty } \delta _{n_{i}} \leq 0\). So, by Lemma 2.7, we conclude that \(x_{n} \rightarrow x^{*}\) as \(n\rightarrow \infty \). This completes the proof. □
Remark 3.2
Indeed, the parameter \(\theta _{n}\) can be chosen as follows:
or
where \(N \in \mathbb{N}\) and \(\{\omega _{n}\}\) is a positive sequence such that \(\omega _{n} = o(\alpha _{n})\).
4 IFISTA
Let \(F:H \rightarrow \mathbb{R}\) be a convex and differentiable function and \(G:H \rightarrow \mathbb{R}\cup \{+\infty \}\) be a convex and lower semi-continuous function such that the gradient ∇F is L-Lipschitz continuous and ∂G is the subdifferential of G. It is well known that if ∇F is L-Lipschitz continuous, then it is \(\frac{1}{L}\)-inverse strongly monotone [32]. Moreover, ∂G is maximal monotone [33]. Putting \(A=\nabla F\), \(B=\partial G\), and \(\alpha = \frac{1}{L}\) into Theorem 3.1, we obtain the following result.
Theorem 4.1
Let H be a real Hilbert space. Let \(F: H\rightarrow \mathbb{R}\) be a convex and differentiable function with L-Lipschitz continuous gradient ∇F and \(G: H \rightarrow \mathbb{R}\) be a convex and lower semi-continuous function. Let f be a k-contraction mapping of H into itself, and assume that \((\nabla F+\partial G)^{-1}(0)\) is nonempty. Let \(x_{0},x_{1} \in H\) and \(\{x_{n}\} \subset H\) be a sequence generated by
for all \(n \in \mathbb{N}\), where \(\{\alpha _{n}\} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0, \frac{2}{L})\), \(\{e_{n}\} \subset H\), and \(\{\theta _{n}\} \subset [0,\theta ]\) such that \(\theta \in [0,1)\) satisfy the following conditions:
-
(C1)
\(\lim_{n\rightarrow \infty } \alpha _{n} = 0\) and \(\sum_{n=1}^{\infty }\alpha _{n} = \infty \),
-
(C2)
\(0< a\leq \lambda _{n} \leq b < \frac{2}{L}\) for some \(a,b>0\),
-
(C3)
\(\lim_{n\rightarrow \infty }\frac{\|e_{n}\|}{\alpha _{n}}=0\),
-
(C4)
\(\sum_{n=1}^{\infty }\|e_{n}\| < \infty \) and \(\lim_{n\rightarrow \infty } \frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1} \| = 0\),
then the sequence \(\{x_{n}\}\) converges strongly to a point \(x^{*} \in (\nabla F+\partial G)^{-1}(0)\), where \(x^{*} = P_{(\nabla F+\partial G)^{-1}(0)} f(x^{*})\).
We focus on the image restoration using the fixed point optimization algorithm in Theorem 4.1. The image deblurring problem is in the form
where \(A \in \mathbb{R}^{m \times n}\) represents a known blurring operator (which is called the point spread function: PSF), \(b \in \mathbb{R}^{m}\) is a known observed blurred (and additive noisy) image, \(\varepsilon \in \mathbb{R}^{m}\) is an unknown additive white Gaussian noise, and \(x \in \mathbb{R}^{n}\) is an unknown signal/image to be restored (estimated). Both b and x are formed by stacking the columns of their corresponding two-dimensional images.
In order to solve problem (4.1), we introduce the least absolute shrinkage and selection operator (LASSO) of Tibshirani [34] for solving the following minimization problem:
where \(\lambda > 0\) is a regularization parameter and \(W: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\) represents the orthogonal or tight frame wavelet synthesis, which is a special case of the convex minimization problem (1.2) when \(F(x) = \|Ax-b\|_{2}^{2}\) and \(G(x) = \lambda \|Wx\|_{1}\) such that \(\|x\|_{1} = \sum_{i=1}^{n} |x_{i}|\) and \(\|x\|_{2} = \sqrt{\sum_{i=1}^{n} |x_{i}|^{2}}\) for all \(x = (x_{1},x_{2},\ldots,x_{n})^{T} \in \mathbb{R}^{n}\). It is well known from Lemma 2.2 by putting \(T(Ax) = P_{\mathbb{R}^{m}} Ax = b\) that \(\nabla F(x) = 2A^{T} (Ax-b)\) and ∇F is L-Lipschitzian with \(L=2\|A\|^{2}\) such that \(A^{T}\) stands the transpose of A, and \(\|A\|\) is the largest singular value of A (i.e., the square root of the largest eigenvalue of the matrix \(A^{T} A\)) or the spectral norm \(\|A\|_{2}\).
In this image deblurring case using Theorem 4.1, if the blurring operator A is smaller than the observed blurred image b and the restored image x, then it is changed by padPSF in MATLAB to embed its array to the matrix \(A_{\text{big}} \in \mathbb{R}^{m \times n}\), and followed by a transformation to the signal matrix \(A_{\text{sig}} \in \mathbb{R}^{m \times n}\) for calculating the matrix \(A_{\text{eig}} = (a^{\text{eig}}_{ij})\in \mathbb{R}^{m \times n}\) of eigenvalues of the signal matrix \(A_{\text{sig}}\) using the discrete fast Fourier transformation (FFT) or the discrete cosine transformation (DCT). That is,
We set \(m=n\) and the process of gradient ∇F always maps the signal x to 2 times of the signal \(A^{T} (Ax-b)\), where x, \(A^{T}\), A, and b are in the form of the signal transformation FFT or DCT. That is,
where \(A_{\text{eig}}^{T} = A_{\text{eig}}^{-1}\) such that \(A_{\text{eig}}^{T}\) and \(A_{\text{eig}}^{-1}\) stand for the transpose and the inverse signal transform of the eigenvalues matrix \(A_{\text{eig}}\), respectively.
By [35] and the reference therein, for all \(u=(u_{1},u_{2},\ldots,u_{m})^{T} \in \mathbb{R}^{m}\) and for each \(n \in \mathbb{N}\), we have
such that \(v = (v_{1},v_{2},\ldots,v_{m})^{T} \in \mathbb{R}^{m}\), where \(v_{i} = \text{sign}((Wu)_{i})\max \{|(Wu)_{i}|-\lambda _{n} \lambda ,0 \}\) for all \(i=1,2,\ldots,m\). When the process of \(\text{prox}_{\lambda _{n} G}\) has been finished, it returns to \(W^{-1}(\text{prox}_{\lambda _{n} G}(u))\), where \(W^{-1}\) stands for the inverse wavelet synthesis such that \(W^{-1}(\cdot ) = W^{T}(\cdot )\) before continuing other processes. That is,
and
for all \(n \in \mathbb{N}\).
In the next section, we present IFISTA in Algorithm 1 to the improved fast iterative shrinkage thresholding algorithm [19] in the same programming techniques [36].
5 Applications and numerical examples
In this section, we illustrate the performance of IFISTA compared with IFBS, FISTA, FBMWA, FBMMMA, NAGA, and SCTIP for solving the image deblurring problem (4.1) through LASSO problem (4.2) with \(\lambda = 10^{-4}\). We implemented them in MATLAB R2019a to solve and run on personal laptop Intel(R) Core(TM) i5-8250U CPU @1.80 GHz 8 GB RAM. We use the quality measures (it is better if it is larger value) of the image restoration as follows.
Let \(x, x_{n} \in \mathbb{R}^{M \times N}\) represent the original image and the estimated image at \(n^{\text{th}}\) iteration(s), respectively.
-
(1)
For looking at how strong the signal is and how strong the noise is, the measure is the signal-to-noise ratio (SNR) of the images x and \(x_{n}\), which is defined (measured in decibels: dB) by
$$ \text{SNR}(x,x_{n}) = 10\log _{10} \frac{ \Vert x_{n} \Vert _{2}^{2}}{ \Vert x-x_{n} \Vert _{2}^{2}}. $$ -
(2)
For looking at how signal peak is, the measure is the peak signal-to-noise ratio (PSNR) of the images x and \(x_{n}\), which is defined (measured in decibels: dB) by
$$ \text{PSNR}(x,x_{n}) = 10\log _{10} \frac{\text{MAX}^{2}}{\text{MSE}(x,x_{n})} = 10\log _{10} \frac{\text{MAX}^{2}}{\frac{1}{cMN} \Vert x-x_{n} \Vert _{2}^{2}} $$where MAX is the maximum possible pixel value of the m-unit class (m-bit) image such that MAX = \(2^{m}-1\) (for instance, MAX = 255 for 8-bits image and MAX = 65535 for 16-bits image), and \(\mathrm{MSE} (x,x_{n})\) is the mean squared error (MSE) of the images x and \(x_{n}\), which is defined by \(\text{MSE}(x,x_{n}) = \frac{1}{cMN}\|x-x_{n}\|_{2}^{2}\) such that the images x and \(x_{n}\) are c-multichannel image (for instance, \(c=1\) for gray or monochrome image, \(c = 3\) for RGB color image, and \(c=4\) for CMYK color image).
Similarly, this measure is the improvement in signal-to-noise ratio (ISNR) of the images x, \(x_{n}\), and b where the image \(b\in \mathbb{R}^{M \times N}\) represents the observed blurred (and additive noisy) c-multichannel image, which is defined (measured in decibels: dB) by
$$\begin{aligned} \text{ISNR}(x,x_{n},b) =& \text{PSNR}(x,x_{n})-\text{PSNR}(x,b) \\ =& 10\log _{10} \frac{\text{MAX}^{2}}{\frac{1}{cMN} \Vert x-x_{n} \Vert _{2}^{2}} - 10\log _{10} \frac{\text{MAX}^{2}}{\frac{1}{cMN} \Vert x-b \Vert _{2}^{2}} \\ =& 10\log _{10} \frac{ \Vert x-b \Vert _{2}^{2}}{ \Vert x-x_{n} \Vert _{2}^{2}}. \end{aligned}$$
For comparison, we consider the standard test images downloaded from [37] for Cameraman, Woman, Pirate, and Living room, with each monochrome images consisting of \(512 \times 512\) pixels, which represent the original images \(x \in \mathbb{R}^{512 \times 512}\), and we converted them to the double class type by im2double(imread(‘image_name’)) in MATLAB, and also its 2D three-stage Haar wavelet transform \(Wx \in \mathbb{R}^{512 \times 512}\) as in Fig. 1.
The original images went through a Gaussian blur of size \(9 \times 9\) and standard deviation 4 as a point spread function (PSF) which represents the blurring operator A by fspecial or psfGauss in MATLAB, and went through imfilter in MATLAB (computed by mirror-reflecting as the array across-the array border or symmetric) and followed by an additive zero-mean white Gaussian noise with standard deviation 10−3, which represents the observed blurred and noisy image \(b \in \mathbb{R}^{512 \times 512}\) as in Fig. 2. The PSF A was changed by padPSF in MATLAB to embed its array to the matrix \(A_{\text{big}} \in \mathbb{R}^{512 \times 512}\), and it transformed to a signal matrix \(A_{\text{sig}} \in \mathbb{R}^{512 \times 512}\) for calculating the matrix \(A_{\text{eig}} = (a^{\text{eig}}_{ij})\in \mathbb{R}^{512 \times 512}\) of eigenvalues of the signal matrix \(A_{\text{sig}}\) using the discrete cosine transformation (DCT). That is, \(L = 2\|A_{\text{eig}}\|_{\text{max}}^{2} = 2(\max_{ij} |a^{\text{eig}}_{ij}|)^{2} \).
In compared algorithms, all parameters have been set to their high performance. For each \(n \in \mathbb{N}\), we set
and by [35], we introduced the best choice types of testing the parameter \(\lambda _{n}\) for the fast convergence as in Table 1 (see also, Tables 1–4 of Examples 4.3, 4.5, 4.8, and 4.10 in [35], respectively) such that \(M = \frac{1}{L}\) of \(\frac{1}{M}\)-Lipschitzian continuous gradient ∇F, it follows the setting to its high performance that
and the error sequence \(\{e_{n}\}\subset \mathbb{R}^{512\times 512}\) such that
We also set \(f(x) = \frac{x}{5}\) for all \(x \in \mathbb{R}^{512\times 512}\) and choose the initials \(x_{0} = x_{1} = b\) for all algorithms (except for FISTA, \(y_{1} = x_{0} = b\)). Quoting from [38], we can use max-norm regularization, this constrains the norm of the vector of incoming weights at each hidden unit to be bound by a constant c. Max-norm regularization was used for weights in both convolutional and fully connected layers. Since \(L = 2\|A_{\text{eig}}\|_{\text{max}}^{2}\), so we can use SNR and ISNR that both are two quality measures of the image restoration (it is better if it is larger value) to find each hidden estimated image before its convergence in first 1st to 100th iteration(s) to show high performance of each compared algorithm. That is, we find the hidden estimated images \(x^{*}\) and \(y^{*}\) such that
it is better if \(x^{*} = y^{*}\) (which means that both hidden estimated images are in the process of the same iteration), which is shown in Table 2. Moreover, we also show the relative error which is defined by
where tol denotes a prescribed tolerance value of each compared algorithm at first 1000th iterations by the constants SNR and ISNR as in Table 3, and their convergence behavior are shown in Fig. 3 and Fig. 4.
In evaluation of each algorithm, we use the image processing mean to assess them such that nÌ…, \(\overline{\text{SNR}}\), \(\overline{\text{ISNR}}\), \(\overline{\text{CPU}}\), and \(\overline{\text{tol}}\) are the compared image processing arithmetic mean of n, SNR, ISNR, CPU times, and tol, respectively.
On the results of each algorithms in first 1st to 100th iteration(s) as in Table 2, we see that IFBS, FISTA, and IFISTA have the quantities of SNR and ISNR near to others (except for SCTIP) and their quantities of n and CPU times are vastly different from others. We give an evaluation order for those algorithms as in Table 4 as follows: \(\text{CPU} \leq \overline{\text{CPU}}\), \(\text{SNR} \geq \overline{\text{SNR}}\), \(\text{ISNR} \geq \overline{\text{ISNR}}\), and \(n \leq \overline{n}\), respectively. We see that all evaluations of IFISTA are satisfied, while only CPU times, SNR, and ISNR evaluations of both IFBS and FISTA are satisfied, where SNR and ISNR of IFBS are greater than FISTA, and then, we conclude that IFISTA, IFBS, and FISTA are in the 1st, 2nd, and 3rd place, respectively, of the top three fast processing with high performance for compared image deblurring as Fig. 5, Fig. 6, Fig. 7, and Fig. 8.
From results of each algorithms at first 1000th iterations as in Table 3, we see that the quantities of SNR and ISNR of all algorithms are vastly different. We give an evaluation order for those algorithms as in Table 4 as follows: \(\text{CPU} \leq \overline{\text{CPU}}\), \(\text{SNR} \geq \overline{\text{SNR}}\), \(\text{ISNR} \geq \overline{\text{ISNR}}\), \(n \leq \overline{n}\), and \(\text{tol} \leq \overline{\text{tol}}\), respectively. We see that all evaluations of IFBS, NAGA, and IFISTA are satisfied, where SNR and ISNR of NAGA are greater than both IFBS and IFISTA, and also which of IFBS are greater than IFISTA, and then, we conclude that NAGA, IFBS, and IFISTA are in the 1st, 2nd, and 3rd place, respectively, of the top three fast convergence with good performance for compared image deblurring as Fig. 9, Fig. 10, Fig. 11, and Fig. 12.
6 Conclusion
A new iterative forward-backward splitting method with an error is obtained in our main result. It can be applied to improve the fast iterative shrinkage thresholding algorithm (IFISTA) with an error for solving the image deblurring problem. Under the same programming techniques [36] and setting all parameters to their high performance, we obtain the following results.
-
1.
For looking at the fast processing with high performance for compared image deblurring, IFISTA, IFBS, and FISTA are in the 1st, 2nd, and 3rd place, respectively, and all are better than FBMWA, FBMMMA, NAGA, and SCTIP.
-
2.
For looking at the fast convergence with good performance for compared image deblurring, NAGA, IFBS, and IFISTA are in the 1st, 2nd, and 3rd place, respectively, and all are better than FISTA, FBMWA, FBMMMA, and SCTIP.
Availability of data and materials
Not applicable.
References
Bauschke, H.H.: The approximation of fixed points of compositions of nonexpansive mappings in Hilbert space. J. Math. Anal. Appl. 202, 150–159 (1996)
Chidume, C.E., Bashir, A.: Convergence of path and iterative method for families of nonexpansive mappings. Appl. Anal. 67, 117–129 (2008)
Halpern, B.: Fixed points of nonexpansive maps. Bull. Am. Math. Soc. 73, 957–961 (1967)
Ishikawa, S.: Fixed points by a new iteration method. Proc. Am. Math. Soc. 44, 147–150 (1974)
Klen, R., Manojlovic, V., Simic, S., Vuorinen, M.: Bernoulli inequality and hypergeometric functions. Proc. Am. Math. Soc. 142, 559–573 (2014)
Kunze, H., La Torre, D., Mendivil, F., Vrscay, E.R.: Generalized fractal n transforms and self-similar objects in cone metric spaces. Comput. Math. Appl. 64, 1761–1769 (2012)
Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4, 506–510 (1953)
Radenovic, S., Rhoades, B.E.: Fixed point theorem for two non-self mappings in cone metric spaces. Comput. Math. Appl. 57, 1701–1707 (2009)
Todorcevic, V.: Harmonic Quasiconformal Mappings and Hyperbolic Type Metrics. Springer, Basel (2019)
Byrne, C.: Iterative oblique projection onto convex subsets and the split feasibility problem. Inverse Probl. 18, 441–453 (2002)
Byrne, C.: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Probl. 20, 103–120 (2004)
Combettes, P.L., Wajs, V.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)
Censor, Y., Bortfeld, T., Martin, B., Trofimov, A.: A unified approach for inversion problems in intensity-modulated radiation therapy. Phys. Med. Biol. 51, 2353–2365 (2006)
Censor, Y., Elfving, T., Kopf, N., Bortfeld, T.: The multiple set split feasibility problem and its applications. Inverse Probl. 21, 2071–2084 (2005)
Censor, Y., Motova, A., Segal, A.: Perturbed projections and subgradient projections for the multiple-sets feasibility problem. J. Math. Anal. 327, 1244–1256 (2007)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. 72, 383–390 (1979)
Moudafi, A., Oliny, M.: Convergence of a splitting inertial proximal method for monotone operators. J. Comput. Appl. Math. 155, 447–454 (2003)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Hanjing, A., Suantai, S.: A fast image restoration algorithm based on a fixed point and optimization method. Mathematics 8, 378 (2020)
Padcharoen, A., Kumam, P.: Fixed point optimization method for image restoration. Thai J. Math. 18(3), 1581–1596 (2020)
Verma, M., Shukla, K.K.: A new accelerated proximal gradient technique for regularized multitask learning framework. Pattern Recognit. Lett. 95, 98–103 (2017)
Cholamjiak, P., Kesornprom, S., Pholasa, N.: Weak and strong convergence theorems for the inclusion problem and the fixed-point problem of nonexpansive mappings. Mathematics 7, 167 (2019)
Abubakar, J., Kumam, P., Ibrahim, A.H., Padcharoen, A.: Relaxed inertial Tseng’s type method for solving the inclusion problem with application to image restoration. Mathematics 8, 818 (2020)
Luo, Y.: An inertial splitting algorithm for solving inclusion problems and its applications to compressed sensing. J. Appl. Numer. Optim. 2(3), 279–295 (2020)
Takahashi, W.: Introduction to Nonlinear and Convex Analysis. Yokohama Publishers, Yokohama (2009)
Tang, J.F., Chang, S.S., Yuan, F.: A strong convergence theorem for equilibrium problems and split feasibility problems in Hilbert spaces. Fixed Point Theory Appl. 2014, 36 (2014)
Nadezhkina, N., Takahashi, W.: Weak convergence theorem by an extragradient method for nonexpansive mappings and monotone mappings. J. Optim. Theory Appl. 128, 191–201 (2006)
Geobel, K., Kirk, W.A.: Topic in Metric Fixed Point Theory. Cambridge Studies in Advanced Mathematics, vol. 28. Cambridge University Press, Cambridge (1990)
Takahashi, W., Xu, H.-K.: Iterative algorithms for nonlinear operators. J. Lond. Math. Soc. 66, 240–256 (2002)
He, S., Yang, C.: Solving the variational inequality problem defined on intersection of finite level sets. Abstr. Appl. Anal. 2013, Article ID 942315 (2013)
Baillon, J.B., Haddad, G.: Quelques proprietes des operateurs angle-bornes et cycliquement monotones. Isr. J. Math. 26, 137–150 (1977)
Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pac. J. Math. 33, 209–216 (1970)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 58, 267–288 (1996)
Tianchai, P.: The zeros of monotone operators for the variational inclusion problem in Hilbert spaces. J. Inequal. Appl. 2021, 126 (2021)
Guide to the MATLAB code for wavelet-based deblurring with FISTA, Available online: https://docplayer.net/128436542-Guide-to-the-matlab-code-for-wavelet-based-deblurring-with-fista.html (accessed on 1 June 2021)
Image Databases, Available online: http://www.imageprocessingplace.com/downloads_V3/root_downloads/image_databases/standard_test_images.zip (accessed on 1 June 2021)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Acknowledgements
The author would like to thank the Faculty of Science, Maejo University for its financial support.
Funding
This research was supported by Faculty of Science, Maejo University.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The author declares that he has no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Tianchai, P. An improved fast iterative shrinkage thresholding algorithm with an error for image deblurring problem. Fixed Point Theory Algorithms Sci Eng 2021, 18 (2021). https://doi.org/10.1186/s13663-021-00703-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13663-021-00703-6