Skip to main content

An improved fast iterative shrinkage thresholding algorithm with an error for image deblurring problem

Abstract

In this paper, we introduce a new iterative forward-backward splitting method with an error for solving the variational inclusion problem of the sum of two monotone operators in real Hilbert spaces. We suggest and analyze this method under some mild appropriate conditions imposed on the parameters such that another strong convergence theorem for these problem is obtained. We also apply our main result to improve the fast iterative shrinkage thresholding algorithm (IFISTA) with an error for solving the image deblurring problem. Finally, we provide numerical experiments to illustrate the convergence behavior and show the effectiveness of the sequence constructed by the inertial technique to the fast processing with high performance and the fast convergence with good performance of IFISTA.

1 Introduction

Let C be a nonempty closed convex subset of a real Hilbert space H. The variational inclusion problem is a fundamental problem in optimization theory, it can be applied in many areas of science and applied science, engineering, economics, and medicine [1–9], in image processing, machine learning, modeling intensity modulated radiation theory treatment planning [10–15]. It is to find \(x^{*} \in H\) such that

$$ 0 \in Ax^{*}+Bx^{*}, $$
(1.1)

where \(A:H \rightarrow H\) is an operator and \(B: D(B) \subset H \rightarrow 2^{H}\) is a set-valued operator.

To solve the variational inclusion problem (1.1) via fixed point theory, we define the mapping \(J_{r}^{A,B} : H\rightarrow D(B)\) as follows:

$$ J_{r}^{A,B} = (I+rB)^{-1}(I-rA) = J_{r}^{B} (I-rA), $$

where \(J_{r}^{B} = (I+rB)^{-1}\) is the resolvent operator of B for \(r>0\). For \(x \in H\), we see that

$$\begin{aligned} J_{r}^{A,B} (x) = x \quad \Leftrightarrow\quad & x = (I+rB)^{-1}(x-rAx) \\ \quad \Leftrightarrow\quad & x-rAx \in x+r Bx \\ \quad \Leftrightarrow\quad & 0 \in Ax+Bx, \end{aligned}$$

which shows that the fixed point set of \(J_{r}^{A,B}\) coincides with the solutions set of \((A+B)^{-1}(0)\). This suggests the following iteration process: \(x_{1} \in C\) and

$$ x_{n+1} = \underbrace{(I+r_{n} B)^{-1}}_{\text{backward step}} \underbrace{(I-r_{n} A)}_{\text{forward step}}x_{n} = J_{r_{n}}^{A,B} (x_{n}), \quad \forall n \in \mathbb{N}, $$

where \(\{r_{n}\} \subset (0,\infty )\) and \(D(B) \subset C\). This method is called a forward-backward splitting algorithm [16, 17].

In applications, we always let \(A=\nabla F\) and \(B = \partial G\) such that \(F:H \rightarrow \mathbb{R}\) is a convex and differentiable function and \(G:H \rightarrow \mathbb{R}\cup \{+\infty \}\) is a convex and lower semi-continuous function, where ∇F is the gradient of F with L-Lipschitz continuous and ∂G is the subdifferential of G which defined by

$$ \partial G(x) = \bigl\{ z\in H : \langle y-x,z \rangle + G(x) \leq G(y), \forall y \in H\bigr\} . $$

Then problem (1.1) is reduced to the following convex minimization problem:

$$ F\bigl(x^{*}\bigr)+G\bigl(x^{*}\bigr) = \min_{x\in H} \bigl\lbrace F(x)+G(x) \bigr\rbrace \quad \Leftrightarrow \quad 0 \in \nabla F\bigl(x^{*}\bigr)+\partial G \bigl(x^{*}\bigr). $$
(1.2)

Recall that the proximity operator \(\text{prox}_{G}\) of G is defined for all \(x \in H\) as follows:

$$ \text{prox}_{G}(x) = \underset{y \in H}{\mathrm{Argmin}} \biggl\lbrace G(y)+ \frac{1}{2} \Vert y-x \Vert _{2}^{2} \biggr\rbrace . $$

For \(x \in H\) and \(r>0\), we see that

$$\begin{aligned} z = \text{prox}_{rG} (x) \quad \Leftrightarrow \quad & 0 \in r\partial G(z)+z-x \\ \quad \Leftrightarrow\quad & x \in (I+r\partial G) (z) \\ \quad \Leftrightarrow\quad & z = (I+r\partial G)^{-1}(x) = J_{r}^{\partial G}(x). \end{aligned}$$

Therefore,

$$\begin{aligned} x^{*} \in \underset{x \in H}{\mathrm{Argmin}} \bigl\lbrace F(x)+G(x) \bigr\rbrace \quad \Leftrightarrow\quad & 0 \in \nabla F\bigl(x^{*} \bigr)+\partial G\bigl(x^{*}\bigr) \\ \quad \Leftrightarrow \quad & x^{*} = J_{r}^{\nabla F,\partial G} \bigl(x^{*}\bigr) \\ \quad \Leftrightarrow\quad & x^{*} = J_{r}^{\partial G} (I-r \nabla F)x^{*} = \text{prox}_{rG}(I-r \nabla F)x^{*}. \end{aligned}$$

Many researchers have proposed and analyzed the iterative shrinkage thresholding algorithms for solving the convex minimization problem (1.2) under a few specific conditions as follows.

In the weak convergence theorems, Lions and Mercier [16] first introduced forward-backward splitting (FBS) algorithm:

$$\begin{aligned} x_{n+1} = \text{prox}_{\lambda _{n} G}\bigl(x_{n} - \lambda _{n} \nabla F(x_{n})\bigr), \quad \forall n \in \mathbb{N}, \end{aligned}$$

where \(x_{1} \in H\) and \(\{ \lambda _{n} \} \subset (0,2/L)\). Later, Moudafi and Oliny [18] introduced the iterative forward-backward splitting (IFBS) algorithm:

$$\begin{aligned} \textstyle\begin{cases} y_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ x_{n+1} = \text{prox}_{\lambda _{n} G}(y_{n}-\lambda _{n} \nabla F(x_{n})), \quad \forall n \in \mathbb{N,} \end{cases}\displaystyle \end{aligned}$$

where \(x_{0}, x_{1} \in H\), \(\{\theta _{n}\} \subset [0,a] \subset [0,1)\), \(\{ \lambda _{n}\} \subset [b,c] \subset (0,2/L)\) such that \(\sum_{n=1}^{\infty }\theta _{n} \|x_{n}-x_{n-1}\|^{2} < \infty \). In our research, we focus attention on the inertial parameter \(\theta _{n}\) which controls the momentum of \(x_{n} - x_{n-1}\) in the fast iterative shrinkage thresholding algorithm (FISTA) of Beck and Teboulle [19] as follows:

$$\begin{aligned} \textstyle\begin{cases} x_{n} = \text{prox}_{\frac{1}{L} G}(y_{n}-\frac{1}{L} \nabla F(y_{n})), \\ t_{n+1} = \frac{1+\sqrt{1+4t_{n}^{2}}}{2},\qquad \theta _{n} = \frac{t_{n}-1}{t_{n+1}}, \\ y_{n+1} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \quad \forall n \in \mathbb{N}, \end{cases}\displaystyle \end{aligned}$$

where \(y_{1} = x_{0} \in H\) and \(t_{1}=1\). In FISTA, we observe that \(y_{n}\) is known before \(x_{n}\), where the sequence \(\{x_{n}\}\) converges weakly to the solution of the convex minimization problem (1.2). Recently, Hanjing and Suantai [20] introduced the forward-backward modified W-algorithm (FBMWA) as follows:

$$\begin{aligned} \textstyle\begin{cases} w_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ z_{n} = (1-\gamma _{n})w_{n}+\gamma _{n} \text{prox}_{\lambda _{n} G}(w_{n}- \lambda _{n} \nabla F(w_{n})), \\ y_{n} = (1-\beta _{n})\text{prox}_{\lambda _{n} G}(w_{n}-\lambda _{n} \nabla F(w_{n}))+\beta _{n} \text{prox}_{\lambda _{n} G}(z_{n}- \lambda _{n} \nabla F(z_{n})), \\ x_{n+1} = (1-\alpha _{n})\text{prox}_{\lambda _{n} G}(z_{n}-\lambda _{n} \nabla F(z_{n}))+\alpha _{n} \text{prox}_{\lambda _{n} G}(y_{n}- \lambda _{n} \nabla F(y_{n})), \end{cases}\displaystyle \end{aligned}$$

for all \(n \in \mathbb{N}\), where \(x_{0},x_{1} \in H\) and \(\{\alpha _{n}\} \subset [0,a] \subset [0,1)\), \(\{\beta _{n}\} \subset [0,1]\), \(\{\gamma _{n}\} \subset [b,c] \subset (0,1)\), and \(\{ \theta _{n} \} \subset [0,\infty ) \) such that \(\sum_{n=1}^{\infty }\theta _{n} < \infty \), and \(\{\lambda _{n}\} \subset (0,2/L)\) such that \(\lambda _{n} \rightarrow \lambda \in (0,2/L)\) as \(n \rightarrow \infty \). In the same way, Padcharoen and Kuman [21] introduced the forward-backward modified MM-algorithm (FBMMMA) as follows:

$$\begin{aligned} \textstyle\begin{cases} w_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ z_{n} = (1-\gamma _{n})w_{n}+\gamma _{n} \text{prox}_{\lambda _{n} G}(w_{n}- \lambda _{n} \nabla F(w_{n})), \\ y_{n} = (1-\alpha _{n} -\beta _{n})z_{n}+\alpha _{n} \text{prox}_{ \lambda _{n} G}(z_{n}-\lambda _{n} \nabla F(z_{n})) \\ \hphantom{y_{n} =}{} +\beta _{n} \text{prox}_{\lambda _{n} G}(w_{n}-\lambda _{n} \nabla F(w_{n})) , \\ x_{n+1} = \text{prox}_{\lambda _{n} G}(y_{n}-\lambda _{n} \nabla F(y_{n})), \quad \forall n \in \mathbb{N}, \end{cases}\displaystyle \end{aligned}$$

where \(x_{0},x_{1} \in H\) and \(\{\alpha _{n}\},\{\beta _{n}\},\{\gamma _{n}\} \subset [0,1]\) such that \(\alpha _{n}+\beta _{n} \in [0,1]\) and \(\{ \theta _{n} \} \subset (0,1)\) such that \(\sum_{n=1}^{\infty }\theta _{n} < \infty \), and \(\{\lambda _{n}\} \subset (0,2/L)\) such that \(\lambda _{n} \rightarrow \lambda \in (0,2/L)\) as \(n \rightarrow \infty \). Other weak convergence theorems of all those algorithms are obtained.

In the strong convergence theorems, Verma and Shukla [22] introduced a new accelerated proximal gradient algorithm (NAGA) as follows:

$$\begin{aligned} \textstyle\begin{cases} z_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ y_{n} = (1-\alpha _{n})z_{n}+\alpha _{n} \text{prox}_{\lambda _{n} G}(z_{n}- \lambda _{n} \nabla F(z_{n})), \\ x_{n+1} = \text{prox}_{\lambda _{n} G}(y_{n}-\lambda _{n} \nabla F(y_{n})), \quad \forall n \in \mathbb{N,} \end{cases}\displaystyle \end{aligned}$$

where \(x_{0},x_{1} \in H\), \(\{\alpha _{n}\}, \{\theta _{n}\} \subset (0,1]\), and \(\{\lambda _{n}\} \subset (0,2/L)\). They proved that the sequence \(\{x_{n}\}\) of NAGA converges strongly under the condition \(\frac{\|x_{n}-x_{n-1}\|}{\theta _{n}} \rightarrow 0\) as \(n \rightarrow \infty \). How to choose the parameter \(\theta _{n}\)? We leave it for the reader to verify. In their proof, we observe that NAGA still holds under conditions \(\alpha _{n} \rightarrow 0\) and \(\frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1}\| \rightarrow 0\) as \(n\rightarrow \infty \), and the parameter \(\theta _{n}\) can be chosen as

$$\begin{aligned} \theta _{n} = \textstyle\begin{cases} \min \{ \frac{\omega _{n}}{ \Vert x_{n}-x_{n-1} \Vert }, \alpha _{n} \} & \text{if } x_{n} \neq x_{n-1}, \\ \alpha _{n} & \text{otherwise}, \end{cases}\displaystyle \end{aligned}$$

where \(\{\omega _{n}\}\) is a positive sequence such that \(\omega _{n} = o(\alpha _{n})\). Cholamjiak et al. [23] introduced the strong convergence theorem for the inclusion problem (SCTIP) by letting \(S=I\), \(A=\nabla F\), and \(B=\partial G\) as follows:

$$\begin{aligned} \textstyle\begin{cases} z_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ y_{n} = \alpha _{n} f(x_{n})+(1-\alpha _{n}) \text{prox}_{\lambda _{n} G}(z_{n}-\lambda _{n} \nabla F(z_{n})), \\ x_{n+1} = \beta _{n} x_{n} +(1-\beta _{n})y_{n}, \quad \forall n \in \mathbb{N}, \end{cases}\displaystyle \end{aligned}$$

where \(x_{0},x_{1} \in C\) and f is a contraction of C into itself, and \(\{\alpha _{n}\},\{\beta _{n} \} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0,2/L)\), and \(\{\theta _{n} \} \subset [0,\theta ]\) such that \(\theta \in [0,1)\). They proved that the sequence \(\{x_{n}\}\) of SCTIP converges strongly under the following conditions:

  1. (C1)

    \(\lim_{n\rightarrow \infty } \alpha _{n} = 0\) and \(\sum_{n=1}^{\infty }\alpha _{n} = \infty \),

  2. (C2)

    \(\liminf_{n\rightarrow \infty } \beta _{n} (1-\beta _{n}) > 0\),

  3. (C3)

    \(0 < \liminf_{n\rightarrow \infty } \lambda _{n} \leq \limsup_{n \rightarrow \infty } \lambda _{n} < 2/L\),

  4. (C4)

    \(\lim_{n\rightarrow \infty } \frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1} \| = 0\).

Moreover, many researchers have proposed and analyzed the iterative forward-backward scheme with a variable step size, which does not depend on the Lipschitz constant of the operator \(A=\nabla F\) (see also [24, 25]).

In our research, we consider the forward-backward splitting method with an error as follows: \(x_{1} \in C\) and

$$ x_{n+1} = \underbrace{(I+\lambda _{n} B)^{-1}}_{\text{backward step}} \underbrace{\bigl((I-\lambda _{n} A)x_{n} +e_{n}\bigr)}_{ \text{forward step with an error} } = J_{\lambda _{n}}^{B} \bigl((I- \lambda _{n} A)x_{n} +e_{n}\bigr) , \quad \forall n \in \mathbb{N}, $$

where \(\{\lambda _{n}\} \subset (0,\infty )\), \(\{e_{n}\} \subset H\), \(D(B) \subset C\), and \(J_{\lambda _{n}}^{B} = (I+\lambda _{n} B)^{-1}\). We introduce a new iterative forward-backward splitting method with an error for solving the variational inclusion problem (1.1) as follows:

$$\begin{aligned} \textstyle\begin{cases} z_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ y_{n} = \alpha _{n} f(z_{n})+(1-\alpha _{n})J_{\lambda _{n}}^{B} (z_{n}- \lambda _{n} Az_{n}+e_{n}), \\ x_{n+1} = J_{\lambda _{n}}^{B}(y_{n}-\lambda _{n} Ay_{n}+e_{n}), \quad \forall n \in \mathbb{N,} \end{cases}\displaystyle \end{aligned}$$

where \(x_{0},x_{1} \in C\) and f is a contraction of C into itself, and \(\{\alpha _{n}\} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0,2/L)\), \(\{e_{n} \} \subset H\), and \(\{\theta _{n}\} \subset [0,\theta ]\) such that \(\theta \in [0,1)\). Moreover, it can be applied to improve the fast iterative shrinkage thresholding algorithm (IFISTA) with an error for solving the convex minimization problem (1.2) by letting \(A=\nabla F\) and \(B=\partial G\) as follows:

$$\begin{aligned} \textstyle\begin{cases} z_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ y_{n} = \alpha _{n} f(z_{n})+(1-\alpha _{n})\text{prox}_{\lambda _{n} G} (z_{n}-\lambda _{n} \nabla F(z_{n})+e_{n}), \\ x_{n+1} = \text{prox}_{\lambda _{n} G}(y_{n}-\lambda _{n} \nabla F(y_{n})+e_{n}), \quad \forall n \in \mathbb{N} \end{cases}\displaystyle \end{aligned}$$

which obtains a self-adaptive scheme with fast convergence properties under some mild conditions when compared to the existing algorithms in the literature. The outline of our research is as follows: in Sect. 2, we give some well-known definitions and lemmas which are used in Sect. 3 to prove the strong convergence theorem of IFISTA for solving the variational inclusion problem (1.1), and we also apply its result in Sect. 4 for solving the image deblurring problem, which is a special case of convex minimization problem (1.2); and in Sect. 5, we provide numerical experiments to illustrate the fast processing with high performance and the fast convergence with good performance of IFISTA by the inertial technique.

2 Preliminaries

Let C be a nonempty closed convex subset of a real Hilbert space H. We will use the notation: → to denote the strong convergence, ⇀ to denote the weak convergence,

$$ \omega _{w}(x_{n}) = \bigl\{ x: \exists \{x_{n_{k}} \} \subset \{x_{n}\} \text{ such that } x_{n_{k}} \rightharpoonup x \bigr\} $$

to denote the weak limit set of \(\{x_{n}\}\), and \(\text{Fix}(T) = \{x:x=Tx \}\) to denote the fixed point set of the mapping T.

Recall that the metric projection \(P_{C}: H \rightarrow C\) is defined as follows: for each \(x \in H\), \(P_{C} x\) is the unique point in C satisfying

$$ \Vert x-P_{C} x \Vert = \inf \bigl\{ \Vert x-y \Vert :y\in C\bigr\} . $$

The operator \(T:H\rightarrow H\) is called:

  1. (i)

    monotone if

    $$ \langle x-y,Tx-Ty \rangle \geq 0,\quad \forall x,y \in H, $$
  2. (ii)

    L-Lipschitzian with \(L>0\) if

    $$ \Vert Tx-Ty \Vert \leq L \Vert x-y \Vert ,\quad \forall x,y \in H, $$
  3. (iii)

    k-contraction if it is k-Lipschitzian with \(k \in (0,1)\),

  4. (iv)

    nonexpansive if it is 1-Lipschitzian,

  5. (v)

    firmly nonexpansive if

    $$ \Vert Tx-Ty \Vert ^{2} \leq \Vert x-y \Vert ^{2} - \bigl\Vert (I-T)x-(I-T)y \bigr\Vert ^{2},\quad \forall x,y \in H, $$
  6. (vi)

    α-strongly monotone with \(\alpha > 0\) if

    $$ \langle Tx-Ty,x-y \rangle \geq \alpha \Vert x-y \Vert ^{2},\quad \forall x,y \in H, $$
  7. (vii)

    α-inverse strongly monotone with \(\alpha > 0\) if

    $$ \langle Tx-Ty,x-y \rangle \geq \alpha \Vert Tx-Ty \Vert ^{2},\quad \forall x,y \in H. $$

Let B be a mapping of H into \(2^{H}\). The domain and the range of B are denoted by \(D(B) = \{x\in H : Bx \neq \emptyset \}\) and \(R(B) = \bigcup \{Bx:x \in D(B) \}\), respectively. The inverse of B, denoted by \(B^{-1}\), is defined by \(x\in B^{-1}y\) if and only if \(y\in Bx\). A multi-valued mapping B is said to be a monotone operator on H if \(\langle x-y,u-v\rangle \geq 0\) for all \(x,y \in D(B)\), \(u \in Bx\), and \(v \in By\). A monotone operator B on H is said to be maximal if its graph is not strictly contained in the graph of any other monotone operator on H. For a maximal monotone operator B on H and \(r>0\), we define the single-valued resolvent operator \(J_{r}^{B}:H\rightarrow D(B)\) by \(J_{r}^{B}=(I+rB)^{-1}\). It is well known that \(J_{r}^{B}\) is firmly nonexpansive and \(\text{Fix}(J_{r}^{B})=B^{-1}(0)\).

We collect together some known lemmas which are the main tools in proving our result.

Lemma 2.1

([26])

Let C be a nonempty closed convex subset of a real Hilbert space H. Then:

  1. (i)

    \(\|x \pm y\|^{2} = \|x\|^{2} \pm 2 \langle x,y \rangle + \|y\|^{2}\), \(\forall x,y\in H\),

  2. (ii)

    \(\|\lambda x+(1-\lambda )y\|^{2} = \lambda \|x\|^{2}+(1-\lambda )\|y \|^{2}-\lambda (1-\lambda )\|x-y\|^{2}\), \(\forall x,y\in H\), \(\lambda \in \mathbb{R}\),

  3. (iii)

    \(z=P_{C}x \Leftrightarrow \langle x-z,z -y \rangle \geq 0\), \(\forall x\in H\), \(y \in C\),

  4. (iv)

    \(z=P_{C}x \Leftrightarrow \|x-z \|^{2} \leq \|x-y\|^{2} - \| y-z \|^{2}\), \(\forall x\in H\), \(y \in C\),

  5. (v)

    \(\| P_{C} x - P_{C} y\|^{2} \leq \langle x-y,P_{C} x - P_{C} y \rangle \), \(\forall x,y\in H\).

Lemma 2.2

([27])

Let H and K be two real Hilbert spaces, and let \(T:K \rightarrow K\) be a firmly nonexpansive mapping such that \(\|(I-T)x\|\) is a convex function from K to \(\overline{\mathbb{R}}=[-\infty ,+\infty ]\). Let \(A:H\rightarrow K\) be a bounded linear operator and \(f(x) = \frac{1}{2}\|(I-T)Ax\|^{2} \) for all \(x\in H\). Then:

  1. (i)

    f is convex and differential,

  2. (ii)

    \(\nabla f(x) = A^{*}(I-T)Ax \) for all \(x\in H\) such that \(A^{*}\) denotes the adjoint of A,

  3. (iii)

    f is weakly lower semi-continuous on H,

  4. (iv)

    ∇f is \(\|A\|^{2}\)-Lipschitzian.

Lemma 2.3

([27])

Let H be a real Hilbert space and \(T: H\rightarrow H\) be an operator. The following statements are equivalent:

  1. (i)

    T is firmly nonexpansive,

  2. (ii)

    \(\|Tx-Ty\|^{2} \leq \langle x-y,Tx-Ty \rangle \), \(\forall x,y \in H\),

  3. (iii)

    \(I-T\) is firmly nonexpansive.

Lemma 2.4

([28])

Let C be a nonempty closed convex subset of a real Hilbert space H. Let the mapping \(A:C\rightarrow H\) be α-inverse strongly monotone and \(r>0\) be a constant. Then we have

$$ \bigl\Vert (I-rA)x-(I-rA)y \bigr\Vert ^{2} \leq \Vert x-y \Vert ^{2}-r(2\alpha -r) \Vert Ax-Ay \Vert ^{2} $$

for all \(x,y \in C\). In particular, if \(0< r\leq 2\alpha \), then \(I-rA\) is nonexpansive.

Lemma 2.5

([29] (Demiclosedness principle))

Let C be a nonempty closed convex subset of a real Hilbert space H, and let \(S:C \rightarrow C\) be a nonexpansive mapping with \(\textit{Fix}(S)\neq \emptyset \). If the sequence \(\{x_{n}\}\subset C\) converges weakly to x and the sequence \(\{(I-S)x_{n}\}\) converges strongly to y, then \((I-S)x = y\); in particular, if \(y=0\), then \(x\in \textit{Fix}(S)\).

Lemma 2.6

([30])

Let \(\{a_{n}\}\) and \(\{c_{n}\}\) be sequences of nonnegative real numbers such that

$$ a_{n+1} \leq (1-\delta _{n})a_{n}+b_{n}+c_{n}, \quad \forall n=0,1,2,\ldots, $$

where \(\{\delta _{n} \}\) is a sequence in \((0,1)\) and \(\{b_{n}\}\) is a real sequence. Assume that \(\sum_{n=0}^{\infty }c_{n} < \infty \). Then the following results hold:

  1. (i)

    if \(b_{n} \leq \delta _{n} M\) for some \(M\geq 0\), then \(\{a_{n}\}\) is a bounded sequence,

  2. (ii)

    if \(\sum_{n=0}^{\infty }\delta _{n} = \infty \) and \(\limsup_{n\rightarrow \infty } b_{n}/\delta _{n} \leq 0\), then \(\lim_{n\rightarrow \infty }a_{n}=0\).

Lemma 2.7

([31])

Assume that \(\{s_{n}\}\) is a sequence of nonnegative real numbers such that

$$ s_{n+1} \leq (1-\gamma _{n})s_{n} + \gamma _{n} \delta _{n}, \quad \forall n=0,1,2,\ldots $$

and

$$ s_{n+1} \leq s_{n} - \eta _{n} +\rho _{n}, \quad \forall n=0,1,2,\ldots, $$

where \(\{\gamma _{n}\}\) is a sequence in \((0,1)\), \(\{\eta _{n}\}\) is a sequence of nonnegative real numbers, and \(\{\delta _{n}\}\), \(\{\rho _{n}\}\) are real sequences such that

  1. (i)

    \(\sum_{n=0}^{\infty }\gamma _{n} = \infty \),

  2. (ii)

    \(\lim_{n\rightarrow \infty } \rho _{n} = 0\),

  3. (iii)

    if \(\lim_{k\rightarrow \infty } \eta _{n_{k}} = 0\), then \(\limsup_{k\rightarrow \infty } \delta _{n_{k}} \leq 0\) for any subsequence \(\{n_{k}\}\) of \(\{n\}\).

Then \(\lim_{n\rightarrow \infty } s_{n} = 0\).

3 Main result

Theorem 3.1

Let C be a nonempty closed convex subset of a real Hilbert space H. Let A be an α-inverse strongly monotone mapping of H into itself and B be a maximal monotone operator on H such that the domain of B is included in C, and assume that \((A+B)^{-1}(0)\) is nonempty. Let \(J_{\lambda }^{B}=(I+\lambda B)^{-1}\) be the resolvent of B for \(\lambda > 0\) and f be a k-contraction mapping of C into itself. Let \(x_{0},x_{1} \in C\) and \(\{x_{n}\} \subset C\) be a sequence generated by

$$\begin{aligned} \textstyle\begin{cases} z_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ y_{n} = \alpha _{n} f(z_{n})+(1-\alpha _{n})J_{\lambda _{n}}^{B} (z_{n}- \lambda _{n} Az_{n}+e_{n}), \\ x_{n+1} = J_{\lambda _{n}}^{B}(y_{n}-\lambda _{n} Ay_{n}+e_{n}), \end{cases}\displaystyle \end{aligned}$$

for all \(n \in \mathbb{N}\), where \(\{\alpha _{n}\} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0,2\alpha )\), \(\{e_{n}\} \subset H\), and \(\{\theta _{n}\} \subset [0,\theta ]\) such that \(\theta \in [0,1)\) satisfy the following conditions:

  1. (C1)

    \(\lim_{n\rightarrow \infty } \alpha _{n} = 0\) and \(\sum_{n=1}^{\infty }\alpha _{n} = \infty \),

  2. (C2)

    \(0< a\leq \lambda _{n} \leq b < 2\alpha \) for some \(a,b>0\),

  3. (C3)

    \(\lim_{n\rightarrow \infty }\frac{\|e_{n}\|}{\alpha _{n}}=0\),

  4. (C4)

    \(\sum_{n=1}^{\infty }\|e_{n}\| < \infty \) and \(\lim_{n\rightarrow \infty } \frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1} \| = 0\).

Then the sequence \(\{x_{n}\}\) converges strongly to a point \(x^{*} \in (A+B)^{-1}(0)\) where \(x^{*} = P_{(A+B)^{-1}(0)} f(x^{*})\).

Proof

Picking \(z\in (A+B)^{-1}(0)\) and fixing \(n \in \mathbb{N}\), it follows that \(z=J_{\lambda _{n}}^{B} (z-\lambda _{n} Az)\). Firstly, we will show that \(\{x_{n}\}\), \(\{y_{n}\}\), and \(\{z_{n}\}\) are bounded. Since

$$\begin{aligned} \Vert z_{n}-z \Vert \leq \Vert x_{n}-z \Vert + \theta _{n} \Vert x_{n}-x_{n-1} \Vert , \end{aligned}$$

therefore, by nonexpansiveness of \(J_{\lambda _{n}}^{B}\) and \(I-\lambda _{n} A\), we have

$$\begin{aligned} \Vert y_{n}-z \Vert =& \bigl\Vert \alpha _{n} \bigl(f(z_{n})-z\bigr)+(1-\alpha _{n}) \bigl(J_{ \lambda _{n}}^{B}(z_{n}-\lambda _{n} Az_{n}+e_{n})-z\bigr) \bigr\Vert \\ \leq & \alpha _{n} \bigl( \bigl\Vert f(z_{n})-f(z) \bigr\Vert + \bigl\Vert f(z)-z \bigr\Vert \bigr) \\ &{} +(1-\alpha _{n}) \bigl\Vert (z_{n}-\lambda _{n} Az_{n}+e_{n})-(z-\lambda _{n} Az) \bigr\Vert \\ \leq & \alpha _{n} \bigl(k \Vert z_{n}-z \Vert + \bigl\Vert f(z)-z \bigr\Vert \bigr) +(1-\alpha _{n}) \bigl( \Vert z_{n}-z \Vert + \Vert e_{n} \Vert \bigr) \\ \leq & \bigl(1-\alpha _{n}(1-k)\bigr) \Vert z_{n}-z \Vert +\alpha _{n} \bigl\Vert f(z)-z \bigr\Vert + \Vert e_{n} \Vert \\ \leq & \bigl(1-\alpha _{n}(1-k)\bigr) \Vert x_{n}-z \Vert +\theta _{n} \Vert x_{n}-x_{n-1} \Vert + \alpha _{n} \bigl\Vert f(z)-z \bigr\Vert + \Vert e_{n} \Vert . \end{aligned}$$

It follows by the same arguments again that

$$\begin{aligned} \Vert x_{n+1}-z \Vert =& \bigl\Vert J_{\lambda _{n}}^{B}(y_{n}- \lambda _{n} Ay_{n}+e_{n})-J_{ \lambda _{n}}^{B} (z-\lambda _{n} Az) \bigr\Vert \\ \leq & \bigl\Vert (y_{n}-\lambda _{n} Ay_{n}+e_{n})-(z-\lambda _{n} Az) \bigr\Vert \\ \leq & \Vert y_{n}-z \Vert + \Vert e_{n} \Vert \\ \leq & \bigl(1-\alpha _{n}(1-k)\bigr) \Vert x_{n}-z \Vert \\ &{} +\alpha _{n}(1-k) \biggl( \frac{1}{1-k} \frac{\theta _{n}}{\alpha _{n}} \Vert x_{n}-x_{n-1} \Vert + \frac{ \Vert f(z)-z \Vert }{1-k} \biggr) +2 \Vert e_{n} \Vert . \end{aligned}$$

So, by condition (C4) and putting \(M = \frac{1}{1-k} ( \|f(z)-z\|+ \sup_{n\in \mathbb{N}} \frac{\theta _{n}}{\alpha _{n}} \|x_{n}-x_{n-1}\| ) \geq 0\) in Lemma 2.6 (i), we conclude that the sequence \(\{\|x_{n}-z\|\}\) is bounded. That is the sequence \(\{x_{n}\}\) is bounded, and so is \(\{z_{n}\}\). Moreover, by condition (C4), \(\sum_{n=1}^{\infty }\|e_{n}\| < \infty \) implies \(\lim_{n\rightarrow \infty } \|e_{n}\| =0\), that is, \(\lim_{n \rightarrow \infty } e_{n} =0\), it follows that the sequence \(\{y_{n}\}\) is also bounded.

Since \(P_{(A+B)^{-1}(0)} f\) is k-contraction on C, by Banach’s contraction principle there exists a unique element \(x^{*} \in C\) such that \(x^{*} = P_{(A+B)^{-1}(0)} f(x^{*})\), that is, \(x^{*} \in (A+B)^{-1}(0)\), it follows that \(x^{*}=J_{\lambda _{n}}^{B} (x^{*}-\lambda _{n} Ax^{*})\). Now, we will show that \(x_{n} \rightarrow x^{*}\) as \(n\rightarrow \infty \). On the other hand, we have

$$\begin{aligned} \bigl\Vert z_{n}-x^{*} \bigr\Vert ^{2} =& \bigl\langle z_{n}-x^{*},z_{n}-x^{*} \bigr\rangle \\ =& \bigl\langle x_{n}+\theta _{n}(x_{n}-x_{n-1})-x^{*},z_{n}-x^{*} \bigr\rangle \\ =& \bigl\langle x_{n}-x^{*},z_{n}-x^{*} \bigr\rangle +\theta _{n} \bigl\langle x_{n}-x_{n-1},z_{n}-x^{*} \bigr\rangle \\ \leq & \bigl\Vert x_{n}-x^{*} \bigr\Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert + \theta _{n} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert \\ \leq & \frac{1}{2} \bigl( \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2}+ \bigl\Vert z_{n}-x^{*} \bigr\Vert ^{2} \bigr)+ \theta _{n} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert . \end{aligned}$$

This implies that

$$ \bigl\Vert z_{n}-x^{*} \bigr\Vert ^{2} \leq \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2}+2\theta _{n} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert . $$
(3.1)

It follows by (3.1), Lemma 2.4, and the firm nonexpansiveness of \(J_{\lambda _{n}}^{B}\) that

$$\begin{aligned}& \bigl\| J_{\lambda _{n}}^{B} (z_{n}-\lambda _{n} Az_{n}+e_{n})-x^{*}\bigr\| ^{2} \\& \quad = \bigl\Vert J_{\lambda _{n}}^{B} (z_{n}-\lambda _{n} Az_{n}+e_{n})-J_{ \lambda _{n}}^{B} \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\& \quad \leq \bigl\Vert (z_{n}-\lambda _{n} Az_{n}+e_{n})-\bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\& \qquad {} - \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})- \bigl(I-J_{ \lambda _{n}}^{B}\bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\& \quad \leq \bigl( \bigl\Vert (z_{n}-\lambda _{n} Az_{n}) -\bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert + \Vert e_{n} \Vert \bigr)^{2} \\& \qquad {} - \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})- \bigl(I-J_{ \lambda _{n}}^{B}\bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\& \quad = \bigl\Vert (I-\lambda _{n} A)z_{n} -(I-\lambda _{n} A)x^{*} \bigr\Vert ^{2} +2 \bigl\Vert (z_{n}- \lambda _{n} Az_{n})- \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert \Vert e_{n} \Vert \\& \qquad {} + \Vert e_{n} \Vert ^{2} - \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})-\bigl(I-J_{ \lambda _{n}}^{B} \bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\& \quad \leq \bigl\Vert z_{n}-x^{*} \bigr\Vert ^{2} -\lambda _{n} (2\alpha -\lambda _{n}) \bigl\Vert Az_{n}-Ax^{*} \bigr\Vert ^{2} \ \\& \qquad {} +2 \bigl\Vert (z_{n}-\lambda _{n} Az_{n})-\bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert \Vert e_{n} \Vert + \Vert e_{n} \Vert ^{2} \\& \qquad {} - \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})- \bigl(I-J_{ \lambda _{n}}^{B}\bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\& \quad \leq \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2}+2\theta _{n} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert -\lambda _{n} (2\alpha -\lambda _{n}) \bigl\Vert Az_{n}-Ax^{*} \bigr\Vert ^{2} \\& \qquad {} +2 \bigl\Vert (z_{n}-\lambda _{n} Az_{n})-\bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert \Vert e_{n} \Vert + \Vert e_{n} \Vert ^{2} \\& \qquad {} - \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})- \bigl(I-J_{ \lambda _{n}}^{B}\bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2}. \end{aligned}$$
(3.2)

We also have

$$\begin{aligned} \bigl\Vert y_{n}-x^{*} \bigr\Vert ^{2} =& \bigl\langle y_{n}-x^{*},y_{n}-x^{*} \bigr\rangle \\ =& \bigl\langle \alpha _{n} f(z_{n})+(1-\alpha _{n})J_{\lambda _{n}}^{B} (z_{n}- \lambda _{n} Az_{n}+e_{n})-x^{*} ,y_{n}-x^{*}\bigr\rangle \\ =& \bigl\langle \alpha _{n} \bigl(f(z_{n})-x^{*} \bigr)+(1-\alpha _{n}) \bigl(J_{\lambda _{n}}^{B} (z_{n}-\lambda _{n} Az_{n}+e_{n})-x^{*} \bigr) ,y_{n}-x^{*}\bigr\rangle \\ =& \alpha _{n} \bigl\langle f(z_{n})-f \bigl(x^{*}\bigr),y_{n}-x^{*} \bigr\rangle + \alpha _{n} \bigl\langle f\bigl(x^{*} \bigr)-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ & {}+(1-\alpha _{n})\bigl\langle J_{\lambda _{n}}^{B} (z_{n}-\lambda _{n} Az_{n}+e_{n})-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ \leq & \alpha _{n} k \bigl\Vert z_{n}-x^{*} \bigr\Vert \bigl\Vert y_{n}-x^{*} \bigr\Vert + \alpha _{n} \bigl\langle f\bigl(x^{*} \bigr)-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ &{} +(1-\alpha _{n}) \bigl\Vert J_{\lambda _{n}}^{B} (z_{n}-\lambda _{n} Az_{n}+e_{n})-x^{*} \bigr\Vert \bigl\Vert y_{n}-x^{*} \bigr\Vert \\ \leq & \frac{1}{2} \alpha _{n} k \bigl( \bigl\Vert z_{n}-x^{*} \bigr\Vert ^{2}+ \bigl\Vert y_{n}-x^{*} \bigr\Vert ^{2} \bigr) +\alpha _{n} \bigl\langle f\bigl(x^{*}\bigr)-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ &{} +\frac{1}{2} (1-\alpha _{n}) \bigl( \bigl\Vert J_{\lambda _{n}}^{B} (z_{n}- \lambda _{n} Az_{n}+e_{n})-x^{*} \bigr\Vert ^{2}+ \bigl\Vert y_{n}-x^{*} \bigr\Vert ^{2} \bigr). \end{aligned}$$

This implies that

$$\begin{aligned} \bigl\Vert y_{n}-x^{*} \bigr\Vert ^{2} \leq & \frac{\alpha _{n} k}{1+\alpha _{n} (1-k)} \bigl\Vert z_{n}-x^{*} \bigr\Vert ^{2} +\frac{2\alpha _{n}}{1+\alpha _{n} (1-k)} \bigl\langle f \bigl(x^{*}\bigr)-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ &{} +\frac{1-\alpha _{n}}{1+\alpha _{n} (1-k)} \bigl\Vert J_{\lambda _{n}}^{B} (z_{n}- \lambda _{n} Az_{n}+e_{n})-x^{*} \bigr\Vert ^{2}. \end{aligned}$$
(3.3)

Hence, by (3.1), (3.2), (3.3), the nonexpansiveness of \(J_{\lambda _{n}}^{B}\), and \(I-\lambda _{n} A\), we obtain

$$\begin{aligned} \bigl\Vert x_{n+1}-x^{*} \bigr\Vert ^{2} =& \bigl\Vert J_{\lambda _{n}}^{B} (y_{n}-\lambda _{n} Ay_{n}+e_{n}) - J_{\lambda _{n}}^{B} \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\ \leq & \bigl\Vert (y_{n}-\lambda _{n} Ay_{n}+e_{n}) - \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \\ \leq & \bigl( \bigl\Vert y_{n}-x^{*} \bigr\Vert + \Vert e_{n} \Vert \bigr)^{2} \\ =& \bigl\Vert y_{n}-x^{*} \bigr\Vert ^{2} + 2 \bigl\Vert y_{n}-x^{*} \bigr\Vert \Vert e_{n} \Vert + \Vert e_{n} \Vert ^{2} \\ \leq & \frac{\alpha _{n} k}{1+\alpha _{n} (1-k)} \bigl( \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2}+2\theta _{n} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert \bigr) \\ &{} + 2 \bigl\Vert y_{n}-x^{*} \bigr\Vert \Vert e_{n} \Vert + \Vert e_{n} \Vert ^{2} + \frac{2\alpha _{n}}{1+\alpha _{n} (1-k)} \bigl\langle f\bigl(x^{*} \bigr)-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ &{} +\frac{1-\alpha _{n}}{1+\alpha _{n} (1-k)} \bigl( \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2}+2 \theta _{n} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert \\ &{} -\lambda _{n} (2\alpha -\lambda _{n}) \bigl\Vert Az_{n}-Ax^{*} \bigr\Vert ^{2} +2 \bigl\Vert (z_{n}- \lambda _{n} Az_{n})- \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert \Vert e_{n} \Vert \\ &{} + \Vert e_{n} \Vert ^{2}- \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})-\bigl(I-J_{ \lambda _{n}}^{B} \bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \bigr). \end{aligned}$$

It follows that

$$\begin{aligned} \bigl\| x_{n+1}-x^{*}\bigr\| ^{2} \leq & \biggl( 1-\frac{\alpha _{n} (1-k)}{1+\alpha _{n} (1-k)} \biggr) \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2} \\ & {}+ \frac{\alpha _{n} (1-k)}{1+\alpha _{n} (1-k)} \biggl( \frac{2}{1-k} \frac{\theta _{n}}{\alpha _{n}} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert \\ & {}+ \frac{4}{1-k} \frac{ \Vert e_{n} \Vert }{\alpha _{n}} \bigl\Vert y_{n}-x^{*} \bigr\Vert + \frac{2}{1-k} \frac{ \Vert e_{n} \Vert }{\alpha _{n}} \Vert e_{n} \Vert +\frac{2}{1-k} \bigl\langle f\bigl(x^{*}\bigr)-x^{*} ,y_{n}-x^{*} \bigr\rangle \\ &{} +\frac{2}{1-k} \frac{ \Vert e_{n} \Vert }{\alpha _{n}} \bigl\Vert (z_{n}-\lambda _{n} Az_{n})- \bigl(x^{*}- \lambda _{n} Ax^{*}\bigr) \bigr\Vert \biggr) \end{aligned}$$

and

$$\begin{aligned} \bigl\| x_{n+1}-x^{*}\bigr\| ^{2} \leq & \bigl\Vert x_{n}-x^{*} \bigr\Vert ^{2} - \bigl( \lambda _{n} (2\alpha -\lambda _{n}) \bigl\Vert Az_{n}-Ax^{*} \bigr\Vert ^{2} \\ &{} + \bigl\Vert \bigl(I-J_{\lambda _{n}}^{B}\bigr) (z_{n}-\lambda _{n} Az_{n}+e_{n})- \bigl(I-J_{ \lambda _{n}}^{B}\bigr) \bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert ^{2} \bigr) \\ &{} + \biggl( 2\alpha _{n} \frac{\theta _{n}}{\alpha _{n}} \Vert x_{n}-x_{n-1} \Vert \bigl\Vert z_{n}-x^{*} \bigr\Vert +2\alpha _{n} \frac{ \Vert e_{n} \Vert }{\alpha _{n}} \bigl\Vert y_{n}-x^{*} \bigr\Vert +2 \Vert e_{n} \Vert ^{2} \\ &{} +2\alpha _{n} \bigl\Vert f\bigl(x^{*} \bigr)-x^{*} \bigr\Vert \bigl\Vert y_{n}-x^{*} \bigr\Vert + 2 \bigl\Vert (z_{n}- \lambda _{n} Az_{n})-\bigl(x^{*}-\lambda _{n} Ax^{*}\bigr) \bigr\Vert \Vert e_{n} \Vert \biggr), \end{aligned}$$

which are of the forms

$$ s_{n+1} \leq (1-\gamma _{n})s_{n} + \gamma _{n} \delta _{n} $$

and

$$ s_{n+1} \leq s_{n} -\eta _{n} + \rho _{n}, $$

respectively, where \(s_{n}=\|x_{n}-x^{*}\|^{2}\), \(\gamma _{n} = \frac{\alpha _{n} (1-k)}{1+\alpha _{n} (1-k)}\), \(\delta _{n} = \frac{2}{1-k} \frac{\theta _{n}}{\alpha _{n}} \| x_{n}-x_{n-1}\| \|z_{n}-x^{*} \| + \frac{4}{1-k} \frac{\|e_{n}\|}{\alpha _{n}} \|y_{n}-x^{*}\| + \frac{2}{1-k}\frac{\|e_{n}\|}{\alpha _{n}}\|e_{n}\| +\frac{2}{1-k} \langle f(x^{*})-x^{*} ,y_{n}-x^{*} \rangle +\frac{2}{1-k} \frac{\|e_{n}\|}{\alpha _{n}} \|(z_{n}-\lambda _{n} Az_{n})-(x^{*}- \lambda _{n} Ax^{*})\|\), \(\eta _{n} = \lambda _{n} (2\alpha -\lambda _{n}) \|Az_{n}-Ax^{*}\|^{2} +\|(I-J_{\lambda _{n}}^{B})(z_{n}-\lambda _{n} Az_{n}+e_{n})-(I-J_{ \lambda _{n}}^{B})(x^{*}-\lambda _{n} Ax^{*})\|^{2} \) and \(\rho _{n} =2\alpha _{n} \frac{\theta _{n}}{\alpha _{n}} \| x_{n}-x_{n-1} \| \|z_{n}-x^{*} \|+2\alpha _{n} \frac{\|e_{n}\|}{\alpha _{n}}\|y_{n}-x^{*} \|+2\|e_{n}\|^{2} +2\alpha _{n} \|f(x^{*})-x^{*}\| \|y_{n}-x^{*}\| + 2 \|(z_{n}-\lambda _{n} Az_{n})-(x^{*}-\lambda _{n} Ax^{*})\| \|e_{n}\| \). Therefore, using conditions (C1), (C3), and (C4), we can check that all those sequences satisfy conditions (i) and (ii) in Lemma 2.7. To complete the proof, we verify that condition (iii) in Lemma 2.7 is satisfied. Let \(\lim_{i\rightarrow \infty }\eta _{n_{i}} = 0\). Then, by condition (C2), we have

$$ \lim_{i\rightarrow \infty } \bigl\Vert Az_{n_{i}}-Ax^{*} \bigr\Vert = 0 $$
(3.4)

and

$$ \lim_{i\rightarrow \infty } \bigl\Vert \bigl(I-J_{\lambda _{n_{i}}}^{B} \bigr) (z_{n_{i}}- \lambda _{n_{i}} Az_{n_{i}}+e_{n_{i}})- \bigl(I-J_{\lambda _{n_{i}}}^{B}\bigr) \bigl(x^{*}- \lambda _{n_{i}} Ax^{*}\bigr) \bigr\Vert = 0. $$

It follows by conditions (C2), (C4) and (3.4) that

$$\begin{aligned}& \lim_{i\rightarrow \infty } \bigl\Vert (z_{n_{i}}-\lambda _{n_{i}} Az_{n_{i}}+e_{n_{i}})-J_{ \lambda _{n_{i}}}^{B} (z_{n_{i}}-\lambda _{n_{i}} Az_{n_{i}}+e_{n_{i}}) \\& \quad {}-\bigl(\bigl(x^{*}-\lambda _{n_{i}} Ax^{*} \bigr)-J_{\lambda _{n_{i}}}^{B} \bigl(x^{*}- \lambda _{n_{i}} Ax^{*}\bigr)\bigr) \bigr\Vert = 0, \\& \lim_{i\rightarrow \infty } \bigl\Vert z_{n_{i}}-J_{\lambda _{n_{i}}}^{B} (z_{n_{i}}- \lambda _{n_{i}} Az_{n_{i}}) \bigr\Vert = 0. \end{aligned}$$
(3.5)

Consider a subsequence \(\{x_{n_{i}}\}\) of \(\{x_{n}\}\). As \(\{x_{n}\}\) is bounded, so is \(\{x_{n_{i}}\}\), there exists a subsequence \(\{x_{n_{i_{j}}}\}\) of \(\{x_{n_{i}}\}\) which converges weakly to \(x \in C\). Without loss of generality, we can assume that \(x_{n_{i}} \rightharpoonup x\) as \(i\rightarrow \infty \). On the other hand, by conditions (C1) and (C4), we have

$$ \lim_{i\rightarrow \infty } \Vert z_{n_{i}}-x_{n_{i}} \Vert = \lim_{i \rightarrow \infty } \alpha _{n_{i}} \frac{\theta _{n_{i}}}{\alpha _{n_{i}}} \Vert x_{n_{i}}-x_{{n_{i}}-1} \Vert = 0. $$

It follows that \(z_{n_{i}} \rightharpoonup x\) as \(i\rightarrow \infty \). Hence, by (3.5) and the demiclosedness at zero in Lemma 2.5, we obtain \(x \in \text{Fix}(J_{\lambda _{n_{i}}}^{B} (I-\lambda _{n_{i}}A))\), that is, \(x \in (A+B)^{-1}(0)\). Since

$$\begin{aligned} \Vert y_{n_{i}}-z_{n_{i}} \Vert \leq \alpha _{n_{i}} \bigl\Vert f(z_{n_{i}})-z_{n_{i}} \bigr\Vert +(1-\alpha _{n_{i}}) \bigl\Vert J_{\lambda _{n_{i}}}^{B} (z_{n_{i}}-\lambda _{n_{i}}Az_{n_{i}}+e_{n_{i}})-z_{n_{i}} \bigr\Vert , \end{aligned}$$

then, by (3.5) and conditions (C1) and (C4), we obtain

$$ \lim_{i\rightarrow \infty } \Vert y_{n_{i}}-z_{n_{i}} \Vert = 0. $$

It implies that \(y_{n_{i}} \rightharpoonup x\) as \(i\rightarrow \infty \). Therefore, by Lemma 2.1(iii), we obtain

$$ \limsup_{i\rightarrow \infty }\bigl\langle f\bigl(x^{*} \bigr)-x^{*},y_{n_{i}}-x^{*} \bigr\rangle = \bigl\langle f\bigl(x^{*}\bigr)-x^{*},x-x^{*} \bigr\rangle \leq 0. $$

It follows by conditions (C1), (C3), and (C4) that \(\limsup_{i\rightarrow \infty } \delta _{n_{i}} \leq 0\). So, by Lemma 2.7, we conclude that \(x_{n} \rightarrow x^{*}\) as \(n\rightarrow \infty \). This completes the proof. □

Remark 3.2

Indeed, the parameter \(\theta _{n}\) can be chosen as follows:

$$ \theta _{n} = \textstyle\begin{cases} \min \{ \frac{\omega _{n}}{ \Vert x_{n}-x_{n-1} \Vert }, \alpha _{n} \} & \text{if } x_{n} \neq x_{n-1}, \\ \alpha _{n} & \text{otherwise}, \end{cases}\displaystyle \quad \forall n\in \mathbb{N} $$

or

$$ \theta _{n} = \textstyle\begin{cases} \textstyle\begin{cases} \sigma _{n} \in [0,1) \text{ such that }\sigma _{n} \rightarrow 0 \text{ as } n\rightarrow \infty \text{ or} \\ \sigma _{n} \in [0,1) \text{ such that }\sigma _{n} \rightarrow 1 \text{ as } n\rightarrow \infty \text{ or} \\ \sigma _{n} \in [0,1) \text{ to be chosen arbitrarily} \end{cases}\displaystyle & \text{if } n \leq N, \\ \textstyle\begin{cases} \min \{ \frac{\omega _{n}}{ \Vert x_{n}-x_{n-1} \Vert }, \alpha _{n} \} & \text{if } x_{n} \neq x_{n-1}, \\ \alpha _{n} & \text{otherwise}, \end{cases}\displaystyle & \text{otherwise}, \end{cases}\displaystyle \quad \forall n\in \mathbb{N}, $$

where \(N \in \mathbb{N}\) and \(\{\omega _{n}\}\) is a positive sequence such that \(\omega _{n} = o(\alpha _{n})\).

4 IFISTA

Let \(F:H \rightarrow \mathbb{R}\) be a convex and differentiable function and \(G:H \rightarrow \mathbb{R}\cup \{+\infty \}\) be a convex and lower semi-continuous function such that the gradient ∇F is L-Lipschitz continuous and ∂G is the subdifferential of G. It is well known that if ∇F is L-Lipschitz continuous, then it is \(\frac{1}{L}\)-inverse strongly monotone [32]. Moreover, ∂G is maximal monotone [33]. Putting \(A=\nabla F\), \(B=\partial G\), and \(\alpha = \frac{1}{L}\) into Theorem 3.1, we obtain the following result.

Theorem 4.1

Let H be a real Hilbert space. Let \(F: H\rightarrow \mathbb{R}\) be a convex and differentiable function with L-Lipschitz continuous gradient ∇F and \(G: H \rightarrow \mathbb{R}\) be a convex and lower semi-continuous function. Let f be a k-contraction mapping of H into itself, and assume that \((\nabla F+\partial G)^{-1}(0)\) is nonempty. Let \(x_{0},x_{1} \in H\) and \(\{x_{n}\} \subset H\) be a sequence generated by

$$\begin{aligned} \textstyle\begin{cases} z_{n} = x_{n}+\theta _{n} (x_{n}-x_{n-1}), \\ y_{n} = \alpha _{n} f(z_{n})+(1-\alpha _{n})\textit{prox}_{\lambda _{n} G} (z_{n}-\lambda _{n} \nabla F(z_{n})+e_{n}), \\ x_{n+1} = \textit{prox}_{\lambda _{n} G}(y_{n}-\lambda _{n} \nabla F(y_{n})+e_{n}), \end{cases}\displaystyle \end{aligned}$$

for all \(n \in \mathbb{N}\), where \(\{\alpha _{n}\} \subset (0,1)\), \(\{\lambda _{n}\} \subset (0, \frac{2}{L})\), \(\{e_{n}\} \subset H\), and \(\{\theta _{n}\} \subset [0,\theta ]\) such that \(\theta \in [0,1)\) satisfy the following conditions:

  1. (C1)

    \(\lim_{n\rightarrow \infty } \alpha _{n} = 0\) and \(\sum_{n=1}^{\infty }\alpha _{n} = \infty \),

  2. (C2)

    \(0< a\leq \lambda _{n} \leq b < \frac{2}{L}\) for some \(a,b>0\),

  3. (C3)

    \(\lim_{n\rightarrow \infty }\frac{\|e_{n}\|}{\alpha _{n}}=0\),

  4. (C4)

    \(\sum_{n=1}^{\infty }\|e_{n}\| < \infty \) and \(\lim_{n\rightarrow \infty } \frac{\theta _{n}}{\alpha _{n}}\|x_{n}-x_{n-1} \| = 0\),

then the sequence \(\{x_{n}\}\) converges strongly to a point \(x^{*} \in (\nabla F+\partial G)^{-1}(0)\), where \(x^{*} = P_{(\nabla F+\partial G)^{-1}(0)} f(x^{*})\).

We focus on the image restoration using the fixed point optimization algorithm in Theorem 4.1. The image deblurring problem is in the form

$$\begin{aligned} Ax = b+\varepsilon , \end{aligned}$$
(4.1)

where \(A \in \mathbb{R}^{m \times n}\) represents a known blurring operator (which is called the point spread function: PSF), \(b \in \mathbb{R}^{m}\) is a known observed blurred (and additive noisy) image, \(\varepsilon \in \mathbb{R}^{m}\) is an unknown additive white Gaussian noise, and \(x \in \mathbb{R}^{n}\) is an unknown signal/image to be restored (estimated). Both b and x are formed by stacking the columns of their corresponding two-dimensional images.

In order to solve problem (4.1), we introduce the least absolute shrinkage and selection operator (LASSO) of Tibshirani [34] for solving the following minimization problem:

$$ \min_{x \in \mathbb{R}^{n}} \bigl\lbrace \Vert Ax-b \Vert _{2}^{2} +\lambda \Vert Wx \Vert _{1} \bigr\rbrace , $$
(4.2)

where \(\lambda > 0\) is a regularization parameter and \(W: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\) represents the orthogonal or tight frame wavelet synthesis, which is a special case of the convex minimization problem (1.2) when \(F(x) = \|Ax-b\|_{2}^{2}\) and \(G(x) = \lambda \|Wx\|_{1}\) such that \(\|x\|_{1} = \sum_{i=1}^{n} |x_{i}|\) and \(\|x\|_{2} = \sqrt{\sum_{i=1}^{n} |x_{i}|^{2}}\) for all \(x = (x_{1},x_{2},\ldots,x_{n})^{T} \in \mathbb{R}^{n}\). It is well known from Lemma 2.2 by putting \(T(Ax) = P_{\mathbb{R}^{m}} Ax = b\) that \(\nabla F(x) = 2A^{T} (Ax-b)\) and ∇F is L-Lipschitzian with \(L=2\|A\|^{2}\) such that \(A^{T}\) stands the transpose of A, and \(\|A\|\) is the largest singular value of A (i.e., the square root of the largest eigenvalue of the matrix \(A^{T} A\)) or the spectral norm \(\|A\|_{2}\).

In this image deblurring case using Theorem 4.1, if the blurring operator A is smaller than the observed blurred image b and the restored image x, then it is changed by padPSF in MATLAB to embed its array to the matrix \(A_{\text{big}} \in \mathbb{R}^{m \times n}\), and followed by a transformation to the signal matrix \(A_{\text{sig}} \in \mathbb{R}^{m \times n}\) for calculating the matrix \(A_{\text{eig}} = (a^{\text{eig}}_{ij})\in \mathbb{R}^{m \times n}\) of eigenvalues of the signal matrix \(A_{\text{sig}}\) using the discrete fast Fourier transformation (FFT) or the discrete cosine transformation (DCT). That is,

$$ L = 2 \Vert A_{\text{eig}} \Vert _{\text{max}}^{2} = 2 \Bigl(\max_{ij} \bigl\vert a^{\text{eig}}_{ij} \bigr\vert \Bigr)^{2}. $$

We set \(m=n\) and the process of gradient ∇F always maps the signal x to 2 times of the signal \(A^{T} (Ax-b)\), where x, \(A^{T}\), A, and b are in the form of the signal transformation FFT or DCT. That is,

$$ \nabla F(x) := \nabla F(x_{\text{sig}}) = 2A_{\text{eig}}^{T} (A_{\text{eig}}x_{\text{sig}}-b_{\text{sig}}) := 2 \underbrace{A^{T}(Ax-b)}_{ \text{signal form in } \mathbb{R}^{m}}, $$

where \(A_{\text{eig}}^{T} = A_{\text{eig}}^{-1}\) such that \(A_{\text{eig}}^{T}\) and \(A_{\text{eig}}^{-1}\) stand for the transpose and the inverse signal transform of the eigenvalues matrix \(A_{\text{eig}}\), respectively.

By [35] and the reference therein, for all \(u=(u_{1},u_{2},\ldots,u_{m})^{T} \in \mathbb{R}^{m}\) and for each \(n \in \mathbb{N}\), we have

$$ \text{prox}_{\lambda _{n} G}(u) = \text{prox}_{\lambda _{n} \lambda \|Wu \|_{1}}(u) = v $$

such that \(v = (v_{1},v_{2},\ldots,v_{m})^{T} \in \mathbb{R}^{m}\), where \(v_{i} = \text{sign}((Wu)_{i})\max \{|(Wu)_{i}|-\lambda _{n} \lambda ,0 \}\) for all \(i=1,2,\ldots,m\). When the process of \(\text{prox}_{\lambda _{n} G}\) has been finished, it returns to \(W^{-1}(\text{prox}_{\lambda _{n} G}(u))\), where \(W^{-1}\) stands for the inverse wavelet synthesis such that \(W^{-1}(\cdot ) = W^{T}(\cdot )\) before continuing other processes. That is,

$$ \text{prox}_{\lambda _{n} G} \bigl(z_{n}-\lambda _{n} \nabla F(z_{n})+e_{n}\bigr) = W^{-1} \bigl(\text{prox}_{\lambda _{n} \lambda \|W(\cdot )\|_{1}} \bigl(z_{n}-2 \lambda _{n} \underbrace{A^{T} (Az_{n}-b)}_{\text{signal form in } \mathbb{R}^{m}}+e_{n} \bigr)\bigr) $$

and

$$ \text{prox}_{\lambda _{n} G} \bigl(y_{n}-\lambda _{n} \nabla F(y_{n})+e_{n}\bigr) = W^{-1} \bigl(\text{prox}_{\lambda _{n} \lambda \|W(\cdot )\|_{1}} \bigl(y_{n}-2 \lambda _{n} \underbrace{A^{T} (Ay_{n}-b)}_{\text{signal form in } \mathbb{R}^{m}}+e_{n} \bigr)\bigr) $$

for all \(n \in \mathbb{N}\).

In the next section, we present IFISTA in Algorithm 1 to the improved fast iterative shrinkage thresholding algorithm [19] in the same programming techniques [36].

Algorithm 1
figure a

Improved fast iterative shrinkage thresholding algorithm (IFISTA).

5 Applications and numerical examples

In this section, we illustrate the performance of IFISTA compared with IFBS, FISTA, FBMWA, FBMMMA, NAGA, and SCTIP for solving the image deblurring problem (4.1) through LASSO problem (4.2) with \(\lambda = 10^{-4}\). We implemented them in MATLAB R2019a to solve and run on personal laptop Intel(R) Core(TM) i5-8250U CPU @1.80 GHz 8 GB RAM. We use the quality measures (it is better if it is larger value) of the image restoration as follows.

Let \(x, x_{n} \in \mathbb{R}^{M \times N}\) represent the original image and the estimated image at \(n^{\text{th}}\) iteration(s), respectively.

  1. (1)

    For looking at how strong the signal is and how strong the noise is, the measure is the signal-to-noise ratio (SNR) of the images x and \(x_{n}\), which is defined (measured in decibels: dB) by

    $$ \text{SNR}(x,x_{n}) = 10\log _{10} \frac{ \Vert x_{n} \Vert _{2}^{2}}{ \Vert x-x_{n} \Vert _{2}^{2}}. $$
  2. (2)

    For looking at how signal peak is, the measure is the peak signal-to-noise ratio (PSNR) of the images x and \(x_{n}\), which is defined (measured in decibels: dB) by

    $$ \text{PSNR}(x,x_{n}) = 10\log _{10} \frac{\text{MAX}^{2}}{\text{MSE}(x,x_{n})} = 10\log _{10} \frac{\text{MAX}^{2}}{\frac{1}{cMN} \Vert x-x_{n} \Vert _{2}^{2}} $$

    where MAX is the maximum possible pixel value of the m-unit class (m-bit) image such that MAX = \(2^{m}-1\) (for instance, MAX = 255 for 8-bits image and MAX = 65535 for 16-bits image), and \(\mathrm{MSE} (x,x_{n})\) is the mean squared error (MSE) of the images x and \(x_{n}\), which is defined by \(\text{MSE}(x,x_{n}) = \frac{1}{cMN}\|x-x_{n}\|_{2}^{2}\) such that the images x and \(x_{n}\) are c-multichannel image (for instance, \(c=1\) for gray or monochrome image, \(c = 3\) for RGB color image, and \(c=4\) for CMYK color image).

    Similarly, this measure is the improvement in signal-to-noise ratio (ISNR) of the images x, \(x_{n}\), and b where the image \(b\in \mathbb{R}^{M \times N}\) represents the observed blurred (and additive noisy) c-multichannel image, which is defined (measured in decibels: dB) by

    $$\begin{aligned} \text{ISNR}(x,x_{n},b) =& \text{PSNR}(x,x_{n})-\text{PSNR}(x,b) \\ =& 10\log _{10} \frac{\text{MAX}^{2}}{\frac{1}{cMN} \Vert x-x_{n} \Vert _{2}^{2}} - 10\log _{10} \frac{\text{MAX}^{2}}{\frac{1}{cMN} \Vert x-b \Vert _{2}^{2}} \\ =& 10\log _{10} \frac{ \Vert x-b \Vert _{2}^{2}}{ \Vert x-x_{n} \Vert _{2}^{2}}. \end{aligned}$$

For comparison, we consider the standard test images downloaded from [37] for Cameraman, Woman, Pirate, and Living room, with each monochrome images consisting of \(512 \times 512\) pixels, which represent the original images \(x \in \mathbb{R}^{512 \times 512}\), and we converted them to the double class type by im2double(imread(‘image_name’)) in MATLAB, and also its 2D three-stage Haar wavelet transform \(Wx \in \mathbb{R}^{512 \times 512}\) as in Fig. 1.

Figure 1
figure 1

Original images and their 2D three-stage Haar wavelet transform

The original images went through a Gaussian blur of size \(9 \times 9\) and standard deviation 4 as a point spread function (PSF) which represents the blurring operator A by fspecial or psfGauss in MATLAB, and went through imfilter in MATLAB (computed by mirror-reflecting as the array across-the array border or symmetric) and followed by an additive zero-mean white Gaussian noise with standard deviation 10−3, which represents the observed blurred and noisy image \(b \in \mathbb{R}^{512 \times 512}\) as in Fig. 2. The PSF A was changed by padPSF in MATLAB to embed its array to the matrix \(A_{\text{big}} \in \mathbb{R}^{512 \times 512}\), and it transformed to a signal matrix \(A_{\text{sig}} \in \mathbb{R}^{512 \times 512}\) for calculating the matrix \(A_{\text{eig}} = (a^{\text{eig}}_{ij})\in \mathbb{R}^{512 \times 512}\) of eigenvalues of the signal matrix \(A_{\text{sig}}\) using the discrete cosine transformation (DCT). That is, \(L = 2\|A_{\text{eig}}\|_{\text{max}}^{2} = 2(\max_{ij} |a^{\text{eig}}_{ij}|)^{2} \).

Figure 2
figure 2

Observed blurred and noisy images

In compared algorithms, all parameters have been set to their high performance. For each \(n \in \mathbb{N}\), we set

$$\begin{aligned}& \alpha _{n} = \textstyle\begin{cases} \frac{10^{-6}}{n+1} & \text{for FBMWA, NAGA, SCTIP, IFISTA}, \\ \frac{1}{2} (1-\frac{10^{-6}}{n+1} ) & \text{for FBMMMA}, \end{cases}\displaystyle \\& \beta _{n} = \textstyle\begin{cases} \frac{2n}{5n+1} & \text{for SCTIP}, \\ \frac{10^{-6}}{n+1} & \text{for FBMWA}, \\ \frac{1}{2} (1-\frac{10^{-6}}{n+1} ) & \text{for FBMMMA}, \end{cases}\displaystyle \gamma _{n} = \textstyle\begin{cases} 1-10^{-6}-\frac{10^{-6}}{n+1} & \text{for FBMWA,} \\ 1-\frac{10^{-6}}{n+1} & \text{for FBMMMA,} \end{cases}\displaystyle \end{aligned}$$

and by [35], we introduced the best choice types of testing the parameter \(\lambda _{n}\) for the fast convergence as in Table 1 (see also, Tables 1–4 of Examples 4.3, 4.5, 4.8, and 4.10 in [35], respectively) such that \(M = \frac{1}{L}\) of \(\frac{1}{M}\)-Lipschitzian continuous gradient ∇F, it follows the setting to its high performance that

$$\begin{aligned}& \lambda _{n} = \textstyle\begin{cases} 2M-\frac{M}{10} & \text{for IFBS, FBMWA, FBMMMA, SCTIP, IFISTA (A5 type}), \\ \frac{M(n+2)}{n+1} & \text{for NAGA (B2 type}), \end{cases}\displaystyle \\& \theta _{n} = \textstyle\begin{cases} \sigma _{n} = \frac{t_{n}-1}{t_{n+1}} \text{ such that } t_{1} = 1 \text{ and } t_{n+1} = \frac{1+\sqrt{1+4t_{n}^{2}}}{2} & \text{if } n \leq 100, \\ (\text{except FISTA, for all $n \in \mathbb{N}$} ) & \\ \textstyle\begin{cases} \frac{1}{2^{n}} \text{ for FBMWA, FBMMMA}, \\ \textstyle\begin{cases} \min \{\frac{1/(n+1)^{2}}{ \Vert x_{n}-x_{n-1} \Vert ^{2}_{2}},0.5\} & \text{if } x_{n} \neq x_{n-1}, \\ 0 & \text{otherwise}, \end{cases}\displaystyle & \text{for IFBS} \\ \textstyle\begin{cases} \min \{ \frac{1/(n+1)^{3}}{ \Vert x_{n}-x_{n-1} \Vert _{2}}, \alpha _{n} \} & \text{if } x_{n} \neq x_{n-1}, \\ \alpha _{n} & \text{otherwise}, \end{cases}\displaystyle & \text{otherwise}, \end{cases}\displaystyle & \text{otherwise}, \end{cases}\displaystyle \end{aligned}$$

and the error sequence \(\{e_{n}\}\subset \mathbb{R}^{512\times 512}\) such that

$$ e_{n} = \textstyle\begin{cases} \frac{b}{n^{n}} & \text{if }n \leq 100, \\ \frac{b}{(n+1)^{3}} & \text{otherwise.} \end{cases} $$
Table 1 The best choice types of testing the parameter \(\lambda _{n}\) for the fast convergence

We also set \(f(x) = \frac{x}{5}\) for all \(x \in \mathbb{R}^{512\times 512}\) and choose the initials \(x_{0} = x_{1} = b\) for all algorithms (except for FISTA, \(y_{1} = x_{0} = b\)). Quoting from [38], we can use max-norm regularization, this constrains the norm of the vector of incoming weights at each hidden unit to be bound by a constant c. Max-norm regularization was used for weights in both convolutional and fully connected layers. Since \(L = 2\|A_{\text{eig}}\|_{\text{max}}^{2}\), so we can use SNR and ISNR that both are two quality measures of the image restoration (it is better if it is larger value) to find each hidden estimated image before its convergence in first 1st to 100th iteration(s) to show high performance of each compared algorithm. That is, we find the hidden estimated images \(x^{*}\) and \(y^{*}\) such that

$$ \text{SNR}\bigl(x,x^{*}\bigr) = \max_{1\leq n \leq 100} \text{SNR}(x,x_{n}) \text{ and } \text{ISNR} \bigl(x,y^{*},b\bigr) = \max_{1\leq n \leq 100} \text{ISNR}(x,x_{n},b), $$

it is better if \(x^{*} = y^{*}\) (which means that both hidden estimated images are in the process of the same iteration), which is shown in Table 2. Moreover, we also show the relative error which is defined by

$$ \frac{ \Vert x_{n+1}-x_{n} \Vert _{2}}{ \Vert x_{n} \Vert _{2}} \leq \text{tol}, $$

where tol denotes a prescribed tolerance value of each compared algorithm at first 1000th iterations by the constants SNR and ISNR as in Table 3, and their convergence behavior are shown in Fig. 3 and Fig. 4.

Figure 3
figure 3

The SNR convergence behavior of image deblurring

Figure 4
figure 4

The ISNR convergence behavior of image deblurring

Table 2 The maximum of SNR and ISNR values in first 1st to 100th iteration(s) for image deblurring
Table 3 The SNR and ISNR values at first 1000th iterations for image deblurring

In evaluation of each algorithm, we use the image processing mean to assess them such that nÌ…, \(\overline{\text{SNR}}\), \(\overline{\text{ISNR}}\), \(\overline{\text{CPU}}\), and \(\overline{\text{tol}}\) are the compared image processing arithmetic mean of n, SNR, ISNR, CPU times, and tol, respectively.

On the results of each algorithms in first 1st to 100th iteration(s) as in Table 2, we see that IFBS, FISTA, and IFISTA have the quantities of SNR and ISNR near to others (except for SCTIP) and their quantities of n and CPU times are vastly different from others. We give an evaluation order for those algorithms as in Table 4 as follows: \(\text{CPU} \leq \overline{\text{CPU}}\), \(\text{SNR} \geq \overline{\text{SNR}}\), \(\text{ISNR} \geq \overline{\text{ISNR}}\), and \(n \leq \overline{n}\), respectively. We see that all evaluations of IFISTA are satisfied, while only CPU times, SNR, and ISNR evaluations of both IFBS and FISTA are satisfied, where SNR and ISNR of IFBS are greater than FISTA, and then, we conclude that IFISTA, IFBS, and FISTA are in the 1st, 2nd, and 3rd place, respectively, of the top three fast processing with high performance for compared image deblurring as Fig. 5, Fig. 6, Fig. 7, and Fig. 8.

Figure 5
figure 5

Top three fast processing with high performance for deblurring of Cameraman

Figure 6
figure 6

Top three fast processing with high performance for deblurring of Woman

Figure 7
figure 7

Top three fast processing with high performance for deblurring of Pirate

Figure 8
figure 8

Top three fast processing with high performance for deblurring of Living room

Table 4 The effectiveness comparison of image deblurring

From results of each algorithms at first 1000th iterations as in Table 3, we see that the quantities of SNR and ISNR of all algorithms are vastly different. We give an evaluation order for those algorithms as in Table 4 as follows: \(\text{CPU} \leq \overline{\text{CPU}}\), \(\text{SNR} \geq \overline{\text{SNR}}\), \(\text{ISNR} \geq \overline{\text{ISNR}}\), \(n \leq \overline{n}\), and \(\text{tol} \leq \overline{\text{tol}}\), respectively. We see that all evaluations of IFBS, NAGA, and IFISTA are satisfied, where SNR and ISNR of NAGA are greater than both IFBS and IFISTA, and also which of IFBS are greater than IFISTA, and then, we conclude that NAGA, IFBS, and IFISTA are in the 1st, 2nd, and 3rd place, respectively, of the top three fast convergence with good performance for compared image deblurring as Fig. 9, Fig. 10, Fig. 11, and Fig. 12.

Figure 9
figure 9

Top three fast convergence with good performance for deblurring of Cameraman

Figure 10
figure 10

Top three fast convergence with good performance for deblurring of Woman

Figure 11
figure 11

Top three fast convergence with good performance for deblurring of Pirate

Figure 12
figure 12

Top three fast convergence with good performance for deblurring of Living room

6 Conclusion

A new iterative forward-backward splitting method with an error is obtained in our main result. It can be applied to improve the fast iterative shrinkage thresholding algorithm (IFISTA) with an error for solving the image deblurring problem. Under the same programming techniques [36] and setting all parameters to their high performance, we obtain the following results.

  1. 1.

    For looking at the fast processing with high performance for compared image deblurring, IFISTA, IFBS, and FISTA are in the 1st, 2nd, and 3rd place, respectively, and all are better than FBMWA, FBMMMA, NAGA, and SCTIP.

  2. 2.

    For looking at the fast convergence with good performance for compared image deblurring, NAGA, IFBS, and IFISTA are in the 1st, 2nd, and 3rd place, respectively, and all are better than FISTA, FBMWA, FBMMMA, and SCTIP.

Availability of data and materials

Not applicable.

References

  1. Bauschke, H.H.: The approximation of fixed points of compositions of nonexpansive mappings in Hilbert space. J. Math. Anal. Appl. 202, 150–159 (1996)

    Article  MathSciNet  Google Scholar 

  2. Chidume, C.E., Bashir, A.: Convergence of path and iterative method for families of nonexpansive mappings. Appl. Anal. 67, 117–129 (2008)

    Article  MathSciNet  Google Scholar 

  3. Halpern, B.: Fixed points of nonexpansive maps. Bull. Am. Math. Soc. 73, 957–961 (1967)

    Article  Google Scholar 

  4. Ishikawa, S.: Fixed points by a new iteration method. Proc. Am. Math. Soc. 44, 147–150 (1974)

    Article  MathSciNet  Google Scholar 

  5. Klen, R., Manojlovic, V., Simic, S., Vuorinen, M.: Bernoulli inequality and hypergeometric functions. Proc. Am. Math. Soc. 142, 559–573 (2014)

    Article  MathSciNet  Google Scholar 

  6. Kunze, H., La Torre, D., Mendivil, F., Vrscay, E.R.: Generalized fractal n transforms and self-similar objects in cone metric spaces. Comput. Math. Appl. 64, 1761–1769 (2012)

    Article  MathSciNet  Google Scholar 

  7. Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4, 506–510 (1953)

    Article  MathSciNet  Google Scholar 

  8. Radenovic, S., Rhoades, B.E.: Fixed point theorem for two non-self mappings in cone metric spaces. Comput. Math. Appl. 57, 1701–1707 (2009)

    Article  MathSciNet  Google Scholar 

  9. Todorcevic, V.: Harmonic Quasiconformal Mappings and Hyperbolic Type Metrics. Springer, Basel (2019)

    Book  Google Scholar 

  10. Byrne, C.: Iterative oblique projection onto convex subsets and the split feasibility problem. Inverse Probl. 18, 441–453 (2002)

    Article  Google Scholar 

  11. Byrne, C.: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Probl. 20, 103–120 (2004)

    Article  MathSciNet  Google Scholar 

  12. Combettes, P.L., Wajs, V.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)

    Article  MathSciNet  Google Scholar 

  13. Censor, Y., Bortfeld, T., Martin, B., Trofimov, A.: A unified approach for inversion problems in intensity-modulated radiation therapy. Phys. Med. Biol. 51, 2353–2365 (2006)

    Article  Google Scholar 

  14. Censor, Y., Elfving, T., Kopf, N., Bortfeld, T.: The multiple set split feasibility problem and its applications. Inverse Probl. 21, 2071–2084 (2005)

    Article  MathSciNet  Google Scholar 

  15. Censor, Y., Motova, A., Segal, A.: Perturbed projections and subgradient projections for the multiple-sets feasibility problem. J. Math. Anal. 327, 1244–1256 (2007)

    Article  MathSciNet  Google Scholar 

  16. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)

    Article  MathSciNet  Google Scholar 

  17. Passty, G.B.: Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J. Math. Anal. 72, 383–390 (1979)

    Article  MathSciNet  Google Scholar 

  18. Moudafi, A., Oliny, M.: Convergence of a splitting inertial proximal method for monotone operators. J. Comput. Appl. Math. 155, 447–454 (2003)

    Article  MathSciNet  Google Scholar 

  19. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  Google Scholar 

  20. Hanjing, A., Suantai, S.: A fast image restoration algorithm based on a fixed point and optimization method. Mathematics 8, 378 (2020)

    Article  Google Scholar 

  21. Padcharoen, A., Kumam, P.: Fixed point optimization method for image restoration. Thai J. Math. 18(3), 1581–1596 (2020)

    MathSciNet  Google Scholar 

  22. Verma, M., Shukla, K.K.: A new accelerated proximal gradient technique for regularized multitask learning framework. Pattern Recognit. Lett. 95, 98–103 (2017)

    Article  Google Scholar 

  23. Cholamjiak, P., Kesornprom, S., Pholasa, N.: Weak and strong convergence theorems for the inclusion problem and the fixed-point problem of nonexpansive mappings. Mathematics 7, 167 (2019)

    Article  Google Scholar 

  24. Abubakar, J., Kumam, P., Ibrahim, A.H., Padcharoen, A.: Relaxed inertial Tseng’s type method for solving the inclusion problem with application to image restoration. Mathematics 8, 818 (2020)

    Article  Google Scholar 

  25. Luo, Y.: An inertial splitting algorithm for solving inclusion problems and its applications to compressed sensing. J. Appl. Numer. Optim. 2(3), 279–295 (2020)

    Google Scholar 

  26. Takahashi, W.: Introduction to Nonlinear and Convex Analysis. Yokohama Publishers, Yokohama (2009)

    MATH  Google Scholar 

  27. Tang, J.F., Chang, S.S., Yuan, F.: A strong convergence theorem for equilibrium problems and split feasibility problems in Hilbert spaces. Fixed Point Theory Appl. 2014, 36 (2014)

    Article  MathSciNet  Google Scholar 

  28. Nadezhkina, N., Takahashi, W.: Weak convergence theorem by an extragradient method for nonexpansive mappings and monotone mappings. J. Optim. Theory Appl. 128, 191–201 (2006)

    Article  MathSciNet  Google Scholar 

  29. Geobel, K., Kirk, W.A.: Topic in Metric Fixed Point Theory. Cambridge Studies in Advanced Mathematics, vol. 28. Cambridge University Press, Cambridge (1990)

    Book  Google Scholar 

  30. Takahashi, W., Xu, H.-K.: Iterative algorithms for nonlinear operators. J. Lond. Math. Soc. 66, 240–256 (2002)

    Article  MathSciNet  Google Scholar 

  31. He, S., Yang, C.: Solving the variational inequality problem defined on intersection of finite level sets. Abstr. Appl. Anal. 2013, Article ID 942315 (2013)

    MathSciNet  MATH  Google Scholar 

  32. Baillon, J.B., Haddad, G.: Quelques proprietes des operateurs angle-bornes et cycliquement monotones. Isr. J. Math. 26, 137–150 (1977)

    Article  MathSciNet  Google Scholar 

  33. Rockafellar, R.T.: On the maximal monotonicity of subdifferential mappings. Pac. J. Math. 33, 209–216 (1970)

    Article  MathSciNet  Google Scholar 

  34. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  35. Tianchai, P.: The zeros of monotone operators for the variational inclusion problem in Hilbert spaces. J. Inequal. Appl. 2021, 126 (2021)

    Article  MathSciNet  Google Scholar 

  36. Guide to the MATLAB code for wavelet-based deblurring with FISTA, Available online: https://docplayer.net/128436542-Guide-to-the-matlab-code-for-wavelet-based-deblurring-with-fista.html (accessed on 1 June 2021)

  37. Image Databases, Available online: http://www.imageprocessingplace.com/downloads_V3/root_downloads/image_databases/standard_test_images.zip (accessed on 1 June 2021)

  38. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author would like to thank the Faculty of Science, Maejo University for its financial support.

Funding

This research was supported by Faculty of Science, Maejo University.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Pattanapong Tianchai.

Ethics declarations

Competing interests

The author declares that he has no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tianchai, P. An improved fast iterative shrinkage thresholding algorithm with an error for image deblurring problem. Fixed Point Theory Algorithms Sci Eng 2021, 18 (2021). https://doi.org/10.1186/s13663-021-00703-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13663-021-00703-6

MSC

Keywords