Circumcentering approximate reflections for solving the convex feasibility problem

The circumcentered-reflection method (CRM) has been applied for solving convex feasibility problems. CRM iterates by computing a circumcenter upon a composition of reflections with respect to convex sets. Since reflections are based on exact projections, their computation might be costly. In this regard, we introduce the circumcentered approximate-reflection method (CARM), whose reflections rely on outer-approximate projections. The appeal of CARM is that, in rather general situations, the approximate projections we employ are available under low computational cost. We derive convergence of CARM and linear convergence under an error bound condition. We also present successful theoretical and numerical comparisons of CARM to the original CRM, to the classical method of alternating projections (MAP), and to a correspondent outer-approximate version of MAP, referred to as MAAP. Along with our results and numerical experiments, we present a couple of illustrative examples.


Introduction
We consider the convex feasibility problem (CFP) consisting of finding a point in the intersection of a finite number of closed convex sets. We are going to employ Pierra's product space reformulation [1] in order to reduce CFP to seeking a point common to a closed convex set and an affine subspace.
Computing exact projections onto general convex sets can be, context depending, too demanding in comparison to solving the given CFP itself. Bearing this in mind, we present It is well known that Q-linear convergence implies R-linear convergence (with the same asymptotic constant), but the converse statement does not hold true [20].
We remind now the notion of Fejér monotonicity.

Definition 2.2
A sequence {z k } k∈N ⊂ R n is Fejér monotone with respect to a nonempty closed convex set M ⊂ R n when z k+1y ≤ z ky for all k ∈ N and y ∈ M. Proof See Theorem 2.16 in [3].
Next we introduce the separating operator needed for the approximate versions of MAP and CRM, namely MAAP and CARM.

Definition 2.4
Given a closed and convex set K ⊂ R n , a separating operator for K is a point-to-set mapping S : R n → P(R n ) satisfying: (A1) S(x) is closed and convex for all x ∈ R n . (A2) K ⊂ S(x) for all x ∈ R n . (A3) If a sequence {z k } k∈N ⊂ R n converges to z * ∈ R n and lim k→∞ dist(z k , S(z k )) = 0, then z * ∈ K .
We have the following immediate result regarding Definition 2.4.

Proposition 2.5 If S is a separating operator for K , then x ∈ S(x) if and only if x ∈ K .
Proof The "if " statement follows from A2. For the "only if " statement, take x ∈ S(x), consider the constant sequence z k = x for all k ∈ N, which converges to x, and apply A3.
Proposition 2.5 implies that if x / ∈ K then x / ∈ S(x), which, in view of A2, indicates that the set S(x) separates indeed x from K . The separating sets S(x) will provide to the approximate projections that we are going to employ throughout the paper. Several notions of separating operators have been introduced in the literature; see, e.g., [21,Sect. 2.1.13] and the references therein. Our definition is a point-to-set version of the separating operators in [22,Definition 2.1]. It encompasses not only hyperplane-based separators as the ones in the seminal work by Fukushima [19], considered next in Example 2.6, but also more general situations. Indeed, in Example 2.7, S(x) is the Cartesian product of half-spaces, which is not a half-space.
For the family of convex sets in Examples 2.6 and 2.7, we get both explicit separating operators complying with Definition 2.4 and closed formulas for projections onto them.
We mention that any closed and convex set K can be written as the 0-sublevel set of a convex and even smooth function g, for instance, g(x) = dist(x, K) 2 , but in general this is not advantageous, because for this g it holds that ∇g(x) = 2(x -P K (x)), so that P K (x), the exact projection of x onto K , is needed for computing the separating half-space, and nothing has been won. The scheme is interesting when the function g has easily computable gradient or subgradients. For instance, in the quite frequent case in which K = {x ∈ R n : g i (x) ≤ 0 (1 ≤ i ≤ )}, where the g i s are convex and smooth, we can take g(x) = max 1≤i≤ g i (x), and the subgradients of g are easily obtained from the gradients of the g i s.
Example 2.7 Assume that K = K 1 ×· · ·×K m ⊂ R nm , where K i ⊂ R n is of the form K i = {x ∈ R n : g i (x) ≤ 0} and g i : R n → R is convex for 1 ≤ i ≤ m. Write x ∈ R nm as x = (x 1 , . . . , x m ) with x i ∈ R n (1 ≤ i ≤ m). We define the separating operator S : R nm → P(R nm ) as S(x) = S 1 (x 1 ) × · · · × S m (x m ), with where u i ∈ ∂g i (x i ) is an arbitrary subgradient of g i at x i . Proof We start with S as in Example 2.6. First we observe that if x / ∈ K then all subgradient of g at x are nonzero: since K is assumed nonempty, there exist points where g is nonpositive, so that x, which satisfies g(x) > 0, cannot be a minimizer of g, and hence 0 / ∈ ∂g(x), i.e., u = 0 for all u ∈ ∂g(x). Regarding A1, S(x) is either equal to K or to a half-space, both of which are closed and convex. For A2, obviously it holds for x ∈ K . If x / ∈ K , we take z ∈ K , and conclude, taking into account the fact that z ∈ K and the subgradient inequality, that u (xz) + g(x) ≤ g(z) ≤ 0, implying that z ∈ S(x) in view of (2.1).
We deal now with A3. Take a sequence {z k } k∈N converging to some z * such that lim k→∞ dist(z k , S(z k )) = 0. We must prove that z * ∈ K . If some subsequence of {z k } k∈N is contained in K , then z * ∈ K , because K is closed. Otherwise, for large enough k, S(z k ) is a half-space. It is well known, and easy to check, that the projection P H onto a half-space H = {y ∈ R n : a y ≤ α} ⊂ R n , with a ∈ R n , α ∈ R, is given by (2.2) Denote by P S k the projection onto S(z k ). By (2.2), P S k (z) = zu k -2 max{0, g(z)}u k , so that dist z k , S z k = z k -P S k z k = u k -1 max 0, g z k .
Note that {z k } k∈N is bounded, because it is convergent. Since the subdifferential operator ∂g is locally bounded in the interior of the domain of g, which here we take as R n , there exists μ > 0 so that u k ≤ μ for all k and all u k ∈ ∂g(z k ). Hence, dist(z k , S(z k )) ≥ μ -1 max{0, g(z k )} ≥ 0. Since by assumption lim k→∞ dist(z k , S(z k )) = 0, and g being convex, is continuous, we get that 0 = lim k→∞ μ -1 max{0, g(z k )} = μ -1 max{0, g(z * )}, implying that 0 = max{0, g(z * )},i.e., g(z * ) ≤ 0, so that z * ∈ K and A3 holds. Now we consider S as in Example 2.7. As before, if Concerning A1-A3, A1 holds because S(x) is the Cartesian product of closed and convex sets (either K i or a half-space in R n ). For A2, take (x 1 , . . . , x m ) ∈ K. If x i ∈ K i , then x i ∈ S i (z i ) = K i . Otherwise, we take z i ∈ K i , and invoking again the subgradient inequality, we get ( for all i, and the result follows taking into account the definitions of K and S. For A3, note that lim k→∞ dist(z k , S(z k )) = 0 if and only if lim k→∞ dist(z k,i , S i (z k,i )) = 0 for 1 ≤ i ≤ m, where z k = (z k,1 , . . . , z k,m ) with z k,i ∈ R n . Then, the result follows with the same argument as in Example 2.6, with z k,i , S i , g i substituting for z k , S, g.

Convergence results for MAAP and CARM
We recall now the definitions of MAP and CRM and introduce the formal definitions of MAAP and CARM. Consider a closed convex set K ⊂ R n and an affine manifold U ⊂ R n . We remind that an affine manifold is a set of the form {x ∈ R n : Qx = b} for some Q ∈ R n×n and some b ∈ R n . Let P K , P U be the projections onto K , U respectively and define R K , R U , T, C : R n → R n as where Id is the identity operator in R n and circ(x, y, z) is the circumcenter of x, y, z, i.e., the point in the affine hull of x, y, z equidistant to them. We remark that circ(x, y, y) = (1/2)(x + y) (in this case the affine hull is the line through x, y) and circ(x, x, x) = x (the affine hull being the singleton {x}). Then, starting from any x 0 ∈ R n , MAP generates a sequence {x k } k∈N ⊂ R n according to and, starting with x 0 ∈ U, CRM generates a sequence {x k } k∈N ⊂ R n given by For MAAP and CARM, we assume that S : R n → P(R n ) is a separating operator for K satisfying A1-A3, we take P U as before and define P S as the operator given by P S (x) := P S(x) (x), where P S(x) is the projection onto S(x).
Take R U as in (3.1), and define R S , T S , C S : R n → R n as Then, starting from any x 0 ∈ R n , MAAP generates a sequence {x k } k∈N ⊂ R n according to and, starting with x 0 ∈ U, CARM generates a sequence {x k } k∈N ⊂ R n given by We observe now that the "trivial" separating operator S(x) = K for all x ∈ R n satisfies A1-A3, and that in this case we have T S = T, C S = C, so that MAP, CRM are particular instances of MAAP, CARM respectively. Hence, the convergence analysis of the approximate algorithms encompasses the exact ones. Global convergence of MAP is well known (see, e.g., [23]) and the corresponding result for CRM has been established in [15]. The following propositions follow quite closely the corresponding results for the exact algorithms, with T S as in (3.2).
Proof The projection operator P M onto any closed and convex set M is known to be firmly nonexpansive [24,Proposition 4.16], that is, for all x ∈ R n and all y ∈ M. Applying consecutively (3.4) with M = U and M = S(x) and noting that for z ∈ K ∩ U we get z ∈ U and also z ∈ K ⊂ S(x) (due to Assumption A2), we obtain (3.3).
A similar result for operator C S is more delicate due to the presence of the reflections and the circumcenter and requires some intermediate results. We follow closely the analysis for operator C presented in [15].
The crux of the convergence analysis of CRM, performed in [15], is the remarkable observation that for x ∈ U \ K , C(x) is indeed the projection of x onto a half-space H(x) separating x from K ∩ U. Next, we extend this result to C S . Proposition 3.2 Let U, H ⊂ R n be an affine manifold and a subspace, respectively, such that H ∩ U = ∅. Denote as P H , R H : R n → R n the projection and the reflection with respect to H, respectively. Then Proof See Lemmas 2 and 3 in [15].
Proposition 3.2 means that when the sets in CFP are an affine manifold and a hyperplane, CRM indeed converges in one step, which is a first indication of its superiority over MAP, which certainly does not enjoy this one-step convergence property, but also points to the main weakness of CRM, namely that for its convergence we may replace H by a general closed and convex set, but the other set must be kept as an affine manifold.

Lemma 3.3 Define H(x) ⊂ R n as
{z ∈ R n : (z -P S (x)) (x -P S (x)) ≤ 0}, otherwise.  Proof Take x ∈ U. If x ∈ K , then x ∈ S(x) by A2, and it follows that R U (x) = R S (x) = x, so that C S (x) = circ(x, x, x) = x. Also, P H(x) (x) = P K (x) = x by (3.5), and the result holds. Assume that x ∈ U \ K , so that H(x) is the half-space in (3.5).
In view of (3.5), we get, using (2.2) with a = x -P S (x), α = (x -P S (x)) P S (x), that P H(x) (x) = P S (x). It follows from the definition of the reflection operator R S that Now, by (3.2) and (3.6), Since U is an affine manifold and H(x) is a half-space, we can apply Proposition 3.2 and conclude that C S (x) = P H(x)∩U (x), proving the last statement of the lemma. By assumption, This rewriting of the operator C S as a projection onto a half-space (which varies with the argument of C S ) allows us to obtain the result for CARM analogous to Proposition 3.1.

Proposition 3.4 For all z ∈ K ∩ U and all x ∈ U, it holds that
Proof For (i), take z ∈ K ∩ U and x ∈ U. By Lemma 3. Propositions 3.1 and 3.4 allow us to prove convergence of the MAAP and CARM sequences, respectively, using the well-known Fejér monotonicity argument. Theorem 3.5 Consider a closed and convex set K ⊂ R n and an affine manifold U ⊂ R n such that K ∩ U = ∅. Consider also a separating operator S for K satisfying Assumptions A1-A3. Then the sequences generated by either MAAP or CARM, starting from any initial point in the MAAP case and from a point in U in the CARM case, are well defined, contained in U, Fejér monotone with respect to K ∩ U, convergent, and their limits belong to K ∩ U, i.e., they solve CFP.
Proof Let first {x k } k∈N be the sequence generated by MAAP, i.e., x k+1 = T S (x k ). Take any z ∈ K ∩ U. Then, by Proposition 3.1, and so {x k } k∈N is Fejér monotone with respect to K ∩ U. By the definition of T S in (3.2), iterates converges to 0. Hence, rewriting (3.7) as we conclude that Letx be a cluster point of {x k } k∈N and {x j k } j k ∈N be a subsequence of {x k } k∈N convergent tox. By (3.9), lim k→∞ dist(x j k , S(x j k )) = 0. By Assumption A3 on the separating operator S, x ∈ K . It follows also from (3.9) that lim k→∞ P S (x j k ) =x. By (3.8) and the continuity of P U , P U (x) =x, so thatx ∈ U and thereforex ∈ K ∩ U. By Proposition 2.3(ii),x = lim k→∞ x k , completing the proof for the case of MAAP. Let now {x k } k∈N be the sequence generated by CARM with x 0 ∈ U. By Lemma 3.3, whenever x k ∈ U, x k+1 is the projection onto a closed and convex set, namely H(x k ), and hence it is well defined. Since x 0 ∈ U by assumption, the whole sequence is well defined, and using recursively Proposition 3.4(ii), we have that {x k } k∈N ⊂ U. Now, we use Proposition 3.2, obtaining, for any z ∈ K ∩ U, so that again {x k } k∈N is Fejér monotone with respect K ∩ U, and henceforth bounded. Also, with the same argument as before, we get In view of (3.10) and the definition of circumcenter, Letx be any cluster point of {x k } k∈N . Looking at (3.11) along a subsequence of {x k } k∈N converging tox and invoking Assumption A3 of the separating operator S, we conclude thatx ∈ K . Since {x k } k∈N ⊂ U, we get that all cluster points of {x k } k∈N belong to K ∩ U, and then, using Proposition 2.3(ii), we get that lim k→∞ x k =x ∈ K ∩ U, establishing the convergence result for CARM.

Linear convergence rate of MAAP and CARM under a local error bound assumption
In [9], the following global error bound assumption on the sets K , U, denoted as (EB), was considered: Let us comment on the connection between (EB) and other notions of error bounds which have been introduced in the past, all of them related to regularity assumptions imposed on the solutions of certain problems. If the problem at hand consists of solving H(x) = 0 with a smooth H : R n → R m , a classical regularity condition demands that m = n and the Jacobian matrix of H be nonsingular at a solution x * , in which case Newton's method, for instance, is known to enjoy superlinear or quadratic convergence. This condition implies local uniqueness of the solution x * . For problems with nonisolated solutions, a less demanding assumption is the notion of calmness (see [25], Chap. 8, Sect. F), which requires that for all x ∈ R n \ S * and some ω > 0, where S * is the solution set, i.e., the set of zeros of H. Calmness, also called upper-Lipschitz continuity (see [26]), is a classical example of error bound, and it holds in many situations (e.g., when H is affine by virtue of Hoffman's lemma [27]). It implies that the solution set is locally a Riemannian manifold (see [28]), and it has been used for establishing superlinear convergence of Levenberg-Marquardt methods in [29]. When dealing with convex feasibility problems, as in this paper, it seems reasonable to replace the numerator of (4.1) by the distance from x to some of the convex sets, giving rise to (EB). Similar error bounds for feasibility problems can be found, for instance, in [30][31][32][33].
Under (EB), it was proved in [9] that MAP converges linearly with asymptotic constant bounded above by √ 1 -ω 2 , and that CRM also converges linearly with a better upper bound for the asymptotic constant, namely (1 -ω 2 )/(1 +ω 2 ). In this section, we prove similar results for MAAP and CARM, assuming that a local error bound related not just to K , U, but also to the separating operator S. The local error bound, denoted as (LEB), is defined as follows: (LEB) There exist a set V ⊂ R n and a scalar ω > 0 such that We reckon that (LEB) becomes meaningful, and relevant for establishing convergence rate results, only when the set V contains the tail of the sequence generated by the algorithm; otherwise it might be void (e.g., it holds trivially, with any ω, when U ∩ V = ∅). In order to facilitate the presentation, we opted for introducing additional conditions on V in our convergence results rather than in the definition of (LEB).
The use of a local error bound instead of a global one makes sense, because the definition of linear convergence rate deals only with iterates x k of the generated sequence with large enough k, and the convergence of the sequences of interest has already been established in Theorem 3.5, so that only points close enough to the limit x * of the sequence matter. In fact, the convergence rate analysis for MAP and CRM in [9] holds, without any substantial change, under a local rather than global error bound.
The set V could be expected to be a neighborhood of the limit x * of the sequence, but we do not specify it for the time being, because for the prototypical example of separating operator, i.e., the one in Example 2.6 of Sect. 3, it will have, as we will show later, a slightly more complicated structure: a ball centered at x * minus a certain "slice". We start with the convergence rate analysis for MAAP.
Proposition 4.1 Assume that K , U and the separating operator S satisfy (LEB). Consider with ω as in Assumption (LEB).
Proof By Proposition 3.1, for all z ∈ K ∩ U and all x ∈ R n , Take now x ∈ U ∩ V and invoke (LEB) to get from (4.4) which immediately implies the result.
Proposition 4.1 implies that if {x k } k∈N is the sequence generated by MAAP and x k ∈ V for large enough k, then the sequence {dist(x k , K ∩ U)} k∈N converges Q-linearly, with asymptotic constant bounded above by √ 1ω 2 . In order to move from the distance sequence to the sequence {x k } k∈N itself, we will invoke the following lemma from [9].  {x k } k∈N be the sequence generated by MAAP from any starting point x 0 ∈ R n . If there exists k 0 such that x k ∈ V for all k ≥ k 0 , then {x k } k∈N converges R-linearly to some point x * ∈ K ∩ U, and the asymptotic constant is bounded above by √ 1ω 2 , with ω and V as in (LEB).
Proof The fact that {x k } k∈N converges to some x * ∈ K ∩ U has been established in Theorem 3.5. Take any k ≥ k 0 . By assumption, x k ∈ V , and by definition of T S , x k ∈ U. So, we can take x = x k in Proposition 4.1, in which case T S (x) = x k+1 , and rewrite (4.2) as (1 - can be found in [15], where it is proved with a geometrical argument. Here we present an analytical one.

Proposition 4.4
Consider the operators C S , T S : R n → R n defined in (3.2). Then T S (x) belongs to the segment between x and C S (x) for all x ∈ U.
Proof Let E denote the affine manifold spanned by x, R S (x) and R U (R S (x)). By definition, the circumcenter of these three points, namely C S (x), belongs to E. We claim that T S (x) also belongs to E, and next we proceed to proving the claim. Since U is an affine manifold, P U is an affine operator, so that P U (αx + (1α)x ) = αP U (x) + (1α)P U (x ) for all α ∈ R and all x, x ∈ R n . By (3.1), R U (R S (x)) = 2P U (R S (x)) -R S (x), so that On the other hand, using the affinity of P U , the definition of T S , and the assumption that x ∈ U, we have so that Combining (4.5) and (4.7), i.e., T S (x) is a convex combination of x, R U (R S (x)) and R S (x). Since these three points belong to E, the same holds for T S (x), and the claim holds. We observe now that x ∈ U by assumption, T S (x) ∈ U by definition, and C S (x) ∈ U by Proposition 3.4(ii). Now we consider three cases: if dim(E ∩ U) = 0 then x, T S (x) and C S (x) coincide, and the result holds trivially. If dim(E ∩ U) = 2 then E ⊂ U, so that R S (x) ∈ U so that R U (R S (x)) = R S (x), in which case C S (x) is the midpoint between x and R S (x), which is precisely P S (x). Hence, P S (x) ∈ U, so that T S (x) = P U (P S (x)) = P S (x) = C S (x), implying that T S (x) and C S (x) coincide, and the result holds trivially. The interesting case is the remaining one, i.e., dim(E ∩ U) = 1. In this case x, T S (x) and C S (x) lie in a line, so that we can write C S (x) = x + η(T S (x)x) with η ∈ R, and it suffices to prove that η ≥ 1. By the definition of η, In view of (3.4) with M = U, y = C S (x), and x = R S (x), using the definition of the circumcenter in the first equality, (4.9) in the inequality, and (4.6), as well as the definition of η, in the third equality. Combining (4.8) and (4.10), we get implying that |η| ≥ |2 -η|, which holds only when η ≥ 1, completing the proof.
We continue with another intermediate result.

Proposition 4.5
Assume that (LEB) holds for K , U, and S, and take x ∈ U. If x, C S (x) ∈ V , then with V , ω as in (LEB).
Proof Take z ∈ K ∩ U, x ∈ V ∩ U. We use Proposition 3.1, rewriting (3.3) as for all x ∈ R n and all z ∈ K ∩ U. Since x ∈ U, we get from Lemma 3.3 that C S (x) = P H(x) (x). We apply next the well-known characterization of projections [24,Theorem 3.16] to get In view of Proposition 4.4, P U (P S (x)) is a convex combination of x and C S (x), meaning that P U (P S (x)) -C S (x) is a nonnegative multiple of x -C S (x), so that (4.13) implies Expanding the inner product in (4.14), we obtain Combining (4.12) and (4.15), we have Now, since U is an affine manifold, (y -P U (y)) (w -P U (y)) = 0 for all y ∈ R n and all w ∈ U, so that Replacing (4.18) in (4.16), we obtain using the facts that P S (x) ∈ S(x) and z ∈ K ∩ U in the last inequality. Now, we take z = P K∩U (x), recall that x, C S (x) ∈ V by hypothesis, and invoke the (LEB) assumption, together with (4.19), in order to get The result follows rearranging (4.20).
Next we present our convergence rate result for CARM. by CARM from any starting point x 0 ∈ U. If there exists k 0 such that x k ∈ V for all k ≥ k 0 , then {x k } k∈N converges R-linearly to some point x * ∈ K ∩ U, and the asymptotic constant is bounded above by (1ω 2 )/(1 + ω 2 ), with ω and V as in (LEB).
Proof The fact that {x k } k∈N converges to some x * ∈ K ∩ U has been established in Theorem 3.5. Take any k ≥ k 0 . By assumption, x k ∈ V and by the definition of T S , x k ∈ U. Also, k + 1 ≥ k 0 , so that C S (x k ) = x k+1 ∈ V So, we can take x = x k in Proposition 4.5 and rewrite (4.11) as ( for all k ≥ 0, which immediately implies that {dist(x k , K ∩ U)} k∈N converges Q-linearly, and hence R-linearly, with asymptotic constant bounded by (1ω 2 )/(1 + ω 2 ). The corresponding result for the R-linear convergence of {x k } k∈N with the same bound for the asymptotic constant follows then from Lemma 4.2, since {x k } k∈N is Fejér monotone with respect to K ∩ U by Theorem 3.5.
From now on, given z ∈ R n , α > 0, B(z, α) will denote the closed ball centered at z with radius α.
The results of Theorems 4.3 and 4.6 become relevant only if we are able to find a separating operator S for K such that (LEB) holds. This is only possible if the "trivial" separating operator satisfies an error bound, i.e., if an error bound holds for the sets K , U themselves. Let {x k } k∈N be a sequence generated by CARM starting at some x 0 ∈ U. By Theorem 3.5, {x k } k∈N converges to some x * ∈ K ∩ U. Without loss of generality, we assume that x k / ∈ K for all k, because otherwise the sequence is finite and it makes no sense to deal with convergence rates. In such a case, x * ∈ ∂K , the boundary of K . We also assume from now on that a local error bound for K , U, say (LEB1), holds at some neighborhood of x * , i.e., (LEB1) There exist ρ,ω > 0 such that dist(x, K) ≥ω dist(x, K ∩ U) for all x ∈ U ∩ B(x * , ρ). Note that (LEB1) is simply a local version of (EB). Observe further that (LEB1) does not involve the separating operator S, and that it gives a specific form to the set V in (LEB), namely a ball around x * .
We will analyze the situation for what we call the "prototypical" separating operator, namely the operator S presented in Example 2.6, for the case in which K is the 0-sublevel set of a convex function g : R n → R.
We will prove that under some additional mild assumptions on g, for any ω <ω, there exists a set V such that U, K , S satisfy a local error bound assumption, say (LEB), with the pair ω, V .
Indeed, it will not be necessary to assume (LEB) in the convergence rate result; we will prove that under (LEB1), (LEB) will be satisfied for any ω <ω with an appropriate set V which does contain the tail of the sequence.
Our proof strategy will be as follows: in order to check that the error bounds for K , U and S(x), U are virtually the same for x close to the limit x * of the CARM sequence, we will prove that the quotient between dist(x, S(x)) and dist(x, K) approaches 1 when x approaches x * . Since both distances vanish at x = x * , we will take the quotient of their first order approximations, in a L'Hôspital's rule fashion, and the result will be established as long as the numerator and denominator of the new quotient are bounded away from 0, because otherwise this quotient remains indeterminate. This "bad" situation occurs when x approaches x * along a direction almost tangent to K ∩ U, or equivalently almost normal to ∇g(x * ). Fortunately, the CARM sequence, being Fejér monotone with respect to K ∩ U, does not approach x * in such a tangential way. We will take an adequate value smaller than the angle between ∇g(x * ) and x kx * . Then, we will exclude directions whose angle with ∇g(x * ) is smaller than such a value and find a ball around x * such that, given any ω <ω, (LEB) holds with parameter ω in the set V defined as the ball minus the "slice" containing the "bad" directions. Because of the Fejér monotonicity of the CARM sequence, its iterates will remain in V for large enough k, and the results of Theorem 4.6 will hold with such ω. We proceed to follow this strategy in detail. The additional assumptions on g are continuous differentiability and a Slater condition, meaning that there existsx ∈ R n such that g(x) < 0. When g is of class C 1 , the separating operator of Example 2.6 becomes Proposition 4.7 Let g : R n → R be convex, of class C 1 , and such that there existsx ∈ R n satisfying g(x) < 0. Take K = {x ∈ R n : g(x) ≤ 0}. Assume that K , U satisfy (LEB1). Take x * as in (LEB1), fix 0 < ν < ∇g(x * ) (we will establish that 0 = ∇g(x * ) in the proof of this proposition), and define L ν : Consider the separating operator S defined in (4.21). Then, for any ω <ω, withω as in (LEB1), there exists β > 0 such that K , U, S satisfy (LEB) with ω and V := B(x * , β) \ L ν .
Proof The fact that 0 < ν < ∇g(x * ) ensures that V = ∅. We will prove that, for x close to x * , the quotient dist(x, S(x))/ dist(x, K) approaches 1, and first we proceed to evaluate dist(x, S(x)). Note that when x ∈ K ⊂ S(x), the inequality in (LEB1) holds trivially because of A1. Thus, we assume that x / ∈ K , so that x / ∈ S(x) by Proposition 2.5, and hence g(x) > 0 and S(x) = {z ∈ R n : ∇g(x) (zx) + g(x) ≤ 0}, implying, in view of (2.2), that Now we look for a more manageable expression for dist(x, K) = x -P K (x) . Let y = P K (x). So, y is the unique solution of the problem min zx 2 s.t. g(z) ≤ 0, whose first order optimality conditions, sufficient by convexity of g, are xz = λ∇g(z) (4.23) with λ ≥ 0, so that dist(x, K) = xy = λ ∇g(y) . (4.24) Now we observe that the Slater condition implies that the right-hand sides of both (4.22) and (4.24) are well defined: since x / ∈ K , g(x) > 0; since y = P K (x) ∈ ∂K , g(y) = 0. By the Slater condition, g(x) > g(x) and g(y) > g(x), so that neither x nor y are minimizers of g, and hence both ∇g(y) and ∇g(x) are nonzero. By the same token, ∇g(x * ) = 0, because x * is not a minimizer of g: being the limit of a sequence lying outside K , x * belongs to the boundary of K , so that g(x * ) = 0 > g(x). From (4.22), (4.24), , (4.25) where the notation y(x), λ(x) emphasizes that both y = P K (x) and the multiplier λ depend on x.
We look at the right-hand side of (4.25) for x close to x * ∈ K , in which case y, by the continuity of P K , approaches P(x * ) = x * , so that ∇g(y(x)) approaches ∇g(x * ) = 0, and hence, in view of (4.22), λ(x) approaches 0. So, the product of the first two factors in the right-hand side of (4.25) approaches ∇g(x * ) 2 , but the quotient is indeterminate, because both the numerator and the denominator approach 0, requiring a more precise first order analysis.
Expanding g(x) around x * and taking into account that g(x * ) = 0, we get Now we look at λ(x). Let φ(t) = λ(x * + td). Note that, for x ∈ ∂K , we get y(x) = x, so that 0 = λ(x)∇g(x) and hence λ(x) = 0. Thus, φ(0) = 0 and (4.27) where φ + (0) denotes the right derivative of φ(t) at 0. Since we assume that x / ∈ K , we have y(x) ∈ ∂K and hence, using (4.23), , so that (4.28) becomes 0 = ψ(t) = g(x * + tdσ (t)) for all t > 0 and hence Taking limits in (4.29) with t → 0 + and noting that y(x * ) = x * because x * ∈ K , we get 0 = ∇g x * dσ + (0) , (4.30) where σ + (0) denotes the right derivative of σ (t) at 0. We compute σ + (0) directly from the definition, because we assume that g is of class C 1 but perhaps not of class C 2 . Recalling that φ(0) = 0, we have that using the facts that g is class C 1 and that y(x * ) = x * . Replacing (4.31) in (4.30), we get that 0 = ∇g(x * ) (dφ + (0)∇g(x * )), and therefore Using (4.27) and (4.32), we obtain (4.33) Replacing (4.33) and (4.26) in (4.25), we obtain Now we recall that we must check the inequality of (LEB) only for points in V , and that V ∩ , ∇g(x * ) d is bounded away from 0, independently of the direction d. In this situation, it is clear that the rightmost expression of (4.34) tends to 1 when t → 0 + , and so there exists some β > 0 such that, for t ∈ (0, β), such an expression is not smaller than ω/ω, with ω as in (LEB) andω as in (LEB1). Without loss of generality, we assume that β ≤ ρ, with ρ as in Assumption (LEB1). Since t = x-x * , we have proved that, for It follows from (4.35) that for all x ∈ V ∩ U. Dividing both sides of (4.36) by dist(x, K ∩ U), recalling that β ≤ ρ, and invoking Assumption (LEB1), we obtain for all x ∈ U ∩ V , thus proving that (LEB) holds for any ω <ω, with V = B(x * , β) \ L ν and withω as in (LEB1). We have proved that for the prototypical separating operator given by (4.21), the result of Proposition 4.5 holds. In order to obtain the convergence rate result of Theorem 4.6 for this operator, we must prove that in this case the tail of the sequence {x k } k∈N generated by CARM is contained in V = B(x * , β) \ L ν . Note that β depends on ν. Next we will show that if we take ν smaller than a certain constant which depends on x * , the initial iterate x 0 , the Slater pointx, and the parameterω of (LEB1), then the tail of the sequence {x k } k∈N will remain outside L ν . Clearly, this will suffice, because the sequence eventually remains in any ball around its limit, which is x * , so that its tail will surely be contained in B(x  *  , β). The fact that x k / ∈ L ν for large enough k is a consequence of the Fejér monotonicity of the sequence with respect to K ∩ U, proved in Theorem 3.5. In the next proposition we will prove that indeed x k / ∈ L ν for large enough k, and so the result of Theorem 4.6 holds for this separating operator.

Proposition 4.8
Let g : R n → R be convex, of class C 1 , and such that there existsx ∈ R n satisfying g(x) < 0. Take K = {x ∈ R n : g(x) ≤ 0}. Assume that K , U satisfy (LEB1). Consider the separating operator S defined in (4.21). Let {x k } k∈N be a sequence generated by (CARM) with starting point x 0 ∈ U and limit point x * ∈ K ∩ U. Take ν > 0 satisfying withω as in (LEB1), and define Then there exists k 0 such that, for all k ≥ k 0 , x k ∈ B(x * , β) \ L ν , with β as in Proposition 4.7.
Proof Assume that x k ∈ L ν , i.e., Using the gradient inequality, the fact that g(x * ) = 0, and (4.38), we obtain By Theorem 3.5, {x k } k∈N is Fejér monotone with respect to K ∩ U. Thus, we use Proposition 2.3(iii) and (LEB1) in (4.39), obtaining Denote y k = P K (x k ). Using again the gradient inequality, together with the facts that g(y k ) = 0 and that x ky k and ∇g(y k ) are collinear, which is a consequence of (4.23), and the nonnegativity of λ, we get from (4.40) Now we use the Slater assumption on g for finding a lower bound for ∇g(y k ) . Takex such that g(x) < 0, and apply once again the gradient inequality.
Multiplying (4.42) by -1, we get using the facts that y k = P K (x k ) and that x * ∈ K in the third inequality and the Féjer monotonicity of {x k } k∈N with respect to K ∩ U in the fourth one. Now, since lim k→∞ x k = x * , there exists k 1 such that x kx * ≤ ρ for k ≥ k 1 , with ρ as in (LEB1). So, in view of (4.43), (4.44) Combining (4.40), (4.41), (4.44), and (4.37), we obtain The inequality in (4.45) has been obtained by assuming that x k ∈ L ν . Now, since lim k→∞ x k = x * and g is of class C 1 , there exists k 0 ≥ k 1 such that ∇g(x * ) -∇g(x k ) ≤ ν for k ≥ k 0 , and hence (4.45) implies that, for k ≥ k 0 , Now we conclude the analysis of CARM with the prototypical separating operator, proving that under smoothness of g and a Slater condition, the CARM method achieves linear convergence with precisely the same bound for the asymptotic constant as CRM, thus showing that the approximation of P K by P S produces no deterioration in the convergence rate. We emphasize again that, for this operator S, P S has an elementary closed formula, namely the one given by Theorem 4.9 Let g : R n → R be convex, of class C 1 , and such that there existsx ∈ R n satisfying g(x) < 0. Take K = {x ∈ R n : g(x) ≤ 0}. Assume that K , U satisfy (LEB1). Consider the separating operator S defined in (4.21). Let {x k } k∈N be a sequence generated by CARM with the starting point x 0 ∈ U. Then {x k } k∈N converges to some x * ∈ K ∩ U with linear convergence rate and asymptotic constant bounded above by (1 -ω 2 )/(1 +ω 2 ), withω as in (LEB1).
Proof The fact that {x k } k∈N converges to some x * ∈ K ∩ 1 follows from Theorem 3.5. Let ω be the parameter in (LEB1). By Proposition 4.7, P, K , and S satisfy (LEB) with any parameter ω ≤ω and a suitable V . By Proposition 4.8, x k ∈ V for large enough k, so that the assumptions of Theorem 4.6 hold, and hence lim sup for any ω ≤ω. Taking infimum in the right-hand side of (4.46) with ω <ω, we conclude that the inequality holds also forω, i.e., completing the proof.
We mention that the results of Propositions 4.7 and 4.8 and Theorem 4.9 can be extended without any complications to the separating operator S in Example 2.7, so that they can be applied for accelerating SiPM for CFP with m convex sets, presented as 0-sublevel sets of smooth convex functions. We omit the details.
Let us continue with a comment on the additional assumptions on g used for proving Theorem 4.9, namely continuous differentiability and the Slater condition. We guess that the second one is indeed needed for the validity of the result; regarding smoothness of g, we conjecture that the CARM sequence still converges linearly under (LEB) when g is not smooth, but with an asymptotic constant possibly larger than the one for CRM. It seems clear that the proof of such a result requires techniques quite different from those used here.
Finally, we address the issue of the extension of the results in this paper to the framework of infinite dimensional Hilbert spaces. We have refrained from developing our analysis in such a framework because our main focus lies in the extension of the convergence rate results for the exact algorithms presented in [9] to the approximate methods introduced in this paper, so that in order to establish the appropriate comparisons between the exact and approximate methods one should start by rewriting the results of [9] in the context of Hilbert spaces, which would unduly extend the length of this paper. We just comment that it is possible to attain such an aim following the approach presented in [11,12].

Convergence rate results for CARM and MAAP applied to specific instances of CFP
The results of Sect. 4 indicate that when K , U satisfy an error bound assumption, both CARM and MAAP enjoy linear convergence rates (with a better asymptotic constant for the former). In this section we present two families of CFP instances for which the difference between CARM and MAAP is more dramatic: using the prototypical separating operator, in the first one (for which (LEB) does not hold), MAAP converges sublinearly and CARM converges linearly; in the second one, MAAP converges linearly, as in Sect. 4, but CARM converges superlinearly. Similar results on the behavior of MAP and CRM for these two families can be found in [9]. Throughout this section, K ⊂ R n+1 will be the epigraph of a convex function f : R n → R of class C 1 and U will be the hyperplane U := {x ∈ R n+1 : x n+1 = 0}. We mention that the specific form of U and the fact that K is an epigraph entail little loss of generality; but the smoothness assumption on f and the fact that U is a hyperplane (i.e. an affine manifold of codimension 1) are indeed more restrictive.
First we look at the case when the following assumptions hold: B2. ∇f (x) = 0 if and only if x = 0. Note that under B1-B2, 0 is the unique minimizer of f and that K ∩ U = {0}. It follows from Theorem 3.5 that the sequences generated by MAAP and CARM, from any initial iterate in R n and U respectively, converge to x * = 0. We prove next that under these assumptions MAAP converges sublinearly. Proposition 5.1 Assume that K ⊂ R n+1 is the epigraph of a convex function f : R n → R of class C 1 satisfying B1-B2, and U := {x ∈ R n+1 : x n+1 = 0}. Consider the separating operator given by (4.21) for the function g : R n+1 → R defined as g(x 1 , . . . , x n+1 ) = f (x 1 , . . . , x n )x n+1 . Then the sequence {x k } k∈N generated by MAAP starting at any x 0 ∈ R n+1 converges sublinearly to x * = 0.
Proof Convergence of {x k } k∈N to x * = 0 results from Theorem 3.5. We write vectors in R n+1 as (x, s) with x ∈ R n , s ∈ R. We start by computing the formula for T S (x, 0). By definition of g, ∇g(x, s) = (∇f (x), -1) . Let which implies, since P U (x, s) = (x, 0), completing the proof. We mention that the results of Corollary 5.5 coincide with those obtained in Corollary 4.11 of [9] for the sequences generated by MAP and CRM applied to the same families of instances of CFP, showing that the convergence rate results of the exact methods are preserved without any deterioration also in these cases.

Numerical comparisons
In this section, we perform numerical comparisons between CARM, MAAP, CRM, and MAP. These methods are employed for solving the particular CFP of finding a common point in the intersection of finitely many ellipsoids, that is, findinḡ with each ellipsoid E i being given by where g i : R n → R is given by g i (x) = x A i x + 2x b iα i , each A i is a symmetric positive definite matrix, b i is an n-vector, α i is a positive scalar. Problem (6.1) has importance on its own (see [34,35]), and both CRM and MAP are suitable for solving it. Nevertheless, the main motivation for tackling it with approximate projection methods is that the computation of exact projections onto ellipsoids is a formidable burden for any algorithm to bear. Since the gradient of each g i is easily available, we can consider the separable operators given in Examples 2.6 and 2.7 and use CARM and MAAP to solve problem (6.1) as well. What is more, the experiments illustrate that, in this case, CARM handily outperforms CRM in terms of CPU time, while still being competitive in terms of iteration count. The exact projection onto each ellipsoid is so demanding that even MAAP has a better CPU time result than CRM.
The four methods are employed upon Pierra's product space reformulation, that is, we seek a point x * ∈ K ∩ D, where K := E 1 × E 2 × · · · × E m and D is the diagonal space. For each sequence {x k } k∈N that we generate, we consider the tolerance ε := 10 -6 and use as stopping criteria the gap distances where P K (x k ) is utilized for CRM and MAP, and P S K (x k ) is used for CARM and MAAP. Note that if the correspondent criterion is not met in a given iteration, the projection computed is employed to yield the next step. We also set the maximum number of iterations as 50,000.
To execute our tests, we randomly generate 160 instances of (6.1) in the following manner. We range the dimension size n in {10, 50, 100, 200}, and for each n, we took the number m of underlying sets varying in {5, 10, 20, 50}. For each of these 16 pairs (m, n), we build 10 randomly generated instances of (6.1). Each matrix A i is of the form A i = γ Id +B i B i , with B i ∈ R n×n , γ ∈ R ++ . Matrix B i is a sparse matrix sampled from the standard normal distribution with sparsity density p = 2n -1 , and each vector b i is sampled from the uniform distribution between [0, 1]. We then choose each α i so that α i > (b i ) Ab i , which ensures that 0 belongs to every E i , and thus (6.1) is feasible. The initial point x 0 is of the form (η, η, . . . , η) ∈ R n , with η being negative and |η| sufficient large, guaranteeing that x 0 is far from all E i s. The computational experiments were performed on an Intel Xeon W-2133 3.60 GHz with 32 GB of RAM running Ubuntu 20.04 and using Julia v1.5 programming language [36]. The codes for our experiments are fully available in https://github.com/ lrsantos11/CRM-CFP.
We remark that, as CRM and MAP rely on the computation of exact projections, AL-GENCAN [37], an augmented Lagrangian algorithm implemented in Fortran (wrapped in Julia using NLPModels.jl [38]) was used in our code to compute projections onto the ellipsoids. Each projection was found by solving the correspondent quadratic minimization problem with quadratic constraints (the g i s).
The results are summarized in Fig. 1 using a performance profile [39]. Performance profiles allow one to compare different methods on problems set with respect to a performance measure. The vertical axis indicates the percentage of problems solved, while the horizontal axis indicates, in log-scale, the corresponding factor of the performance index used by the best solver. In this case, when looking at CPU time (in seconds), the performance profile shows that CARM always did better than the other three methods. The picture also shows that MAAP took less time than CRM and MAP. We conclude this examination by presenting, in Table 1, the following descriptive statistics of the benchmark of CARM, MAAP, CRM, and MAP: mean, maximum (max), minimum (min), and  standard deviation (std) for iteration count (it) and CPU time in seconds (CPU (s)). In particular, CARM was, in average, almost 3000 times faster than CRM.

Concluding remarks
In this paper, we have introduced a new circumcenter iteration for solving convex feasibility problems. The new method is called CARM, and it utilizes outer-approximate projections instead of the exact ones taken in the original CRM.
We have drawn our attention to questions on whether similar convergence results known for CRM could be generalized for CARM. We have derived many positive theoretical statements in this regard. For instance, the convergence of CARM was proven, and linear rates were achieved under error bound conditions. In addition to that, we presented numerical experiments in which subgradient approximate projections were employed. This choice of approximate projections is a particular case of the ones covered by CARM. The numerical results show CARM to be much faster in CPU time than the other methods we compared it with, namely, the pure CRM, the classical MAP, and an approximate version of MAP called MAAP.