Preconditioned CG
The conjugate gradient method is one of the most powerful methods for the solutions of linear systems Ax⋆=b, where A is a symmetric and positive definite matrix. It is particularly convenient for large and sparse matrices as, at each step, the only operation involving A we are required to do is the product of A times a vector. Recall that the convergence rate of the method can be estimated by looking at the energy norm of the error Ek=xk−x⋆. The following upper bound holds
‖Ek‖A‖E0‖A≤2(√κ(A)−1√κ(A)+1)k,,
being κ(A)=λmax(A)/λmin(A) the ℓ2 condition number of A.
Given a matrix A, in this lab you are asked to:
- Fix a solution vector x⋆ (it can be random or any fixed vector you like) and define the vector b=Ax⋆
- Design the Conjugate Gradient routine to address the solution of the system Ax=b
- Build from A the Jacobi preconditioner P
- Solve the linear system Ax=b with the Preconditioned Conjugate Gradient method, using the Jacobi and the Incomplete Cholesky preconditioners
- Generate convergence plots in semi-log scale of the conjugate
gradient iteration for all the test problems. The plots should
display the Log10 of the relative norm of the residual
rk vs. the iteration number k, and should include different choices of preconditioners, including the unpreconditioned case P=I
- Generate plots that compare the convergence behavior of the PCG
residual in comparison with the real residual rk=b−Axk and the theoretical bound (1).
The matrix A you should consider to work with are:
1. Diagonal 100×100 matrix with eigenvalues λ1=5, λ2=4, λ3=1.5, λ4=1.4, λ5=1.3 and λ6=⋯=λ100=1.
2.
{Aii=√iAij=1,if |i−j|=kAij=0otherwise
3.
Pseudocode of the PCG algorithm
function x = my_pcg(A,P,b,x0,maxiter,tol)
% Input: A, SPD matrix
% P, preconditioner
% b, rhs vector
% x0, starting vector
% maxiter, max number of iterations allowed
% tol, convergence tolerance
% Output: x_approx, approximation of the solution to Ax=b
M = P^{-1};
r0 = b - A*x0;
p1 = M*r0;
for k = 1 : maxiter
a_k = (r_{k-1}, M*r_{k-1}) / (p_k,A*p_k);
x_k = x_{k-1} + a_k * p_k;
r_k = r_{k-1} - a_k * A * p_k;
b_{k+1} = (r_k, M*r_k) / (r_{k-1}, M*r_{k-1});
p_{k+1} = P^{-1}*r_k + b_{k+1}*p_k;
if tol has been reached
x_approx = x_k;
break;
end
end
end
Implementation notes
Note that the method can (has to) be implemented so that only four vectors are allocated in the memory, overall. Moreover, note that the most costly operation per step is A×vector, which requires O(nnz) flops, being nnz the number of non-zero entries of A.
You can plot the norm of the residual ‖rm‖ and the norm of the true residual ‖Axm−b‖ and compare them. You should observe that their behavior is different due to machine precision. Also, you should observe that the upper-bound (1) is quite loose, but still, when κ(A)≫1 the method can be very slow.