Wednesday, February 08, 2006

Full Multigrid Cycle

It's been a while since I've posted anything, but since I've actually given this address to more than one person, I figure I should at least finish what I set out to do originally, that is, document my learning of image processing algorithms as they come my way.

But first: the full multigrid algorithm (FMG).

So now that we've seen how to make a v-cycle, how can we make it so that we need only run the algorithm once, rather than running it many times and waiting for convergence? The answer lies with the first guess. The answer to this problem greatly depends on the initial conditions on the grid; if those initial conditions are set 'properly', then the convergence should be instant.

A trivial example: in the original equation, Del U = F, if we set U to zero when F is zero, our first guess is the right answer.

But how do we get there?

Well, if we start off at the coarsest possible level on the grid, and solve the problem there, we'll have a very coarse solution. If we then interpolate that answer to the next finer level, we'll get a slightly less coarse solution. Of course, to solve that slightly less coarse grid level, we'll just do a regular v-cycle, using residuals and everything that we did before. So, by the third-coarsest level, we use the result from the second-coarsest level as the start, and then solve via a v-cycle.

The claim is that, if done properly (ie, if the upconversion from coarse to fine level is done with at least tricubic interpolation for a second-order derivative approximation), the coarse-to-fine guesses will converge to a reasonable solution within 1 iteration.

Some code:



function outGuess = FullMultiGridSolver(inSize, inNumSteps, inF)
%at each level, the solution is accomplished by a call to
%ComplexMultiGridSolver, and the up interpolation is accomplished
%by CubicInterpolator.


theRealSize = inSize+1;

if (inNumSteps == 0)
%just solve it if it's that small
%guess at rand, and then go from there
rand('seed', 40234513);
theGuess = rand(theRealSize);
for i = 1:10000
theGuess = GaussSeidelStep(theGuess, ...
zeros(max(size(theGuess))), inF);
end; % really, could just solve this directly; but this will be %fast enough

outGuess = theGuess

else

%ok, now just call the full multigrid solver, interpolate the
%result, solve it, and then return.
theFSubsample = Generate2xSubsample(inF);
theGuess = FullMultiGridSolver(inSize/2, inNumSteps-1, theFSubsample);
theUpsampleGuess = CubicInterpolator(theGuess);
theUpsampleGuess = SimpleMultiGridSolver(theUpsampleGuess, ...
inF, inNumSteps);
outGuess = theUpsampleGuess;

end


So, some to go with that, you'll need the Cubic Interpolation:



function outResult = CubicInterpolator(inGrid)
%does a cubic interpolation on interior points, and linear
%interpolation near the edges, in order to create a new guess for the
%'next level up' in the fullmultigrid solver


sizes = size(inGrid);
for i = 1:sizes(1)
for j = 1:sizes(2)
outResult((i-1)*2+1, (j-1)*2+1) = inGrid(i, j);

end;

end;

%interpolate in one direction
[newX, newY] = size(inGrid);
newX = (newX-1) *2+1;
newY = (newY-1) *2+1;
denominator = 1/16;
for i = 1:2:newX
j = 2;
outResult(i, j) = 0.5*(outResult(i, j-1)+ outResult(i, j+1));
j = newY-1;
outResult(i, j) = 0.5*(outResult(i, j-1)+ outResult(i, j+1));

end;
for i = 1:2:newX
for j = 4:2:newY-3
outResult(i, j) = denominator*(-1*outResult(i, j-3)+ 9*outResult(i, j+1) + ...
9*outResult(i, j-1)+ -1*outResult(i, j+3));

end;

end;

% now t'other
for (j = 1:newY)
i = 2;
outResult(i, j) = 0.5*(outResult(i-1, j)+ outResult(i+1, j));
i = newX - 1;
outResult(i, j) = 0.5*(outResult(i-1, j)+ outResult(i+1, j));
end
for i = 4:2:newX-3
for j = 1:newY
outResult(i, j) = denominator*(-1*outResult(i-3, j)+ 9*outResult(i+1, j) + ...
9*outResult(i-1, j)+ -1*outResult(i+3, j));

end;

end;


A slight modification needs to be made to the SimpleMultiGridSolver code in order to work with this code; I leave that as an 'exercise to the reader' (basically, you need to change the return code to be the guess itself as a result of the iteration, rather than the other stuff from before).

Thursday, December 08, 2005

A brief refresher, and the basic V-Cycle

The 2-cycle solution is interesting, in that it solves the equation

LU = F

where L is the Laplacian operator, u is the answer we want, and F is what we have, with the idea of:

LU - Lu = F - Lu

Where u is your current guess, and LU is the 'actual' answer. So, LU - Lu is a measurement of error, and F - Lu is a measurement of the closeness of Lu to F, also known as the residuals. So:

LV = LU - Lu

and

F - Lu = R

So

LV = R

We're using Gauss-Seidel iterations to update our grid, so the residuals are smooth, and we can subsample them. This transformation is from a grid sampled at h to a grid sampled at 2h, where h is the distance between sampling points.

Think of it this way: h is your dx or dy in your differential equation. dx represents the sampling spacing in the x direction, while dy is the sampling spacing in the y direction. Of course, if we don't have isotropic grids (that is, dx isn't the same as dy), then the Laplacian operator is a bit trickier. I'm ignoring that now, mainly because I don't know how to do that (yet). So, if we subsample over the same domain, we've decreased our sampling rate to 2h; we've essentially made dx twice as big. In calculus, we try to make dx equal to infinity, but for practical purposes, we have to use some finite sampling. One trick to solving these kinds of problems is choosing an appropriate h for your grid. I'll return to this topic only for a bit, and only with respect to this particular algorithm; h choice will vary with problem.

But! We can solve

LV = R

on a smaller grid, because the residuals are smooth. It follows, then, that the residuals of that equation themselves are also smooth! So, we could have many grids, and many levels of solver. Many grids = multigrid! (but not all the grids, not yet...)

That solver is called the 'v-cycle of the multigrid.' As the matrices are subsampled, they go down, and then as they are upsampled, they come up, and make a chart like a v:
On the way down, we use injection on the residuals, and on the way back up, we use linear interpolation to increase the size of the error. Of course, we can go lower than 8h, if we wanted.

The idea here is to get rid of the 10000 steps after we go down one level. We cut the problem size down by a factor of 4 with a single shrinkage, but if you have millions (or billions) of points, then a factor of 4 is cold comfort. With this continued shrinking, we can get the number of operations down significantly (around 25-100 per variable, depending on the problem, implementation of the solution, and so forth).

For our purposes, recursion is 'good enough' to solve this problem.

Question: Why not just solve on the smaller grid and be done with it? What's so special about a grid of sample size h?

Answer: Aliasing.

So what's the code look like for a full v-cycle of the Poisson equation?


function [outCorrection] = VCycle(inGuess, inF, inNumSteps)
%this function does one step down in the multigrid, and at the bottom
%step, does a massive number of convergence sweeps in order to
%make it work (ie, makes sure the lowest level converges)
%poisson equation
if inNumSteps > 0,
inSize = max(size(inGuess));

theGuess = inGuess;

theGuess = GaussSeidelStep(theGuess, zeros(inSize), inF);%see this
theGuess = GaussSeidelStep(theGuess, zeros(inSize), inF);

theResiduals = theGuess;
theResiduals = CalcResiduals(theGuess, inF);%see this

theResidualSubsample = Generate2xSubsample(theResiduals); %see this

theSmallSize = max(size(theResidualSubsample));
theSmallerError = zeros(theSmallSize);

theSmallerError = VCycle(theSmallerError, ...
theResidualSubsample, inNumSteps - 1);

theErrorUpsample = Generate2xUpsample(theSmallerError);%see this

theGuess = theGuess + theErrorUpsample;
outCorrection = GaussSeidelStep(theGuess, zeros(inSize), inF);

theCurrentResidualNorm = sqrt(sum(sum(...
CalcResiduals(theGuess, inF).^2)));

else
theGuess = inGuess;
for i = 1:10000
theGuess = GaussSeidelStep(theGuess, zeros(max(size(theGuess))), inF);
end;
outCorrection = theGuess;
end;

Now, that should give similar convergence speed as the 2-grid cycle, but be much faster. Note that you can call this routine with a given depth into your v-cycle, so mess around with whether depth changes convergence speed (hint: it shouldn't unless I messed up).

That particular routine takes a guess in, so you can set your random matrix up outside this function.

An Important Note: When solving PDEs on grids, remember that you're subsampling a space, NOT using pixels! That's a big difference. So, if h = 1 / 32 (that is, you have 32 spaces on your grid), you actually have 32+1 grid points! So, when using the above routine, call it with a matrix that's got one more element in both directions (as in, rand(65), not rand(64)). Before, I kind of hid this from you (it's in the code, though, with that 'theSize = inSize + 1'). Now, it's more explicit. I personally prefer to think in terms of powers of 2, but we're going to reuse this code in a bit, so I think the distinction needs to be noted.

What about the guess? We could keep using random numbers, but what if we tried for something a bit closer to the solution in the first place?

2-cycle multigrid code

This code attempts to solve the poisson equation through the use of subsampled residuals. Why? Because subsampled residuals are, well, smaller than residuals that are the same size as the current matrix. Since we know that the residuals are smooth, we can subsample them, and since they are subsampled, we can manipulate them, and then upsample them back to the original grid spacing, and get a tasty update.

The general algorithm:

0) make your guess (in our case, random numbers)
1) Gauss-Seidel iteration on the guess as on the equation LU = F
2) a second GS iteration on the guess
3) Subsample your residuals
4) Solve your residuals as the equation LV = R (ie, solve for V). We can use the same code as for step 1 (note that the equations are identical in terms of operator, but differ in the actual numbers that will be manipulated)
5) Add the resulting error to the guess from step 2
6) Another GS step

Each iteration of this algorithm should converge at a rate of 0.1! That is, the matrix should get 10 times closer to the solution after each iteration, which is pretty rapid. Also, if you put two GS steps instead of 1 at step 5, you get a convergence rate of 0.03.

How do I measure convergence? First, I calculate the L2Norm of the error (which is the current guess - the expected result. For our sample equation, LU = 0 with 0 on the borders, that's a field of zeroes, so the error is anything in our U matrix that isn't a zero), and then the L2Norm of the residuals.

The L2 Norm Calculation:
theL2Norm = sqrt(sum(sum((theActual - theGuess).^2)));

(later, we'll do a modification to this norm that pays attention to the number of elements in a matrix, so we can compare two matrices of different sizes to one another. But for now, this will work).

So, convergence is measured by theL2Norm of the error and residuals of the previous iteration divided by the L2Norm of the error and residuals of the current iteration. In a 'real problem', where we don't know the answer, we instead just use the residuals. But for this problem, at least, your error should improve by a factor of 10 with each iteration.

When the convergence factor is based on residuals, if the convergence factor approaches 1, that means you're going v_e_r_y s_l_o_w_l_y, and that's no fun. In fact, if you calculated the convergence factor for each step of our GS algorithm from earlier, you should see that the first few steps have some serious activity, and then subsequent steps go to 1. We want to capture the activity of those first few steps, not the slowness of later steps. Just because the convergence is slow doesn't mean that the algorithm is done; it just means that it's very slowly getting there.

Anyway, here's the code:


function [outResidualConvergence, outIterations] = ...
SimpleMultiGridSolver(inRealSize)
%this function does one step down in the multigrid, and at the bottom
%step, does a massive number of convergence sweeps in order to
%make it work (ie, makes sure the lowest level converges)
%poisson equation

inSize = inRealSize+1;
theActual =zeros(inSize);
rand('seed', 40234513);
theGuess = rand(inSize);
keepGoing = 1;
iteration = 0;
theCurrentResidualNorm = sqrt(sum(sum(...
CalcResiduals(theGuess, zeros(inSize)).^2)));

while keepGoing == 1
theL2Error = sqrt(sum(sum((theGuess - theActual).^2)));

theGuess = GaussSeidelStep(theGuess, zeros(inSize), zeros(inSize));
theGuess = GaussSeidelStep(theGuess, zeros(inSize), zeros(inSize));%see this

theResiduals = theGuess;
theResiduals = CalcResiduals(theGuess, zeros(inSize));%see this

theResidualSubsample = Generate2xSubsample(theResiduals); %see below

theSmallSize = max(size(theResidualSubsample));
theSmallerError = zeros(theSmallSize);
for i = 1:10000
theSmallerError = GaussSeidelStep(theSmallerError, ...
zeros(theSmallSize), theResidualSubsample);

end;

theErrorUpsample = Generate2xUpsample(theSmallerError);

theGuess = theGuess + theErrorUpsample;
theGuess = GaussSeidelStep(theGuess, zeros(inSize), zeros(inSize));

iteration = iteration + 1;

thePreviousResidualNorm = theCurrentResidualNorm;
theCurrentResidualNorm = sqrt(sum(sum(...
CalcResiduals(theGuess, zeros(inSize)).^2)));

theResidualCorrection = theCurrentResidualNorm/thePreviousResidualNorm
theResidualL2Norm(iteration) = theResidualCorrection;

end;
outResidualConvergence = theResidualL2Norm(1:iteration-1)/theResidualL2Norm(2:iteration);
outIterations = iteration;




I apologize for the formatting; I can't quite bend Blogger to my will yet. At least the indentation is somewhat preserved...

Anyway! Here are the two routines you need, one for subsampling and the other for upsampling:

function outResult = Generate2xSubsample(inMatrix)
%this function generates a 2x subsample matrix of the original, given
%matrix. Assumes a 2D matrix for now.

sizes = size(inMatrix);
outResult = inMatrix(1:2:sizes(1), 1:2:sizes(2));



function outResult = Generate2xUpsample(inMatrix)
%this will generate a 2x upsample that's purely a linear interpolation,
%increasing the size of the matrix by 2 in each direction. Assumes
%a 2D matrix for now.

sizes = size(inMatrix);
for i = 1:sizes(1)
for j = 1:sizes(2)
outResult((i-1)*2+1, (j-1)*2+1) = inMatrix(i, j);
end;
end;

%interpolate in one direction
[newX, newY] = size(inMatrix);
newX = (newX-1) *2+1;
newY = (newY-1) *2+1;
for i = 1:2:newX

for j = 2:2:newY-1

outResult(i, j) = 0.5*(outResult(i, j-1)+ outResult(i, j+1));

end;

end;
j = newY+1;
for i = 1:2:newX

outResult(i, j) = 0.5 * outResult(i, j-1);

end;

% now t'other
for i = 2:2:newX-1

for j = 1:newY+1

outResult(i, j) = 0.5*(outResult(i-1, j)+ outResult(i+1, j));

end;

end;
i = newX+1;
for j = 1:newY+1

outResult(i, j) = 0.5 * outResult(i-1, j);

end;



There's a few details about this code that's interesting, at least to me.
1) why 10000 GS sweeps on the lowest level? Because we have to solve the lower level still, and that lower level could still be 512x512 or something huge. As such, we have to run enough GS sweeps to check to make sure it's converged. Remember, slow residual convergence means slow convergence in general, but that doesn't mean you've solved the problem, it just means you're taking forever to get there.
2) Are there other methods of subsampling, other than injection? Sure, there's something called 'full weighting'-- that's not important for now.
3) So, how do I call this for a 2-cycle solution?


>> SimpleMultiGridSolver(32, 2)

Note that your size should be a power of 2; this code doesn't really handle non-2 powers for now.

Now that we've gotten the single step down to work, what's next? Well, we solved LV=R on the lower level, so we should solve that equation as well using this method. That equation also has residuals... so the same idea will work! That means, recursion!

Wednesday, November 30, 2005

Basic Multigrid

OK, so you've got the residuals, right?

Now what?

First, we should use the residuals as a way to determine how quickly the error gets reduced as a function of grid size. Smaller the grid (ie, larger the space between grid points, or h), the faster the GS solution should converge, giving you smaller residuals.

In the simple function LU = 0, with 0 on the boundaries, as we decrease h (ie, make a much more finely sampled grid over the domain, or increase the number of pixels we have on the domain, however the hell you want to say it), the ratio of the residuals from one iteration to the next approaches 1.

To test this out, do this:
1) rewrite the previous GS solver so that the L2Norm of the residuals are calculated at each step
2) make the ratio of the L2Norm from iteration n to n+1. Note that as n increases, the ratio asymptotically approaches some value.
3) for large h (ie, h = 1/16 or so), the ratio goes to about 0.8. At higher h, the ratio goes to 1.

Why is this bad? Well, this phenomenon is the reason why GS is so freakin' slow to converge. If you add a step 1.5 in the above algorithm to also calculate the L2 of the error (which is really just the L2 norm of the solution itself, since the actual answer of LU = 0 with 0 borders is zero everywhere on the domain), you can see that the error also asymptotically improves with each iteration. In fact, one can demonstrate that if the residuals are slowly improving, then that means that the error itself is also very slowly converging. So, we want to improve the speed with which the residuals improve from one iteration to the next; that way, we can get the error to improve with each iteration as well.

Remember:

V = U - u eq(1)

Where V is the error, U is the actual solution, and u is the calculated solution. So, if we make the assumption of linearity (which we can relax later):

LV = LU - Lu eq(2)

But also

LU = F eq(3)

LU - Lu = F - Lu eq(4)

so,
LV = F - Lu = R eq(5)

then

LV = R eq(6)

Thus, if you have a decreasing residual set, that also means that you have a decreasing error as well.

Fascinating, isn't it, that

LV = R eq(7)

is of the same form as

LU = F eq(8)

So why not just solve for the residuals, rather than solve for the equation directly? And while we're at it, let's be honest-- those residuals, they're smooth as butter (mmm so creamy), so we don't need to be using all of them. Instead, we can subsample them, using a grid that's half the size (instead of a grid with spacing h, we can use a grid with spacing 2h). Why would this matter? Because a smaller grid means a faster iteration time! We can cut the problem size down by a factor of 4 by doing some subsampling of the residuals.

Once we know the residuals, we can do this:

LV = LU - Lu eq(9)

LU = LV + Lu eq(10)

Which, again with the linearity assumption, is just a restatement of

U = V + u eq(1)

That is, we can add the results of solving for the residuals back to the original u solution, and get an actual answer.

Of course, this is an iterative process, so really it's

U (n+1) = V(n) + u(n) eq(11)

And then we recalculate the residuals for LU(n+1) and continue the cycle again.

So this is the crux of the basic multigrid solution:

1) do a few GS sweeps on the problem (2 is a good number)
2) Calculate the residuals
3) Subsample the residuals
4) 'Solve' the residuals on the smaller grid
5) Upsample the residuals back to the original grid size
6) update your solution by adding back the residuals

Step '4' is where it gets really interesting, but before we jump ahead, let's do some code.

Calculating residuals

Ah, I forgot to include the code for calculating residuals. It's actually fairly straightforward, given the form

R = F - Lu

It's easier than I thought it would be at first:

===================
function outResiduals = CalcResiduals(inError, inF)
%calculates the residuals for multigrid methods
%that is, for LU=F, this will calculate F - Lu~, or the residuals.
inSize = max(size(inError));
outResiduals = inError;
h = (1/(inSize-1));
for j = 2:inSize-1,
for i = 2:inSize-1,
u0 = inError(i, j);
u1 = inError(i+1, j);
u2 = inError(i-1, j);
u3 = inError(i, j+1);
u4 = inError(i, j-1);
outResiduals(i, j) = inF(i, j) - ((u1+u2+u3+u4) - (4*u0))/(h*h);
end;
end;

===================

Again, this assumes a square grid, an assumption that can be fairly easily relaxed.

As a side note: I've made the implicit assumption here that I'm going to be solving a linear equation (which it just so happens this form of LU = F is linear). If LU= F is not linear, then it's harder to calculate the residuals, because you can't just distribute the L operator through both the U and u terms in order to arrive at LU-Lu = F - Lu.

One form of nonlinear equation is called the Bratu equation, and one form it takes is:

LU + (lambda)exp(u) = F

Where F will be, for now, 0.

When calculating Gauss-Seidel iterations for this problem, the resulting equation is much different than for the linear problem; however, we can reduce this equation to a linear form.

However, we have other stuff to go through first; namely, how to use the residuals to our benefit.

Calculating the residuals gives a sort of approximation for how well the solution was found compared to the 'real' solution. As such, we should be able to use it to estimate the next step in the solver; rather than using a straight GS solution, there should be some trick to finding the 'real' u (ie, where the error in u is dominated more by truncation error than algebraic error, more on this later) more quickly using the residuals.

The answer to that is: Multigrid.

Residuals

At the end of the last post, I posed a bunch of questions. Some of them I can answer right away, others I can't.

1) How do you know you did it right? For this problem, the 'right' answer is everything going to zero. Did that happen for your results?
2) How do you know the result is the correct one? Well, for this problem, that's pretty easy; for other problems, not so much. Suffice it to say, without proving anything, that GS relaxation is provably going to (eventually) reach the solution; it may just take forever to get there. I'm not one for proofs here; I'm just trying to explain the algorithms. There are many books that contains proofs for this.
3) Why not start with a bunch of zeros? For this problem zero is the answer, so starting at zero would be a great way to get right to the answer, and not really see how the algorithm work. More generally, for more complicate problems (like sin over the domain, for instance), starting from zero gives the algorithm nowhere to go; starting with rand allows the algorithm to have something to start with, to get to the fluctuations present in the answer.
4) Why 10000 steps? Because it'll probably converge by that point on smaller (64x64) grids.

What we really want is some measure of how we're getting better from step to step. Initially, we choose a problem that we know the answer to (ie, zero = zero) so that we can gauge the performance from one step to the next by gauging how quickly the solution goes to zero. More generally, we want to know how well the iteration performed, but since we won't know the answer, we have to be more clever.

Consider:
LU = F

(where L is the Laplacian operator, U is the answer, and F is the 'given' result of the differential equation)

L(U - u)

I've subtracted Lu from both sides. u, here, is the 'current guess', while U is the actual answer; thus, the difference between the two should give us some suggestion as to the goodness of the update step.

So, if we distribute the L's....

LU - Lu

If we subtract Lu from both sides, we get

Lu - Lu = F - Lu

The term F - Lu can be further characterized as:

F- Lu = R

Where R is a term known as the Residual. By calculating the ratio of the residual norm from the previous iteration to the norm of the current iteration, we can get an idea of the speed of convergence; furthermore, this ratio will eventually asymptotically reach some ratio, and let us know how quickly we're going to get there. A low ratio means that the solution is moving quite quickly.

How do you calculate the norm?

Probably the easiest way to do it is to use the L2 norm:

theSize = max(size(theMatrix)); %again, assuming a square matrix
L2Norm = sqrt(1/theSize * sum(sum(theMatrix.^2)));

So for GS relaxation, the ratio of residual norms should asymptotically approach some value. It turns out, that value depends on h, the grid size of the problem. If we have a 16x16 grid, the solution will be approached at a different speed than a 32x32 grid. What does the ratio as a function of grid size look like?

Sunday, November 27, 2005

Gauss-Seidel Relaxation

In the last post, I babbled insanely into the night about solving the Poisson equation with some kind of update step. Without further ado, I give you the code for that update step, in Matlab:

===============

function outResult = GaussSeidelStep(inMatrix, inBoundaries, inRHS)
%calculates a single step of Gauss-Seidel updates on a given
%matrix, assuming its boundary conditions are directly
%on the edge of the given matrix

sizes = size(inMatrix);
if (sizes(1) ~= sizes(2))
error('Sizes are not equal!');
end;

theSize = sizes(1);


outResult = inMatrix;
h = (1/((theSize-1)));
h2 = h*h;
for j = 2:theSize-1,
for i = 2:theSize-1,
u1 = inMatrix(i+1, j);
u2 = inMatrix(i-1, j);
u3 = inMatrix(i, j+1);
u4 = inMatrix(i, j-1);
theAnswer = ((-1*inRHS(i, j)*h2) + (u1+u2+u3+u4)) /4;
inMatrix(i, j) = theAnswer;
end;
end;
outResult = inMatrix;
outResult(:, 1) = inBoundaries(:, 1);
outResult(:, theSize) = inBoundaries(:, theSize);
outResult(1, :) = inBoundaries(1, :);
outResult(theSize, :) = inBoundaries(theSize, :);

================

This code assumes that you're dealing with a square matrix, but that assumption can be relaxed fairly readily. It also assumes a 2D matrix, which can also be dropped pretty straightfowardly as well; I'm going to be making such assumptions anyway, for the time being.

So what's going on here?

First, note that I'm ignoring the boundaries. Matlab is a 1-based language, not 0-based (like C++), so you have to start your indeces from 1 and include the last index in the series. Here, I go from 2 to theLastIndex-1, and leave the borders out.

Second, and relatedly, I'm forcing the borders to be what they are from a given matrix; that is, I'm forcing the edges of the matrix to be particular values. This is called 'enforcing boundary conditions' (which, if you'll recall from the last post, are what you need in order to make an actual, specific solution to the problem, rather than some general solution that could have any number of actual answers).

Third, I'm updating the matrix into itself. Typically, this is a big mistake, because you don't want your halfway-updated matrix to affect the results of your equation. Here, though, this is OK; the second derivative is smooth, and the results will therefore be smooth by updating with your current answer. That last sentence was known as 'waving my hands', hopefully distracting you from not having any proof whatsoever. If you want proofs, check out books like http://www.amazon.com/gp/product/0801854148/102-3481034-6699322?v=glance&n=283155&n=507846&s=books&v=glance
(incidentally, there's at least one other form of this update step. At this stage in the game, the other method is called Jacobi iterations, which place the results into the outResult rather than back into the inMatrix. This method just makes convergence slower.)

Back to the original problem, solving the Poisson equation with F = 0. Using the above method, you can solve by constantly calling GaussSeidelSteps on the same matrix, vis:

===============

theGuess = rand(someSize); %makes a square matrix of random numbers as the first guess
theZeros = zeros(someSize); % makes a matrix of zeros the size of your guess
for i = 1:10000
theGuess = GaussSeidelStep(theGuess, theZeros, theZeros);
end;

===============

But how do you know that you did it right? How do you know that the result is the correct one, and not something you got spuriously? Why not start with a bunch of zeros as the guess, and not random numbers? And why 10000 steps, why not 10 or 10 million?

That's for the next post...

Saturday, November 26, 2005

Poisson Equation and Gauss- Seidel relaxation

The most interesting class I'm taking at the moment is being taught by Achi Brandt on Multi-grid methods. I'm the only non-math person who's stayed in the class thus far (I think; there might be a few other engineering students in the back who don't say much), which is kind of a shame, because there's some really fascinating stuff going on here.

This class has dealt with solving this equation, and only this equation:

Eq(1)


Where upside-down triangle is the 'nabla' or the Laplacian operator, means this:

second partial derivative of U with respect to (wrt) x + second partial derivative of u wrt y = F
Eq(2)

[as a side note, the equation editor that comes with www.OpenOffice.org is so broken it's not worth using, and I can't quite figure out how to do make it work with html. I really don't care, because I'm learning that writing in equations is just a shorthand to avoid having to explain yourself]

We care about u, NOT F. This is a fairly key insight; the idea is to solve for u, given F and the Laplacian operator; that is, we already have F, and we know that in order to get there, two derivatives were taken in different directions, so we want to go backwards and solve for those original equations. This understanding will then let us predict how we can modify u to get different Fs, or how to predict which F will arise from different u's.

Consider:

x = y Eq(3)

Using this above form, y is what what we're given. We can measure it, it's determinable. The left-hand side, x, is what we want to learn. Using different values of y, we can predict what x will be. In the above equation, the answer is fairly simple; any given x immediately corresponds to the given y directly, ie:

y = 0 -> x = 0
y = 1000-> x= 1000

And so forth.

But, if we have something more complicated:

cx + b = y Eq(4)

Again, we just get values for y, and now we need to find information for c and b in order to make predictions. That is, if we have a value for x, and we want to know what y will result from the equation, then we need to know c and b. This information comes from our boundary conditions.

This equation says that y will change linearly with x; that is, as x increases/goes forward/etc, y will change predictably. In order to make that prediction, we need to know b and c. When x = 0, we can determine what b is pretty easily:

c(0) + b = y Eq(6)

So, b = y when x = 0. This is called the 'boundary condition.'

Now, we want to know another boundary condition in order to determine c. This condition is usually given by the problem; people who have taken differential equations know that I've just played fast and loose with a whole lot of stuff; that's not what's important. What's important is that we have F, and we want to get U in order to be able to predict future Fs from changes in U.

If we imagine that F is a matrix, rather than a single number, we can use a whole bunch of interesting math in order to arrive at a guess for u.

Eq(2) can be discretized into the following form:

(u(i + 1, j) + u(i - 1, j) + u(i, j+1) + u (i, j-1) - 4*u(i,j))/(h*h) = F(i,j) Eq(7)

This says that F, at any given point, is equal the the four neighbors of those same coordinates on u minus four times the point at u. h here is the grid distance, the distance from u(i, j) to any one of its neighbors. This equation is derived fairly straightfowardly if you know calculus. For the sake of ease, I'm saying that h is the same in either the horizontal or vertical direction. In image processing terms, h is the size of a pixel (sort of), so that if you have a 512x512 pixel image, h is 1/513 (each pixel is the space between grid points, and we solve on grid points; this kind of distinction is bounced back and forth depending on the math involved, and is a massive headache). h is so small because the image is typically assumed to be between 0 and 1 in both directions; that assumption helps with a bunch of other kinds of math, not least being Fourier methods.

We only care about u, not F. So, we can rearrange terms to say:

u(i,j) = ( (u(i + 1, j) + u(i - 1, j) + u(i, j+1) + u (i, j-1) - F(i, j) * h * h )/(4) Eq(8)

One way to solve this equation is to be iterative about it. We already have F; we want u. So, to solve this equation, we can make iterative updates.

The algorithm looks something like this:
1) guess that u is some set of random numbers between 0 and 1.
2) set each pixel to update with the formula in Eq(8)
3) rinse and repeat until the solution stops converging and the error is small.

How do you do 2? and 3? More on that next time...

First post!

This blog, first and foremost, is an attempt to document the things I'm learning while attending UCLA as a master's student. This is more of a free-form diary, but I've learned a few things that I thought other people might be particularly interested in. Specifically, I'm going to be talking most about the different image processing algorithms I'm learning about, as well as methods for solving partial differential equations quickly. I've found that most of the literature on these topics is written assuming you have at least 4 years of math background; I don't, I have 6 years of being a computer programmer as background. As such, this is intended to be mainly a rosetta stone of sorts, allowing programmers to translate math into code.