Homework 2: Laplace's Equation, Domain Decomposition

Homework due date has been extended to Thursday, April 11, 2004, 11:59PM

Last modified (Greenwich Mean Time): $Id: index.html,v 1.25 2004/04/09 07:19:51 kenta Exp $ ChangeLog

This homework will explore solving Laplace's equation in two dimensions. Written as a partial differential equation (PDE), Laplace's equation is

$\Delta T = \frac{\partial^2T}{\partial x^2}+ \frac{\partial ^2T}{\partial y^2} = 0$

In this homework we will refer to the dependent variable, T, as "temperature", so physically we are simulating the equilibrium temperature of a sheet of metal whose temperature along its edges are specified by the boundary conditions. (The astute student may realize this is the same equation solved in homework 1, although we didn't tell you it was Laplace's equation.)

We will choose a simple discretization of Laplace's equation on a rectangular grid. The solution to the discrete equation has the temperature at each grid point to be the average of its four neighbors in each direction. Intuitively, this means that temperature various smoothly within the interior of the sheet.

The boundary conditions are what makes the equation interesting. The boundary conditions are specified one grid point beyond the edge of the grid. For a rectangular grid, the boundary conditions are specified at the 14 points B marked below for an example 3x4 grid.

 BBBB
B....B
B....B
B....B
 BBBB

Each boundary edge can be one of two types:

The temperature along the edge can be explicitly defined.
Or, it can be "insulated", meaning that the grid point adjacent to this a boundary point ignores this boundary when calculating the average of its neighbors. Therefore, an the temperature of an edge grid point adjacent to an insulated boundary point is the average of three neighbors, not four. [see figure] And the temperature of a corner grid point adjacent to two insulated boundary points would be the the average of just two neighbors.

It should be pretty clear that the discretized Laplace's equation specifies a linear system. For example, for a 2x3 grid with temperatures 1,2 and 3,4 along the sides, and insulation along the top and temperature 5,6,7 along the bottom, the system of equations is:


  I I I
1 a b c 3       I=insulation
2 d e f 4
  5 6 7

(1 + b + d)/3=a
(a + e + c)/3=b
(b + f + 3)/3=c
(a + e + 2 + 5)/4 = d
(d + b + f + 6)/4 = e
(e + c + 4 + 7)/4 = f

Question -1

Please complete the parallel programming productivity questionnaire at: http://care.cs.umd.edu:3555/18_337/Assignment_1_Questionnaire.html The questionnaire is specifically for gathering feedback in regards to Homework 1 "Grid of Resistors". Although participation in the study is voluntary, please be candid in your responses as no course staff will see your answers, nor will it have any bearing on your grade.

Also, as you do this current homework, please submit an Effort log for approximately each day you work.

Question 0

0(a) Translate (by hand) the above linear system into the form Ax=b.
0(b) Find the solution using backslash in Matlab; that is x=A\b. (regular Matlab will suffice here, Star-p is not necessary.)

Question 1

1(a) Solve a discretized Laplace's equation on a rectangle using Matlab*P and dense linear algebra. Use Matlab's backslash. You should write a function
```
function T=solve_dense(left_is_insulated,left_temp,
  right_is_insulated,right_temp,top_is_insulated,top_temp,
  bottom_is_insulated,bottom_temp)
```
where T is either an n-by-m matrix or a (n*m)-by-1 vector (your choice). left_is_insulated and right_is_insulated are boolean (values are either zero or one) values specifying whether or not a boundary edge is insulated. left_temp and right_temp are 1-by-n real vectors specifying the temperature along the boundary. The temperature of foo_temperature(i) does not matter (is ignored) if foo_is_insulated is true. top and bottom are similarly specified.

HINT: you might find the reshape function helpful to get the (n*m)-by-1 vector solution from backslash into n-by-m form.
1(b) Solve Laplace's equation on a 50-by-50 grid with the top and bottom insulated, the left edge having one period of sin(x) and the right edge having one period of cos(x). Apply surf(T), and turn in a screen shot (or some other graphical representation) of the solution.
1(c) Experiment with different matrix sizes and time the results. Using two nodes (i.e., 8 processes on beowulf2), how large a problem can you solve in a reasonable amount of time?
1(d) After you've verified that your code is perfectly working, try running with (say) 75% of the available nodes at the time. You can see how many nodes are available with the command "pbsnodes -a". BUT DO NOT USE THAT MANY NODES FOR MORE THAN 5 MINUTES! Do one or two runs and comment on your results. Please e-mail 18.337/staff AT mit.edu if you see someone tying up a large number of nodes for an extended period of time.

Question 2

Repeat question 1 with matlab's sparse linear algebra. For part (c), try some different boundary conditions. NOTE: Do only SERIAL Matlab for this question. Sparse distributed matrices using Matlap*P is currently not working. HINT: you may find the Matlab functions sparse() and spdiags() useful.

Question 3

3(a) The serial Matlab code in laplacefft.m solves Laplace's equation on a rectangle using the Fast Fourier Transform (FFT). Examine how it works and translate the code into serial C (or C++) code calling the FFTW library. Use type double, not float. You can find documentation about FFTW at the FFTW home page. Note that we have installed FFTW version 2. The libraries to link against are installed in /usr/local/lib/*fftw*. You will probably want to include
```
#include <dfftw.h>
```
at the top of your source code, and compile with the linker flags:
```
$ gcc foo.c -ldfftw -lm
```
The only functions you will need are the complex-to-complex fftw_one(), (which corresponds to Matlab's fft() function), fftw_create_plan(), and fftw_destroy_plan(). All three functions are demonstrated in the Complex One-dimensional Transforms Tutorial in the FFTW documentation. You should not try to use FFTW's 2-dimensional or multi-dimensional functions as they are tricky to use. (If you know what you are doing, go right ahead.)

As it says in help fft, if Matlab's fft() function takes as input a matrix, it will perform a one-dimensional FFT on each column independently. You will have to "unroll" this loop when calling FFTW's fftw_one.
3(b) Measure the speed of your implementation for various grid sizes. Try grid sizes of aspect ratio 1:1 and 1:2. When measuring speed, place an outer for-loop to repeat the FFTW solving so that the program takes at least a minute or so to run. (You may wish to read the number of iterations from the *argv command line pointer.) When calling fftw_create_plan, use FFTW_MEASURE. Remember that you can re-use the same plan for FFTs of the same same size.
3(c) Solve a 500x500 grid with initial conditions left(t)=1+sin(t), right(t)=1+cos(2*t), top and bottom insulated. (Scale appropriately so that the length of the edge is 2 pi.)

Use this routine (which you may modify as you see fit) to produce a greyscale PGM file, and use the program ppmtogif (installed on the beowulf2 front end) to produce a GIF file.

/* type "man pgm" to understand the (very simple!) PGM format */
printf("P5\n");  /* This is the "raw pgm" format */
printf("%d %d\n", width, height);
printf("255\n");
for(int i=0;i<height;i++){
  for(int j=0;j<width;j++){
     /* scale appropriately */
     int c=(int)(T[i][j]/2*255);  /* T[i][j] is the temperature */

     printf("%c",c);
  }
}

If you would like to make color output (optional), try the "man ppm".

Question 4

4(a) Using the Matlab FFT solver, now implement a parallel Star-p solver on a very-long sheet using domain decomposition. (Hint, use MM-mode.) Each domain should be solved in parallel, and then the values in each solved domain are fed into the boundary conditions of the adjacent domains in the next iteration. In the example figure below, the values along the thick black line within sub-domain 1 in iteration 1 are fed into the left boundary condition of sub-domain 2 in iteration 2.

The following files show how to do domain decomposition on a serial machine in one-dimension:
We will leave it up to you to choose some interesting boundary conditions in 2D.

How do you know when to stop iterating? For this problem, go ahead an "cheat"--use the FFT or sparse solver from Question 2 or 3 on the whole domain to get the exact solution. Or use the exact solution given in the comments of laplacefft.m. Continue iterating until all points are within 1e-6 of the exact solution.
4(b) Examine how the amount of overlap between the boundaries affects the number of iterations required for solution, as well as the total amount of time needed for the solution. Be sure you are comparing problems of the same size as you adjust the size of the overlap.

Question 5

Repeat question 4 with MPI. You may wish to make use of your results of FFTW on various sizes of FFT from question 3(b). You do not need to use any of the parallel features of FFTW, although you are welcome to try if you want.

Question 6

This question is OPTIONAL. Repeat question 5 with OpenMP. You don't have to do this unless you want the learning experience, or you want to use OpenMP for Question 7

Question 7

Using your favorite domain-decomposition solver of Question 4, 5, or 6 (that is, you can choose just one of your favorite parallel programming method; you do NOT need to do this question for all three methods), define a creative-shaped region with interesting boundary conditions, and solve it. Turn in a picture of your solution (along with your code). At a bare minimum, your region should be at least as complicated as an L-shape.

Using creatively placed boundary conditions and domain decomposition, you can place infinitely thin insulating barriers (or explicitly set the temperature) within a complex region. (Note: it is probably impossible (though you can prove us wrong) to do the insulating barriers unless you implement boundaries which can be partially Dirichlet and partially Neumann. If you want to do this, you'll probably want to extend your sparse solver from Question 2.) However, it is possible to explicily set temperature within a complex region without having to resort to such hackery.

This question will be graded "on a curve" (that is, the amount of your effort relative to the rest of the class). Special recognition will be given in class for the most impressive solutions to this question.

Question 8

Click ME! (See Question -1.)