18.337/6.338 Parallel Computing

Home News Course Calendar Homework Projects Lecture Slides Suggested Readings Darwin Cluster Amazon EC2 Julia Docs Previous Classes

Homework 1

Given out 9/10/12, due 9/19/12.

In this assignment you will be exposed to different models of parallel computation. The goal is simply to say “hello world” through different tools and environments so you know how to access them.

We will be using two platforms: a local cluster and Amazon Elastic Compute Cloud (EC2). We will also try out three models of parallelism: message passing via MPI (in C and Fortran), loop parallelism, and data parallelism (in Julia).

This assignment has four sections. For each section, copy a transcript of the interesting parts of your terminal session to a text file.

1. MPI on a cluster

Read about the Darwin cluster, and make sure you can connect to beagle.darwinproject.mit.edu and evolution.darwinproject.mit.edu. Follow the instructions for “Setting up passwordless SSH”.

Find the “MPI example” on the same page. Copy this text into an editor such as emacs or vi on beagle and save it with a file name ending in .c. Follow the instructions for compiling and running MPI programs. Try running the program several times. What do you notice?

Here is another MPI program, in both C and Fortran. It evaluates an integral that equals pi. Compile and run the Fortran version.

MPI pi integral in C

MPI pi integral in Fortran

(Optional!) MPI can also be used from Julia. Check out some examples , and see our instructions.

2. Data parallelism

See our instructions for running julia on the Darwin cluster. In this exercise, we will describe a problem that maps well to data parallel computation, and use Julia to solve it. In fact we will tell you roughly how to solve it, and you will put the pieces together. All work can be done at the command prompt.

In the popular board game of Risk, it is common for an “attacker” to roll three dice and a “defender” to roll two. Let’s call the sorted rolls of the attacker and defender A3 ≥ A2 ≥ A1 and D2 ≥ D1 , respectively.

If A3 > D2 and A2 > D1 then the attacker “wins” 2.
If A3 ≤ D2 and A2 > D1 then the attacker “wins” 1 and loses 1.
Also if A3 > D2 and A2 ≤ D1 then the attacker “wins” 1 and loses 1.
If A3 ≤ D2 and A2 ≤ D1 , then the attacker “loses” 2.

Run as many simulations as you wish. Figure out the probability of each of three events: A wins 2, A wins 1 and loses 1, A loses 2.

The expression A=ceil(6*drand(nt,3)) will create a distributed array of dice rolls, where nt is the number of trials. The rolls can be ordered using A=sort(A,2) (sorting in the second dimension, along rows).

In Julia, comparisons are applied to whole arrays elementwise using the .> operator. As in Matlab, the results of comparisons can be summed to form counts. Combining these features, simulation trials can be done using sum(sub(A,1:nt,2:3).>B, 2). The sub function references a section of an array while leaving it in place (not copying). sum and .== will give you the final result counts.

The simulation can be scaled up by adding processors. Additional Julia processes can be requested from the Sun Grid Engine batch queue with the command addprocs_sge(n). If the cluster is busy, it can take a few minutes for jobs to start. You can use the qstat command to get a sense of the load and monitor your jobs.

Side note:
Counting the number of 2s in an array is an example of a map-reduce process. This can also be expressed as mapreduce(+, x->x==2, X), where the first function is the reducer and the second one is the mapper.

3. Loop parallelism

If you are a Pythonista, or prefer or want to learn some other parallel programming tool, you may do this section of the assignment in parallel Python or any other language.

Peruse the "Parallel Computing" chapter in the Julia docs, especially the section on parallel loops.

The beagle and evolution machines themselves (as well as the compute nodes) have multiple cores, and so are each parallel computers on their own. Starting julia with the command line option -p 4 will give you four processors on the local machine.

Port the MPI pi integral program from section 1 to Julia. You can ignore the user input, printing, and timing parts, and just do the computation. You should write a function that looks like this:

function parallelpi(niter)
    h = 1.0/n
    YOUR CODE
end

The function can be written at the Julia prompt, and then called with parallelpi(n). Or if you wish, you can write the function in a file and load it with load("file.jl").

You might want to check whether the machine you’re on is busy using top. If it is busy (i.e. there are some processes using close to 100% CPU), you can try connecting to a compute node with SSH instead.

Julia is still somewhat experimental, and you may encounter problems. Just show us what you tried and what happened, and don’t worry if you can’t get it to work. If you're curious about Julia, the fastest way to get questions answered is to join the mailing list, where you will find around-the-clock julia-related chatter of all kinds.

4. A brief visit to the cloud

Read our intro to EC2. Follow the instructions for starting an instance, connect, and find peakflops() on your instance.