Let S be a random string of the alphabet Q = { A, C, G, T } with length mn. S is distributed among m processors such that processor i has a substring S_i of length n. S equals concat(S_0, S_1, ..., S_(m-1)).
Let I be an input string of the alphabet Q.
Part II:
Merge Sort
Using MATLAB*P's mm mode, try to run a merge sort.
With np processors, generate a random vector of length 10000*np. Something like randn(10000*np*p,1) would work. Then, using mm mode, do a merge sort. The issue here is how do you proceed beyond the initial sort step - this is for you to figure out. MATLAB*P <-> frontend traffic is allowed, but can only be done in blocks of size 1000.
Constraint - at any time you are not allowed to have more than 10000 elements of the input data on each node, no matter where they come from.
What should you submit
- mergesort.m - entry point. Takes in an distributed vector and returns the sorted version.
- Any associated m-files.
Hints/Clarifications
- Q: "at any time you are not allowed to have more than 10000 elements of
the random vector on any one MATLAB process". What exactly does this
mean? Does the merge sort have to happen "in place", without using
additional memory? For example, does a = mm('sort',b) violate this
constraint? I guess the main problem here is that I don't know how
exactly the mm mode works and what exactly the constraint means.
A: Let the distributed vector to be sorted be x. The statement means each processor cannot hold more than 10000 elements of x at any time. You don't have to care about the internals of any function. The point is you cannot have more than 10000 elements of the input data.
- Q: "MATLAB*P <-> frontend traffic is allowed, but can only be done in
blocks of size 1000". What exactly is "MATLAB*P <-> frontend" traffic?
Is that when you use the pp2matlab/matlab2pp function calls? Does this
type of traffic occur at any other times? Is the final result supposed
to a ddense object? If so, when does this type of traffic occur?
A: I don't remember if the architecture of MATLAB*P has been explained in lecture. But the MATLAB that you are seeing is the 'frontend' and MATLAB*P itself is a parallel server running in the back. All the distributed matrices that you do calculate on are on the backend. They are never brought to the frontend unless explicitly requested (e.g. pp2matlab, matlab2pp, or subsref like A(1:5)).
The point here is that since you have a limit (10000) of the number of elements you can hold on each proc, to do merge sort you will need to swap data around. mm mode does not allow message passing from one node to another, so it has to pass through the frontend. And this is allowed.
- Q: Do the blocks really have to be sized EXACTLY at 1000? Or can the blocks be smaller?
A: They can be smaller.
- Q: It seems that if you make an assignment like temp = A(1:10,1) where A is a distributed dense array, temp is no longer distributed. (if this is followed by 'whos' temp is listed as a double array). Isn't this passing data to the front end? If not, on which node would temp reside? (I have been assuming that it gets sent to the front end...)
A: Temp is now on the frontend (and this counts as data to frontend traffic). This behavior might change in the future, but not for this homework.
- Q: As a note question -- the method I'm using for swapping the data works reliably -- and I am sorting the arrays, but it is fairly slow. I don't know how much I should be worried about this...
A: I won't be too worried. 100,000 elements (<1MB worth of data) is awfully small for a parallel application so communication cost will dominate.
- Q: For the second problem, can we use frontend matlab to do merging? And is it that it can also hold no more than 10,000 data of x? Also Does 10000
data ofx means any 10000 data of x, no matter which part they come
from x?
A: Yes you can use frontend to merge, and yes it can hold no more than
10000 element of x.
10000 elements mean any 10000 elements