Groups 5 of 99+ julia-users › Parallel computing: SharedArrays not updating on cluster 5 posts by 5 authors PMab Jun 21 Hi everyone, I am using shared arrays and an @sync @parallel for loop to run computations on my university's cluster. The body of the loop takes a given line in a shared array, calls a function on it that uses a solver to return an array, and finally updates another shared array using the returned array. Here is a snippet of that code, which is part of a bigger function that is called in the main file. Functions and shared arrays are defined @everywhere. function do_for_i(i::Int64) tmp_grid = gridSt[i,:]; tmp_resmat = resmat[i,:]; tmp_resmat_prev = resmat_prev[i,:]; tmp_resmat_new, tmp_VF, tmp_TF, tmp_failed = solvePointList(mobj, tmp_grid, tmp_resmat, tmp_resmat_prev, printmode); return tmp_resmat_new, tmp_VF, tmp_TF, tmp_failed; end @sync @parallel for i in 1:NPT temp = do_for_i(i); #temp = solve_for_i(i,gridSt,resmat,resmat_prev,mobj,printmode) resmat[i,:] = temp[1]; VFnext[i,:] = temp[2]; TFnext[i,:] = temp[3]; end The very puzzling thing is that when I run this code on a single mac or PC with multiple workers (respectively 2 and 8), everything works fine and the shared arrays resmat, VFnext and TFnext are updated. However, when I run it on a cluster (using the --machinefile option -- and whatever the number of workers used), they are not updated. They seem to be updated only within the @sync @parallel for loop, but not in the body of the bigger function. Does someone know what is going on? Is it possible that Julia's SharedArrays don't work with clusters? Greg Plowman Jun 21 Yes. AFAIK, Shared arrays are shared across multiple processes on the same machine. Distributed arrays can be distributed across different machines. Stefan Karpinski Jun 22 That's right – shared memory arrays cannot, by definition, be used on a non-shared memory distributed system like a cluster. You may want DistributedArrays. - show quoted text - Matthew Pearce Jun 24 As the others have said, it won't work like that. I found a few options: DistributedArrays. Message passing handled in the background. Some limitations, but I've not used much. SharedArrays on each machine. You can share memory between all the pids on a single machine, and then pass messages between one process from each machine to updated. Regular Arrays on each machine. Swap messages between all processes. Which one works for you will depend on how big your arrays are and the access patterns of the code you're trying to run on them. Kevin Keys Jun 25 To clarify: shared memory arrays cannot be used across multiple nodes of a compute cluster. If you schedule your code to run on only one node of a cluster, then your code should work fine. This is what I do on my university cluster; see here. If you need more parallel computing power than what is available on one cluster node, then, as others have said, you will need to appeal to a different array paradigm. KLK El dimecres, 22 juny de 2016 12:09:49 UTC-7, Stefan Karpinski va escriure: - show quoted text -