Groups


14 of 99+  


julia-users ›
Questions on parallelizing code - or how to deal with objects (and not just gathering data) in parallel.
5 posts by 3 authors  


Sleort 	

Sep 1


Hi,

I am trying to figure out how to parallelize a slightly convoluted Monte Carlo simulation in Julia (0.4.6), but have a hard time figuring out the "best"/"recommended" way of doing it. The non-parallel program structure goes like this:
Initialize a (large) Monte Carlo state object (of its own type), which is going to be updated using a Markov Chain Monte Carlo update algorithm. Say,
x =MCState()
In my case this is NOT an array, but a linked list/graph structure. The state object also contains some parameters, to be iteratively determined.
Do n Monte Carlo updates (which changes the state x) and gather some data from this in a dataobject.
for it=1:n
doMCupdate!(x,dataobject)
end
Based on the gathered data, the parameters of the MC state should be updated,
updateparameters!(x,dataObject)
Repeat from 2 until convergence by some measure.
Ideally, the parallel code should read something like this:
Initialize a Monte Carlo state object on each worker. The state is large (in memory), so it should not be copied/moved around between workers.
Do independent Monte Carlo updates on each worker, collecting the data in independent dataobjects.
Gather all the relevant data of the dataobjects on the master process. Calculate what the new parameters should be based on these (compared to the non-parallel case, statistically improved) data. Distribute these parameters back to the Monte Carlo state objects on each worker process.
Repeat from 2 until convergence by some measure.
The question is: What is the "best" way of accomplishing this in Julia? 

As long as the entire program is wrapped within the same function/global scope, the parallel case can be accomplished by the use of @everywhere, @parallel for, and @eval @everywhere x.parameters = $newparameters (for broadcasting the new parameters from the master to the workers). This however, results in a long, ugly code, which probably isn't very efficient from a compiler point of view. I would rather like to pass the parallel MCstate objects between the various steps in the algorithm, like in the non-parallel way. This could (should?) maybe be achieved with the use of RemoteRefs? However, RemoteRefs are references to results of a calculation rather than the objects on which the calculations are performed. The objects could of course be accessed by clever use of identity functions, the put() function etc., but again the approach seems rather inelegant/"hackish" to me...

To summarize/generalize: I'm wondering about how to deal with independent objects defined on each worker process. How to pass them between functions in parallel. How to gather information from them to the master process. How to broadcast information from the master to the workers... To me, my problem seems to be somewhat beyond the @parallel for, pmap and similar "distribute calculations and gather the result and that's it" approaches explained in the documentation and elsewhere. However, I'm sure there is a natural way to deal with it in Julia. After all, I'm trying to a achieve a rather generic parallel programming pattern.

Any suggestions/ideas are very welcome!
 

Chris Rackauckas 	

Sep 1


Hey,
  There are some things that are changed in v0.5 so I would suggest that you would start this part of the project on v0.5. 

  That said, I think you have to build the tools yourself using the basic parallel macros. You might want to look into ParallelDataTransfer.jl. It's built off a solution from StackExchange awhile ago, though there is a relevant bug you'll need to help us squash. Anything you find helpful in this area I would love to have as a contribution to this package. It would be helpful to the community to have a curated repository of these functions/macros.
- show quoted text -
 

Sleort 	

Sep 3


Thanks for the information! I will have a look into it and see if there is anything I can help with.

Hmm... I must say I am a bit surprised that there is no such built-in functionality in Base..? Seems to be a rather basic parallel programming functionality to me. Is there something about Julia's parallel programming model/philosophy I have missed/misunderstood?
 

michae...@gmail.com 	

Sep 3


MPI.jl has a montecarlo.jl function which could possibly be used. A simple example which shows the ideas is https://github.com/JuliaParallel/MPI.jl/blob/master/examples/07-pi-montecarlo.jl  montecarlo.jl automatically collects results. The part about updating the state and broadcasting the new state to the workers could also be done using the MPI package. 


On Saturday, September 3, 2016 at 6:08:46 AM UTC+2, Sleort wrote:
Thanks for the information! I will have a look into it and see if there is anything I can help with.

Hmm... I must say I am a bit surprised that there is no such built-in functionality in Base..? Seems to be a rather basic parallel programming functionality to me. Is there something about Julia's parallel programming model/philosophy I have missed/misunderstood?
 

Sleort 	

Sep 4


I was kind of hoping that it would be possible to achieve the desired functionality within Julia's own parallel framework / without having to add the extra complexity of MPI... But it is good to know that there is an alternative, should Chris Rackauckas's solution (or other "Julian" solutions) not work as desired.