Groups 51 of 99+ julia-users › Parallel loop 3 posts by 2 authors ami... gmail.com 10 22 15 Hi all, I swear I tried to look into the documentation or online but I can't figure out what I want to do. I have a lot of sequential code executed and at some point I want to parallelize the following loop: mat_a zeros n, n for i 1:n mat_a i,i:n mean mat_b :,i . mat_b :,i:n , 1 end with mat_b being computed before. I have a bunch of questions in order to better understand things: - how to best choose the number of procs with which I run julia? - since each operation on the rows of mat_a can be done independently from the others, I'd like to send mat_b to each worker so that it can compute certain lines of the matrix mat_a in the form of an array of vector which I would concatenate afterwards to retrieve mat_a. I wanted to send mat_b with the everywhere macro but it seems this works only for definitions of variables directly on a worker. I don't know how to send already computed data to a specific worker. - more generally, is this the best approach to parallelizing this kind of code? Any advice appreciated, Thanks a lot, Tim Holy 10 22 15 In short: use SharedArrays, if all processes are on the same host. The example in the documentation should make this pretty clear, but feel free to post again if it's not. As for of procs, just try different choices up to the of cores in your machine and see what happens. --Tim ami... gmail.com 10 23 15 Thanks a lot for this hint Tim. I think this is indeed exactly what I need. I have implemented it more or less successfully in the sense that it works and computes the correct matrix: parallel helpers, from there: http: stackoverflow.com questions 27677399 julia-how-to-copy-data-to-another-processor-in-julia function sendto p::Int; args... for nm, val in args spawnat p, eval Main, Expr : , nm, val end end function sendto ps::Vector Int ; args... for p in ps sendto p; args... end end getfrom p::Int, nm::Symbol; mod Main fetch spawnat p, getfield mod, nm function passobj src::Int, target::Vector Int , nm::Symbol; from_mod Main, to_mod Main r RemoteRef src spawnat src, put! r, getfield from_mod, nm for to in target spawnat to, eval to_mod, Expr : , nm, fetch r end nothing end function passobj src::Int, target::Int, nm::Symbol; from_mod Main, to_mod Main passobj src, target , nm; from_mod from_mod, to_mod to_mod end function passobj src::Int, target, nms::Vector Symbol ; from_mod Main, to_mod Main for nm in nms passobj src, target, nm; from_mod from_mod, to_mod to_mod end end variables m Int 1e3 n Int 1e4 mat_b rand m, n sequential function compute_row_sequential mat_b, i, n return mean mat_b :,i . mat_b :,i:n , 1 end mat_a zeros n, n tic for i 1:n mat_a i,i:n compute_row_sequential mat_b, i, n end toc parallel addprocs 3 everywhere function compute_row_shared! smat_a, mat_b, irange, n for i in irange smat_a i,i:n mean mat_b :,i . mat_b :,i:n , 1 end end sendto workers , n n sendto workers , mat_b mat_b smat_a SharedArray Float64, n,n , pids workers tic sync begin for p in procs smat_a async begin irange p-1:length procs smat_a :n remotecall_wait p, compute_row_shared!, smat_a, mat_b, irange, n end end end toc println mat_a smat_a the last line returns true, but I tried different values of m and n and I could not find a case where the parallel implementation is more efficient than the sequential one. So obviously, either I'm doing something wrong or it's not the best approach for this case... I'll keep on trying to improve the efficiency, any advice again welcome : Thank you.