Groups 17 of 99+ julia-users › (How?) Has parallel computing changed in 0.4? 11 posts by 3 authors Nils Gudat 8/27/15 Might be in relation to this thread - I've ported some code from 0.3.11 to 0.4, but when trying to run it in parallel I encounter all kinds of issues. My main file starts with nprocs()==CPU_CORES || addprocs(CPU_CORES-1) @everywhere begin include(path*"Optimizations.jl") include(path*"Interpolations.jl") include(path*"Parameters.jl") ... end Each file included here has a using statement at the beginning importing the necessary modules (e.g. Optimizations.jl will start with using Optim). Now when I run this, I get warnings about "replacing modules", one for each worker. Furthermore, every now and then (I'd say about half the time), I also get warnings of the form "both Grid and Grid export "CoordInterpGrid"; uses of it in module Main must be qualified", and the whole thing ends in an UndefVarError on one of the workers, complaining that a function from one of the modules that should have been imported is undefined. If I get past this stage and my code gets to the parallel loop, it finishes after a couple of seconds without any results. I've narrowed this down to the fact that the function being called in my parallel loop for some reason doesn't work on the worker processes (i.e. I can call func(a,b,c) on the main process, but remotecall_fetch(3,func,a,b,c,) gives a MethodError), but instead of getting lots of errors from my workers (as in 0.3.11), it seems that the loop simply skips over the errors and returns an empty results array without displaying an error. Is this expected behavior? Steven G. Johnson 8/27/15 On Thursday, August 27, 2015 at 7:29:00 AM UTC-4, Nils Gudat wrote: Might be in relation to this thread - I've ported some code from 0.3.11 to 0.4, but when trying to run it in parallel I encounter all kinds of issues. if you do "using Foo" then it imports the module on all the workers. If you do "@everywhere using Foo" then it imports it twice, hence the warnings. Solution: do "import Foo" only on node 1 (i.e. outside of @everywhere), which imports Foo everywhere (once), and then it is safe to do "@everywhere using Foo" (since using a module that has already been imported does not re-import it, it just puts the exported names into the namespace). I think the same thing happened in 0.3, but there was no warning about the double import. See also: https://github.com/JuliaLang/julia/issues/12381 Nils Gudat 8/27/15 Great, this does the trick! So as a general rule, for parallel processing the way to go would be 1. addprocs() 2. import all modules used 3. @everywhere include auxiliary files such as function definitions, which should include a using statement for all modules so that they are in the namespace of each worker Is this broadly speaking correct? Nils Gudat 8/27/15 Now that my code is running, in case anyone is still following this thread: is it possible that there has been a regression in parallel speed in 0.4? I recall that running my code in 0.3.11 resulted in a more or less linear speedup on my local machine (i.e. adding seven processors on my machine with 4 local physical core resulted in a runtime improvement of about a factor of 3), while now the ratio is a little under 2.45. Steven G. Johnson 8/27/15 - show quoted text - Yes. Nils Gudat 8/27/15 Thanks for the clarification :) Any pointers to where I could read up on the reasons for this, and (maybe?) ways around it? Kristoffer Carlsson 8/28/15 https://github.com/JuliaLang/julia/issues/12794 On Thursday, August 27, 2015 at 6:47:07 PM UTC+2, Nils Gudat wrote: Thanks for the clarification :) Any pointers to where I could read up on the reasons for this, and (maybe?) ways around it? Nils Gudat 8/28/15 Thanks! So this means the issue is fixed already? I just installed the latest 0.4 nightly (Windows), but performance hasn't changed. Kristoffer Carlsson 8/28/15 When is that nightly built? Maybe it isn't new enough to include the commit. It could be something else of course. Nils Gudat 8/28/15 0.4.0-dev+7053, commit ff77f73 (2015-08-28 04:25 UTC) Windows (x86_64-w64-mingw32) Kristoffer Carlsson 8/28/15 Could very well be another regression then. Would likely be appreciated if you could make a smallish text example and post an issue for it. Even better if you could git bisect to the commit that introduced the regression but if you are using binaries it might be difficult.