Groups 179 of 99+ julia-users › Parallelizing Error on 0.5 on Ubuntu 1 post by 1 author ABB Sep 30 On this Julia version: _ _ _ _ _ | A fresh approach to technical computing _ | _ _ | Documentation: http: docs.julialang.org _ _ _| |_ __ _ | Type ?help for help. | | | | | | | _` | | | | |_| | | | _| | | Version 0.5.0 2016-09-19 18:14 UTC _ |\__'_|_|_|\__'_| | Official http: julialang.org release |__ | x86_64-pc-linux-gnu running on: Ubuntu 14.04.3 LTS I am trying to do a Monte Carlo simulation in parallel across 36 workers. I have two problems at least . 1. Some of the workers terminate at the beginning of the simulation, but I don't understand the error message: Worker 5 terminated.ERROR unhandled task failure : ProcessExitedException in yieldto ::Task, ::ANY at . event-jl:136 in wait at . event-jl:169 in wait ::Condition at . event-jl:27 in wait ::Channel Any at . channels-jl:92 in take! ::Channel Any at . channels-jl:73 in remotecall_fetch 606 ::Array Any,1 , ::Function, ::Function, ::Base.Worker, ::Function, ::Vararg Any,N at . multi-jl:1066 in remotecall_fetch ::Function, ::Base.Worker, ::Function, ::Vararg Any,N at . multi-jl:1062 in remotecall_fetch 609 ::Array Any,1 , ::Function, ::Function, ::Int64, ::Function, ::Vararg Any,N at . multi-jl:1080 in remotecall_fetch ::Function, ::Int64, ::Function, ::Vararg Any,N at . multi-jl:1080 in ::Base. 667 668 Base. +,ProjectModule. 45 47 Int64,Array Any,1 ,Array Any,2 ,UnitRange Int64 ,Array UnitRange Int64 ,1 at . multi-jl:1998 This is not a huge problem as the rest of the workers keep going and can finish the simulation, but I would like to understand what is going on, if possible. And maybe how to fix it so as to use those workers. 2. The more important problem is that at the end of the simulation, I run into other errors and nothing is returned. My uninformed and probably wrong guess is that there is something the program doesn't like about the fact that the different workers are finishing at different times? The errors I get are: ERROR unhandled task failure : EOFError: read end of file Worker 16 terminated.ERROR unhandled task failure : ProcessExitedException in yieldto ::Task, ::ANY at . event-jl:136 in wait at . event-jl:169 in wait ::Condition at . event-jl:27 in wait ::Channel Any at . channels-jl:92 in take! ::Channel Any at . channels-jl:73 in remotecall_fetch 606 ::Array Any,1 , ::Function, ::Function, ::Base.Worker, ::Function, ::Vararg Any,N at . multi-jl:1066 in remotecall_fetch ::Function, ::Base.Worker, ::Function, ::Vararg Any,N at . multi-jl:1062 in remotecall_fetch 609 ::Array Any,1 , ::Function, ::Function, ::Int64, ::Function, ::Vararg Any,N at . multi-jl:1080 in remotecall_fetch ::Function, ::Int64, ::Function, ::Vararg Any,N at . multi-jl:1080 in ::Base. 667 668 Base. +,ProjectModule. 45 47 Int64,Array Any,1 ,Array Any,2 ,UnitRange Int64 ,Array UnitRange Int64 ,1 at . multi-jl:1998 And - ERROR: LoadError: ProcessExitedException in wait ::Task at . task-jl:135 in collect_to! ::Array Array Float64,2 ,1 , ::Base.Generator Array Task,1 ,Base. wait , ::Int64, ::Int64 at . array-jl:340 in collect ::Base.Generator Array Task,1 ,Base. wait at . array-jl:308 in preduce ::Function, ::Function, ::UnitRange Int64 at . multi-jl:2002 in ::ProjectModule. 44 46 Int64,Array Any,1 ,Array Any,2 ,Int64 at . multi-jl:2011 in macro expansion at . task-jl:326 inlined in OuterSim 43 ::Int64, ::Int64, ::Int64, ::Array Any,1 , ::Array Any,2 , ::Function, ::Int64 at home ubuntu dynhosp DataStructs-jl:1321 in ::ProjectModule. kw OuterSim ::Array Any,1 , ::ProjectModule. OuterSim, ::Int64 at . missing :0 in include_from_node1 ::String at . loading-jl:488 in process_options ::Base.JLOptions at . client-jl:262 in _start at . client-jl:318 while loading home ubuntu dynhosp Run-jl, in expression starting on line 9 And finally: ERROR unhandled task failure : On worker 9: ArgumentError: Dict kv : kv needs to be an iterator of tuples or pairs in Type at . dict-jl:388 in CalcWTP at home ubuntu dynhosp DataStructs-jl:728 in WTPMap at home ubuntu dynhosp DataStructs-jl:747 in PSim 32 at home ubuntu dynhosp DataStructs-jl:1024 in 45 at . multi-jl:2016 in 625 at . multi-jl:1421 in run_work_thunk at . multi-jl:1001 in macro expansion at . multi-jl:1421 inlined in 624 at . event-jl:68 in remotecall_fetch 606 ::Array Any,1 , ::Function, ::Function, ::Base.Worker, ::Function, ::Vararg Any,N at . multi-jl:1070 in remotecall_fetch ::Function, ::Base.Worker, ::Function, ::Vararg Any,N at . multi-jl:1062 in remotecall_fetch 609 ::Array Any,1 , ::Function, ::Function, ::Int64, ::Function, ::Vararg Any,N at . multi-jl:1080 in remotecall_fetch ::Function, ::Int64, ::Function, ::Vararg Any,N at . multi-jl:1080 in ::Base. 667 668 Base. +,ProjectModule. 45 47 Int64,Array Any,1 ,Array Any,2 ,UnitRange Int64 ,Array UnitRange Int64 ,1 at . multi-jl:1998 The actual function I am calling is: function OuterSim MCcount::Int; T1::Int64 3, dim1::Int64 290, dim2::Int64 67, fi fips, da data05 outp sync parallel + for j 1:MCcount Texas MakeNew fi, da ; eq_patients NewPatients neq_patients NewPatients ResultsOut NewSim T1, Texas, eq_patients , PSim T1, neq_patients ; T T1 end outp :,1 outp :,1 MCcount return outp end I added the sync following the suggestion of a colleague here - I am not sure it's necessary. FWIW - I get the errors above on Ubuntu whether I include it or not. This code does run and terminate without error on my own home machine running OS-X, also v0.5 , which has only four cores. I would love your feedback! Thanks - AB