Errors when loading parallel packages with ssh tunnel 16778 Closed EthanAnderes opened this Issue on Jun 5 ยท 4 comments Projects None yet Labels parallel Milestone No milestone Assignees No one assigned 4 participants EthanAnderes RossBoylan vtjnash tkelman Notifications EthanAnderes EthanAnderes commented on Jun 5 Ref: google groups I seem to be encountering a problem using packages in parallel when using ssh tunneling to set up parallel works on a remote server. Here is how I am setting up the workers. _ _ _ _ _ _ | A fresh approach to technical computing _ | _ _ | Documentation: http: docs.julialang.org _ _ _| |_ __ _ | Type ?help for help. | | | | | | | _` | | | | |_| | | | _| | | Version 0.4.6-pre+37 2016-05-27 22:56 UTC _ |\__'_|_|_|\__'_| | Commit 430601c 9 days old release-0.4 |__ | x86_64-apple-darwin15.5.0 julia machines anderes xxx.xxx.edu , anderes xxx.xxx.edu 2-element Array ASCIIString,1 : anderes xxx.xxx.edu anderes xxx.xxx.edu julia addprocs machines, tunnel true, dir home anderes , exename usr local bin julia , topology :master_slave, 2-element Array Int64,1 : 2 3 After this, all of the following four code blocks fail I'll only show the errors on just the last one for readability . Note, these commands work fine when I launch the workers on the same machine as the master node without ssh tunneling . import Dierckx everywhere using Dierckx everywhere spl Dierckx.Spline1D 1., 2., 3. , 1., 2., 3. , k 2 import Dierckx using Dierckx everywhere spl Dierckx.Spline1D 1., 2., 3. , 1., 2., 3. , k 2 using Dierckx everywhere spl Dierckx.Spline1D 1., 2., 3. , 1., 2., 3. , k 2 everywhere using Dierckx everywhere spl Dierckx.Spline1D 1., 2., 3. , 1., 2., 3. , k 2 The last give the following errors julia everywhere using Dierckx WARNING: node state is inconsistent: node 2 failed to load cache from Users ethananderes .julia lib v0.4 Dierckx.ji WARNING: node state is inconsistent: node 3 failed to load cache from Users ethananderes .julia lib v0.4 Dierckx.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Dierckx.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Dierckx.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Compat.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Compat.ji ERROR: On worker 2: LoadError: InitError: Dierckx not properly installed. Run Pkg.build Dierckx in __init__ at Users ethananderes .julia v0.4 Dierckx src Dierckx-jl:27 in include_string at loading-jl:282 in include_from_node1 at . loading-jl:323 in require at . loading-jl:259 in eval at . sysimg-jl:14 in anonymous at multi-jl:1394 in anonymous at multi-jl:923 in run_work_thunk at multi-jl:661 inlined code from multi-jl:923 in anonymous at task-jl:63 during initialization of module Dierckx while loading Users ethananderes .julia v0.4 Dierckx src Dierckx-jl, in expression starting on line 714 in remotecall_fetch at multi-jl:747 in remotecall_fetch at multi-jl:750 in anonymous at multi-jl:1396 ...and 1 other exceptions. in sync_end at . task-jl:413 in anonymous at multi-jl:1405 julia everywhere spl Dierckx.Spline1D 1., 2., 3. , 1., 2., 3. , k 2 ERROR: On worker 2: error compiling __Spline1D 6__: could not load library Users ethananderes .julia v0.4 Dierckx src .. deps src ddierckx libddierckx Users ethananderes .julia v0.4 Dierckx src .. deps src ddierckx libddierckx: cannot open shared object file: No such file or directory in eval at . sysimg-jl:14 in anonymous at multi-jl:1394 in anonymous at multi-jl:923 in run_work_thunk at multi-jl:661 inlined code from multi-jl:923 in anonymous at task-jl:63 in remotecall_fetch at multi-jl:747 in remotecall_fetch at multi-jl:750 in anonymous at multi-jl:1396 ...and 1 other exceptions. in sync_end at . task-jl:413 in anonymous at multi-jl:1405 julia The errors seem to be package dependent. Here are similar code blocks for the Distributions package. I'm only getting ERROR on the last one when using ssh tunneling, yet each one works fine when the workers are launched with addprocs 2 julia import Distributions WARNING: node state is inconsistent: node 2 failed to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: node state is inconsistent: node 3 failed to load cache from Users ethananderes .julia lib v0.4 Distributions.ji julia everywhere using Distributions WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 PDMats.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 PDMats.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Compat.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Compat.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsFuns.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsFuns.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsBase.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsBase.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 ArrayViews.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 ArrayViews.ji julia everywhere spl Distributions.Normal 0,1 julia julia everywhere using Distributions WARNING: node state is inconsistent: node 2 failed to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 PDMats.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Compat.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: node state is inconsistent: node 3 failed to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsFuns.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 PDMats.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 Compat.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsBase.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsFuns.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 ArrayViews.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 StatsBase.ji WARNING: deserialization checks failed while attempting to load cache from Users ethananderes .julia lib v0.4 ArrayViews.ji julia everywhere spl Distributions.Normal 0,1 julia julia using Distributions WARNING: node state is inconsistent: node 2 failed to load cache from Users ethananderes .julia lib v0.4 Distributions.ji WARNING: node state is inconsistent: node 3 failed to load cache from Users ethananderes .julia lib v0.4 Distributions.ji julia everywhere spl Distributions.Normal 0,1 ERROR: On worker 2: UndefVarError: Distributions not defined in eval at . sysimg-jl:14 in anonymous at multi-jl:1394 in anonymous at multi-jl:923 in run_work_thunk at multi-jl:661 inlined code from multi-jl:923 in anonymous at task-jl:63 in remotecall_fetch at multi-jl:747 in remotecall_fetch at multi-jl:750 in anonymous at multi-jl:1396 ...and 1 other exceptions. in sync_end at . task-jl:413 in anonymous at multi-jl:1405 tkelman tkelman added the parallel label on Jun 6 RossBoylan RossBoylan referenced this issue on Jun 6 Closed Errors when loading parallel packages on same machine 16788 RossBoylan RossBoylan commented on Jun 6 This may be related to 16788, which shows similar errors using parallelism on one machine. vtjnash The Julia Language member vtjnash commented on Jun 12 The binary build on each client needs to be the same, not just the source checkout. vtjnash vtjnash closed this on Jun 12 EthanAnderes EthanAnderes commented on Jun 13 vtjnash Maybe I don't understand what you mean by binary build, but after the source checkout I complied julia...in particular, when launching julia on both machines it displays the exact same version number. vtjnash The Julia Language member vtjnash commented on Jun 13 The exact bitwise same usr lib julia binaries, which requires copying them to each machine