parallel bug in 0.35 release #10085 Closed armgong opened this Issue on Feb 5, 2015 · 22 comments Projects None yet Labels parallel regression Milestone No milestone Assignees No one assigned 5 participants @armgong @tkelman @jiahao @vtjnash @ViralBShah Notifications You’re not receiving notifications from this thread. @armgong armgong commented on Feb 5, 2015 I wrtie rjulia package call julia from R, the following R code run correctly on julia 0.3.3 and 0.4,but fail on 0.35. library(rjulia) julia_init() julia_eval("addprocs(1)") for (i in 1:2) { julia_void_eval(paste("r=remotecall(",i,", rand, 2, 2)",sep = "")) y <- j2r("fetch(r)") cat("\n") cat("process", i, "got value:\n"); print(y) } julia_void_eval("rmprocs(workers())") on 0.35 show following error > library(rjulia) > julia_init() > julia_eval("addprocs(1)") NULL > for (i in 1:2) + { + julia_void_eval(paste("r=remotecall(",i,", rand, 2, 2)",sep = "")) + y <- j2r("fetch(r)") + cat("\n") + cat("process", i, "got value:\n"); + print(y) + } process 1 got value: [,1] [,2] [1,] 0.5080583 0.5226671 [2,] 0.3843354 0.6072951 fatal: error thrown and no exception handler available. MemoryError() signal (11): Segmentation fault unknown function (ip: -1616141115) unknown function (ip: -1615757548) unknown function (ip: -1616127477) unknown function (ip: -1618661289) unknown function (ip: -1618550229) unknown function (ip: -1618537003) unknown function (ip: -1618576727) unknown function (ip: -1616064465) unknown function (ip: -1616064242) unknown function (ip: -1616064028) unknown function (ip: -1624995090) unknown function (ip: -1624994754) jl_trampoline at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) jl_apply_generic at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) jl_f_apply at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) julia_rmprocs_20121 at (unknown line) terminate_all_workers at ./multi.jl:1614 jlcall_terminate_all_workers_18781 at /data/julia0.3/julia/usr/bin/../lib/julia/sys.so (unknown line) jl_apply_generic at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) _atexit at ./client.jl:423 jlcall__atexit_18051 at /data/julia0.3/julia/usr/bin/../lib/julia/sys.so (unknown line) jl_apply_generic at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) uv_atexit_hook at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) jl_exit at /data/julia0.3/julia/usr/lib/libjulia.so (unknown line) unknown function (ip: -1624911487) unknown function (ip: -1624911380) unknown function (ip: -1624920120) unknown function (ip: -1547017712) unknown function (ip: 33247064) @armgong armgong commented on Feb 5, 2015 after search code , I think this issue is maybe casued by gc bug ??? @ViralBShah ViralBShah added the parallel label on Feb 5, 2015 @armgong armgong referenced this issue in armgong/rjulia on Feb 5, 2015 Closed new issues about parallel #13 @jiahao jiahao added the regression label on Feb 5, 2015 @tkelman The Julia Language member tkelman commented on Feb 5, 2015 Can you do a git bisect on the release-0.3 branch to determine exactly which commit introduced this? @armgong armgong commented on Feb 5, 2015 as @tkelman suggest I try hunt the bug,now found commit 06d01c2 cause this problem.before it everything is ok. with this commit simple code will crash it: library(rjulia) julia_init() julia_void_eval("versioninfo()") but before this stress test is pass like this: library(rjulia) julia_init() julia_void_eval("versioninfo()") julia_void_eval("addprocs(2)") for (j in 1:1000) { for (i in 2:3) { julia_void_eval(paste("r=remotecall(",i,", rand, 2, 2)",sep = "")) y <- j2r("fetch(r)") cat("process",j, i, "got value:\n"); print(y) cat("\n") } } julia_void_eval("rmprocs(workers())") cat("done\n") @armgong armgong commented on Feb 5, 2015 after revert commit 06d01c2 on release-0.3 branch head,everything run like a charm,so julia core dev please revert 06d01c2 @jiahao The Julia Language member jiahao commented on Feb 5, 2015 cc @vtjnash @tkelman The Julia Language member tkelman commented on Feb 5, 2015 Very interesting. Thank you for running the bisect @armgong. Such a useful tool, git bisect. @tkelman The Julia Language member tkelman commented on Feb 5, 2015 That particular commit was fixing a different bug, so I'm not sure if just reverting it is the correct solution. It's possible that RJulia might be doing something wrong here with respect to tasks or initialization, but I'll wait for Jameson to weigh in. @tkelman tkelman referenced this issue on Feb 5, 2015 Closed 0.3.6 release planning issue #10058 @armgong armgong commented on Feb 6, 2015 if RJulia doing something wrong,it might be in julia_init() ,it only do three thing 1 jl_init(JULIA_HOME) 2 jl_eval_string("Base.init_parallel()") 3 jl_eval_string("Base.init_bind_addr(ARGS)") because julia 0.3 don't call init parallel function in jlapi.c jl_init_with_image,so we need last two line . 0.4 already have them in jlapi.c jl_init_with_image. on julia 0.33 add last two line test run ok,if no them can't start worker process. on julia 0.40 also add last two line (though don't need them since already in jl_init_with_image) test run ok on julia 0.35 no last two line can't start worker process ,add last two line test run into crash. @vtjnash The Julia Language member vtjnash commented on Feb 6, 2015 #9461 will likely fix this when merged @tkelman The Julia Language member tkelman commented on Feb 6, 2015 Is that backportable? It isn't even on master yet, would be good to get it merged sooner rather than later to give it time to be well-tested before backporting into 0.3.6. @vtjnash The Julia Language member vtjnash commented on Feb 6, 2015 That pr needs to be rewritten for the new gc, so that's not really an available option to merge to master first @tkelman The Julia Language member tkelman commented on Feb 6, 2015 Ah right. It doesn't cherry-pick cleanly to 0.3 either, does it need to be rewritten twice then? @vtjnash The Julia Language member vtjnash commented on Feb 6, 2015 there's actually a couple of alternative ways of fixing this also. the undocumented issue is that you can't call a julia function from a higher stack frame than the one it created in julia_init @vtjnash The Julia Language member vtjnash commented on Feb 7, 2015 actually, i missed that you said this worked on 0.4. that means #9461 isn't really necessary. what is more necessary is finishing cherry-picking the other changes such as 54affdb. or call the JL_SET_STACK_BASE from some function that doesn't return (in 0.4, this is rolled into julia_init) @tkelman The Julia Language member tkelman commented on Feb 7, 2015 Oh dear. #9266 caused lots of problems, that seems really questionable for backporting. I'd be in favor of a less drastic modification on release-0.3 if possible. @vtjnash The Julia Language member vtjnash commented on Feb 7, 2015 the less drastic measure is to record the jl_stack_base value in every jl_task_t, as we did before 06d01c2, during the first call to save_stack (for any task != jl_root_task). the restore_stack function assumes that value is const after that point, whereas jl_eval_string will change it if it was previously undefined (due to a lack of call to JL_SET_STACK_BASE) @tkelman The Julia Language member tkelman commented on Feb 7, 2015 I see. Is that doable without unfixing #8551, or do we have to choose between trading one bug for another, or backporting a change that caused some still-unresolved issues? @vtjnash The Julia Language member vtjnash commented on Feb 7, 2015 oh right, not really -- #9266 was the implementation of what i just re-described above @armgong armgong commented on Feb 7, 2015 @vtjnash after further test, 0.33 and 0.4 both have problems but different ,just enlarge I to torture julia worker process and rjulia. test code is: library(rjulia) julia_init() julia_void_eval("versioninfo()") julia_void_eval("addprocs(2)") for (j in 1:5015) { for (i in 2:3) { julia_void_eval(paste("r=remotecall(",i,", rand, 2, 2)",sep = "")) y <- j2r("fetch(r)") cat("process",j, i, "got value:\n"); print(y) cat("\n") } } julia_void_eval("rmprocs(workers())") cat("done\n") ********* 0.4 log ,attention under 0.4 just julia worker processes crashed, R process still ok ************* signal (11): Segmentation fault unknown function (ip: 624957107) unknown function (ip: 624965742) unknown function (ip: 624972752) jl_gc_collect at /data/julia/usr/bin/../lib/libjulia.so (unknown line) unknown function (ip: 624983543) jl_alloc_tuple_uninit at /data/julia/usr/bin/../lib/libjulia.so (unknown line) jl_f_tuple at /data/julia/usr/bin/../lib/libjulia.so (unknown line) jl_f_apply at /data/julia/usr/bin/../lib/libjulia.so (unknown line) ntuple at ./tuple.jl:30 ....(lot of ntuple at ./tuple.jl:30 ) ntuple at ./tuple.jl:30 signal (11): Segmentation fault ntuple at ./tuple.jl:30 ....(lot of ntuple at ./tuple.jl:30 ) ntuple at ./tuple.jl:30 deserialize_tuple at serialize.jl:355 handle_deserialize at serialize.jl:350 anonymous at task.jl:855 unknown function (ip: 624884065) unknown function (ip: 0) unknown function (ip: -1397220928) unknown function (ip: -1397213074) unknown function (ip: -1397206064) jl_gc_collect at /data/julia/usr/bin/../lib/libjulia.so (unknown line) unknown function (ip: -1397195273) jl_alloc_tuple_uninit at /data/julia/usr/bin/../lib/libjulia.so (unknown line) jl_f_tuple at /data/julia/usr/bin/../lib/libjulia.so (unknown line) jl_f_apply at /data/julia/usr/bin/../lib/libjulia.so (unknown line) ntuple at ./tuple.jl:30 ....(lot of ntuple at ./tuple.jl:30 ) ntuple at ./tuple.jl:30 Worker 2 terminated.ProcessExitedException () process 4940 2 got value: NULL ArgumentErrorWorker 3 terminated.( "stream is closed or unusable") ProcessExitedException() process 4940 3 got value: NULL ProcessExitedException() ProcessExitedException() process 4941 2 got value: NULL ************* 0.3 log ,attention under 0.3 , R process core dump ************* process 4632 3 got value: [,1] [,2] [1,] 0.3489215 0.3766013 [2,] 0.7847451 0.8211198 *** stack smashing detected ***: /usr/lib64/R/bin/exec/R terminated ======= Backtrace: ========= /usr/lib/libc.so.6(+0x732ae)[0x7f88025232ae] /usr/lib/libc.so.6(__fortify_fail+0x37)[0x7f88025a8907] /usr/lib/libc.so.6(__fortify_fail+0x0)[0x7f88025a88d0] /usr/lib64/R/lib/libR.so(Rf_applyClosure+0x6f2)[0x7f8802b44cc2] /usr/lib64/R/lib/libR.so(Rf_eval+0x341)[0x7f8802b3eaa1] /usr/lib64/R/lib/libR.so(+0xd1dd0)[0x7f8802b40dd0] /usr/lib64/R/lib/libR.so(Rf_eval+0x534)[0x7f8802b3ec94] /usr/lib64/R/lib/libR.so(+0xd44dd)[0x7f8802b434dd] /usr/lib64/R/lib/libR.so(Rf_eval+0x534)[0x7f8802b3ec94] /usr/lib64/R/lib/libR.so(+0xd1dd0)[0x7f8802b40dd0] /usr/lib64/R/lib/libR.so(Rf_eval+0x534)[0x7f8802b3ec94] /usr/lib64/R/lib/libR.so(+0xd44dd)[0x7f8802b434dd] /usr/lib64/R/lib/libR.so(Rf_eval+0x534)[0x7f8802b3ec94] /usr/lib64/R/lib/libR.so(Rf_ReplIteration+0x252)[0x7f8802b677d2] /usr/lib64/R/lib/libR.so(+0xf8b31)[0x7f8802b67b31] /usr/lib64/R/lib/libR.so(run_Rmainloop+0x44)[0x7f8802b68084] /usr/lib64/R/bin/exec/R(main+0x1b)[0x40082b] /usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f88024d0040] /usr/lib64/R/bin/exec/R[0x40085b] ======= Memory map: ======== 00400000-00401000 r-xp 00000000 08:01 13240744 /usr/lib/R/bin/exec/R 00600000-00601000 r--p 00000000 08:01 13240744 /usr/lib/R/bin/exec/R 00601000-00602000 rw-p 00001000 08:01 13240744 /usr/lib/R/bin/exec/R 0238e000-0d3a3000 rw-p 00000000 00:00 0 [heap] 7f87c0000000-7f87c0021000 rw-p 00000000 00:00 0 7f87c0021000-7f87c4000000 ---p 00000000 00:00 0 7f87c8000000-7f87c8021000 rw-p 00000000 00:00 0 7f87c8021000-7f87cc000000 ---p 00000000 00:00 0 7f87cfd77000-7f87cfd78000 ---p 00000000 00:00 0 7f87cfd78000-7f87d0578000 rw-p 00000000 00:00 0 [stack:8359] 7f87d0578000-7f87d0579000 ---p 00000000 00:00 0 7f87d0579000-7f87d0d79000 rw-p 00000000 00:00 0 [stack:8358] 7f87d0d79000-7f87d0d7a000 ---p 00000000 00:00 0 7f87d0d7a000-7f87d157a000 rw-p 00000000 00:00 0 [stack:8357] 7f87d157a000-7f87d157b000 ---p 00000000 00:00 0 7f87d157b000-7f87d1d7b000 rw-p 00000000 00:00 0 [stack:8356] 7f87d1d7b000-7f87d1d9f000 r-xp 00000000 08:11 1052567 /data/julia0.3/0.33/lib/libopenlibm.so.1.0 7f87d1d9f000-7f87d1f9f000 ---p 00024000 08:11 1052567 /data/julia0.3/0.33/lib/libopenlibm.so.1.0 7f87d1f9f000-7f87d1fa0000 rw-p 00024000 08:11 1052567 /data/julia0.3/0.33/lib/libopenlibm.so.1.0 7f87d1fa0000-7f87d1ffa000 r-xp 00000000 08:11 1052937 /data/julia0.3/0.33/lib/libmpfr.so.4.1.2 7f87d1ffa000-7f87d21fa000 ---p 0005a000 08:11 1052937 /data/julia0.3/0.33/lib/libmpfr.so.4.1.2 7f87d21fa000-7f87d21fc000 rw-p 0005a000 08:11 1052937 /data/julia0.3/0.33/lib/libmpfr.so.4.1.2 7f87d21fc000-7f87d2267000 r-xp 00000000 08:11 1052924 /data/julia0.3/0.33/lib/libgmp.so.10.1.3 7f87d2267000-7f87d2467000 ---p 0006b000 08:11 1052924 /data/julia0.3/0.33/lib/libgmp.so.10.1.3 7f87d2467000-7f87d2470000 rw-p 0006b000 08:11 1052924 /data/julia0.3/0.33/lib/libgmp.so.10.1.3 7f87d2470000-7f87d2473000 r-xp 00000000 08:11 1052623 /data/julia0.3/0.33/lib/libdSFMT.so 7f87d2473000-7f87d2673000 ---p 00003000 08:11 1052623 /data/julia0.3/0.33/lib/libdSFMT.so 7f87d2673000-7f87d2674000 rw-p 00003000 08:11 1052623 /data/julia0.3/0.33/lib/libdSFMT.so 7f87d2674000-7f87d4674000 rw-p 00000000 00:00 0 7f87d4674000-7f87d46d2000 r-xp 00000000 08:11 1052861 /data/julia0.3/0.33/lib/libpcre.so.1.0.1 7f87d46d2000-7f87d48d1000 ---p 0005e000 08:11 1052861 /data/julia0.3/0.33/lib/libpcre.so.1.0.1 7f87d48d1000-7f87d48d2000 rw-p 0005d000 08:11 1052861 /data/julia0.3/0.33/lib/libpcre.so.1.0.1 7f87d48d2000-7f87da8d2000 rw-p 00000000 00:00 0 7f87da8d2000-7f87da8d3000 ---p 00000000 00:00 0 7f87da8d3000-7f87db0d3000 rw-p 00000000 00:00 0 [stack:8339] 7f87db0d3000-7f87df0d3000 rw-p 00000000 00:00 0 7f87df0d3000-7f87df0d4000 ---p 00000000 00:00 0 7f87df0d4000-7f87df8d4000 rw-p 00000000 00:00 0 [stack:8338] 7f87df8d4000-7f87e18d4000 rw-p 00000000 00:00 0 7f87e18d4000-7f87e18d5000 ---p 00000000 00:00 0 7f87e18d5000-7f87e20d5000 rw-p 00000000 00:00 0 [stack:8337] 7f87e20d5000-7f87e60d5000 rw-p 00000000 00:00 0 7f87e60d5000-7f87e60d6000 ---p 00000000 00:00 0 7f87e60d6000-7f87e68d6000 rw-p 00000000 00:00 0 [stack:8336] 7f87e68d6000-7f87e68d7000 ---p 00000000 00:00 0 7f87e68d7000-7f87e70d7000 rw-p 00000000 00:00 0 [stack:8335] 7f87e70d7000-7f87e70d8000 ---p 00000000 00:00 0 7f87e70d8000-7f87e78d8000 rw-p 00000000 00:00 0 [stack:8334] 7f87e78d8000-7f87e78d9000 ---p 00000000 00:00 0 7f87e78d9000-7f87e80d9000 rw-p 00000000 00:00 0 [stack:8333] 7f87e80d9000-7f87ee0d9000 rw-p 00000000 00:00 0 7f87ee0d9000-7f87ee0da000 ---p 00000000 00:00 0 7f87ee0da000-7f87ee8da000 rw-p 00000000 00:00 0 [stack:8332] 7f87ee8da000-7f87f08da000 rw-p 00000000 00:00 0 7f87f08da000-7f87f08db000 ---p 00000000 00:00 0 7f87f08db000-7f87f10db000 rw-p 00000000 00:00 0 [stack:8331] 7f87f10db000-7f87f50db000 rw-p 00000000 00:00 0 7f87f50db000-7f87f50dc000 ---p 00000000 00:00 0 7f87f50dc000-7f87f58dc000 rw-p 00000000 00:00 0 [stack:8330] 7f87f58dc000-7f87f78dc000 rw-p 00000000 00:00 0 7f87f78dc000-7f87f78dd000 ---p 00000000 00:00 0 7f87f78dd000-7f87f80dd000 rw-p 00000000 00:00 0 [stack:8329] 7f87f80dd000-7f87f80de000 ---p 00000000 00:00 0 7f87f80de000-7f87f88de000 rw-p 00000000 00:00 0 [stack:8328] 7f87f88de000-7f87f88df000 ---p 00000000 00:00 0 7f87f88df000-7f87f90df000 rw-p 00000000 00:00 0 [stack:8327] 7f87f90df000-7f87f90e0000 ---p 00000000 00:00 0 7f87f90e0000-7f87f98e0000 rw-p 00000000 00:00 0 [stack:8326] 7f87f98e0000-7f87f98e1000 ---p 00000000 00:00 0 7f87f98e1000-7f87fa0e1000 rw-p 00000000 00:00 0 [stack:8325] 7f87fa0e1000-7f87fbf7e000 r-xp 00000000 08:11 1052751 /data/julia0.3/0.33/lib/libopenblas.so 7f87fbf7e000-7f87fc17d000 ---p 01e9d000 08:11 1052751 /data/julia0.3/0.33/lib/libopenblas.so 7f87fc17d000-7f87fc19a000 rw-p 01e9c000 08:11 1052751 /data/julia0.3/0.33/lib/libopenblas.so 7f87fc19a000-7f87fc2ae000 rw-p 00000000 00:00 0 7f87fc30c000-7f87fcaae000 rw-p 00000000 00:00 0 7f87fcaae000-7f87fce90000 r-xp 00000000 08:11 1053030 /data/julia0.3/0.33/lib/julia/sys.so 7f87fce90000-7f87fd08f000 ---p 003e2000 08:11 1053030 /data/julia0.3/0.33/lib/julia/sys.so 7f87fd08f000-7f87fd0c0000 rw-p 003e1000 08:11 1053030 /data/julia0.3/0.33/lib/julia/sys.so 7f87fd0c0000-7f87fd0df000 rw-p 00000000 00:00 0 7f87fd2e0000-7f87fd360000 rwxp 00000000 00:00 0 7f87fd360000-7f87fd561000 rw-p 00000000 00:00 0 7f87fd561000-7f87fd568000 r-xp 00000000 08:01 2099261 /home/armgong/R/x86_64-unknown-linux-gnu-library/3.1/rjulia/libs/rjulia.so 7f87fd568000-7f87fd767000 ---p 00007000 08:01 2099261 /home/armgong/R/x86_64-unknown-linux-gnu-library/3.1/rjulia/libs/rjulia.so 7f87fd767000-7f87fd768000 r--p 00006000 08:01 2099261 /home/armgong/R/x86_64-unknown-linux-gnu-library/3.1/rjulia/libs/rjulia.so 7f87fd768000-7f87fd769000 rw-p 00007000 08:01 2099261 /home/armgong/R/x86_64-unknown-linux-gnu-library/3.1/rjulia/libs/rjulia.so 7f87fd769000-7f87fd8d2000 r-xp 00000000 08:01 12586451 /usr/lib/libstdc++.so.6.0.20 7f87fd8d2000-7f87fdad1000 ---p 00169000 08:01 12586451 /usr/lib/libstdc++.so.6.0.20 7f87fdad1000-7f87fdadb000 r--p 00168000 08:01 12586451 /usr/lib/libstdc++.so.6.0.20 7f87fdadb000-7f87fdadd000 rw-p 00172000 08:01 12586451 /usr/lib/libstdc++.so.6.0.20 7f87fdadd000-7f87fdae1000 rw-p 00000000 00:00 0 7f87fdae1000-7f87fdaf6000 r-xp 00000000 08:01 12589184 /usr/lib/libz.so.1.2.8 7f87fdaf6000-7f87fdcf5000 ---p 00015000 08:01 12589184 /usr/lib/libz.so.1.2.8 7f87fdcf5000-7f87fdcf6000 r--p 00014000 08:01 12589184 /usr/lib/libz.so.1.2.8 7f87fdcf6000-7f87fdcf7000 rw-p 00015000 08:01 12589184 /usr/lib/libz.so.1.2.8 7f87fdcf7000-7f87fe97c000 r-xp 00000000 08:11 1052962 /data/julia0.3/0.33/lib/libjulia.so 7f87fe97c000-7f87feb7b000 ---p 00c85000 08:11 1052962 /data/julia0.3/0.33/lib/libjulia.so 7f87feb7b000-7f87fec53000 rw-p 00c84000 08:11 1052962 /data/julia0.3/0.33/lib/libjulia.so 7f87fec53000-7f87fed2d000 rw-p 00000000 00:00 0 7f87fed2d000-7f87ff2c5000 r-xp 00000000 08:01 12601905 /usr/lib/liblapack.so 7f87ff2c5000-7f87ff4c4000 ---p 00598000 08:01 12601905 /usr/lib/liblapack.so 7f87ff4c4000-7f87ff4c5000 r--p 00597000 08:01 12601905 /usr/lib/liblapack.so 7f87ff4c5000-7f87ff4c8000 rw-p 00598000 08:01 12601905 /usr/lib/liblapack.so 7f87ff4c8000-7f87ff5d6000 rw-p 00000000 00:00 0 7f87ff5d6000-7f87ff676000 r-xp 00000000 08:01 13240661 /usr/lib/R/library/stats/libs/stats.so 7f87ff676000-7f87ff875000 ---p 000a0000 08:01 13240661 /usr/lib/R/library/stats/libs/stats.so 7f87ff875000-7f87ff877000 r--p 0009f000 08:01 13240661 /usr/lib/R/library/stats/libs/stats.so 7f87ff877000-7f87ff879000 rw-p 000a1000 08:01 13240661 /usr/lib/R/library/stats/libs/stats.so 7f87ff879000-7f87ff8b7000 r-xp 00000000 08:01 13239404 /usr/lib/R/library/grDevices/libs/grDevices.so 7f87ff8b7000-7f87ffab6000 ---p 0003e000 08:01 13239404 /usr/lib/R/library/grDevices/libs/grDevices.so 7f87ffab6000-7f87ffabb000 r--p 0003d000 08:01 13239404 /usr/lib/R/library/grDevices/libs/grDevices.so 7f87ffabb000-7f87ffabd000 rw-p 00042000 08:01 13239404 /usr/lib/R/library/grDevices/libs/grDevices.so 7f87ffabd000-7f87ffabe000 rw-p 00000000 00:00 0 7f87ffaf0000-7f87ffb2e000 r-xp 00000000 08:01 13239270 /usr/lib/R/library/graphics/libs/graphics.so 7f87ffb2e000-7f87ffd2e000 ---p 0003e000 08:01 13239270 /usr/lib/R/library/graphics/libs/graphics.so 7f87ffd2e000-7f87ffd2f000 r--p 0003e000 08:01 13239270 /usr/lib/R/library/graphics/libs/graphics.so 7f87ffd2f000-7f87ffd30000 rw-p 0003f000 08:01 13239270 /usr/lib/R/library/graphics/libs/graphics.so 7f87ffd30000-7f87ffd32000 r-xp 00000000 08:01 12586283 /usr/lib/gconv/ISO8859-1.so 7f87ffd32000-7f87fff31000 ---p 00002000 08:01 12586283 /usr/lib/gconv/ISO8859-1.so 7f87fff31000-7f87fff32000 r--p 00001000 08:01 12586283 /usr/lib/gconv/ISO8859-1.so 7f87fff32000-7f87fff33000 rw-p 00002000 08:01 12586283 /usr/lib/gconv/ISO8859-1.so 7f87fff33000-7f880003a000 rw-p 00000000 00:00 0 7f880003a000-7f8800042000 r-xp 00000000 08:01 13239779 /usr/lib/R/library/methods/libs/methods.so 7f8800042000-7f8800241000 ---p 00008000 08:01 13239779 /usr/lib/R/library/methods/libs/methods.so 7f8800241000-7f8800242000 r--p 00007000 08:01 13239779 /usr/lib/R/library/methods/libs/methods.so 7f8800242000-7f8800243000 rw-p 00008000 08:01 13239779 /usr/lib/R/library/methods/libs/methods.so 7f8800243000-7f880024d000 r-xp 00000000 08:01 13240445 /usr/lib/R/library/utils/libs/utils.so 7f880024d000-7f880044d000 ---p 0000a000 08:01 13240445 /usr/lib/R/library/utils/libs/utils.so 7f880044d000-7f880044e000 r--p 0000a000 08:01 13240445 /usr/lib/R/library/utils/libs/utils.so 7f880044e000-7f880044f000 rw-p 0000b000 08:01 13240445 /usr/lib/R/library/utils/libs/utils.so 7f880044f000-7f88004f7000 rw-p 00000000 00:00 0 7f88004f7000-7f8800502000 r-xp 00000000 08:01 12586092 /usr/lib/libnss_files-2.20.so 7f8800502000-7f8800702000 ---p 0000b000 08:01 12586092 /usr/lib/libnss_files-2.20.so 7f8800702000-7f8800703000 r--p 0000b000 08:01 12586092 /usr/lib/libnss_files-2.20.so 7f8800703000-7f8800704000 rw-p 0000c000 08:01 12586092 /usr/lib/libnss_files-2.20.so 7f8800704000-7f88007b5000 rw-p 00000000 00:00 0 7f88007b5000-7f8800ae1000 r--p 00000000 08:01 12601662 /usr/lib/locale/locale-archive 7f8800ae1000-7f8800af7000 r-xp 00000000 08:01 12586447 /usr/lib/libgcc_s.so.1 7f8800af7000-7f8800cf6000 ---p 00016000 08:01 12586447 /usr/lib/libgcc_s.so.1 7f8800cf6000-7f8800cf7000 rw-p 00015000 08:01 12586447 /usr/lib/libgcc_s.so.1 7f8800cf7000-7f8800d34000 r-xp 00000000 08:01 12586464 /usr/lib/libquadmath.so.0.0.0 7f8800d34000-7f8800f33000 ---p 0003d000 08:01 12586464 /usr/lib/libquadmath.so.0.0.0 7f8800f33000-7f8800f34000 rw-p 0003c000 08:01 12586464 /usr/lib/libquadmath.so.0.0.0 7f8800f34000-7f8800f93000 r-xp 00000000 08:01 12586503 /usr/lib/libncursesw.so.5.9 7f8800f93000-7f8801193000 ---p 0005f000 08:01 12586503 /usr/lib/libncursesw.so.5.9 7f8801193000-7f8801197000 r--p 0005f000 08:01 12586503 /usr/lib/libncursesw.so.5.9 7f8801197000-7f8801199000 rw-p 00063000 08:01 12586503 /usr/lib/libncursesw.so.5.9 7f8801199000-7f88012be000 r-xp 00000000 08:01 12586457 /usr/lib/libgfortran.so.3.0.0 7f88012be000-7f88014be000 ---p 00125000 08:01 12586457 /usr/lib/libgfortran.so.3.0.0 7f88014be000-7f88014c0000 rw-p 00125000 08:01 12586457 /usr/lib/libgfortran.so.3.0.0 7f88014c0000-7f88014d6000 r-xp 00000000 08:01 12586439 /usr/lib/libgomp.so.1.0.0 7f88014d6000-7f88016d5000 ---p 00016000 08:01 12586439 /usr/lib/libgomp.so.1.0.0 7f88016d5000-7f88016d6000 rw-p 00015000 08:01 12586439 /usr/lib/libgomp.so.1.0.0 7f88016d6000-7f88016d9000 r-xp 00000000 08:01 12586089 /usr/lib/libdl-2.20.so 7f88016d9000-7f88018d8000 ---p 00003000 08:01 12586089 /usr/lib/libdl-2.20.so 7f88018d8000-7f88018d9000 r--p 00002000 08:01 12586089 /usr/lib/libdl-2.20.so 7f88018d9000-7f88018da000 rw-p 00003000 08:01 12586089 /usr/lib/libdl-2.20.so 7f88018da000-7f88018e1000 r-xp 00000000 08:01 12586133 /usr/lib/librt-2.20.so 7f88018e1000-7f8801ae0000 ---p 00007000 08:01 12586133 /usr/lib/librt-2.20.so 7f8801ae0000-7f8801ae1000 r--p 00006000 08:01 12586133 /usr/lib/librt-2.20.so 7f8801ae1000-7f8801ae2000 rw-p 00007000 08:01 12586133 /usr/lib/librt-2.20.so 7f8801ae2000-7f8801b07000 r-xp 00000000 08:01 12593683 /usr/lib/liblzma.so.5.2.0 7f8801b07000-7f8801d06000 ---p 00025000 08:01 12593683 /usr/lib/liblzma.so.5.2.0 7f8801d06000-7f8801d07000 r--p 00024000 08:01 12593683 /usr/lib/liblzma.so.5.2.0 7f8801d07000-7f8801d08000 rw-p 00025000 08:01 12593683 /usr/lib/liblzma.so.5.2.0 7f8801d08000-7f8801d49000 r-xp 00000000 08:01 12589082 /usr/lib/libreadline.so.6.3 7f8801d49000-7f8801f49000 ---p 00041000 08:01 12589082 /usr/lib/libreadline.so.6.3 7f8801f49000-7f8801f4b000 r--p 00041000 08:01 12589082 /usr/lib/libreadline.so.6.3 7f8801f4b000-7f8801f52000 rw-p 00043000 08:01 12589082 /usr/lib/libreadline.so.6.3 7f8801f52000-7f8801f53000 rw-p 00000000 00:00 0 7f8801f53000-7f8802056000 r-xp 00000000 08:01 12586119 /usr/lib/libm-2.20.so 7f8802056000-7f8802256000 ---p 00103000 08:01 12586119 /usr/lib/libm-2.20.so 7f8802256000-7f8802257000 r--p 00103000 08:01 12586119 /usr/lib/libm-2.20.so 7f8802257000-7f8802258000 rw-p 00104000 08:01 12586119 /usr/lib/libm-2.20.so 7f8802258000-7f88022af000 r-xp 00000000 08:01 12601874 /usr/lib/libblas.so 7f88022af000-7f88024ae000 ---p 00057000 08:01 12601874 /usr/lib/libblas.so 7f88024ae000-7f88024af000 r--p 00056000 08:01 12601874 /usr/lib/libblas.so 7f88024af000-7f88024b0000 rw-p 00057000 08:01 12601874 /usr/lib/libblas.so 7f88024b0000-7f8802649000 r-xp 00000000 08:01 12586120 /usr/lib/libc-2.20.so 7f8802649000-7f8802849000 ---p 00199000 08:01 12586120 /usr/lib/libc-2.20.so 7f8802849000-7f880284d000 r--p 00199000 08:01 12586120 /usr/lib/libc-2.20.so 7f880284d000-7f880284f000 rw-p 0019d000 08:01 12586120 /usr/lib/libc-2.20.so 7f880284f000-7f8802853000 rw-p 00000000 00:00 0 7f8802853000-7f880286a000 r-xp 00000000 08:01 12586098 /usr/lib/libpthread-2.20.so 7f880286a000-7f8802a69000 ---p 00017000 08:01 12586098 /usr/lib/libpthread-2.20.so 7f8802a69000-7f8802a6a000 r--p 00016000 08:01 12586098 /usr/lib/libpthread-2.20.so 7f8802a6a000-7f8802a6b000 rw-p 00017000 08:01 12586098 /usr/lib/libpthread-2.20.so 7f8802a6b000-7f8802a6f000 rw-p 00000000 00:00 0 7f8802a6f000-7f8802d15000 r-xp 00000000 08:01 13240745 /usr/lib/R/lib/libR.so 7f8802d15000-7f8802f15000 ---p 002a6000 08:01 13240745 /usr/lib/R/lib/libR.so 7f8802f15000-7f8802f1a000 r--p 002a6000 08:01 13240745 /usr/lib/R/lib/libR.so 7f8802f1a000-7f8802f26000 rw-p 002ab000 08:01 13240745 /usr/lib/R/lib/libR.so 7f8802f26000-7f8803012000 rw-p 00000000 00:00 0 7f8803012000-7f8803034000 r-xp 00000000 08:01 12586095 /usr/lib/ld-2.20.so 7f880304c000-7f880306c000 rwxp 00000000 00:00 0 7f880306c000-7f880321e000 rw-p 00000000 00:00 0 7f880321e000-7f8803220000 rw-p 00000000 00:00 0 7f8803220000-7f8803230000 rwxp 00000000 00:00 0 7f8803230000-7f8803231000 r--p 00000000 08:01 13239951 /usr/lib/R/library/translations/en/LC_MESSAGES/R.mo 7f8803231000-7f8803233000 rw-p 00000000 00:00 0 7f8803233000-7f8803234000 r--p 00021000 08:01 12586095 /usr/lib/ld-2.20.so 7f8803234000-7f8803235000 rw-p 00022000 08:01 12586095 /usr/lib/ld-2.20.so 7f8803235000-7f8803236000 rw-p 00000000 00:00 0 7fff60d02000-7fff60d2c000 rw-p 00000000 00:00 0 [stack] 7fff60d5e000-7fff60d60000 r--p 00000000 00:00 0 [vvar] 7fff60d60000-7fff60d62000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] signal (6): Aborted gsignal at /usr/lib/libc.so.6 (unknown line) abort at /usr/lib/libc.so.6 (unknown line) unknown function (ip: 38941363) __fortify_fail at /usr/lib/libc.so.6 (unknown line) __fortify_fail at /usr/lib/libc.so.6 (unknown line) Rf_applyClosure at /usr/lib64/R/lib/libR.so (unknown line) Rf_eval at /usr/lib64/R/lib/libR.so (unknown line) unknown function (ip: 45354448) Rf_eval at /usr/lib64/R/lib/libR.so (unknown line) unknown function (ip: 45364445) Rf_eval at /usr/lib64/R/lib/libR.so (unknown line) unknown function (ip: 45354448) Rf_eval at /usr/lib64/R/lib/libR.so (unknown line) unknown function (ip: 45364445) Rf_eval at /usr/lib64/R/lib/libR.so (unknown line) Rf_ReplIteration at /usr/lib64/R/lib/libR.so (unknown line) unknown function (ip: 45513521) run_Rmainloop at /usr/lib64/R/lib/libR.so (unknown line) main at /usr/lib64/R/bin/exec/R (unknown line) __libc_start_main at /usr/lib/libc.so.6 (unknown line) unknown function (ip: 4196443) unknown function (ip: 0) Aborted (core dumped) @armgong armgong commented on Feb 7, 2015 ok, 0.4 problem not related to julia embedded, in julia 0.4 REPL run for i=1:10000 r=remotecall(2,rand,2,2);fetch(r);r=remotecall(3,rand,2,2);fetch(r) end also lead same issue , julia worker dead but julia head still alive julia> versioninfo() Julia Version 0.4.0-dev+3172 Commit 456b85a (2015-02-06 21:24 UTC) Platform Info: System: Linux (x86_64-unknown-linux-gnu) CPU: Intel(R) Celeron(R) CPU J1900 @ 1.99GHz WORD_SIZE: 64 BLAS: libopenblas (USE64BITINT NO_AFFINITY ATOM) LAPACK: libopenblas LIBM: libopenlibm LLVM: libLLVM-3.3 julia> for i=1:10000 r=remotecall(2,rand,2,2);fetch(r);r=remotecall(3,rand,2,2);fetch(r) end signal (11): Segmentation fault unknown function (ip: -259180096) unknown function (ip: -259172242) unknown function (ip: -259165232) jl_gc_collect at /data/julia/usr/bin/../lib/libjulia.so (unknown line) unknown function (ip: -259154441) jl_alloc_tuple_uninit at /data/julia/usr/bin/../lib/libjulia.so (unknown line) jl_f_tuple at /data/julia/usr/bin/../lib/libjulia.so (unknown line) jl_f_apply at /data/julia/usr/bin/../lib/libjulia.so (unknown line) ntuple at ./tuple.jl:30 ntuple at ./tuple.jl:30 ntuple at ./tuple.jl:30 ntuple at ./tuple.jl:30 lot of it ..... deserialize_tuple at serialize.jl:355 handle_deserialize at serialize.jl:350 anonymous at task.jl:855 unknown function (ip: -259253919) unknown function (ip: 0) Worker 3 terminated.ERROR: ProcessExitedException() in wait at ./task.jl:288 in wait at ./task.jl:198 in wait_full at ./multi.jl:571 in remotecall_fetch at multi.jl:671 in call_on_owner at ./multi.jl:716 in fetch at multi.jl:726 in anonymous at no file:1 julia> and julia 0.35 run this code just ok @armgong armgong commented on Feb 7, 2015 for i=1:10000 r=remotecall(2,rand,2,2);fetch(r);r=remotecall(3,rand,2,2);fetch(r) end also fail on julia REPL windows x86_64 0.4 master branch with message F:>julia\usr\bin\julia -p 2 _ _ _ ()_ | A fresh approach to technical computing () | () () | Documentation: http://docs.julialang.org _ _ _| | __ _ | Type "help()" for help. | | | | | | |/ ` | | | | || | | | (| | | Version 0.4.0-dev+3174 (2015-02-07 05:30 UTC) / |\'|||_'| | Commit 49a1f2e* (0 days old master) |_/ | x86_64-w64-mingw32 julia> for i=1:10000 r=remotecall(2,rand,2,2);fetch(r);r=remotecall(3,rand,2,2); julia> for i=1:10000 r=remotecall(2,rand,2,2);fetch(r);r=remotecall(3,rand,2,2); fetch(r) end Worker 3 terminated.ERROR: InexactError() in _uv_hook_return_spawn at process.jl:229 (repeats 2 times) @vtjnash vtjnash added a commit that referenced this issue on Feb 8, 2015 @vtjnash fix #9883, #10085 d780b7f @vtjnash vtjnash added a commit that referenced this issue on Feb 8, 2015 @vtjnash fix #9883, #10085 f769e41 @vtjnash vtjnash closed this on Feb 8, 2015 @vtjnash vtjnash added the backport pending label on Feb 8, 2015 @tkelman tkelman added this to the 0.3.7 milestone on Feb 18, 2015 @tkelman The Julia Language member tkelman commented on Mar 11, 2015 Just for reference - see the comments on f769e41, I don't plan on backporting the fix for this myself because it has messy conflicts that I don't know how to resolve. If anyone would like to see this resolved for 0.3.7 or later, please prepare a PR against release-0.3. @tkelman tkelman removed this from the 0.3.7 milestone on Jul 1, 2015 @tkelman tkelman removed the backport pending label on Jul 19, 2015