Worker process exiting when calling into undefined module #13554 Closed malmaud opened this Issue on Oct 11, 2015 · 4 comments Projects None yet Labels bug parallel Milestone No milestone Assignees No one assigned 2 participants @malmaud @amitmurthy Notifications You’re not receiving notifications from this thread. @malmaud malmaud commented on Oct 11, 2015 This should give an error, but not crash the worker. module M f()=0 end addprocs(1) remotecall_fetch(M.f, 2) WARNING: Module M not defined on process 2 fatal error on 2: ERROR: UndefVarError: M not defined in deserialize at serialize.jl:502 [inlined code] from operators.jl:313 in handle_deserialize at serialize.jl:475 [inlined code] from int.jl:187 in deserialize at serialize.jl:521 [inlined code] from operators.jl:313 in handle_deserialize at serialize.jl:475 [inlined code] from essentials.jl:111 in deserialize at serialize.jl:696 in deserialize_datatype at serialize.jl:649 in handle_deserialize at serialize.jl:465 [inlined code] from int.jl:187 in deserialize at serialize.jl:435 in message_handler_loop at multi.jl:847 in process_tcp_streams at multi.jl:836 in anonymous at task.jl:59 Worker 2 terminated.ERROR: ProcessExitedException() in yieldto at ./task.jl:67 in wait at ./task.jl:367 in wait at ./task.jl:282 in wait at ./channels.jl:93 in take! at ./channels.jl:82 in take! at ./multi.jl:792 in remotecall_fetch at multi.jl:729 [inlined code] from multi.jl:368 in remotecall_fetch at multi.jl:734 ERROR (unhandled task failure): EOFError: read end of file @malmaud malmaud added bug parallel labels on Oct 11, 2015 @amitmurthy The Julia Language member amitmurthy commented on Oct 12, 2015 dup of #3680 ? @malmaud malmaud commented on Oct 12, 2015 Seems plausible, but I'm uncertain. @amitmurthy The Julia Language member amitmurthy commented on Oct 12, 2015 One way to handle this is to serlialize a message to a buffer retrieve the length of the buffer and then write the length to the socket followed by the serialized bytes This way a socket connection can recover from serialization/deserialization errors. The problem with this approach shows up mainly with large arrays where we end up with double the memory required due to the buffer write and writing twice I suggest just documenting this behaviour and the rationale for the same. @malmaud malmaud commented on Oct 14, 2015 Yes, tricky situation. At any rate, I do think you're right that this is a dup so I'll close it. @malmaud malmaud closed this on Oct 14, 2015