Groups


64 of 99+  


julia-users ›
pmap - intermingled output from workers on v0.4
5 posts by 2 authors  


Greg Plowman 	

11/23/15


Has output from parallel workers changed in Julia v0.4 from v0.3?

I guess that running parallel processes might lead to intermingled output.
However, I have (more or less) the same parallel simulation code using pmap running on v0.3 and v0.4.

On v0.3 the output from workers is always orderly.

On v0.4 it's often intermingled between workers.
But moreover, the output sometimes seems delayed, as if it's being buffered and not being flushed straight away.

Is there a way I can get the output fro workers written immediately?

 
Greg Plowman 	

11/23/15


I should add this problem is only when using remote workers. (In my case ssh on Windows).

The following code produces intermingled output with multiple workers on multiple machines (Julia v0.4)
Output is orderly when using Julia v0.3, or with v0.4 when workers are on local machine only.


function Launch()
    @everywhere function sim(trial, numIterations)
        println("Starting trial $trial")
        s = 0.0
        for i = 1:numIterations
            s += sum(sqrt(rand(10^6)))
        end
        println("Finished trial $trial")
        s
    end
    
    numTrials = 100
    numIterations = 100
    println("Running random simulation: $numTrials trials of $numIterations iterations ... ")
    results = pmap(sim, 1:numTrials, fill(numIterations, numTrials))
end 


bernhard 	

11/24/15


In my view it is natural, that the order of the "output" (print statements) is intermingled, as the code runs in parallel. To my knowledge this was the same in 0.3 . Is it possible that you had no workers at all? (I.e. nprocs() evaluates to 1).
Also, I cannot see any noticable delay...
- show quoted text -
 

Greg Plowman 	

11/25/15


Thanks for your reply.

In my view it is natural, that the order of the "output" (print statements) is intermingled, as the code runs in parallel.

Yes, I agree. But I'd like to make sure we're talking about the same level of intermingledness (is this a new word?)
Firstly I don't really understand parallel processing, output streams, switching etc.
But when I first starting using Julia for parallel sims (Julia v0.3) I was initially surprised that output from each worker was NOT intermingled, in the sense that each print statement from a worker was delivered to the master process console "atomically". i.e. there were discreet lines on the console each wholly from a single worker.
Sure, the order of the lines depended on the speed of the processor, the amount of work to do etc.
After a while, I just assumed this was either magic, or there was some kind of queuing system with locking or similar.
In any case, I didn't really think about it until I started using Julia v0.4 where output lines are sometimes not discrete and sometimes delayed.

Here's an example of output:
 
     ...
     From worker 3:  Completed random trial 69
     From worker 3:  Starting random trial 86 with 1000000 games
     From worker 5:  Starting random trial 87 with 1000000 games
     From worker 2:  Completed random trial 70
     From worker 2:  Starting random trial 88 with 1000000 games
     From worker 27: Starting random trial 89 with 1000000 games
     From worker 21: Completed random trial  From worker 22: Starting random trial 90 with 1000000 games
     From worker 23: Starting random trial 93 with 1000000 games
     From worker 21: 81
     From worker 19: Starting random trial 91 with 1000000 games
     From worker 14: Starting random trial 96 with 1000000 games
     From worker 4:  Completed random trial 82
     From worker 4:  Starting random trial 98 with 1000000 games
     From worker 24: Completed random trial  From worker 26: Completed random trial 76
     From worker 25: Completed random trial 80
     From worker 24: 85
     From worker 22: Completed random trial 90
     From worker 3:  Completed random trial 86
     From worker 8:  Completed random trial  From worker 9:  Starting random trial 94 with 1000000 games
     From worker 8:  78
     From worker 3:  Starting random trial 99 with 1000000 games
     From worker 27: Completed random trial  From worker 29: Starting random trial 92 with 1000000 games
     From worker 28: Starting random trial 95 with 1000000 games
     From worker 27: 89
     From worker 2:  Completed random trial 88
     From worker 2:  Starting random trial 100 with 1000000 games
     From worker 23: Completed random trial 93
     From worker 29: Completed random trial 92
     From worker 28: Completed random trial 95
     From worker 14: Completed random trial  From worker 16: Completed random trial 72
     From worker 15: Completed random trial 75
     From worker 20: Completed random trial 79
     From worker 17: Completed random trial 83
     From worker 18: Completed random trial 84
     From worker 19: Completed random trial 91
     From worker 14: 96
     From worker 4:  Completed random trial 98
     From worker 9:  Completed random trial 94
     From worker 3:  Completed random trial 99
     From worker 10: Completed random trial  From worker 11: Completed random trial 65
     From worker 12: Completed random trial 66
     From worker 13: Completed random trial 71
     From worker 10: 77      From worker 11: Starting random trial 97 with 1000000 games
     From worker 10:
     From worker 2:  Completed random trial 100
     From worker 5:  Completed random trial  From worker 6:  Completed random trial 73
     From worker 7:  Completed random trial 74
     From worker 5:  87
     From worker 11: Completed random trial 97


Again I have no idea how these thing work, but here's code from Julia v0.3 (multi.jl) 

     if isa(stream, AsyncStream)
        let wrker = w
            # redirect console output from workers to the client's stdout:
            @async begin
                while !eof(stream)
                    line = readline(stream)
                    print("\tFrom worker $(wrker.id):\t$line")
                end
            end
        end
    end


And equivalent code from Julia v0.4:

function redirect_worker_output(ident, stream)
    @schedule while !eof(stream)
        line = readline(stream)
        if startswith(line, "\tFrom worker ")
            # STDOUT's of "additional" workers started from an initial worker on a host are not available
            # on the master directly - they are routed via the initial worker's STDOUT.
            print(line)
        else
            print("\tFrom worker $(ident):\t$line")
        end
    end
end


It seems we've gone from @async to @schedule.
Would this make a difference?

 
Greg Plowman 	

11/26/15


OK, I've done a little more digging.

It seems that in v0.4, remote workers are started differently. This is my understanding:
Only one worker for each host is started directly from the master process.
Additional workers on each host are started from the first worker on that host.
Thus output from these additional workers is routed via the first worker on the host (rather than directly to master process).
Somehow this causes the intermingled output.

To overcome this, I can start all workers directly from the master process, and output is orderly again (as for v0.3).
Presumably, the new v0.4 indirect method was to speed up adding remote workers.

Clearly, I don't really understand much of this. And I'm not sure how connecting all workers directly to master process affects performance or scalability.
Intuitively, it doesn't sound good, but for my purpose it does give more readable output.

To help speed up the startup of workers, I can start workers on different hosts in parallel (but each worker on host is started serially and directly from master process)

@sync begin
    for each (host, nworkers) in machines
        @async begin
            for i = 1:nworkers
                addprocs([(host,1)])
            end
        end
    end
end