Groups 165 of 99+ julia-users › asynchronous reading from file 3 posts by 2 authors pev... gmail.com Apr 15 Hi All, I would like to implement an asynchronous reading from file. I am doing stochastic gradient descend and while I am doing the optimisation, I would like to load the data on the background. Since reading of the data is followed by a quite complicated parsing, it is not just simple IO operation that can be done without CPU cycles. the skeleton of my current implementation looks like this rr RemoteChannel async put! rr, remotecall_fetch loaddata,2 for ii in 1:maxiter do some steps of the gradient descend check if the data are ready and schedule next reading if isready rr append! dss 1 ,take! rr ; async put! rr, remotecall_fetch loaddata,2 end end nevertheless the isready rr always returns false, which looks like that the data are never loaded. I start the julia as julia -p 2, therefore I expect there will be a processor. Can anyone explain me please, what am I doing wrong? Thank you very much. Tomas James Fairbanks Apr 16 Hi Tomas, tomas writes: the skeleton of my current implementation looks like this rr RemoteChannel async put! rr, remotecall_fetch loaddata,2 for ii in 1:maxiter do some steps of the gradient descend check if the data are ready and schedule next reading if isready rr append! dss 1 ,take! rr ; async put! rr, remotecall_fetch loaddata,2 end end The example of pmap shown here uses sync around a block with multiple async operations. http: docs.julialang.org en release-0.4 manual parallel-computing synchronization-with-remote-references My usage for stuff like this is to wrap the io into a task http: docs.julialang.org en release-0.4 manual control-flow tasks-aka-coroutines http: docs.julialang.org en release-0.4 stdlib parallel I think that async is a lower level API than using a `Task` that calls `produce data ` when it has the data and another Task that calls `consume iotask ` on the first task. This approach is similar to python generators. I start the julia as julia -p 2, therefore I expect there will be a processor. The ` async` and Tasks tools work in a single process. The ` spawn` macro sends work to different processors. Can anyone explain me please, what am I doing wrong? I am sure that others know better than I do. Here is my Task based example. I am open to suggestions to make this example clearer. Julia code: ------------------------------------------------ set up a Task to do the IO in a pseudothread read from STDIN in a loop up to 20 lines. iotask task begin info reading from stdin for i in 1:20 s readline STDIN produce s end end our fake computation just preppends val to our input function f x return val:$x end a function that takes values and applies f to them in a worker Task aka pseudothread this function uses task instead of creating a 0-argument function and passing it to Task . function work t::Task task begin for i in 1:20 s consume t info worker got: $s produce f s end end end the worker needs a handle to the IO task which is why we create it second worktask work iotask schedule both tasks so that they start executing schedule iotask schedule worktask this task based computation is based on pulling data. That is if we don't ask the the worker for any results, then no computation happens. for i in 1:20 x consume worktask info computed $x end ------------------------------------------------ Results ------------------------------------------------ for i in 0..22 ; do echo $i; done | julia taskio-jl INFO: reading from stdin INFO: worker got: 0 INFO: computed val:0 INFO: worker got: 1 INFO: computed val:1 INFO: worker got: 2 INFO: computed val:2 INFO: worker got: 3 INFO: computed val:3 INFO: worker got: 4 INFO: computed val:4 INFO: worker got: 5 INFO: computed val:5 INFO: worker got: 6 INFO: computed val:6 INFO: worker got: 7 INFO: computed val:7 INFO: worker got: 8 INFO: computed val:8 INFO: worker got: 9 INFO: computed val:9 INFO: worker got: 10 INFO: computed val:10 INFO: worker got: 11 INFO: computed val:11 INFO: worker got: 12 INFO: computed val:12 INFO: worker got: 13 INFO: computed val:13 INFO: worker got: 14 INFO: computed val:14 INFO: worker got: 15 INFO: computed val:15 INFO: worker got: 16 INFO: computed val:16 INFO: worker got: 17 INFO: computed val:17 INFO: worker got: 18 INFO: computed val:18 INFO: worker got: 19 INFO: computed val:19 ------------------------------------------------ Notice that only 20 lines of output appear even though the input has 22 lines. Changing the loop bounds in the code is left as an exercise to the reader. pev... gmail.com Apr 21 Hi James, thank for the reply. Though in your implementation the reading is not in a separate process thread, as I expect that you are bound to IO operations. In my problem there is computationally intensive post-processing. Should I modify the iotask as iotask task begin info reading from stdin for i in 1:20 s spawn loaddata produce s end end Do I need to have the consumer of s wrapped as another task? Meaning my stochastic gradient descend loop will look like your worked and does the stochastic gradient descend needs to produce something? I would like to understand the details. Thanks for the answer. Tomas