Groups 168 of 99+ julia-users › Slow startup w --proc and --proc 80 .. can crash my machine 1 post by 1 author Páll Haraldsson 11 24 15 $ time julia --proc auto -e real 0m3.292s user 0m15.464s sys 0m0.593s auto means 8 CPUs +1 for master on my machine. Maybe it should mean 4 because of hyperthreading? Or some in-between 6? number? I didn't wait 15 sec, more like 4. I'm not really worried about startup wait for procs 4, 8 or 80, unless this indicates problems elsewhere, just curious.. and it seemed abnormal at first that the wait would get longer with higher numbers or even 1.. The point of many procs is parallel speedup, even if all the CPUs have to do the same on startup, in theory it should run in parallel, I guess this is just too much for the L3 cache.. The wait gets to be really long with --proc 80 that I do not have, and thus not a worry for me to get fast, just not to crash.. . Does it ever make sense to go above the number of virtual CPUs for --proc? I was just testing out the slowdown with up to --proc 80 that crashed with VM off but worked when on on my Ubuntu 14.04 Linux and I got a black screen with a brown blinking cursor and couldn't even get to a virtual terminal, and had to reset not entirely unexpected.. . On a second try it got frozen for a long while I couldn't get a virtual terminal and got my session closed in the end, but managed to not have to restart.. I'm not too worried about proc 1, but should I make a PR that limits procs to at most the number of virtual CPUs? I think I could manage that, or maybe, if you can think of a reason to go higher, say at most a double the number of CPUs? More complicated would be to take the amount of virtual memory into account. If there is a reason to go higher, can't the number of workers always be changed from within the program? The programmer should know better and maybe have that capability, but for users it seems not user friendly to be able to crash by invoking from the shell with high numbers like --proc 80, that options seems not needed just waiting for tinkerers like me that like to try everything out.. :- $ time julia --proc 0 -e ERROR: julia: -p,--procs n must be an integer 1 I can see with --proc 1: $ ps aux |grep julia qwerty 8278 45.7 1.5 8720228 127244 pts 9 Sl+ 17:05 0:01 julia --proc 1 qwerty 8282 22.7 1.4 8616920 116076 ? Ssl 17:05 0:00 usr bin julia -Cx86-64 -J usr lib x86_64-linux-gnu julia sys.so --bind-to 130.208.69.54 --worker that you get one worker, on top of the one master, but is it mostly a waste? Should it say must be an integer 2 and less than number of virtual processors ? Does proc 1 ever make sense? Is it for testing or should it maybe do the same as if proc is skipped 1 CPU vs 1+1 ? Is this 8616920 memory use 1.4 on my 8 GB about a constant that can't be reduced much? It would mean that a low end Android phone 512 MB would max out at 4.4 cores, if that as the system must use something and you have zram compressed VM and no real VM and crash with proc 5, maybe 4 or lower. Note I tested all with virtual memory off and then also on as I have lots of VM on an SSD , maybe too much with some of the swap used . I did not expect Julia to use a constant amount of memory multiplied by --proc because of copy-on-write COW but that is in fact what happens. COW probably doesn't help on fork, as most of the work Julia does is considered data not only code..? Compared to: $ time julia -e real 0m0.137s user 0m0.117s sys 0m0.038s $ time julia --proc 1 -e real 0m1.646s user 0m2.302s sys 0m0.213s $ time julia --proc 2 -e real 0m2.307s user 0m4.475s sys 0m0.317s $ time julia --proc 3 -e real 0m2.509s user 0m6.118s sys 0m0.437s $ time julia --proc 4 -e real 0m2.608s user 0m7.845s sys 0m0.502s $ time julia --proc 8 -e real 0m3.003s user 0m15.457s sys 0m0.824s $ time julia --proc 80 -e real 0m26.826s user 2m15.768s sys 0m8.688s I also tested with VM on: top - 15:19:51 up 12 days, 22:55, 12 users, load average: 0,47, 3,35, 3,79 Tasks: 267 total, 1 running, 266 sleeping, 0 stopped, 0 zombie Cpu0 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st Cpu1 : 0,3 us, 0,0 sy, 0,0 ni, 99,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st Cpu2 : 0,3 us, 0,0 sy, 0,0 ni, 99,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st Cpu3 : 0,0 us, 0,0 sy, 0,0 ni, 99,3 id, 0,7 wa, 0,0 hi, 0,0 si, 0,0 st Cpu4 : 0,0 us, 0,3 sy, 0,0 ni, 99,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st Cpu5 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st Cpu6 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st Cpu7 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem: 8130224 total, 1201108 used, 6929116 free, 1212 buffers KiB Swap: 31264764 total, 2585184 used, 28679580 free. 116152 cached Mem top - 15:01:12 up 12 days, 22:37, 12 users, load average: 1,60, 1,72, 1,54 Tasks: 298 total, 2 running, 296 sleeping, 0 stopped, 0 zombie Cpu s : 16,4 us, 1,0 sy, 0,0 ni, 82,0 id, 0,5 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem: 8130224 total, 7020028 used, 1110196 free, 3116 buffers KiB Swap: 31264764 total, 1924952 used, 29339812 free. 704908 cached Mem PID USER PR NI VIRT RES SHR S CPU MEM TIME+ COMMAND 13950 qwerty 20 0 1340360 247648 17256 Rlang 99,8 3,0 418:09.71 chromium-browse 12606 qwerty 20 0 2896688 1,359g 195360 S 8,3 17,5 290:57.88 chromium-browse -- Palli.