Groups 168 of 99+ julia-users › Slow startup w/--proc and --proc=80 .. can crash my machine 1 post by 1 author Páll Haraldsson 11/24/15 $ time julia --proc=auto -e "" real 0m3.292s user 0m15.464s sys 0m0.593s "auto" means 8 CPUs (+1 for master) on my machine. Maybe it should mean 4 because of hyperthreading? Or some in-between (6?) number? I didn't wait 15 sec, more like 4. I'm not really worried about startup wait for procs=4, 8 or 80, unless this indicates problems elsewhere, just curious.. and it seemed abnormal (at first) that the wait would get longer with higher numbers or even 1.. The point of many procs is parallel speedup, even if all the CPUs have to do the same on startup, in theory it should run in parallel, I guess this is just too much for the [L3] cache.. The wait gets to be really long with --proc=80 (that I do not have, and thus not a worry for me to get fast, just not to crash..). Does it ever make sense to go above the number of virtual CPUs for --proc? I was just testing out the slowdown with up to --proc=80 that crashed with VM off (but worked when on) on my Ubuntu 14.04 Linux and I got a black screen with a brown blinking cursor and couldn't even get to a virtual terminal, and had to reset (not entirely unexpected..). On a second try it got frozen for a long while I couldn't get a virtual terminal and got my session closed in the end, but managed to not have to restart.. I'm not too worried about proc=1, but should I make a PR that limits procs to at most the number of [virtual] CPUs? I think I could manage that, or maybe, if you can think of a reason to go higher, say at most a double the number of CPUs? More complicated would be to take the amount of [virtual] memory into account. If there is a reason to go higher, can't the number of workers always be changed from within the program? The programmer should know better and maybe have that capability, but for users it seems not user friendly to be able to crash by invoking from the shell with high numbers like --proc=80, that options seems not needed just waiting for tinkerers like me that like to try everything out.. :-/ $ time julia --proc=0 -e "" ERROR: julia: -p,--procs= must be an integer >= 1 I can see with --proc=1: $ ps aux |grep julia qwerty 8278 45.7 1.5 8720228 127244 pts/9 Sl+ 17:05 0:01 julia --proc=1 qwerty 8282 22.7 1.4 8616920 116076 ? Ssl 17:05 0:00 /usr/bin/julia -Cx86-64 -J/usr/lib/x86_64-linux-gnu/julia/sys.so --bind-to 130.208.69.54 --worker that you get one worker, on top of the one master, but is it mostly a waste? Should it say "must be an integer >= 2 and less than number of virtual processors"? Does proc=1 ever make sense? Is it for testing or should it maybe do the same as if proc is skipped (1 CPU vs 1+1)? Is this 8616920 memory use (1.4% on my 8 GB) about a constant that can't be reduced much? It would mean that a low end Android phone (512 MB) would max out at 4.4 cores, if that (as the system must use something and you have zram "compressed VM" (and no "real VM")) and crash with proc=5, maybe 4 or lower. Note I tested all with virtual memory off (and then also on) as I have lots of VM (on an SSD), maybe too much (with some of the swap used). I did not expect Julia to use a constant amount of memory multiplied by --proc because of copy-on-write (COW) but that is in fact what happens. COW probably doesn't help on fork, as most of the work Julia does is considered data not (only) code..? Compared to: $ time julia -e "" real 0m0.137s user 0m0.117s sys 0m0.038s $ time julia --proc=1 -e "" real 0m1.646s user 0m2.302s sys 0m0.213s $ time julia --proc=2 -e "" real 0m2.307s user 0m4.475s sys 0m0.317s $ time julia --proc=3 -e "" real 0m2.509s user 0m6.118s sys 0m0.437s $ time julia --proc=4 -e "" real 0m2.608s user 0m7.845s sys 0m0.502s $ time julia --proc=8 -e "" real 0m3.003s user 0m15.457s sys 0m0.824s $ time julia --proc=80 -e "" real 0m26.826s user 2m15.768s sys 0m8.688s I also tested with VM on: top - 15:19:51 up 12 days, 22:55, 12 users, load average: 0,47, 3,35, 3,79 Tasks: 267 total, 1 running, 266 sleeping, 0 stopped, 0 zombie %Cpu0 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu1 : 0,3 us, 0,0 sy, 0,0 ni, 99,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu2 : 0,3 us, 0,0 sy, 0,0 ni, 99,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu3 : 0,0 us, 0,0 sy, 0,0 ni, 99,3 id, 0,7 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu4 : 0,0 us, 0,3 sy, 0,0 ni, 99,7 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu5 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu6 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st %Cpu7 : 0,0 us, 0,0 sy, 0,0 ni,100,0 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem: 8130224 total, 1201108 used, 6929116 free, 1212 buffers KiB Swap: 31264764 total, 2585184 used, 28679580 free. 116152 cached Mem top - 15:01:12 up 12 days, 22:37, 12 users, load average: 1,60, 1,72, 1,54 Tasks: 298 total, 2 running, 296 sleeping, 0 stopped, 0 zombie %Cpu(s): 16,4 us, 1,0 sy, 0,0 ni, 82,0 id, 0,5 wa, 0,0 hi, 0,0 si, 0,0 st KiB Mem: 8130224 total, 7020028 used, 1110196 free, 3116 buffers KiB Swap: 31264764 total, 1924952 used, 29339812 free. 704908 cached Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13950 qwerty 20 0 1340360 247648 17256 R 99,8 3,0 418:09.71 chromium-browse 12606 qwerty 20 0 2896688 1,359g 195360 S 8,3 17,5 290:57.88 chromium-browse -- Palli.