Groups 138 of 99+ julia-users › why's my julia code running slower than matlab, despite performance tips 26 posts by 9 authors feza May 8 I have read the performance section and believe I have followed all the suggested guidelines The same matlab script takes less than 3 seconds, julia 0.45 9.7 seconds julia 0.5 is even worse... script src https: gist.github.com musmo 27436a340b41c01d51d557a655276783.js script feza May 8 https: gist.github.com musmo 27436a340b41c01d51d557a655276783 michae... gmail.com May 8 I see that c is a constant array of Ints, and its elements multiply ux, uy and uz in a loop, where ux, uy and uz are arrays of floats, so there's a type stability problem. feza May 8 Good catch altough this still doesn't explain away the difference code_warntype shows me feq, f, \rho, ux, uy, uz are red for some reason eventhough I have explictly stated their types... STAR0SS May 8 You are using a lot of vectorized operations and Julia isn't as good as matlab is with those. The usual solution is to devectorized your code and to use loops except for matrix multiplication if you have large matrices . Patrick Kofod Mogensen May 8 As for the v0.5 performance which is horrible , I think it's the boxing issue with closure https: github.com JuliaLang julia issues 15276 . Right? On Sunday, May 8, 2016 at 10:29:59 AM UTC+2, STAR0SS wrote: You are using a lot of vectorized operations and Julia isn't as good as matlab is with those. The usual solution is to devectorized your code and to use loops except for matrix multiplication if you have large matrices . Patrick Kofod Mogensen May 8 For what it's worth it run in about 3-4 seconds on my computer on latest v0.4. CPU : Intel Rlang Core TM i7-4600U CPU 2.10GHz feza May 8 That's no surprise your CPU is better : Regarding devectorization for l in 1:q for k in 1:nz for j in 1:ny for i in 1:nx u ux i,j,k v uy i,j,k w uz i,j,k cu c k,1 u + c k,2 v + c k,3 w u2 u u + v v + w w feq i,j,k,l weights k ρ i,j,k 1 + 3 cu + 9 2 cu cu - 3 2 u2 f i,j,k,l f i,j,k,l 1-ω + ω feq i,j,k,l end end end end Actually makes the code a lot slower.... Tim Holy May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips One of the really cool features of julia is that functions are allowed to have more than 0 arguments. It's even considered good style, and I highly recommend making use of this awesome feature in your code! :- In other words: try passing all variables as arguments to the functions. Even though you're wrapping everything in a function, performance-wise you're running up against an inference problem https: github.com JuliaLang julia issues 15276 . In terms of coding style, you're still essentially using global variables. Honestly, these make your life harder in the end http: c2.com cgi wiki?GlobalVariablesAreBad ---it's not a bad thing that julia provides gentle encouragement to avoid using them, and you're losing out on opportunities by trying to sidestep that encouragement. Best, --Tim feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Thanks for the tip initially I just transllated the matlab verbatim Now I have made all the changes. In place operations, and direct function calls. Despite these changes. Matlab is 3.6 seconds, new Julia 7.6 seconds TBH the results of this experiment are frustrating, I was hoping Julia was going to provide a huge speedup on the level of c Am I still missing anything in the Julia code that is crucial to speed? code_warntype looks ok sans a few red unions which i don't think are in my control feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Milan Script is here: https: gist.github.com musmo 27436a340b41c01d51d557a655276783 This message has been deleted. STAR0SS May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Try changing the order of your loops: for i in 1:nx, j in 1:ny, k in 1:nz - inbounds for k in 1:nz, j in 1:ny, i in 1:nx inbounds disable bounds checking for arrays, it usually makes a small improvement . This message has been deleted. feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Wow thank you guys I totally thought for i in 1:nx, j in 1:ny, k in 1:nz ran the i index first and then j and then k !!!!! This has been a great learning experience. Much appreciated, now the julia code is about twice as fast! On Sunday, May 8, 2016 at 1:12:30 PM UTC-4, Tk wrote: Also try: julia -O --check-bounds no yourcode-jl David Gold May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips So, the issue here was the indexing clashing up against the column-major storage of multi-dimensional arrays? On Sunday, May 8, 2016 at 10:10:54 AM UTC-7, Tk wrote: Could you try replacing for i in 1:nx, j in 1:ny, k in 1:nz to for k in 1:nz, j in 1:ny, i in 1:nx because your arrays are defined like a i,j,k ? Another question is, how many cores is your Matlab code using? feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Well first problem was that the vectorized version of my code was very slow. Then I devectorized still slow, because of the index clashing with the column-major storage I assumed for i 1:10,j 1:10,k 1:10 does the index i first then j then k wrongly... feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips With all that done, the julia code runs about the same if not better than matlab using 4 threads Patrick Kofod Mogensen May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips out of curiosity, what about v0.5? feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips roughly the same speed. On Sunday, May 8, 2016 at 2:44:19 PM UTC-4, Patrick Kofod Mogensen wrote: out of curiosity, what about v0.5? Patrick Kofod Mogensen May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Same as v0.4, or same as before you changed the code? feza May 8 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips I mean the revised script runs just as fast if not a tad faster with the latest master as it does on 0.4.5 : Christian Peel May 9 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips The usual solution is to devectorized your code and to use loops except for matrix multiplication if you have large matrices . I am hopeful that ParallelAccelerator-jl 1 2 or similar projects can enable fast vectorized Julia code 1 https: github.com IntelLabs ParallelAccelerator-jl 2 http: julialang.org blog 2016 03 parallelaccelerator -- chris... ieee.org Ford O. May 9 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Other recipients: chris... ieee.org I have checked the link and read the article. Am I right that the parallel accelerator basically uses Clanguage code instead of julia to do the computation? That would be kinda shame dont you think? Dne pondělí 9. května 2016 7:00:38 UTC+2 Christian Peel napsal a : Yichao Yu May 9 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Other recipients: chris... ieee.org On Mon, May 9, 2016 at 1:15 AM, Ford Ox ford... gmail.com wrote: I have checked the link and read the article. Am I right that the parallel accelerator basically uses Clanguage code instead of julia to do the computation? That would be kinda shame dont you think? No I don't think so. IIUC it uses Clanguage for the threading API, it even has a backend using the julia threading API. And the julia threading API is very incomplete and experimental . And in general this is not so different from julia generating LLVM IR especially since LLVM has a Clanguage backend . Generating Clanguage is just usually not the as efficient as generating LLVM IR since you'll have parser overhead, much less flexible and expressive, unless, as in this case, the function API is in Clanguage. Yichao Yu May 9 Re: julia-users Re: why's my julia code running slower than matlab, despite performance tips Other recipients: chris... ieee.org On Mon, May 9, 2016 at 2:04 AM, Yichao Yu yyc... gmail.com wrote: On Mon, May 9, 2016 at 1:15 AM, Ford Ox ford... gmail.com wrote: I have checked the link and read the article. Am I right that the parallel accelerator basically uses Clanguage code instead of julia to do the computation? That would be kinda shame dont you think? No I don't think so. IIUC it uses Clanguage for the threading API, it even has a backend using the julia threading API. And the julia threading API is very incomplete and experimental . And in general this is not so different from julia generating LLVM IR especially since LLVM has a Clanguage backend . Generating Clanguage is just usually not the as efficient as generating LLVM IR since you'll have parser overhead, much less flexible and expressive, unless, as in this case, the function API is in Clanguage. Or in another word, it is at most a shame for LLVM IR for not having a threading construct which, admittedly, is a very hard problem but people are working on it . Dne pondělí 9. května 2016 7:00:38 UTC+2 Christian Peel napsal a : The usual solution is to devectorized your code and to use loops except for matrix multiplication if you have large matrices . I am hopeful that ParallelAccelerator-jl 1 2 or similar projects can enable fast vectorized Julia code 1 https: github.com IntelLabs ParallelAccelerator-jl 2 http: julialang.org blog 2016 03 parallelaccelerator On Sun, May 8, 2016 at 3:37 PM, feza moham... gmail.com wrote: I mean the revised script runs just as fast if not a tad faster with the latest master as it does on 0.4.5 : On Sunday, May 8, 2016 at 5:20:08 PM UTC-4, Patrick Kofod Mogensen wrote: Same as v0.4, or same as before you changed the code? On Sunday, May 8, 2016 at 8:55:00 PM UTC+2, feza wrote: roughly the same speed. On Sunday, May 8, 2016 at 2:44:19 PM UTC-4, Patrick Kofod Mogensen wrote: out of curiosity, what about v0.5? -- chris... ieee.org