Groups
138 of 99+
julia-users ›
why's my julia code running slower than matlab, despite performance tips
26 posts by 9 authors
feza
May 8
I have read the performance section and believe I have followed all the suggested guidelines
The same matlab script takes less than 3 seconds, julia 0.45 9.7 seconds (julia 0.5 is even worse...)
feza
May 8
https://gist.github.com/musmo/27436a340b41c01d51d557a655276783
- show quoted text -
michae...@gmail.com
May 8
I see that c is a constant array of Ints, and its elements multiply ux, uy and uz in a loop, where ux, uy and uz are arrays of floats, so there's a type stability problem.
- show quoted text -
feza
May 8
Good catch altough this still doesn't explain away the difference
@code_warntype shows me feq, f, \rho, ux, uy, uz are red for some reason eventhough I have explictly stated their types...
- show quoted text -
STAR0SS
May 8
You are using a lot of vectorized operations and Julia isn't as good as matlab is with those.
The usual solution is to devectorized your code and to use loops (except for matrix multiplication if you have large matrices).
Patrick Kofod Mogensen
May 8
As for the v0.5 performance (which is horrible), I think it's the boxing issue with closure https://github.com/JuliaLang/julia/issues/15276 . Right?
On Sunday, May 8, 2016 at 10:29:59 AM UTC+2, STAR0SS wrote:
You are using a lot of vectorized operations and Julia isn't as good as matlab is with those.
The usual solution is to devectorized your code and to use loops (except for matrix multiplication if you have large matrices).
Patrick Kofod Mogensen
May 8
For what it's worth it run in about 3-4 seconds on my computer on latest v0.4.
CPU : Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
- show quoted text -
feza
May 8
That's no surprise your CPU is better :)
Regarding devectorization
for l in 1:q
for k in 1:nz
for j in 1:ny
for i in 1:nx
u = ux[i,j,k]
v = uy[i,j,k]
w = uz[i,j,k]
cu = c[k,1]*u + c[k,2]*v + c[k,3]*w
u2 = u*u + v*v + w*w
feq[i,j,k,l] = weights[k]*ρ[i,j,k]*(1 + 3*cu + 9/2*(cu*cu) - 3/2*u2)
f[i,j,k,l] = f[i,j,k,l]*(1-ω) + ω*feq[i,j,k,l]
end
end
end
end
Actually makes the code a lot slower....
- show quoted text -
Tim Holy
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
One of the really cool features of julia is that functions are allowed to have
more than 0 arguments. It's even considered good style, and I highly recommend
making use of this awesome feature in your code! :-)
In other words: try passing all variables as arguments to the functions. Even
though you're wrapping everything in a function, performance-wise you're
running up against an inference problem
(https://github.com/JuliaLang/julia/issues/15276). In terms of coding style,
you're still essentially using global variables. Honestly, these make your
life harder in the end (http://c2.com/cgi/wiki?GlobalVariablesAreBad)---it's
not a bad thing that julia provides gentle encouragement to avoid using them,
and you're losing out on opportunities by trying to sidestep that
encouragement.
Best,
--Tim
- show quoted text -
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Thanks for the tip (initially I just transllated the matlab verbatim)
Now I have made all the changes. In place operations, and direct function calls.
Despite these changes. Matlab is 3.6 seconds, new Julia 7.6 seconds
TBH the results of this experiment are frustrating, I was hoping Julia was going to provide a huge speedup (on the level of c)
Am I still missing anything in the Julia code that is crucial to speed?
@code_warntype looks ok sans a few red unions which i don't think are in my control
- show quoted text -
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Milan
Script is here: https://gist.github.com/musmo/27436a340b41c01d51d557a655276783
- show quoted text -
This message has been deleted.
STAR0SS
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Try changing the order of your loops:
for i in 1:nx, j in 1:ny, k in 1:nz
->
@inbounds for k in 1:nz, j in 1:ny, i in 1:nx
(@inbounds disable bounds checking for arrays, it usually makes a small improvement).
This message has been deleted.
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Wow thank you guys
I totally thought
for i in 1:nx, j in 1:ny, k in 1:nz
ran the i index first and then j and then k !!!!!
This has been a great learning experience.
Much appreciated, now the julia code is about twice as fast!
On Sunday, May 8, 2016 at 1:12:30 PM UTC-4, Tk wrote:
Also try:
julia -O --check-bounds=no yourcode.jl
- show quoted text -
David Gold
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
So, the issue here was the indexing clashing up against the column-major storage of multi-dimensional arrays?
On Sunday, May 8, 2016 at 10:10:54 AM UTC-7, Tk wrote:
Could you try replacing
for i in 1:nx, j in 1:ny, k in 1:nz
to
for k in 1:nz, j in 1:ny, i in 1:nx
because your arrays are defined like a[i,j,k]?
Another question is, how many cores is your Matlab code using?
- show quoted text -
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Well first problem was that the vectorized version of my code was very slow.
Then I devectorized still slow, because of the index clashing with the column-major storage
I assumed for i =1:10,j=1:10,k=1:10 does the index i first then j then k wrongly...
- show quoted text -
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
With all that done, the julia code runs about the same if not better than matlab (using 4 threads)
- show quoted text -
Patrick Kofod Mogensen
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
out of curiosity, what about v0.5?
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
roughly the same speed.
On Sunday, May 8, 2016 at 2:44:19 PM UTC-4, Patrick Kofod Mogensen wrote:
out of curiosity, what about v0.5?
Patrick Kofod Mogensen
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Same as v0.4, or same as before you changed the code?
- show quoted text -
feza
May 8
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
I mean the revised script runs just as fast if not a tad faster with the latest master as it does on 0.4.5 : )
- show quoted text -
Christian Peel
May 9
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
> The usual solution is to devectorized your code and to use loops (except for matrix multiplication if you have large matrices).
I am hopeful that ParallelAccelerator.jl [1][2] or similar projects can enable fast vectorized Julia code
[1] https://github.com/IntelLabs/ParallelAccelerator.jl
[2] http://julialang.org/blog/2016/03/parallelaccelerator
- show quoted text -
--
chris...@ieee.org
Ford O.
May 9
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Other recipients: chris...@ieee.org
I have checked the link and read the article. Am I right that the parallel accelerator basically uses C code instead of julia to do the computation? That would be kinda shame dont you think?
Dne pondělí 9. května 2016 7:00:38 UTC+2 Christian Peel napsal(a):
- show quoted text -
Yichao Yu
May 9
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Other recipients: chris...@ieee.org
On Mon, May 9, 2016 at 1:15 AM, Ford Ox wrote:
> I have checked the link and read the article. Am I right that the parallel
> accelerator basically uses C code instead of julia to do the computation?
> That would be kinda shame dont you think?
No I don't think so.
IIUC it uses C for the threading API, it even has a backend using the
julia threading API. (And the julia threading API is very incomplete
and experimental).
And in general this is not so different from julia generating LLVM IR
(especially since LLVM has a C backend). Generating C is just usually
not the as efficient as generating LLVM IR since you'll have parser
overhead, much less flexible and expressive, unless, as in this case,
the function/API is in C.
- show quoted text -
Yichao Yu
May 9
Re: [julia-users] Re: why's my julia code running slower than matlab, despite performance tips
Other recipients: chris...@ieee.org
On Mon, May 9, 2016 at 2:04 AM, Yichao Yu wrote:
> On Mon, May 9, 2016 at 1:15 AM, Ford Ox wrote:
>> I have checked the link and read the article. Am I right that the parallel
>> accelerator basically uses C code instead of julia to do the computation?
>> That would be kinda shame dont you think?
>
> No I don't think so.
>
> IIUC it uses C for the threading API, it even has a backend using the
> julia threading API. (And the julia threading API is very incomplete
> and experimental).
> And in general this is not so different from julia generating LLVM IR
> (especially since LLVM has a C backend). Generating C is just usually
> not the as efficient as generating LLVM IR since you'll have parser
> overhead, much less flexible and expressive, unless, as in this case,
> the function/API is in C.
Or in another word, it is at most a shame for LLVM IR for not having a
threading construct (which, admittedly, is a very hard problem but
people are working on it).
>
>>
>> Dne pondělí 9. května 2016 7:00:38 UTC+2 Christian Peel napsal(a):
>>>
>>> > The usual solution is to devectorized your code and to use loops (except
>>> > for matrix multiplication if you have large matrices).
>>>
>>> I am hopeful that ParallelAccelerator.jl [1][2] or similar projects can
>>> enable fast vectorized Julia code
>>>
>>> [1] https://github.com/IntelLabs/ParallelAccelerator.jl
>>> [2] http://julialang.org/blog/2016/03/parallelaccelerator
>>>
>>> On Sun, May 8, 2016 at 3:37 PM, feza wrote:
>>>>
>>>> I mean the revised script runs just as fast if not a tad faster with the
>>>> latest master as it does on 0.4.5 : )
>>>>
>>>>
>>>> On Sunday, May 8, 2016 at 5:20:08 PM UTC-4, Patrick Kofod Mogensen wrote:
>>>>>
>>>>> Same as v0.4, or same as before you changed the code?
>>>>>
>>>>> On Sunday, May 8, 2016 at 8:55:00 PM UTC+2, feza wrote:
>>>>>>
>>>>>> roughly the same speed.
>>>>>>
>>>>>> On Sunday, May 8, 2016 at 2:44:19 PM UTC-4, Patrick Kofod Mogensen
>>>>>> wrote:
>>>>>>>
>>>>>>> out of curiosity, what about v0.5?
>>>
>>>
>>>
>>>
>>> --
>>> chris...@ieee.org