Groups 109 of 99+ julia-users › Double free or corruption out 18 posts by 5 authors Nils Gudat 9 26 15 Does anyone have an idea what this could be? This occured in the middle of a minimization routine that has previously with slightly different parameter values converged successfully. This is on 0.4.0 one of the last commits before rc1 on Ubuntu 14.04.3 Attachments 2 Auto Generated Inline Image 1 37 KB View Download Auto Generated Inline Image 2 137 KB View Download Tony Kelman 9 26 15 Please provide code that can reproduce the problem. Nils Gudat 9 26 15 That's the problem I alluded to in my question: This happened in the middle of a very lengthy minimization problem, which had been running for a couple of hours. On a previous run, a very similar version of the code finished successfully after about 10 hours. I was hoping that someone could at least tell me what this error message is about, it seems to be Linux-related and I have no clue what's going on. Bill Hart 9 26 15 The malloc free functions are the ones that allocate and free blocks of memory. They are provided by the system e.g. Linux . A double free or corruption likely means that free was called twice on the same block of memory, or that something was overwritten that shouldn't have been, e.g. an array overrun or something similar. This might have happened deep within Julia itself or in some Clanguage library that your code calls. Just an absolute guess based on the output you posted, some finalizer is trying to call a free or cleanup function on some data from a Clanguage library, but is passing invalid pointers to the Clanguage library... or there is a bug in the Clanguage library itself. I'm sorry I don't know anything about the minimization you are speaking of. I'm not a numerical person. And I don't recognise any of the libraries mention in your stack trace other than libjulia.so . But does this information help in any way? Tracking such things down can be very difficult. If you make a pile of much smaller examples, can you get the same thing to happen repeatedly with similar code? Bill. On Saturday, 26 September 2015 19:07:43 UTC+2, Nils Gudat wrote: That's the problem I alluded to in my question: This happened in the middle of a very lengthy minimization problem, which had been running for a couple of hours. On a previous run, a very similar version of the code finished successfully after about 10 hours. I was hoping that someone could at least tell me what this error message is about, it seems to be Linux-related and I have no clue what's going on. Yichao Yu 9 26 15 Re: julia-users Re: Double free or corruption out The error message means that something corrupted the memory. The most likely reason that causes this I've seen is incorrectly used ccall or other unsafe memory stores . What packages are you using? Do you at least have a list of them that uses ccall? Nils Gudat 9 26 15 Re: julia-users Re: Double free or corruption out The minimization itself is NLopt, the problem is to solve an economic model which takes around 2 minutes to solve on 16 cores and compare its output a 100x4 Float64 Array to some data moments. The model results depend on two parameters. The model itself is mostly minimization via Optim and numerical integration using FastGaussQuadrature , and is parallelized via SharedArrays. Since you asked for a list of packages, I'm also using ApproXD for linear interpolation, and Distributions to draw from a bivariate Normal . Yichao Yu 9 26 15 Re: julia-users Re: Double free or corruption out Looks like there's at least one segfault in NLopt AppVeyor Nightly Win32 and I can reproduce locally with aggressive GC. Will investigate. Yichao Yu 9 26 15 Re: julia-users Re: Double free or corruption out Looks like there's at least one segfault in NLopt AppVeyor Nightly Win32 and I can reproduce locally with aggressive GC. Will investigate. Fixed in https: github.com JuliaLang julia pull 13325 I have no idea if it is the same SegFault corruption you are seeing or on the AppVeyor though..... Nils Gudat 9 29 15 Re: julia-users Re: Double free or corruption out Thanks for that, I've updated my verson to the latest 0.5 master, but am now getting this segfault, which looks like it's still connected to garbage collection: Attachments 1 Auto Generated Inline Image 1 140 KB View Download Yichao Yu 9 29 15 Re: julia-users Re: Double free or corruption out On Tue, Sep 29, 2015 at 4:25 AM, Nils Gudat nils.... gmail.com wrote: Thanks for that, I've updated my verson to the latest 0.5 master, but am now getting this segfault, which looks like it's still connected to garbage collection: Not necessarily. The GC is just one of the most vulnerable piece of code to memory corruption. The backtrace itself is basically useless. Running the code with a few debug options may help debugging but it is not that easy to describe the way to debug this. I'm planning to update the GC debug doc but haven't get to it yet.... If the issue is reproducible enough total time to reproduce 1 week , it would be helpful to post your full code. If you don't want to make it public, please feel free to send private email. Nils Gudat 9 29 15 Re: julia-users Re: Double free or corruption out The code usually segfaults after 2-5 hours, and is available at http: github.com nilshg LearningModels, however I haven't written it up in a way that is easy to run right now it depends on some data not included in the repo , so I'll have to restructure a bit before you can run it. I'll try to do so today if I find the time. Nils Gudat May 31 Re: julia-users Re: Double free or corruption out Resurrecting this very old thread - after having been able to solve the model with no seg faults over the last couple of months, they have now returned and occur much faster usually within 2 hours of running the code . I have refactored the code a little so that it hopefully will be possible for others to run it. Cloning the entire repo at http: github.com nilshg LearningModels, it should run when altering the path in https: github.com nilshg LearningModels blob master NHL NHL_maximize-jl to whatever path it has been cloned to. I'm running this code on a 16-core Ubuntu 14.04 machine with Julia 0.4.5 installed an all packages on the latest tagged versions. On Tuesday, September 29, 2015 at 1:43:31 PM UTC+1, Nils Gudat wrote: The code usually segfaults after 2-5 hours, and is available at http: github.com nilshg LearningModels, however I haven't written it up in a way that is easy to run right now it depends on some data not included in the repo , so I'll have to restructure a bit before you can run it. I'll try to do so today if I find the time. Bill Hart May 31 Re: julia-users Re: Double free or corruption out We are also suddenly getting crashes with 2.4.5. when running our Nemo test suite. It says that some memory allocation is failing due to invalid next size. I suspect there is a bug that wasn't there until the last few days, since we were passing just fine on Travis. Though at this stage, I haven't checked whether we are still passing on Travis. Bill. Bill Hart Jun 1 Re: julia-users Re: Double free or corruption out I've checked that the problem we were having doesn't happen with Julia 0.4.5 on Travis. In fact, it also doesn't happen on another one of our systems with Julia 0.4.5, so at this stage we have no idea what the problem is. It may be totally unrelated to the problem you are having. Bill. Nils Gudat Jun 2 Re: julia-users Re: Double free or corruption out Fair enough. Does anyone have any clues as to how I would go about investigating this? As has been said before, the stacktraces aren't very helpful for segfaults, so how do I figure out what's going wrong here? Andrew Jun 2 Re: julia-users Re: Double free or corruption out Have you tried running the code without using parallel? I have been getting similar errors in my economics code. It segfaults sometimes, though not always, after a seemingly random amount of time, sometimes an hour or so, sometimes less. However, I don't recall it having ever occurred in the times I've run it without parallel. I'm using SharedArrays like you. I've seen this occur on both 0.4.1 and 0.4.5. The error isn't too serious for me because I periodically save the optimization state to disk, so I can just restart. I also can't remember this ever occurring on my own Linux computer. It's happened on a Linux cluster with many cores. On Thursday, June 2, 2016 at 3:45:24 AM UTC-4, Nils Gudat wrote: Fair enough. Does anyone have any clues as to how I would go about investigating this? As has been said before, the stacktraces aren't very helpful for segfaults, so how do I figure out what's going wrong here? Nils Gudat Jun 2 Re: julia-users Re: Double free or corruption out Hm, interesting observation... I suppose the issue in my case is that the code as it is takes about 3-4 days to complete, so running it on 1 instead of 15 cores means I'm unlikely to ever get my PhD! I will at least try to run a shorter version that might be solvable in a day or two without parallel. Nils Gudat Jun 12 Re: julia-users Re: Double free or corruption out So it looks like I'm having the same issue - have been running the code without parallelization defining my SharedArrays as regular ones , and it has now been going for about 3 days without any segfaults. Is this a known issue? If so, do we know whether there's a Julia version one can revert to in which SharedArrays work?