Future of RemoteRefs 12174 Closed amitmurthy opened this Issue on Jul 16, 2015 ยท 7 comments Projects None yet Labels decision needs docs parallel Milestone No milestone Assignees No one assigned 6 participants amitmurthy eschnett malmaud JeffBezanson josefsachsconning ViralBShah Notifications amitmurthy The Julia Language member amitmurthy commented on Jul 16, 2015 This issue is to generate a consensus on the future of RemoteRefs. IMO, the main issue we have with RemoteRefs is distributed gc, DGC for short. DGC relies on add messages sent to the process holding data pointed to by the ref, when a ref is sent to another process delete messages are sent when the refs are collected locally. This is done from a finalizer on the ref. the problem is that while the refs themselves are small and hence may not be collected quickly enough, the data pointed to by them may be large resulting in a delay in collection which is perceived as a leak. Listing 3 options discussed so far: Keep RemoteRefs as is. Improve documentation, mention pitfalls and move folks away to the manually managed ChannelRefs which are references to remote channels. 12042 Deprecate RemoteRefs, force folks to use manually managed Channels ChannelRefs carnaval suggested the following limiting RemoteRefs to immutables only add messages are sent when a ref is serialized to another process which increments a ref count on the owner. put! take! are not allowed. every fetch decrements the count. Fetched value to cached locally object is freed automatically when ref count reaches zero. We will need a manual close on the ref to indicate that the creator of the ref is no longer interested in the value. The remote value will persist till all folks who have been sent the ref have fetched the same. This is a breaking change I would like folks to add to this discussion by listing any other issues they see with RemoteRefs as currently designed other possible replacements for this functionality My preference is for 1. amitmurthy amitmurthy added decision parallel labels on Jul 16, 2015 eschnett eschnett commented on Jul 16, 2015 My preference is also for 1 . A perceived memory leak can also be handled by giving better tools to users. For example, a special debug mode where each object knows where the references live that keep it alive may help, since it allows users to track down where these references are hiding. Another option maybe less expensive would be a way to dump all references from all processes, maybe filtered and sorted in some way. malmaud malmaud commented on Jul 16, 2015 My concern with 1 is that if spawn continues to return RemoteRefs, then it will remain dominant over ChanelRef no matter what the docs say. That pushes me to support 3. malmaud malmaud commented on Jul 16, 2015 Although RemoteRef could still have a finalizer that decrements the remote reference count in the event that fetch was never called on it. JeffBezanson The Julia Language member JeffBezanson commented on Jul 16, 2015 where each object knows where the references live that keep it alive The DGC uses reference lists, so we kind of have this information already. I like option 3 , but we don't necessarily need every aspect of it listed here. We can use the assignable cell model, where a RemoteRef can be an immutable pointer to a mutable Channel. I dislike the non-orthogonality of having both RemoteRef and ChannelRef, and having each RemoteRef require a Channel. For the simplest case where we just want an async reference to a single RPC result, we should have an efficient immutable RemoteRef. Adding put! and take! to RemoteRef was a mistake my mistake! A good thing to think about is how to implement DArrays. RemoteRefs are sufficient for this, but arguably too flexible. For example if we make a copy of a DArray, we need a whole bunch of DGC messages, even though we know exactly which processors are involved already. Also, it's hard to imagine a use case for sending a DArray to an unrelated processor. For example say we have 4 nodes, and node 1 creates a DArray that's stored on nodes 2 and 3. It doesn't seem necessary to allow node 1 to send the DArray to node 4. amitmurthy The Julia Language member amitmurthy commented on Jul 17, 2015 How about this: RemoteRef becomes RemoteRef T, SystemManaged , where T is the Type and SystemManaged is a boolean which implies that it's lifetime is managed by Julia via DGC. Since these are typically the results of spawn type of calls, the data pointed to by the ref is immutable hence can be cached locally A Future is a typealias for RemoteRef Any, true . Everyone understands the term future promise and that is what spawn remotecall return. put! take! are not allowed on Futures. Only isready, wait and fetch are allowed. Thus, all RPC API returned RemoteRef's always point to immutables and are managed by a ref-counting DGC User created refs always point to channels and must be closed explicitly Channels are for inter-task communication. They are always created via channel pid::Int myid , T::Type Any, sz::Int 1 channel always returns a RemoteRef T, false They need to be explicitly freed via a close rr::RemoteRef . Only new export is channel. As for DArrays, the implementation needs to change to manually manage references. When user code is something like: foo D on all procs D where D is a dArray local processing bar D on all procs D local processing baz D on all procs D ...... Each of those calls foo, bar and baz currently generate a bunch of DGC messages - too many in fact. Especially when a DArray is distributed over 100s of workers. Also, stencil type of operations require all participating nodes to be necessarily aware of the distribution and fetch from neighboring cells. We should change it to manually manage the Refs and call a explicit close D::DArray when done. This was referenced on Jul 19, 2015 Closed Added Channels 12042 Merged Remove use of RemoteRefs JuliaParallel DistributedArrays-jl 44 amitmurthy The Julia Language member amitmurthy commented on Nov 19, 2015 Closed by 13923 amitmurthy amitmurthy closed this on Nov 19, 2015 josefsachsconning josefsachsconning commented on Dec 9, 2015 Will there not be a deprecation warning for the disappearance of RemoteRef? How about inclusion in Compat-jl? josefsachsconning josefsachsconning referenced this issue on Dec 9, 2015 Merged Updating RemoteRefs to Futures and RemoteChannels 13923 ViralBShah ViralBShah added the needs-docs label on Dec 22, 2015 oxinabox oxinabox referenced this issue in JuliaParallel Blocks-jl on Apr 21 Open Broken in 0.5 as of RemoteRef Changes 19