Raw Memory Management patterns in Nim language

Overview:

Terminology:

    1. GC:   Garbage Collector 
    2. ARC:  Automatic reference Counting.
    3. MM:   Memory mangament
    4. RC:   Reference Count

Nim is a high-level language with a robust and flexible memory-management ( GC/ARC/ORC ) runtime. Default GC aka refc settings works great for most of the cases, and it frees user from thinking about memory/resources' management as long as user sticks with pure Nim data-structures. And as with managed memory-management languages,it generally takes relatively less time from idea to production. But unlike other languages, Nim Memory-management is quite flexible, allowing users to even not include any memory-management runtime, with --gc:none for embedded-development use-cases. In such cases user would be responsible for managing resources. But even in such cases Nim compiler can lend a helping hand as we would discuss below.

Besides writing a lot of code in pure Nim, I also use Nim extensively for wrapping existing C codebases which generally involves a lot of pointer/direct memory access code. Since Nim itself compiles to C, it has a first-class relationship with C, and hence tools like c2nim works out of the box for generating Nim bindings for existing C code. But I prefer to wrap C code by hand, as it allows me to learn corresponding Codebase/library architecture which is indispensable for bigger codebases like ffmpeg and generally leads to some optimization opportunities when creating higher-level API.

I never knew much about garbage-collection systems or how I can incorporate some GC specific routine to manage resources even allocated in another runtime, like a dll. It is when I read an ARC announcement post for Nim, I started reading more about ARC, hooks to override default object management behaviour.

Some patterns that I have found useful for managing direct memory access are presented below.

First pattern is where we would have a shared-library with C API, many such libraries expose some resources' allocation and de-allocation routines, and user is expected to call corresponding de-allocation routine to free resources once done. It is quite common for me to forget releasing some resources, as API would be new to me. I can use defer statements to make this easier but if not done properly it can lead to memory leakage.

With some basic routines like =destroy and =copy we can instead use Nim compiler to make this process automatic and have increased memory-safety.

=destroy

Nim allows you to define a =destroy routine for any custom Object, which would be called as soon as that particular instance of object goes out of scope.

Note: this is a part of Nim compiler and works even when there is no Memory-management runtime included like with --gc:none.

#example_1

#an example object, whose data field containing a pointer to some data. Note: gc/nim doesnot know what ``data`` field is pointing to
type
    Resource = object
        data:ptr uint8      #storing an address, default is nil.
        field_1:int
        field_2:float
        ...

proc `=destroy`(obj: var Resource)=
    if not isNil(obj.data):    
        echo "I am destroyer"             #should not use `echo` when gc=none, just for example.
        dealloc(obj.data)                 #release resource/memory

var a:Resource
a.data = cast[ptr uint8](alloc0(32))  #allocating 32 bytes. Note: we would be responsible for memory allocated using alloc0, GC have no knowledge about this memory.

#here a goes out of scope, so destructor for ``a`` should be called before returning
nim c -r --gc:none example_1.nim
#output
I am destroyer

So, the corresponding =destroy is being called for Resource instance a before returning, which is what we wanted. But, even though above code looks OK, it would not work as soon as we update the code as shown below.

var a:Resource
a.data = cast[ptr uint8](alloc0(32))  #allocating 32 bytes. Note: we would be responsible for memory allocated using alloc0, GC have no knowledge about this memory.

var b = a
echo a                    #have to add this statement, otherwise compiler would optimize away by moving a resources to b and resetting a to default state, which beats the purpose of understanding. 

#b goes out of scope   #destroy(b)
#a goes out of scope   #destroy(a)
# output

[data = ptr 00000207e6780030 --> 0,
field_1 = 0,
field_2 = 0.0]

ptr 00000207e6780030 --> 0

ptr 00000207e6780030 --> 0

See, =destroy/de-allocation would be called twice for same memory-location, it probably wouldn't crash with default Nim allocator, but it shouldn't happen in the first place[1]. Problem is that for now we have no way to tell compiler to not call =destroy for both a and b when both are pointing to same resource. We can try to prevent any copying attempt of data field[2], so that only one instance have access to raw-pointer pointing to some resource at any point in time.

=copy

Nim also provides a way to override default assignment operations, we can define a =copy routine for our custom object, and that would be called whenever we try to assign one variable to another variable of that object type.

proc `=copy`(a: var Resource, b:Resource){.error.}  #here we prevent any copying operation for Resource type.
nim c -r --gc:none ./example_1.nim
#output

`=copy' is not available for type <Resource>; requires a copy because it's not the last read of 'a';

Now, compiler will prevent us from copying/assigning Resource object to another variable, hence preventing more than one variable with raw-pointer value at any point of time. Even though this pattern prevents any copying and may feel rigid, I have used this pattern a lot by initializing such objects once, and only using them for reading purposes, thus preventing =copy operation by design. The complete pattern in form of code looks like:

Pattern

#Object represents corresponding C struct.
type
    Resource = object          
        data:pointer           #pointing to some resource allocated by some shared-lib/C-code.
        field_1:int
        field_2:float

proc `=copy`(a:var Resource, b:Resource){.error.}  #here we prevent any copying operation for Resource type.
proc `=destroy`(obj: var Resource)=
    if not isNil(obj.data):
        dealloc_some_resource(obj.data)        # a method provided by a shared-lib/C-code to release resources.

var a:Resource
allocate_some_resource(addr(a))   # generally `a.data` field would be updated in-place with a pointer by C-code/shared-lib.


# when a will go out of scope, ``=destroy(a)`` would be called.

In essence, take any existing C API exposed by shared-lib, create corresponding Nim code by hand or using c2nim, provide =destroy and =copy routines, add your own nim code to extend functionality and finally use it with already existing Nim code in form of a package or library. If code compiles without any copy errors, it would work to automatically release resources based on the scope of variables.

Protecting Resources using automatic reference Counting (ARC).

Another pattern is more flexible, and would allow us to assign variables fearlessly even in environments with constrained resources. We needed a way to protect our Resource pointer and also let more than one variables have access to it without releasing resources more than once. So basically allowing access to our resource but in a way that this access can be tracked. This leads to reference in Nim. In simple terms, reference is just a tracked/traced pointer, using this we still can access any object's fields in a normal way but MM runtime can keep track of number of references for a particular instance of Resource. All references are tracked by GC/ARC, hence GC/ARC would know when reference-count becomes zero for an object instance, and at this point corresponding =destroy routine would be called.

type
    Resource = object
        data:ptr float32

proc `=destroy`(obj:var Resource)=
    if not isNil(obj.data):
        dealloc(obj.data)
proc `=copy`(a:var Resource, b:Resource){.error.}

Code above looks the same as before, still no copying is allowed for Resource object, but now rather than using Resource object directly, we would be instead using it as a field of another object.

type
    Resource = object
        data:ptr float32

proc `=destroy`(obj:var Resource)=
    if not isNil(obj.data):
        echo "destroyer: ", repr(obj.resource)
        dealloc(obj.data)
proc `=copy`(a:var Resource, b:Resource){.error.}

type
    SomeObject = object
        resource: ref Resource          #this field holds a reference to Resource.

        field_1:int
        field_2:int
        ...

proc newSomeObject(field_1:int, field_2:int, ...):SomeObject=

    var output:SomeObject
    output.field_1 = field_1
    output.field_2 = field_2

    output.resource = new Resource                          #reference to Resource.
    output.resource.data = cast[ptr float32](alloc0(512))   #allocating resources.

    return output

var a = newSomeObject(...)
var b = a                                      #copy operation for someObject is allowed.
echo repr(a)
nim c --gc:arc ./example_2.nim

output.resource = new Resource creates an object of Resource type on HEAP, and we are provided with a reference to that object. Assignment would now be allowed as we would be copying a reference to instance of Resource, but since MM is keeping reference-count, =destroy for Resource would be called only when when RC becomes zero. Note that MM runtime have no knowledge about what Resource.data field is pointing to, but it can track which object/instance is using this Resource object as its field by keeping a reference count.

In simple terms, MM runtime can be thought of as true owner of some resource, and allowing other variables to borrow access to that resource. =copy for Resource is not allowed and would prevent any accidental attempt to own raw-pointer to that resource. All accesses to a resource would be tracked across that resource lifetime.

ARC and GC are two popular choices for Memory management and we would be discussing ARC below in detail.

ARC

ARC refers to automatic reference counting. We already discussed reference and how MM can keep track of reference-count. Based on that reference-count, ARC would call appropriate =destroy routine. This leads to deterministic memory management because all the extra code is injected during compilation by compiler based on the scope of variables. This is unlike GCs which pause threads to collect all dead objects/references.

In my experience this doesn't mean that ARC based memory-management would be faster or vice-versa, it would depend on the actual use-case and code used. Depending upon control-flow of code written, ARC may need to inject many routines, which would be called at runtime and this may lead to lower thoroughput than default GC, leading to higher latency.

ARC can make memory-management automatic for embedded/GPU programming aka where resources like RAM are limited. Rather than tuning GC for a particular platform, we can use ARC to make sure a resource is released as soon as it goes out of scope. GC may even fail to release resources in-time in such environments.

Pattern

For context, it is very common for Linear-algebra/ML libraries to allocate a buffer to store underlying data once and then some assign a view/slice of that data to another variable. It doesn't make sense to move all the data each time we assign it to another variable. So we would rather have a pointer to that data and move that around. So if we would be writing our such implementation, we can work with raw pointers and still use ARC to automatically manage memory for us.

A simple use-case is discussed below:

#example_2

type
    Resource = object
        data:ptr float32

proc `=destroy`(obj:var Resource)=
    if not isNil(obj.data):
        echo "Releasing: ",repr(obj.data)           #debugging
        dealloc(obj.data)
proc `=copy`(a:var Resource, b:Resource){.error.}

type
    Tensor = object
        resource: ref Resource           #this field  holds a traced reference to Resource.

        #meta-data or whatever
        H:int
        W:int

proc newTensor(H:int, W:int):Tensor=
    result.resource = new Resource     #Resource object is created on HEAP, and we are provided with a reference to that object.
    result.resource.data = cast[ptr float32](alloc0(H*W*sizeof(float32)))

    result.H = H
    result.W = W

var a = newTensor(H=16, W=16)
var b = a                      #no-cost op, because only  reference(pointer),int,int fields are copied.
echo repr(a)                   
nim c -r --gc:arc  ./example_2.nim
#output

Tensor(resource: ref Resource(data: ptr 0.0), H: 16, W: 16)
Releasing: ptr 0.0

As long as we have a way to allocate and de-allocate resources, this pattern would work on GPU/OPENCL or any embedded system. Nim compiler only use =copy for a variable when it cannot prove that this variable is lastReadOf(variable), I.e if possible, it can optimize away copy operation using move operation, which includes stealing resources from one variable and passing to another variable, thus helping new developers to do away with costly accidental copy operations. Nim also provides more advanced hooks to manage custom objects , for more details check out official guide.

Since scopes for all variables can be determined at compile-time, and appropriate code can be injected based on reference count, it leads to a highly portable code with no need to ship extra GC runtime. No dependence on GC runtime also makes it easier to call Nim code from another languages like Python.

ARC/ORC[3] is expected to be default Memory-management starting from Nim 2.0.

A more detailed look for =copy hook.

Since we are preventing any =copy for a particular object, in practise this should lead to a lot of friction while writing Nim code. Because user would probably write a lot of statements like var a = b either for purposes of writing a cleaner code or needs some temporary access to a resource. User cannot be bothered to spend a lot of time avoiding such statements for cases where compiler shouldn't complaint in the first place.

But in practise, very rarely the compiler complaints except where it is due. We can look at some examples to see why it works.

#nim c --gc:arc ./example_3.nim 

type
    Resource = object
        data: ptr float32

proc `=destroy`(a:var Resource) ...
proc `=copy`(a:var Resource, b:Resource){.error.}  #no copying allowed.

var a = new(Resource)
a.data = cast[ptr float32](alloc0(32))  #allocate 32 bytes.

var b = a[]    # `[]` gets the underlying object
echo repr(b)

The compiler would complaint for var b = a[] statement and rightly so, since =copy is not allowed.

proc isAllowed(a:ref Resource)=
    var b = a[]         #compiler would allow this statement.
    echo repr(a)

isAllowed(a)  # this would work 

The routine isAllowed() would work. To see what happens, compile the above code with

nim -c --gc:rc --expandArc:isAllowed ./example_3.nim
--expandArc: isAllowed

var :tmpD
try:
  var b_cursor = a[]    #b_cursor is treated as a raw pointer, which would  go out of scope, so no hook is necessary.
  echo [
    :tmpD = repr(b_cursor)
    :tmpD]
finally:
  `=destroy`(:tmpD)

  #Note: that there is no ``=destroy`` for b_cursor, since it is basically a raw pointer.

-- end of expandArc ------------------------

Note the statement var b_cursor = a[], b_cursor is actually a raw-pointer that would go out of the scope when isAllowed() returns, so no action from compiler is necessary. Compiler can do this kind of inference most of the time, avoiding friction with user. Cursor inference is a form of copy elision.

{.cursor.} pragma is actually used to tell the compiler to treat a reference as a raw-pointer, to avoid any reference counting , as basically it tells the compiler to avoid object construction/destruction pairs, marked with cursor pragma. This concept is used in isAllowed for copy elision.

Remarks:

ARC cannot prevent issues like dangling-pointers (I.e keeping references around even after corresponding resource has been de-allocated) while allowing raw-memory access at same time. But above patterns are supposed to be really useful as long as we wrap raw-pointer inside objects, and use those objects.

Generally, I wrap some functionality involving raw-pointer arithmetic inside a function/proc to keep nice separate scope and remain careful never to return a raw-pointer.

This post has been written with intention to showcase some Nim features from an user viewpoint, to make working wth raw pointers/memory easier and somewhat safer and is based on my understanding for such features. It is quite possible that I may be missing some important information, so any suggestions and comments are welcome on this topic.

References:


Footnotes:

  1. It would crash if we use --d:useMalloc flag.

  2. Even though Resource contains a single data field, Nim compiler prevents creating a =destroy routine for something like type Resource = pointer. We have to wrap it in an object.