Stupid question, are there standard GC libraries that are used with C? Alternati...

nu11ptr · on April 8, 2023

The Boehm collector exists and is mentioned in the article, but is not part of the standard library. Typical memory management practice is to store on the stack as much as possible reclaming memory as it goes out of scope. Larger chunks of memory or those that outlast scope typically uses manual malloc + free and these are often batched to minimize overhead. It is fairly unusual to use a GC in C. This is likely due to multiple reasons: history/culture, language design (C only allows for conservative collectors), less precise control over runtime overhead and pauses, higher memory utilization, etc.

cancerhacker · on April 8, 2023

A common idiom that is no longer as common is to always create a stack buffer up to a certain small bound and then an allocation if it goes beyond:

   void dwim(unsigned count) {
     struct Something _buffer[10];
     struct Something* buffer;
     if (count < sizeof(_buffer) / sizeof(_buffer[0]))
       buffer = _buffer;
     else {
       buffer = (struct Something*) malloc(count * sizeof(struct Something));
     }
     doSomething(buffer, count);
     if (buffer != _buffer)
       free(buffer);
    }

It used to be really common to write that without considering if it was a necessary optimization or not - nowadays I would just malloc up front.

gmueckl · on April 9, 2023

This pattern still exists in C++ in the form of custom containers that implement "small array optimization". The LLVM project is a heavy user of this pattern as far as I know and I have seen it in several proprietary codebases.

spacechild1 · on April 9, 2023

It's also used in std::string (small string optimization).

alpaca128 · on April 9, 2023

Same with Rust's smallvec crate.

hsn915 · on April 9, 2023

The consensus seems to be Arena allocators.

Ryan Fleury has a great write up on the subject:

https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...

mcguire · on April 8, 2023

"The usual memory management practice" in C depends a lot an what you are doing. Malloc/free scattered through your code is...not uncommon. That does require discipline or ugliness ensues. Other common options are not allocating, which is kind of limiting but the only option under very tight memory constraints, and doing specific domain management options.

One of the best of the latter is arena management. (https://en.wikipedia.org/wiki/Region-based_memory_management)

mandarax8 · on April 8, 2023

Boehm GC is mentioned in the article: https://github.com/ivmai/bdwgc

ritcgab · on April 8, 2023

Manual malloc() + free() is the way. The language design makes it hard (if not impossible) to generate compilation-time GC routine. Runtime GC will be another thing, and you probably will lose raw pointer access, because we need some higher-level struct wrapping around the raw pointer to feed more information to the garbage collector.

sirwhinesalot · on April 8, 2023

If you want to keep your sanity, you don't want to be manually mallocing and freeing all over the place.

The most common patterns I often see are a "context" object of some kind (typically used by libraries) which handles all memory management internally and must be passed to every API call. So you only ever allocate and free that one object. (Internally they might be doing all sorts of crazy things, I've even seen a basic tracing GC!)

Applications typically use some combination of bump allocators (aka arenas) and pools. You put temporary allocations in a dedicated temp arena and clear it out when appropriate for the application (i.e. every frame)

sparkie · on April 9, 2023

> Is it typical just to keep track within the confines of the code you're writing and explicitly free memory when you're done with it?

Yes. It's common to write OOP style C by using an opaque pointer in a header file and handle all the allocation and deallocation within the code file. The header file will export a "constructor" and "destructor" function. The constructor and destructor are still called manually, but this method properly encapsulates state to being visible only within a single code file, which prevents some accidental misuses of a type. The destructor should have a free for every allocation in the constructor, but done in reverse order. If you follow this pattern consistently, then `malloc` will only ever appear inside a constructor and `free` will only appear in a destructor. All other object allocation and deallocation is done via the relevant constructor/destructor.

.h file:

    typedef struct my_type_t my_type;

    my_type* my_type_alloc (type1, size_t);

    void my_type_free (my_type*);

.c file:

    struct my_type_t
    {
        type1 member1;
        type2* member2;
    };

    my_type* my_type_alloc (type1 m1_copy, size_t m2_size);
    {
        my_type result* = (my_type*) malloc (sizeof (my_type));
        result->member1 = m1_copy;
        result->member2 = type2_alloc (m2_size);
        return result;
    }

    void my_type_free (my_type* value)
    {
        type2_free (value->member2);
        free (value);
    }

Another technique is to make use of GCC `constructor` and `destructor` attributes, which are called before `main` and after `main` loses scope (or `exit()` is called). For example, you might have a static container type and only expose methods `add`, `remove` and `get`, then all allocation and deallocation happens solely within the code file and consumers of this API don't need to concern themselves with allocating and deallocating. This is only really useful for singleton-like (process global) data structures, but you could perhaps utilize these for implementing a GC, since an allocator is usually global to the process. (Eg, the `GC_init` function in the article could be given the constructor attribute so that the programmer would not need to manually call it).

.h file:

    void container_add (obj value);
    void container_remove (obj value);
    obj container_get (size_t index);

.c file:

    struct container_t
    {
        obj* items;
        size_t num_items;
        size_t capacity;
    }

    static container_t* c;

    __attribute__((constructor))
    void container_initialize(void)
    {
        c = malloc (sizeof (struct container_t));
        c->num_items = 0;
        c->capacity = DEFAULT_NUM_ITEMS;
        c->items = malloc(DEFAULT_NUM_ITEMS * sizeof (obj));
    }

    __attribute__((destructor))
    void container_uninitialize(void)
    {
        free (c->items);
        free (c);
    }

    void container_resize (size_t new_size)
    {
        obj* tmp = c->items;
        c->capacity = new_size;
        c->items = malloc (new_size * sizeof (obj));
        memcpy (c->items, tmp, c->num_items * sizeof (obj));
        free (tmp);
    }

    void container_add (obj value) {
        if (c->num_items == c->capacity) container_resize (c->capacity * 2);
        ...
     }

    void container_remove (obj value) { ... }

    obj container_get (size_t index) {
        return c->items[index];
    }