Write Your Own Memory Allocator

In this article, I will guide you to implement your own memory allocator. This memory allocator is very simple and is intended to help you gain a deeper understanding of the memory allocation process within operating systems. Therefore, it will not be very efficient. Simply put, this memory allocator just works.

We will primarily use the C language for programming. But before coding, we first need to have a certain understanding of how our operating system stores memory.

Memory Storage in Operating Systems#

In our operating system's memory, there are mainly 5 parts. These 5 parts are:

Different partitions in the RAM

kernel: Stores the kernel of our operating system.
stack: Stores the variables of the program.
heap: Stores the memory dynamically allocated by the program.
data: Stores static variables.
text: Stores the code segment of the program.

Among them, the sizes of the relatively dynamic heap and stack are not fixed. The heap can grow upwards (towards higher memory addresses), and the stack can grow downwards (towards lower memory addresses). Therefore, when we use commands like malloc in C, we are actually growing the heap area upwards a bit. We can see from the diagram that there is a brk pointer at the top of the heap, which limits the range of the heap area. So when we want to malloc a new memory segment, we move the brk pointer upwards. The process of moving brk has a dedicated function called sbrk in Unix/Linux systems. This function has 3 usages:

sbrk(0) returns the current brk address.
sbrk(n) increases brk by n bytes. Here, n is a positive number.
sbrk(-n) decreases brk by n bytes. Here, n is a positive number.

If sbrk() is successful in allocating memory, it will return the starting address of the allocated area. If sbrk() fails to allocate memory, it will return (void *) -1 (i.e., 0xFFFFFFFF).

sbrk() has actually been deprecated by modern operating systems because there are better memory allocation methods, such as mmap(). However, since sbrk() is simple and intuitive, we still use sbrk() for memory allocation here.

Implementing `malloc(size_t size)`#

malloc(size_t size) takes one parameter representing the size of memory to be allocated. We might intuitively think that we can just call sbrk(size) to complete the task?

Indeed, calling it directly can allocate new memory, but when we finish using this memory and try to free() it, problems arise. This is because we do not know how large the memory to be freed is, so we cannot release it.

Additionally, since the brk pointer always points to the top of the heap, and we adjust the position of brk to return memory to the operating system, we can only truly release (return to the operating system) memory allocations at the top of the heap. The memory in between is actually very difficult to be truly released (you could do a complete migration, but that would be too costly). Although we cannot truly release the memory in between, we can effectively utilize it, i.e., mark the freed memory so that this segment can be "allocated" by another program. Therefore, when we malloc, it actually includes a loop that checks all our allocated memory to see if it has been freed and whether its size is sufficient to meet our requirements.

Finally, the sbrk() command can be called by any program, not just ours. So when we allocate a segment of memory, it is entirely possible that another program also tries to allocate a segment of memory. In other words, we cannot guarantee that the memory we allocate is contiguous in physical addresses, which will make the "loop" process we mentioned earlier impossible. Therefore, we need to link each segment of memory we allocate together using a linked list.

Thus, when we malloc, we need to write some necessary information about this allocated memory together. Specifically, this information is written at the front of the allocated memory. This necessary information includes allocated memory size, whether it is freed, and the address of the next memory allocation.

struct header_t {
    size_t size;
    unsigned is_free;   /* shorthand for unsigned int */
    union header *next; /* linked list for our heap */
}

Then, for ease of processing, we want all the header information of our allocated memory to be the same size. Therefore, we need to align it so that the size of all header information is 16 bytes.

typedef char ALIGH[16]; /* ALIGH is 16 bytes */

union header {
    struct {
        size_t size;
        unsigned is_free;   /* shorthand for unsigned int */
        union header *next; /* linked list for our heap */
    } s;
    ALIGH stub;
};

typedef union header header_t;  /* so header_t will have a fixed 16 bytes size */

For the sake of the linked list and freeing, we also need two pointers to point to the head and tail of the linked list.

/* a head and tail pointer to track our heap */
header_t *head, *tail;

Finally, to avoid contention, we need to introduce a mutex lock.

/* set a mutex lock to prevent concurrently accessing memory */
pthread_mutex_t global_malloc_lock;

Now, we can finally complete our malloc() function.

header_t *get_free_block(size_t size);

/* sbrk(0) => return the current address of program break
 * sbrk(x) => increase brk by x bytes
 * sbrk(-x) => decrease brk by x bytes
 * if sbrk() fails, it will return (void *) -1 / 0xFFFFFFFF
 * */
void *malloc(size_t size) {
    if (!size)  /* if the requested size is zero, return */
        return NULL;
    size_t total_size;  /* header size + block size */
    void *block;
    header_t *header;

    pthread_mutex_lock(&global_malloc_lock);    /* acquire the lock */
    header = get_free_block(size);  /* check if there have available free block that has a size of bigger than ‘size’ */
    if (header) {   /* exists */
        header -> s.is_free = 0;    /* this block is not free */
        pthread_mutex_unlock(&global_malloc_lock);  /* release the lock */
        return (void *)(header + 1);    /* return the address of the block instead of the header */
    }

    total_size = sizeof(header_t) + size;   /* not exists, we have to create one */
    block = sbrk(total_size);
    if (block == (void *) -1) { /* malloc failed */
        pthread_mutex_unlock(&global_malloc_lock);  /* release the lock */
        return NULL;
    }

    header = block;
    header -> s.size = size;
    header -> s.is_free = 0;
    header -> s.next = NULL;

    if (!head)  /* if the head pointer of the heap not exists, make header be the head */
        head = header;
    if (tail)   /* if the tail pointer of the heap exists, */
        tail -> s.next = header;
    tail = header; /* make this header be the tail (or the new tail) */
    pthread_mutex_unlock(&global_malloc_lock);  /* release the lock */
    return (void *)(header + 1);    /* return the address of the block instead of the header */
}

header_t *get_free_block(size_t size) {
    header_t *curr = head;
    while (curr) {
        if (curr -> s.is_free == 1 && curr -> s.size >= size)
            return curr;
        curr = curr -> s.next;
    }
    return NULL;
}

The process of malloc() is as follows: first, check whether the incoming parameter size is valid. If valid, continue to try to acquire the mutex lock. After successfully obtaining the lock 🔒, use get_free_block() to search for a freed memory block in the memory allocation linked list that is large enough. If found, mark it as not free, release the mutex lock, and return the address of this memory block. Note that (void *)(header + 1) means returning the address of the memory block allocated, because header is the address of the header information, and the header information occupies 16 bytes. On the other hand, if no suitable memory block is found in the allocated memory, we need to request a new block through sbrk. The size we request is equal to the sum of the size of our header information and the size of the memory block. If the request is successful, we link it to our linked list and return the address of the memory block.

Implementing `free(void *block)`#

free(void *block) accepts the address of the memory block and releases it. The release strategy is as follows: if this memory block is exactly at the end of the linked list, i.e., at brk, we can call sbrk() to return this memory block to the operating system. If this memory block is in the middle of the linked list, we simply mark it as "free".

/* if the block-to-be-freed is at the end of the heap, then we return it to the OS.
 * otherwise we simply mark it as ‘free’
 * */
void free(void *block){
    header_t *header, *tmp;
    void *programbreak;

    if (!block) /* if this block not exists */
        return;

    pthread_mutex_lock(&global_malloc_lock);
    header = (header_t *)block - 1; /* header address */

    programbreak = sbrk(0); /* get current brk location */

    if ((char *)block + header -> s.size == programbreak) { /* this is the end of the heap */
        if (head == tail) { /* currently there is only one block in the heap, we will erase it */
            head = tail = NULL;
        } else {    /* otherwise we will remove the tail */
            tmp = head;
            while (tmp) {
                if (tmp -> s.next == tail) {
                    tmp -> s.next = NULL;
                    tail = tmp;
                }
                tmp = tmp -> s.next;
            }
        }
        /* release the space */
        sbrk(0 - sizeof(header_t) - header -> s.size);
        pthread_mutex_unlock(&global_malloc_lock);
        return;
    }

    /* if this block is in the middle of our heap, we mark it as ‘free’ */
    header -> s.is_free = 1;
    pthread_mutex_unlock(&global_malloc_lock);
}

In this code, (header_t *)block - 1 means mapping the memory block address to header *, and -1 gets the address of the header information. (char *)block is the initial address of the memory block, and adding the size of the memory block to the initial address gives the end address. If the end address is pointed to by brk, it indicates that this memory can be released back to the operating system. When releasing, be sure to update the linked list.

Thus, the two core functions of memory allocation have been implemented. Next, we will continue to implement two very useful functions: calloc and realloc.

Implementing `calloc(size_t num, size_t nsize)`#

calloc(size_t num, size_t nsize) accepts a quantity parameter num and a size parameter nsize, then allocates num blocks of memory of size nsize, initializes all allocated memory to 0, and returns the initial address.

/* calloc allocates ‘num’ elements with size of ‘nsize’.
 * */
void *calloc(size_t num, size_t nsize) {
    if (!num || !nsize)
        return NULL;    /* num and nsize must be valid */

    size_t size;
    void *block;
    size = num * nsize;
    if (nsize != size / num)
        return NULL;    /* multiply overflow */
    block = malloc(size);
    if (!block)
        return NULL;    /* malloc failed */
    memset(block, 0, size); /* reset all value in this block */
    return block;
}

It is worth noting that there is a check for overflow in the middle. Other than that, there should be no major issues.

Implementing `realloc(void *block, size_t size)`#

realloc(void *block, size_t size) resizes a memory block. The logic here is as follows: if the new size to be allocated is less than or equal to the original memory block size, the function simply returns the address of the original memory block, i.e., does not change it. If the new size is greater than the original memory block size, then it calls malloc to allocate a new memory block, copies the contents of the original memory block over, and returns the address of the new memory block.

/* change the given block’s size
 * */
void *realloc(void *block, size_t size){
    if (!block || !size)    /* block and size must be valid */
        return NULL;

    header_t *header;
    void *ret;
    header = (header_t *)block - 1;
    if (header -> s.size >= size)   /* if the original block already has enough space for the data */
        return block;
    ret = malloc(size); /* allocate a new space */
    if (ret) {
        memcpy(ret, block, header -> s.size);   /* move data in block to ret */
        free(block);    /* free original block */
    }
    return ret;
}