Redis code reading: sds

Last weekend, I read and summarized zmalloc.c which provides basic memory allocation mechanisms of Redis. In this weekend, I explored sds.c which implements many functions to manage a dynamic string efficiently; in fact, “sds” stands for Simple Dynamic String. This post describes a core of sds implementations. The version of Redis is 3.0.5. I put some comments on following code snippets.

sds is just a char pointer with hidden header

As well as zmalloc family, sds implicitly keeps track of available size and length of string by placing them on its (hidden) prefix. Actually, sds is defined as a char pointer:

// sds.h:39
typedef char *sds;

The prefix of sds contains a length of string and remaining allocated memory. They are packed to sdshdr:

// sds.h:41
struct sdshdr {
    unsigned int len;
    unsigned int free;
    char buf[]; // sds
};

and this is the core data structure of sds. Most part of sds.c implements basic string functions such as copy, concatenation, capitalization and even string formatting for sds because standard string functions are unavailable.

Instantiation and extension of sds

Here, I introduce two important functions out of over thirty in sds.c. The first function is sdsnewlen which creates new sds from a given string:

// sds.c:51
sds sdsnewlen(const void *init, size_t initlen) {
    struct sdshdr *sh;

    // memory allocation
    if (init) {
        sh = zmalloc(sizeof(struct sdshdr)+initlen+1);
    } else {
        sh = zcalloc(sizeof(struct sdshdr)+initlen+1);
    }
    if (sh == NULL) return NULL;

    // store sizes
    sh->len = initlen;
    sh->free = 0;

    // copy given string
    if (initlen && init)
        memcpy(sh->buf, init, initlen);
    sh->buf[initlen] = '\0';
    return (char*)sh->buf;
}

It allocates a memory block for sdshdr at first. Actual size depends on a back-end malloc implementation for zmalloc and a CPU architecture. If the back-end is libc malloc and the architecture is 64-bit, it consumes 16 bytes for the whole prefix.

The second function is sdsMakeRoomFor which ensures that a sds has at least addlen free memory for future string addition by extending memory if necessary:

// sds.c:129
sds sdsMakeRoomFor(sds s, size_t addlen) {
    struct sdshdr *sh, *newsh;
    size_t free = sdsavail(s); // `free` value of sdshdr
    size_t len, newlen;

    // return original sds if it has enough free space
    if (free >= addlen) return s;

    // calculate new length
    len = sdslen(s);
    sh = (void*) (s-(sizeof(struct sdshdr)));
    newlen = (len+addlen);
    if (newlen < SDS_MAX_PREALLOC)
        newlen *= 2;
    else
        newlen += SDS_MAX_PREALLOC;

    // reallocate memory
    newsh = zrealloc(sh, sizeof(struct sdshdr)+newlen+1);
    if (newsh == NULL) return NULL;

    newsh->free = newlen - len;
    return newsh->buf;
}

This function allocates twice as much memory as the minimum capacity where a string with addlen size can be added but the additional size of this pre-allocation is bounded by SDS_MAX_PREALLOC (1MB). Then, it reallocates the memory of sds by zrealloc. For all I know, zrealloc can return brand new memory block when it cannot extend the original block. Thus, all references to the original sds must be replaced by the return value.

Other functions

Other functions in sds.c would be easy-to-read with above basic knowledge for sds though sdscatfmt for formatting and sdssplitargs for splitting a string like function arguments are little bit long and tricky.

 
comments powered by Disqus