Last weekend, I read and summarized zmalloc.c which provides basic memory allocation mechanisms of Redis. In this weekend, I explored sds.c which implements many functions to manage a dynamic string efficiently; in fact, “sds” stands for Simple Dynamic String. This post describes a core of sds implementations. The version of Redis is 3.0.5. I put some comments on following code snippets.
sds is just a char pointer with hidden header
As well as zmalloc family, sds implicitly keeps track of available size and length of string by placing them on its (hidden) prefix. Actually, sds is defined as a char pointer:
// sds.h:39
typedef char *sds;
The prefix of sds contains a length of string and remaining allocated memory. They are packed to sdshdr:
// sds.h:41
struct sdshdr {
unsigned int len;
unsigned int free;
char buf[]; // sds
};
and this is the core data structure of sds. Most part of sds.c implements basic string functions such as copy, concatenation, capitalization and even string formatting for sds because standard string functions are unavailable.
Instantiation and extension of sds
Here, I introduce two important functions out of over thirty in sds.c. The first function is sdsnewlen which creates new sds from a given string:
// sds.c:51
sds sdsnewlen(const void *init, size_t initlen) {
struct sdshdr *sh;
// memory allocation
if (init) {
sh = zmalloc(sizeof(struct sdshdr)+initlen+1);
} else {
sh = zcalloc(sizeof(struct sdshdr)+initlen+1);
}
if (sh == NULL) return NULL;
// store sizes
sh->len = initlen;
sh->free = 0;
// copy given string
if (initlen && init)
memcpy(sh->buf, init, initlen);
sh->buf[initlen] = '\0';
return (char*)sh->buf;
}
It allocates a memory block for sdshdr at first. Actual size depends on a back-end malloc implementation for zmalloc and a CPU architecture. If the back-end is libc malloc and the architecture is 64-bit, it consumes 16 bytes for the whole prefix.
The second function is sdsMakeRoomFor which ensures that a sds has at least addlen free memory for future string addition by extending memory if necessary:
// sds.c:129
sds sdsMakeRoomFor(sds s, size_t addlen) {
struct sdshdr *sh, *newsh;
size_t free = sdsavail(s); // `free` value of sdshdr
size_t len, newlen;
// return original sds if it has enough free space
if (free >= addlen) return s;
// calculate new length
len = sdslen(s);
sh = (void*) (s-(sizeof(struct sdshdr)));
newlen = (len+addlen);
if (newlen < SDS_MAX_PREALLOC)
newlen *= 2;
else
newlen += SDS_MAX_PREALLOC;
// reallocate memory
newsh = zrealloc(sh, sizeof(struct sdshdr)+newlen+1);
if (newsh == NULL) return NULL;
newsh->free = newlen - len;
return newsh->buf;
}
This function allocates twice as much memory as the minimum capacity where a string with addlen size can be added but the additional size of this pre-allocation is bounded by SDS_MAX_PREALLOC (1MB). Then, it reallocates the memory of sds by zrealloc. For all I know, zrealloc can return brand new memory block when it cannot extend the original block. Thus, all references to the original sds must be replaced by the return value.
Other functions
Other functions in sds.c would be easy-to-read with above basic knowledge for sds though sdscatfmt for formatting and sdssplitargs for splitting a string like function arguments are little bit long and tricky.
comments powered by Disqus