Generic Containers in C: vec

Martin Uecker, 2025-07-20


I discuss the implementation of type and bounds safe generic containers in C. Previously, I discussed a span type, and bounds checking using arrays.

Here, I will discuss a vector type. A vector type is essentially a resizable array. A vector type could be used as in the following example.


	int main()
	{
		vec(int) *vec_ptr = calloc(1, sizeof *vec_ptr);

		if (!vec_ptr)	// memory out
			abort();

		for (int i = 0; i < 10; i++)
			vec_push(int, &vec_ptr, i);

		int sum = sum_vec(vec_ptr);

		free(vec_ptr);

		printf("sum: %d\n", sum);
	}
	

In C, it is actually fairly easy to define a vector type.


	#define vec(T) struct vec_##T { ssize_t N; T data[/* .N */]; }
	

It is very similar to the span type I discussed two weeks ago and you may want to read this post first. In comparison to the span type, the last element is not a pointer to an array but a so-called flexible array member, which extends the structure by an array of unknown length.

Growing the Vector

The key feature is that we can append elements to a vector, which we can implement using realloc.


	#define vec_push(T, v, x)					\
	({								\
		vec(T) **_vp = (v);					\
		ssize_t _N = (*_vp)->N + 1;				\
 		ssize_t _S = _N * (ssize_t)sizeof((*_vp)->data[0])	\
				+ (ssize_t)sizeof(vec(T));		\
		if (!(*_vp = realloc(*_vp, _S))) abort();		\
		(*_vp)->N++;						\
		(*_vp)->data[_N - 1] = (x);				\
	})	
	

Error Handling

I simply terminate the program using abort in case memory can not be allocated. Of course, this may not be what you want. If you need to handle errors (e.g. in embedded or systems programming), you need to adapt the API to return an error. You might also notice that I use ssize_t instead of size_t. the idea is to use the signed overflow sanitizer to catch overflow. If overflow must be handled explicitly, C23's checked integers could be used.

Why no Capacity Field?

Many vector types include a capacity field, so that resizing on every push can be avoided. I do not include one, because simplicity is more important to me and realloc often does this already internally. In most scenarios, the performance is already good enough.

Still, there are cases where it is really important to get optimal performance. For this, I experiment with two other interfaces. The first uses an external integer for tracking the capacity. One can use this interface in a performance critical routine to avoid repeated allocation. At the end, a single call to vec_fit can be used to release the memory that is not needed.


	T vec_push_cap(_Type T, ssize_t *cap_ptr, vec(T) *v, T x);
	T vec_pop_cap(_Type T, ssize_t *cap_ptr, vec(T) *v);
	vec(T) *vec_fit(_Type T, vec(T) *v);
	

On top of this interface I have another version that always assumes that the available capacity is the current size rounded up to the next power of two (at least below a certain limit).


	T vec_push_auto(_Type T, vec(T) *v, T x);
	T vec_pop_auto(_Type T, vec(T) *v);
	

I am still experimenting with these interfaces, but it seems that usability and performace can be quite good. One open question is for the automatic version is how to implement a hysteresis. If I find some time I will do some benchmarking and post the results.

Bounds Safety

Safety assumes that the members are not accessed directly but by using the API. Exactly as for the span type, the type of the data array should be T[.N] so that is has a known length. Until we get something like this in C, we can use the same workaround and define a macro that adds the length back to the array type.


	#define vec2array(T, v) (*({ vec(T) *_v = (v); (T(*)[_v->N])_v->data; }))
	

Using such a macro, this vector type nicely interoperates with conventional C arrays. In the following example, we use the array_slice macro from my previous post. I would usually write a loop, but I wanted to show how one can continue to do bounds safe processing with the array that represents the vector.

Example:


	static int array_sum(int N, int (*arr_ptr)[N])
	{
		return N ? (*arr_ptr)[0] + array_sum(N - 1, &array_slice(*arr_ptr, 1, N)) : 0;
	}

	int main()
	{
		vec(int) *vec_ptr = calloc(1, sizeof *vec_ptr);

		for (int i = 0; i < 10; i++)
			vec_push(int, &vec_ptr, i);

		// int (*arr_ptr)[*] = ...
		auto arr_ptr = &vec2array(int, vec_ptr);

		int sum = array_sum(_Countof *arr_ptr, atr_ptr);

		free(vec_ptr);

		printf("sum: %d\n", sum);
	}
	

In practice, there is still one bounds safety issue in the code, because the undefined behavior sanitizer does not check that the first argument in the call to array_sum matches the second argument. I hope we can fix this omission at some point.

If you want, you can check out my experimental library where I am experimenting with these ideas: link. If you have ideas on how to do this better, let me know!