|
C Container Collection (CCC)
|
#include <private_array_tree_map.h>

An array tree map is a modified struct of arrays layout with the modification being that we may have the arrays as pointer offsets in a contiguous allocation if the user desires a dynamic map.
The user data array comes first allowing the user to store any type they wish in the container contiguously with no intrusive element padding, saving space.
The nodes array is next. These nodes track the indices of the child and parent nodes in the WAVL tree.
Finally, comes the parity bit array. This comes last because it allows the optimal storage space to be used, a single bit. Usually, a data structure theorist's "bit" of extra information per node comes out to a byte in practice due to portability and implementation concerns. If this byte were to be included in the current map elem, it would then be given 7 more bytes of padding by the compiler to ensure proper alignment, wasting large amounts of space. Instead all the bits are packed into their own bit array at the end of the allocation. The bit at a given index represents the parity of data and its node at that same index. This allows the implementation to follow the theorist's ideal.
Here is the layout in one contiguous array.
(D = Data Array, N = Nodes Array, P = Parity Bit Array, _N = Capacity - 1)
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐ │D_0│D_1│...│D_N│N_0│N_1│...│N_N│P_0│P_1│...│P_N│ └───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
Consider how this layout saves space. Here is a more traditional approach.
This struct wastes a byte for parity when only a bit is needed. It also has an alignment of 8 bytes meaning the trailing 7 bytes are completely useless. If this element were to intrude on a user element it would also force its 8 byte alignment on user data. This is more wasted space if the user wanted to simply use the map as a set for a smaller type such as an int.
If the user wanted a simple set of 64 ints we have the following waste.
In contrast the current struct of arrays design lays out data as follows.
That means there are only 4 bytes + B bits wasted, 4 bytes of padding between the end of the user type array and the start of the nodes array and the unused bits at the end of the parity bit array. This also means it important to consider the alignment differences that may occur between the user type and the node type.
This layout comes at the cost of consulting multiple arrays for many operations. However, once user data has been inserted or removed the tree fix up operations only need to consult the nodes array and the bit array which means more bits and nodes can be loaded on a cache line. We no longer need to consider arbitrarily sized or organized user data while we do operations on nodes and bits. Performance metrics still must be measured to say whether this is faster or slower than other approaches. However, the goal with this design is space efficiency first, speed second.
Data Fields | |
| void * | data |
| struct CCC_Array_tree_map_node * | nodes |
| unsigned * | parity |
| size_t | root |
| size_t | free_list |
| size_t | capacity |
| size_t | count |
| size_t | sizeof_type |
| size_t | key_offset |
| CCC_Key_comparator * | compare |
| CCC_Allocator * | allocate |
| void * | context |
| CCC_Allocator* CCC_Array_tree_map::allocate |
The provided allocation function, if any.
| size_t CCC_Array_tree_map::capacity |
The current capacity.
| CCC_Key_comparator* CCC_Array_tree_map::compare |
The provided key comparison function.
| void* CCC_Array_tree_map::context |
The provided context data, if any.
| size_t CCC_Array_tree_map::count |
The current size.
| void* CCC_Array_tree_map::data |
The contiguous array of user data.
| size_t CCC_Array_tree_map::free_list |
The start of the free singly linked list.
| size_t CCC_Array_tree_map::key_offset |
Where user key can be found in type.
| struct CCC_Array_tree_map_node* CCC_Array_tree_map::nodes |
The contiguous array of WAVL tree meta data.
| unsigned* CCC_Array_tree_map::parity |
The parity bit array corresponding to each node.
| size_t CCC_Array_tree_map::root |
The root node of the WAVL tree.
| size_t CCC_Array_tree_map::sizeof_type |
The size of the type stored in the map.