|
C Container Collection (CCC)
|
#include <private_flat_hash_map.h>

The layout of the map uses only pointers to account for the possibility of memory provided from the data segment, stack, or heap. When the map is allowed to allocate it will take care of aligning pointers appropriately. In the fixed size case we rely on the user defining a fixed size type. In either case the arrays are in one contiguous allocation but split as follows:
(N == capacity - 1) Where capacity is a power of 2. (G == group size - 1).
┌───┬───┬───┬───┬────┬───┬───┬───┬───┬───┬───┬───┬───┐ │D_0│D_1│...│D_N│Swap│T_0│T_1│...│T_N│R_0│R_1│...│R_G│ └───┴───┴───┴───┴─┬──┼───┴───┴───┴───┼───┴───┴───┴───┘ ┌───────────────┘ │ │ ┌─┴───────────┐ ┌────┴─────────┐ ┌───┴──────────────────────────────────────┐ │Swap slot for│ │Base address │ │Start of replica of first group to support│ │in place │ │of Tag array. │ │a group load that starts at T_N as well as│ │rehashing. │ │Possible pad │ │erase and inserts. This means R_G is never│ │Size = 1 data│ │bytes between.│ │needed but duplicated for branchless ops. │ └─────────────┘ └──────────────┘ └──────────────────────────────────────────┘
This is a different layout than Rust's Hashbrown table. Instead of a shared base address of the data and tag arrays with padding at the start of the data array, we use a normal two array layout with padding between. This is because we must allow the user to allocate a hash table at compile time in the data segment or on the stack. The fixed size map defined at compile time is a struct and structs in the C standard are specified to have no padding at the start. Therefore, so that the code within the map does not care whether it deals with a fixed or dynamic map, we must assume data starts at the base address and that there may be zero or more bytes of padding between the data and tag arrays for alignment.
We may lose some of the assembly optimizations in indexing that Rust's table gets by adding and subtracting from a shared base address. However, this table still needs to use byte offset multiplication because the data is stored as void * so we don't have the same optimizations available to us as Rust's template generic system. Simple 0 based indexing makes the addition and multiplication we perform as simple as possible.
Data Fields | |
| void * | data |
| struct CCC_Flat_hash_map_tag * | tag |
| size_t | count |
| size_t | remain |
| size_t | mask |
| size_t | sizeof_type |
| size_t | key_offset |
| CCC_Key_comparator * | compare |
| CCC_Key_hasher * | hash |
| CCC_Allocator * | allocate |
| void * | context |
| CCC_Allocator* CCC_Flat_hash_map::allocate |
The allocation function, if any.
| CCC_Key_comparator* CCC_Flat_hash_map::compare |
The user callback for equality comparison.
| void* CCC_Flat_hash_map::context |
Auxiliary data, if any.
| size_t CCC_Flat_hash_map::count |
The number of user active slots.
| void* CCC_Flat_hash_map::data |
Reversed user type data array.
| CCC_Key_hasher* CCC_Flat_hash_map::hash |
The hash function provided by user.
| size_t CCC_Flat_hash_map::key_offset |
The location of the key field in user type.
| size_t CCC_Flat_hash_map::mask |
The mask for power of two table sizing.
| size_t CCC_Flat_hash_map::remain |
Track available slots given load factor constrains. When 0, rehash.
| size_t CCC_Flat_hash_map::sizeof_type |
Size of each user data element being stored.
| struct CCC_Flat_hash_map_tag* CCC_Flat_hash_map::tag |
Tag array on byte following data(0).