C Container Collection (CCC)
Loading...
Searching...
No Matches
private_flat_hash_map.h File Reference

Private Flat Hash Map Interface. More...

#include "../types.h"
#include "private_types.h"
Include dependency graph for private_flat_hash_map.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Detailed Description

Private Flat Hash Map Interface.

This flat hash map is a C Container Collection friendly interpretation of the Rust Hashbrown hash table. This in turn is based on the Abseil flat hash table from Google in C++. I simplified and modified the implementation for maximum readability in one header and one file. Tracking how to manage different platform implementations of groups and metadata fingerprint masks should be much easier this way, rather than jumping across countless small implementation files.

One key feature that is rigorously tested via static asserts is the ability to create a static data segment or stack based map. This is a key feature of the implementation but it requires significant set up ahead of time and lazy initialization support. The lazy initialization presents the map with the most complexity in the implementation.

Data Structures

struct  CCC_Flat_hash_map_tag
 
struct  CCC_Flat_hash_map
 
struct  CCC_Flat_hash_map_entry
 
union  CCC_Flat_hash_map_entry_wrap
 

Macros

#define CCC_private_flat_hash_map_declare_fixed_map( fixed_map_type_name, key_val_type_name, capacity)
 
#define CCC_private_flat_hash_map_fixed_capacity(fixed_map_type_name)    (sizeof((fixed_map_type_name){}.tag) - CCC_FLAT_HASH_MAP_GROUP_COUNT)
 
#define CCC_private_flat_hash_map_initialize( private_fixed_map_pointer, private_type_name, private_key_field, private_hash, private_key_compare, private_allocate, private_context_data, private_capacity)
 
#define CCC_private_flat_hash_map_from( private_key_field, private_hash, private_key_compare, private_allocate, private_context_data, private_optional_cap, private_array_compound_literal...)
 
#define CCC_private_flat_hash_map_with_capacity( private_type_name, private_key_field, private_hash, private_key_compare, private_allocate, private_context_data, private_cap)
 
#define CCC_private_flat_hash_map_and_modify_with( Flat_hash_map_entry_pointer, type_name, closure_over_T...)
 
#define CCC_private_flat_hash_map_or_insert_with(Flat_hash_map_entry_pointer, type_compound_literal...)
 
#define CCC_private_flat_hash_map_insert_entry_with( Flat_hash_map_entry_pointer, type_compound_literal...)
 
#define CCC_private_flat_hash_map_try_insert_with(flat_hash_map_pointer, key, type_compound_literal...)
 
#define CCC_private_flat_hash_map_insert_or_assign_with( flat_hash_map_pointer, key, type_compound_literal...)
 

Enumerations

enum  : typeof((struct CCC_Flat_hash_map_tag)
 

Functions

struct CCC_Flat_hash_map_entry CCC_private_flat_hash_map_entry (struct CCC_Flat_hash_map *, void const *)
 
void CCC_private_flat_hash_map_insert (struct CCC_Flat_hash_map *, void const *, struct CCC_Flat_hash_map_tag, size_t)
 
void CCC_private_flat_hash_map_erase (struct CCC_Flat_hash_map *, size_t)
 
void * CCC_private_flat_hash_map_data_at (struct CCC_Flat_hash_map const *, size_t)
 
void * CCC_private_flat_hash_map_key_at (struct CCC_Flat_hash_map const *, size_t)
 
void CCC_private_flat_hash_map_set_insert (struct CCC_Flat_hash_map_entry const *)
 

Variables

enum { ... }  CCC_FLAT_HASH_MAP_GROUP_COUNT = 8
 

Macro Definition Documentation

◆ CCC_private_flat_hash_map_and_modify_with

#define CCC_private_flat_hash_map_and_modify_with (   Flat_hash_map_entry_pointer,
  type_name,
  closure_over_T... 
)
Value:
(__extension__({ \
__auto_type private_flat_hash_map_mod_ent_pointer \
= (Flat_hash_map_entry_pointer); \
struct CCC_Flat_hash_map_entry private_flat_hash_map_mod_with_ent \
= {.status = CCC_ENTRY_ARGUMENT_ERROR}; \
if (private_flat_hash_map_mod_ent_pointer) \
{ \
private_flat_hash_map_mod_with_ent \
= private_flat_hash_map_mod_ent_pointer->private; \
if (private_flat_hash_map_mod_with_ent.status \
& CCC_ENTRY_OCCUPIED) \
{ \
type_name *const T = CCC_private_flat_hash_map_data_at( \
private_flat_hash_map_mod_with_ent.map, \
private_flat_hash_map_mod_with_ent.index); \
if (T) \
{ \
closure_over_T \
} \
} \
} \
private_flat_hash_map_mod_with_ent; \
}))
void * CCC_private_flat_hash_map_data_at(struct CCC_Flat_hash_map const *, size_t)
Definition: private_flat_hash_map.h:170
size_t index
Definition: private_flat_hash_map.h:174
enum CCC_Entry_status status
Definition: private_flat_hash_map.h:178
struct CCC_Flat_hash_map * map
Definition: private_flat_hash_map.h:172

A fairly good approximation of closures given C23 capabilities. The user facing docs clarify that T is a correctly typed reference to the desired data if occupied.

◆ CCC_private_flat_hash_map_declare_fixed_map

#define CCC_private_flat_hash_map_declare_fixed_map (   fixed_map_type_name,
  key_val_type_name,
  capacity 
)
Value:
static_assert((capacity) > 0, \
"fixed size map must have capacity greater than 0"); \
static_assert( \
"fixed size map must have capacity >= CCC_FLAT_HASH_MAP_GROUP_COUNT " \
"(8 or 16 depending on platform)"); \
static_assert(((capacity) & ((capacity) - 1)) == 0, \
"fixed size map must be a power of 2 capacity (32, 64, " \
"128, 256, etc.)"); \
typedef struct \
{ \
key_val_type_name data[(capacity) + 1]; \
tag[(capacity) + CCC_FLAT_HASH_MAP_GROUP_COUNT]; \
}(fixed_map_type_name)
enum @6 CCC_FLAT_HASH_MAP_GROUP_COUNT
Definition: private_flat_hash_map.h:81

Helps the user declare a type for a fixed size map. They can then use this type when they want a hash map as global, static global, or stack local. They would need to define their fixed size type every time but that should be fine as they are likely to only declare one or two. They would likely only have a one fixed size map per translation unit if they are using these capabilities. They control the name of the type so they can organize types as they wish.

The declaration specifies that we have one extra data slot for swapping during in place rehashing and some interface functions and an extra duplicate group of tags at the end of the tag array for safer group loading.

Finally, we must align the tag array to start on an aligned group size byte boundary to be able to perform aligned loads and stores.

◆ CCC_private_flat_hash_map_fixed_capacity

#define CCC_private_flat_hash_map_fixed_capacity (   fixed_map_type_name)     (sizeof((fixed_map_type_name){}.tag) - CCC_FLAT_HASH_MAP_GROUP_COUNT)

If the user does not want to remember the capacity they chose for their type or make mistakes this macro offers consistent calculation of total capacity (aka buckets) of the map. This is not the capacity that is limited by load factor.

The sizeof operator does not decay to a simple pointer here because the tag array of a fixed type has a length known at compile time. Also the tag array is simple byte sized chunks so no division needed. See earlier static asserts for how we ensure no fixed size type is allowed to be defined in a way to make this call unsafe.

◆ CCC_private_flat_hash_map_from

#define CCC_private_flat_hash_map_from (   private_key_field,
  private_hash,
  private_key_compare,
  private_allocate,
  private_context_data,
  private_optional_cap,
  private_array_compound_literal... 
)

Initialize dynamic container with a compound literal array.

◆ CCC_private_flat_hash_map_initialize

#define CCC_private_flat_hash_map_initialize (   private_fixed_map_pointer,
  private_type_name,
  private_key_field,
  private_hash,
  private_key_compare,
  private_allocate,
  private_context_data,
  private_capacity 
)
Value:
{ \
.data = (private_fixed_map_pointer), \
.tag = NULL, \
.count = 0, \
.remain = (((private_capacity) / (size_t)8) * (size_t)7), \
.mask \
= (((private_capacity) > (size_t)0) ? ((private_capacity) - (size_t)1) \
: (size_t)0), \
.sizeof_type = sizeof(private_type_name), \
.key_offset = offsetof(private_type_name, private_key_field), \
.compare = (private_key_compare), \
.hash = (private_hash), \
.allocate = (private_allocate), \
.context = (private_context_data), \
}

Initialization is tricky but we simplify by only accepting a pointer to the map this pointer could be any of the following.

- The address of a user defined fixed size map stored in data segment.
- The address of a user defined fixed size map stored on the stack.
- The address of a user defined fixed size map allocated on the heap.
- NULL if the user intends for a dynamic map.

All of the above cases are covered by accepting the pointer at .data and only evaluating the argument once. This also allows the user to pass a compound literal to the first argument and eliminate any dangling references, such as &(static user_defined_map_type){}. However, to accept a map from all of these sources at compile or runtime, we must implement lazy initialization. This is because we can't initialize the tag array at compile time. By setting the tag field to NULL we will be able to tell if our map is initialized whether it is fixed size and has data or is dynamic and has not yet been given allocation.

◆ CCC_private_flat_hash_map_insert_entry_with

#define CCC_private_flat_hash_map_insert_entry_with (   Flat_hash_map_entry_pointer,
  type_compound_literal... 
)
Value:
(__extension__({ \
__auto_type private_flat_hash_map_ins_ent_pointer \
= (Flat_hash_map_entry_pointer); \
typeof(type_compound_literal) *private_flat_hash_map_ins_ent_res \
= NULL; \
if (private_flat_hash_map_ins_ent_pointer) \
{ \
if (!(private_flat_hash_map_ins_ent_pointer->private.status \
& CCC_ENTRY_INSERT_ERROR)) \
{ \
private_flat_hash_map_ins_ent_res \
private_flat_hash_map_ins_ent_pointer->private.map, \
private_flat_hash_map_ins_ent_pointer->private.index); \
*private_flat_hash_map_ins_ent_res = type_compound_literal; \
if (private_flat_hash_map_ins_ent_pointer->private.status \
== CCC_ENTRY_VACANT) \
{ \
CCC_private_flat_hash_map_set_insert( \
&private_flat_hash_map_ins_ent_pointer->private); \
} \
} \
} \
private_flat_hash_map_ins_ent_res; \
}))

Insert entry also should not fail and therefore returns a reference directly. This is similar to insert or assign where overwriting may occur.

◆ CCC_private_flat_hash_map_insert_or_assign_with

#define CCC_private_flat_hash_map_insert_or_assign_with (   flat_hash_map_pointer,
  key,
  type_compound_literal... 
)

Because this function does not start with an entry it has the option to give user more information and therefore returns an entry. Importantly, this function makes sure the key is in sync with key in table. Similar to insert entry this will overwrite.

◆ CCC_private_flat_hash_map_or_insert_with

#define CCC_private_flat_hash_map_or_insert_with (   Flat_hash_map_entry_pointer,
  type_compound_literal... 
)
Value:
(__extension__({ \
__auto_type private_flat_hash_map_or_ins_ent_pointer \
= (Flat_hash_map_entry_pointer); \
typeof(type_compound_literal) *private_flat_hash_map_or_ins_res \
= NULL; \
if (private_flat_hash_map_or_ins_ent_pointer) \
{ \
if (!(private_flat_hash_map_or_ins_ent_pointer->private.status \
& CCC_ENTRY_INSERT_ERROR)) \
{ \
private_flat_hash_map_or_ins_res \
private_flat_hash_map_or_ins_ent_pointer->private.map, \
private_flat_hash_map_or_ins_ent_pointer->private \
.index); \
if (private_flat_hash_map_or_ins_ent_pointer->private.status \
== CCC_ENTRY_VACANT) \
{ \
*private_flat_hash_map_or_ins_res = type_compound_literal; \
CCC_private_flat_hash_map_set_insert( \
&private_flat_hash_map_or_ins_ent_pointer->private); \
} \
} \
} \
private_flat_hash_map_or_ins_res; \
}))

The or insert method is unique in that it directly returns a reference to the inserted data rather than a entry with a status. This is because it should not fail. If NULL is returned the user knows there is a problem.

◆ CCC_private_flat_hash_map_try_insert_with

#define CCC_private_flat_hash_map_try_insert_with (   flat_hash_map_pointer,
  key,
  type_compound_literal... 
)

Because this function does not start with an entry it has the option to give user more information and therefore returns an entry. Importantly, this function makes sure the key is in sync with key in table.

◆ CCC_private_flat_hash_map_with_capacity

#define CCC_private_flat_hash_map_with_capacity (   private_type_name,
  private_key_field,
  private_hash,
  private_key_compare,
  private_allocate,
  private_context_data,
  private_cap 
)
Value:
(__extension__({ \
struct CCC_Flat_hash_map private_map \
NULL, private_type_name, private_key_field, private_hash, \
private_key_compare, private_allocate, private_context_data, \
0); \
(void)CCC_flat_hash_map_reserve(&private_map, private_cap, \
private_allocate); \
private_map; \
}))
CCC_Result CCC_flat_hash_map_reserve(CCC_Flat_hash_map *map, size_t to_add, CCC_Allocator *allocate)
Reserve space required to add a specified number of elements to the map. If the current capacity is s...
#define CCC_private_flat_hash_map_initialize( private_fixed_map_pointer, private_type_name, private_key_field, private_hash, private_key_compare, private_allocate, private_context_data, private_capacity)
Definition: private_flat_hash_map.h:276
Definition: private_flat_hash_map.h:142

Initializes the flat hash map with the specified capacity.

Enumeration Type Documentation

◆ anonymous enum

anonymous enum : typeof((struct CCC_Flat_hash_map_tag)

Vectorized group scanning allows more parallel scans but a fallback of 8 is good for a portable implementation that will use the widest word on a platform for group scanning. Right now, this lib targets 64-bit so that means uint64_t is widest default integer widely supported. That width is still valid on 32-bit but probably very slow due to emulation.

Function Documentation

◆ CCC_private_flat_hash_map_data_at()

void * CCC_private_flat_hash_map_data_at ( struct CCC_Flat_hash_map const *  ,
size_t   
)

◆ CCC_private_flat_hash_map_entry()

struct CCC_Flat_hash_map_entry CCC_private_flat_hash_map_entry ( struct CCC_Flat_hash_map ,
void const *   
)

◆ CCC_private_flat_hash_map_erase()

void CCC_private_flat_hash_map_erase ( struct CCC_Flat_hash_map ,
size_t   
)

◆ CCC_private_flat_hash_map_insert()

void CCC_private_flat_hash_map_insert ( struct CCC_Flat_hash_map ,
void const *  ,
struct CCC_Flat_hash_map_tag  ,
size_t   
)

◆ CCC_private_flat_hash_map_key_at()

void * CCC_private_flat_hash_map_key_at ( struct CCC_Flat_hash_map const *  ,
size_t   
)

◆ CCC_private_flat_hash_map_set_insert()

void CCC_private_flat_hash_map_set_insert ( struct CCC_Flat_hash_map_entry const *  )

Variable Documentation

◆ 

enum { ... } CCC_FLAT_HASH_MAP_GROUP_COUNT

Vectorized group scanning allows more parallel scans but a fallback of 8 is good for a portable implementation that will use the widest word on a platform for group scanning. Right now, this lib targets 64-bit so that means uint64_t is widest default integer widely supported. That width is still valid on 32-bit but probably very slow due to emulation. A group of tags that can be loded into a 64 bit integer.