Enterprise-grade C library providing an in-flight de-duplication gate keyed only by SHA-256(body) using process-shared memory (POSIX shm_open + mmap).
- After SHA-256 finalization,
hk_gate_try_acquire()is atomic across all processes:- If the lock is absent (or stale), exactly one caller gets
HK_ALLOWand the lock is created/refreshed. - While the lock exists and is not stale, all callers for the same hash get
HK_DROP.
- If the lock is absent (or stale), exactly one caller gets
hk_gate_release()removes the lock (idempotent). Next acquire after release isHK_ALLOWunless another process acquires first.- Locks older than
ttl_secondsare considered stale, recovered, and do not block indefinitely (stale recovery is counted).
- It is not possible to deterministically drop duplicates before reading and hashing the full body, because the key is content-derived and no client-provided early key exists.
- Therefore, multiple identical requests may hash concurrently in an unavoidable overlap window before the first finishes hashing.
- Shared memory segment layout:
- header (magic/version/config + atomic counters)
- array of shard locks (
pthread_mutex_t,PTHREAD_PROCESS_SHARED, robust if available) - fixed-capacity hash table entries
- Hash table:
- deterministic fixed-capacity open addressing per shard segment (linear probing)
- no pointers stored in shared memory (only plain structs)
- Sharding:
- shard =
hash[0] % shard_count - each shard owns a contiguous region of
capacity_entries / shard_count
- shard =
Shared memory must accommodate:
sizeof(header)
+ shard_count * sizeof(pthread_mutex_t)
+ capacity_entries * sizeof(entry)
Example: shard_count=256, capacity_entries=262144 (1024 entries/shard) uses roughly:
- mutexes: 256 * ~40 bytes (implementation-dependent)
- entries: 262144 * 64 bytes (entry struct) ≈ 16 MiB
- plus header/alignment
In practice, allocate >= 20 MiB for this example.
- TTL source: wall-clock seconds (
time(NULL)/ caller-providednow_sec). If the wall clock jumps, TTL behavior follows the jump. - Crash safety: uses robust mutexes where supported (
PTHREAD_MUTEX_ROBUST). If a process dies while holding a shard lock, the next locker recovers (EOWNERDEAD+pthread_mutex_consistent), incrementserror_count, and continues.- If robust mutexes are not available, build falls back to non-robust; deadlock freedom after owner death is then not guaranteed.
- Stale recovery: on lookup during
try_acquire, if an entry is older than TTL it is deleted and the new acquire is allowed;stale_recovered_countincrements.
Dependencies:
- OpenSSL development headers/libs (
libssl,libcrypto) - pthread
make
make testOptional sanitizer builds:
make asan
# ThreadSanitizer generally conflicts with robust process-shared mutexes; provided for best-effort only.
make tsanexamples/demo_multiproc.c– simple multi-process acquire behaviorexamples/bench.c– multi-process stress/throughput reporting
See include/hkgate.h.
Minimal flow:
- Stream request body into SHA-256 context via
hk_sha256_init/update/final. - Call
hk_gate_try_acquire()with the finalized hash. - If
HK_ALLOW, process request; when complete, an external agent callshk_gate_release()for that hash.