Skip to content

Add the PivCo-Huffman encode/decode kernels (#849)#849

Open
terrelln wants to merge 6 commits into
facebook:devfrom
terrelln:export-D109882916
Open

Add the PivCo-Huffman encode/decode kernels (#849)#849
terrelln wants to merge 6 commits into
facebook:devfrom
terrelln:export-D109882916

Conversation

@terrelln

@terrelln terrelln commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary:

Implements the PivCo-Huffman kernel on top of the architecture kernel interface.

common_pivco_kernel.{h,c} provides the shared coding tree ZL_PivCoHuffmanTree, and is the core of the algorithm. Both encoding and decoding are just doing a pre-order traversal over this tree, and following the instructions given.

decode_pivco_kernel.{h,c} provides ZL_PivCoHuffman_decode, and the core compexity is in decodeNode.

encode_pivco_kernel.{h,c} provides ZL_PivCoHuffman_encode, and the core complexity is in encodeNode

Differential Revision: D109882916

@meta-codesync

meta-codesync Bot commented Jun 26, 2026

Copy link
Copy Markdown

@terrelln has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109882916.

@meta-codesync meta-codesync Bot changed the title Add the Huffman Pivoted encode/decode kernels Jun 29, 2026
terrelln added a commit to terrelln/openzl that referenced this pull request Jun 29, 2026
Summary:

Implements the Huffman Pivoted kernel on top of the architecture kernel interface.

`common_pivoted_kernel.{h,c}` provides the shared coding tree `ZL_HuffmanPivotedTree`, and is the core of the algorithm. Both encoding and decoding are just doing a pre-order traversal over this tree, and following the instructions given.

`decode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_decode`, and the core compexity is in `decodeNode`.

`encode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_encode`, and the core complexity is in `encodeNode`

Differential Revision: D109882916
@terrelln terrelln force-pushed the export-D109882916 branch 2 times, most recently from 0f4358e to cd87eba Compare June 29, 2026 18:19
terrelln added a commit to terrelln/openzl that referenced this pull request Jun 29, 2026
Summary:

Implements the Huffman Pivoted kernel on top of the architecture kernel interface.

`common_pivoted_kernel.{h,c}` provides the shared coding tree `ZL_HuffmanPivotedTree`, and is the core of the algorithm. Both encoding and decoding are just doing a pre-order traversal over this tree, and following the instructions given.

`decode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_decode`, and the core compexity is in `decodeNode`.

`encode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_encode`, and the core complexity is in `encodeNode`

Differential Revision: D109882916
@terrelln terrelln force-pushed the export-D109882916 branch from cd87eba to e094433 Compare June 29, 2026 18:30
terrelln added a commit to terrelln/openzl that referenced this pull request Jun 29, 2026
Summary:

Implements the Huffman Pivoted kernel on top of the architecture kernel interface.

`common_pivoted_kernel.{h,c}` provides the shared coding tree `ZL_HuffmanPivotedTree`, and is the core of the algorithm. Both encoding and decoding are just doing a pre-order traversal over this tree, and following the instructions given.

`decode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_decode`, and the core compexity is in `decodeNode`.

`encode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_encode`, and the core complexity is in `encodeNode`

Differential Revision: D109882916
@terrelln terrelln force-pushed the export-D109882916 branch from e094433 to 3d1b935 Compare June 29, 2026 18:41
terrelln added a commit to terrelln/openzl that referenced this pull request Jun 29, 2026
Summary:

Implements the Huffman Pivoted kernel on top of the architecture kernel interface.

`common_pivoted_kernel.{h,c}` provides the shared coding tree `ZL_HuffmanPivotedTree`, and is the core of the algorithm. Both encoding and decoding are just doing a pre-order traversal over this tree, and following the instructions given.

`decode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_decode`, and the core compexity is in `decodeNode`.

`encode_pivoted_kernel.{h,c}` provides `ZL_HuffmanPivoted_encode`, and the core complexity is in `encodeNode`

Differential Revision: D109882916
@terrelln terrelln closed this Jun 29, 2026
@terrelln terrelln reopened this Jun 30, 2026
terrelln added 6 commits June 30, 2026 14:09
Summary:

As title

Reviewed By: daniellerozenblit

Differential Revision: D109877947
Summary:

Adds byte-aligned bit reservation and pop primitives to the FF bitstream, needed by the upcoming PivCo-Huffman kernels:

- `ZS_BitCStreamFF_reserveAlignedBits` / `ZS_BitCStreamFF_commitReservedBits` on the write side.
- `ZS_BitDStreamFF_popAlignedBits` on the read side, plus a `begin` cursor in `ZS_BitDStreamFF` so the reader can compute byte offsets.
- `ZS_BitDStreamFF_finish` now returns the number of bytes consumed.
- Hardens `ZS_BitCStreamFF_init`/`_finish`/`_flush` against small buffers by avoiding pointer arithmetic that could underflow.

Differential Revision: D109878843
Summary:

Introduces the architecture-specific kernel interface for the PivCo-Huffman codec along with a portable, correctness-oriented generic implementation and the kernel test suite.

The `ZL_PivCoHuffmanEncode` and `ZL_PivCoHuffmanDecode` structs provide the interface to cleanly swap between implementations based on the CPU. The `select()` function will pick the best implementation based on the current CPU, when they are added, but for now just returns the generic implementation.

These are the kernels defined in the [PivCo Huffman paper](https://arxiv.org/pdf/2606.05765). The only difference is the encode side is modified following the same strategy as the [upstream repo](https://github.com/MarcinZukowski/pivco-huffman) to operate on `uint8_t rank` values, rather than `uint16_t codeword` values. The reasoning for how that change work is explained in the next diff, but the benefit is working on `uint8_t` instead of `uint16_t`.

Note that some buffers require `SLOP` bytes at the end, and `partitionFull` has an odd aliasing requirement. These are constraints that are imposed by the full implementation in the next diff.

Differential Revision: D109878842
Summary: Adds `micro_pivco_huffman.cpp` and wires it into the micro benchmark harness (`micro_bench.{cpp,h}`) to measure the PivCo-Huffman encode/decode kernels.

Differential Revision: D110077474
Summary:

Implements the PivCo-Huffman kernel on top of the architecture kernel interface.

`common_pivco_kernel.{h,c}` provides the shared coding tree `ZL_PivCoHuffmanTree`, and is the core of the algorithm. Both encoding and decoding are just doing a pre-order traversal over this tree, and following the instructions given.

`decode_pivco_kernel.{h,c}` provides `ZL_PivCoHuffman_decode`, and the core compexity is in `decodeNode`.

`encode_pivco_kernel.{h,c}` provides `ZL_PivCoHuffman_encode`, and the core complexity is in `encodeNode`

Differential Revision: D109882916
@meta-codesync meta-codesync Bot changed the title Add the Huffman Pivoted encode/decode kernels (#849) Jun 30, 2026
@terrelln terrelln force-pushed the export-D109882916 branch from 3d1b935 to ffeb7c6 Compare June 30, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant