Skip to content

[repo-assist] perf: use bottom-up construction in Heap.ofSeq, extract mergeData helper#242

Open
github-actions[bot] wants to merge 2 commits into
masterfrom
repo-assist/perf-heap-ofseq-2026-03-11-ccea0939034d3b00
Open

[repo-assist] perf: use bottom-up construction in Heap.ofSeq, extract mergeData helper#242
github-actions[bot] wants to merge 2 commits into
masterfrom
repo-assist/perf-heap-ofseq-2026-03-11-ccea0939034d3b00

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

🤖 This PR was created by Repo Assist, an automated AI assistant.

Addresses the performance issue described in #166 ("Heap.Tail slow").

Root cause

The previous ofSeq built the heap by a sequential left-to-right fold, treating each new element as a merge with the current root. For sorted ascending input on a min-heap, every new element is larger than the root, so it is prepended as a direct child of the root. After inserting n elements this creates:

T(min, [T(n,[]);  T(n-1,[]); …; T(2,[])])
```

The root has O(n) direct children. The first call to `Tail()` (delete-min) must pair and merge all _n−1_ children, costing **O(n)** for that first call instead of the O(log n) amortised bound expected by users.

## Fix

Replace the sequential fold with **bottom-up heap construction**:

1. Create _n_ singleton heaps from the input array.
2. Repeatedly merge adjacent pairs; each pass halves the count.
3. After ⌈log₂ n⌉ passes the result is a single balanced heap of height O(log n).

The first `Tail()` on this balanced heap costs O(log n) regardless of input ordering.

This is the same bottom-up technique used for heapsort and efficient priority-queue initialisation.

## Additional refactor

The `mergeData` logic (merging two `HeapData` values) was duplicated as a local function inside `Tail()`. It is now extracted as a **private static helper** (`Heap.mergeData`) shared by both `ofSeq` and `Tail()`, and the existing `merge` wrapper delegates to it.

## Trade-offs

| Aspect | Before | After |
|--------|--------|-------|
| First `Tail()` on sorted input | O(n) | O(log n) |
| `ofSeq` allocation | O(n) list nodes | O(n) array slots + O(n) list nodes (same asymptote) |
| Streaming input | Yes (lazy fold) | No — `Array.ofSeq` materialises first |
| Code clarity | Inline `mergeData` duplicated | Shared helper |

The materialisation cost of `Array.ofSeq` is O(n) time and space — the same as the existing fold — so the net allocation is unchanged. For users who do not call `Tail()` (e.g., only `Head`/`Peek`), the build is equally fast.

## Test Status

All existing tests pass: **706 passed, 0 failed** (6 skipped, pre-existing).

```
Passed!  - Failed: 0, Passed: 706, Skipped: 6, Total: 712

Format check (dotnet fantomas --check) passes with no changes needed.

Closes #166

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@346204513ecfa08b81566450d7d599556807389f
The previous ofSeq implementation built the heap by sequential left-to-right
merges. For sorted input (ascending for a min-heap), every new element was
appended as a direct child of the root, producing a root with O(n) children
after n insertions. The first Tail() call then had to merge all those children,
costing O(n) instead of the amortised O(log n) expected by users (see #166).

New approach: build n singleton heaps from the input array and repeatedly
merge adjacent pairs (bottom-up construction). Each pass halves the number of
heaps; log₂(n) passes produce a balanced heap whose height is O(log n). The
first Tail() on such a heap costs O(log n) regardless of the input ordering.

Also refactors the inline mergeData function (previously duplicated inside
Tail()) into a private static helper Heap.mergeData, used by both Tail() and
the new ofSeq, and simplifies the static merge wrapper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dsyme dsyme marked this pull request as ready for review March 16, 2026 13:44
@dsyme

dsyme commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

/repo-assist Add testing for this, and open a separate PR to add an AGENTS.md of about 100 lines the includes instructions to add comprehensive tests for bug fixes and also do code formatting etc.

@dsyme

dsyme commented May 13, 2026

Copy link
Copy Markdown
Contributor

/repo-assist Add testing for this as part of this PR, and open a separate PR to add an AGENTS.md of about 100 lines the includes instructions to add comprehensive tests for bug fixes and also do code formatting etc.

@dsyme dsyme changed the title [Repo Assist] perf: use bottom-up construction in Heap.ofSeq, extract mergeData helper May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment