Skip to content

Add geodesic polygon coverage mode for polygonToCellsExperimental#1052

Open
holoskii wants to merge 39 commits intouber:masterfrom
holoskii:geodesic_coverage
Open

Add geodesic polygon coverage mode for polygonToCellsExperimental#1052
holoskii wants to merge 39 commits intouber:masterfrom
holoskii:geodesic_coverage

Conversation

@holoskii
Copy link
Copy Markdown

@holoskii holoskii commented Oct 5, 2025

Introduce a geodesic flag for polygonToCellsExperimental that interprets
polygon edges as great-circle arcs instead of planar segments. This is
critical for accurate coverage of very large polygons spanning hundreds
or thousands of kilometers, where planar interpolation diverges
significantly from the true spherical geometry.

The implementation operates entirely on the unit sphere using 3D
Cartesian coordinates, with sphere-cap and AABB acceleration structures
for efficient cell pruning. Supports CONTAINMENT_FULL and
CONTAINMENT_OVERLAPPING modes. This mode is slower than the planar
algorithm, so lower resolutions are recommended.

Key changes:

  • New FLAG_GEODESIC_MASK flag bit and FLAG_SET_GEODESIC/RESET macros
  • GeodesicPolygon/GeodesicEdge/GeodesicLoop internal representations
  • Sphere cap bounding with precomputed per-resolution cosine tables
  • 3D AABB and great-circle edge crossing tests for containment
  • Vec3d utility functions (dot, cross, normalize, etc.)
  • cellToVec3/vec3dToCell/cellToGeodesicBoundary conversions
  • Dedicated fuzzer, benchmarks, and comprehensive test suite

Developed at FloeDB for internal use; contributed upstream.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Oct 5, 2025

CLA assistant check
All committers have signed the CLA.

@coveralls
Copy link
Copy Markdown

coveralls commented Oct 6, 2025

Coverage Status

coverage: 98.543% (-0.6%) from 99.095%
when pulling fb857a1 on holoskii:geodesic_coverage
into 1d5346b on uber:master.

@ajfriend
Copy link
Copy Markdown
Collaborator

ajfriend commented Oct 9, 2025

Thanks @holoskii for the contribution! Geodesic polyfill is an exciting addition that I think a lot of folks would use, myself included!

Please bear with us as we try to find the time to properly review this PR. Some initial questions from my end:

  • Is it fair to characterize this PR as providing two major improvements:
    1. geodesic mode for "polyfill", for polygons that would otherwise already be well-handled in the existing planar mode in polygonToCellsExperimental, and
    2. the ability to do polyfill on polygons larger or tricker (e.g. around poles or the antimeridian) than those that could be handled in the previous algorithm (and, if so, would these improvements also generalize to the existing planar mode?)
  • This may become obvious after I'm able to review more closely, but how much code can we share between the planar and geodesic algorithms?
  • Is it necessary to expose Vec3d and GeodesicCellBoundary in the public api (h3api.h.in), or can we keep those structs internal?
  • You mentioned that this mode is slower than the existing algorithm. How much slower? Do you have any example benchmarks you could share?
  • Any thoughts on what to do about the failing CI checks? Or would you like some input from the team?
Copy link
Copy Markdown
Collaborator

@isaacbrodsky isaacbrodsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution and opening this PR! Please let me know if you have questions. I am beginning to review the contribution and will add more comments as I read & test.

@ajfriend ajfriend added this to the v4.5 milestone Oct 16, 2025
@holoskii
Copy link
Copy Markdown
Author

Fixes / changes

  1. Introduce an explicit epsilon for the SphereCap check. The previous iteration relied on tolerance built into precomputed cosine values; in practice, we also need an additional epsilon.
  2. Use GeodesicPolygon* instead of void* in the iterator structure.
  3. Fix the MSVC build by using a macro instead of a constant.
  4. Move GeodesicCellBoundary and Vec3d to their own headers instead of defining them in the public API header.
  5. Move internal headers to the include directory.
  6. Add a new geodesic benchmark.
  7. Rebase on the latest master.

Performance

I added a dedicated geodesic benchmark that exercises small, medium, and large shapes.

  • Small cells: up to 60× slower.
  • Medium cells / large shapes: ~20×.
  • “Primary” geodesic use case (huge shapes + large cells, e.g., NY ↔ London great-circle path): ~2.5×.

Notes:

  • At Yellowbrick Data, we’ve seen ~10× performance improvements by pushing heavy inlining (by moving code to headers), branch hints, etc. I avoided invasive changes here to keep code style intact; the current PR aims for minimal intrusion.
@holoskii
Copy link
Copy Markdown
Author

Thanks @holoskii for the contribution! Geodesic polyfill is an exciting addition that I think a lot of folks would use, myself included!

Please bear with us as we try to find the time to properly review this PR. Some initial questions from my end:

* Is it fair to characterize this PR as providing two major improvements:
  
  1. geodesic mode for "polyfill", for polygons that would otherwise already be well-handled in the existing planar mode in `polygonToCellsExperimental`, and
  2. the ability to do polyfill on polygons larger or tricker (e.g. around poles or the antimeridian) than those that could be handled in the previous algorithm (and, if so, would these improvements also generalize to the existing planar mode?)

* This may become obvious after I'm able to review more closely, but how much code can we share between the planar and geodesic algorithms?

* Is it necessary to expose `Vec3d` and `GeodesicCellBoundary` in the public api (`h3api.h.in`), or can we keep those structs internal?

* You mentioned that this mode is slower than the existing algorithm. How much slower? Do you have any example benchmarks you could share?

* Any thoughts on what to do about the failing CI checks? Or would you like some input from the team?

1.1 - Yes.

1.2 - The geodesic approach behaves better for large polygons and near poles/antimeridian due to great-circle behavior, but it’s architecturally different from the planar path; there isn’t a practical feature we can “back-port” without re-implementing substantial pieces.

2 - We only share FaceIjk and iterator code between the two implementations. Intersection/containment logic and related optimization structures are written from scratch for the geodesic mode.

3 - Performance metrics are now included in the main PR comment (see “Performance” above) with the new benchmark.

@holoskii holoskii requested a review from isaacbrodsky October 27, 2025 09:48
@holoskii
Copy link
Copy Markdown
Author

A few small fixes for the CI:

  • examples did not compile because of "constants.h" included in h3api.h. Removed that include to keep api header standalone
  • fuzzer hit a legitimate timeout. Fixed this by limiting resolution and number of points for geodesic runs
@holoskii
Copy link
Copy Markdown
Author

holoskii commented Nov 3, 2025

@isaacbrodsky was right, separate fuzzer is a way to go. Old fuzzers left not enough time until timeout in each iteration. In the last commit I've added a separate geodesic fuzzer. PR is now fully ready for review

@holoskii
Copy link
Copy Markdown
Author

Hi @nrabinowitz @dfellis @isaacbrodsky - friendly bump on this.
Current state: fixed CI issues, added benchmarks, answered questions and addressed the comments.

@isaacbrodsky
Copy link
Copy Markdown
Collaborator

Hi @nrabinowitz @dfellis @isaacbrodsky - friendly bump on this. Current state: fixed CI issues, added benchmarks, answered questions and addressed the comments.

Thanks for updating the PR and for the bump! We appreciate the contribution and know we need to review this, so it's just a matter of finding time to read through it in depth. We would like to bring this in for the next release of H3.

Comment on lines +40 to +43
if (iter->_polygon->geoloop.numVerts <= 0) {
iterErrorPolygonCompact(iter, E_DOMAIN);
return NULL;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a zero-vertex geodesic polygon is not allowed?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Algorithm is complex and needs a lot of setup and auxiliary structures, which only work for valid geo loops. Easier to error out right away

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually take this further and require that every loop (outer or hole) have at least 3 vertices, raising an error otherwise. Since it isn't clear what an algorithm should do in the degenerate case of 0, 1, or 2 vertices, I think its easier to eliminate this scenario from consideration entirely.

This would also align with the GeoJSON spec, RFC 7946, Section 3.1.6: "A linear ring is a closed LineString with four or more positions". (We don't repeat the closing vertex, so we can do 3.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During testing, I found a use case for two-point 'polygons,' such as long-distance flight paths or marine trajectories. The algorithm correctly covers these start-to-end paths. There is a test case that covers that use case: polygonToCellsLondonNY

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paths are a totally valid use case, but I'd treat that separate from polygons. Given that this function has polygon in the name, I'd prefer it to be restricted to just that.

A separate function that handles lines would be more than welcome, tho! And, hopefully, it wouldn't be much work to add one, since I'm assuming we can reuse a lot of the machinery from this function.

Side note: In that test, it looks like you repeat the closing vertex for three total points, so you have London -> NY -> London. If you want to test two-point "polygons", should that only be two?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this: only loops with >= 3 vertexes are allowed. I kept the test with London -> NY -> London as an edge case of 0-area polygon (as we do not reject those)

@ajfriend ajfriend modified the milestones: v4.5, v4.6 Feb 11, 2026
@isaacbrodsky
Copy link
Copy Markdown
Collaborator

@holoskii Thanks for the update! I think CI is failing on the formatting assertion (needs make format with clang-format-14)

@holoskii
Copy link
Copy Markdown
Author

@holoskii Thanks for the update! I think CI is failing on the formatting assertion (needs make format with clang-format-14)

My update is a bit overdue, this year has been very busy so far 😅
I will address a few more comments (missed them in my first push), and then will use the correct formatter (I probably used wrong version)

@holoskii
Copy link
Copy Markdown
Author

Resolved final issues, formatted with version 14. Should be ready for the CI

2 small issues to raise after this PR is merged:

@holoskii
Copy link
Copy Markdown
Author

Windows workflow failed because on 32-bit Windows systems, the math library doesn't handle NaN inputs correctly. Instead of ignoring NaN data when building a bounding box, it lets the NaN poison the BBOX. This leads to us hitting NaN down the road and failing later.

I added a check at the very beginning, during polygon creation to catch invalid coordinates (NaN or Infinity). Now, we reject bad input right away with a clear error code (E_DOMAIN) instead of letting it break things further down the line.

@holoskii holoskii changed the title Geodesic polygon coverage Feb 16, 2026
@holoskii
Copy link
Copy Markdown
Author

Updated PR title and description, should be ready for the merge

holoskii and others added 18 commits March 13, 2026 14:24
- move test to appropriate file
- move comment to appropriate file
…ygon edges

Two floating-point computation paths for hex→Vec3d conversion
(direct to vec3 vs through geo) produced ~1e-16 divergence. When a cell boundary
edge coincided with the polygon boundary, _geodesicEdgesCross falsely rejected
the pair, causing boundary intersection to return false.
The fallback containment check then tested boundary.verts[0] - a vertex sitting
on the polygon edge - which could be classified as outside.

Fix by:
1. Introducing _hex2dToVec3Consistent that routes through _hex2dToGeo→_geoToVec3d
   (the same path polygon vertices take), ensuring shared vertices are
   bitwise-identical. Used in both hex and pentagon geodesic boundary functions.
2. Testing the cell center (cellToVec3) instead of boundary.verts[0] for the
   fully-inside containment check, avoiding edge-coincident ambiguity.
Our algorithm does not support them, but we didn't use to have any
protection against not supported input. Now we will throw E_DOMAIN if
we can't find a hemisphere that the polygon fits into.
Fix issue where we did not trim cells correctly and could "leak"
some of the incorrect cells. Use hemisphere produced during check
to fix that path. Add more tests covering extreme polygons
We will not calculate expensive polygon-cell intersection if its result is not needed
@holoskii holoskii force-pushed the geodesic_coverage branch from c75a49f to 38293ec Compare March 13, 2026 12:25
@ajfriend
Copy link
Copy Markdown
Collaborator

ajfriend commented Mar 15, 2026

Thanks again for this work, addressing the comments, and fixing the issues, @holoskii. Really excited to get geodesic coverage into the library.

One note for alignment:

As well as we would need to implement the same changes to the non-geodesic polyfill

This isn't actually the case. We can't change the API of the non-geodesic polyfill (like adding a requirement for winding order) without a major version change. Since there are no plans for a major version bump any time soon, the non-geodesic mode is basically fixed as is. However, the geodesic mode is new, so we're free to design that API however we like. The point I was attempting to make is that we don't want paint ourselves into a similar corner with the API for geodesic mode. Since this code is tagged as experimental, we can be a bit more relaxed, but we'd want to avoid a final release that's agnostic to winding order if we're later planning on being strict about winding order to allow for global polygons.

Edit: And just to be clear, I don't think we need any changes for this PR, necessarily. I'm just calling this out for the future. For this PR, being agnostic to winding order is OK because 1) the function is still experimental, and 2) we could add a lightweight validation as a preprocessing step to check that polygons follow the right hand rule.

@ajfriend
Copy link
Copy Markdown
Collaborator

ajfriend commented Mar 15, 2026

Just noting this here, since I was having trouble finding it earlier: #1052 (comment)

The updated PR now requires all loops to have >= 3 vertices. Thanks!

@ajfriend
Copy link
Copy Markdown
Collaborator

ajfriend commented Mar 15, 2026

One chore I'd like to propose: splitting the faceijk.c / vec3d.c refactoring into its own PR, separate from the geodesic polygon code. It's a little more work to separate, but I think it's worth it for review quality and confidence. I'd be happy to help with this.

By splitting it up, we'll have an isolated PR focused on the internal refactor, which touches H3's core indexing functions. Subtle numerical changes here could potentially shift cell assignments, or affect indexing performance. @isaacbrodsky already flagged this in his review. Isolating it would let us add exhaustive round-trip tests and benchmarks to build full confidence, beyond what the existing test suite covers (since I'm not sure it was designed to capture regressions from this kind of change).

We can also write that PR with an eye towards reusable vec3d components for things like geodesic "line to cells" or "sphere cap to cells" functions.

It has the side benefit of making review easier. This is a significant (in size and functionality!) change, so splitting it into smaller PRs is helpful.

Curious to hear what other people think. And as mentioned, I'm happy to help out with this.

I'll review the recent changes, but my expectation at this point is that--pending this split--this PR is ready to merge.

@holoskii
Copy link
Copy Markdown
Author

The idea is great. This is a huge PR, so digesting it in smaller portions will indeed be easier. Thanks for the help with this part! I will be waiting for the focused faceijk/vec3 PR to review, and rebase this one after prep PR is merged

@ajfriend
Copy link
Copy Markdown
Collaborator

Great! I'll try to get tho this soon.

ajfriend pushed a commit to ajfriend/h3 that referenced this pull request Mar 31, 2026
ajfriend pushed a commit to ajfriend/h3 that referenced this pull request Mar 31, 2026
Extract _vec3dToClosestFace, _vec3dToHex2d, _vec3dToFaceIjk from LatLng
wrappers. Add _vec3dAzimuthRads (3D tangent-plane azimuth) and _hex2dToVec3
(inverse via Rodrigues' rotation). Add vec3ToCell and cellToVec3 public API.

Existing LatLng API is unchanged — wrappers delegate to the new Vec3d functions.

From uber#1052.
ajfriend pushed a commit to ajfriend/h3 that referenced this pull request Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants