Skip to content
View contactandyc's full-sized avatar

Highlights

  • Pro

Block or report contactandyc

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
contactandyc/README.md

GitHub Profile for Andy Curtis

Selective works, algorithmic research, and high-performance C infrastructure by Andy Curtis.


C C++ Python Algorithms


✦ Archival Works & Algorithms

A selection of algorithms and systems I've developed over my career, largely in the public domain at this point.

  • Quicksort Improvement (2019) — Designed an efficient modification to quicksort/introsort. By reusing the pivot sample to detect when input is already sorted, reverse-sorted, or all-equal, the algorithm finishes in O(n) with a single verification pass. Yields 5–30× speedups for sorted inputs with no regressions on adversarial cases.
  • Ad Inventory Overlapping Set Problem (2011–2012) — Designed a real-time ad inventory model to handle millions of buys with overlapping targeting criteria using inverted indices and bitmap intersections. Enabled accurate forecasting and scalable allocation at web scale.
  • Expected Frequency & Long Correlation (2009+) — An algorithm for search and recommendations leveraging entire user histories. By adjusting local co-occurrence with an expected vs. actual frequency correction, it systematically removes global popularity bias.
  • Click-Based Search & Recommendations (2003–2005) — A framework leveraging full user sessions (queries + clicks) to learn correlations, improving ranking, personalization, and localization.
  • Efficient Near-Shingling (2001) — An approach to near-duplicate detection that approximates shingling accuracy while reducing storage and compute costs. Documents are reduced to title + first + 10 longest sentence hashes, indexed in dual forward/inverted indices.
  • EzResult Search Engine (1998–1999) — A distributed search engine written entirely in C/C++/assembly, independently utilizing inverted indices, tries, and cosine similarity. Supported instant index updates and was acquired in 1999.


✦ The C Infrastructure Ecosystem

A modular, multi-tier dependency graph of C libraries built for out-of-core data processing, search, and system reliability. Designed strictly for performance, simplicity, and composability.

🧱 Foundation & Memory

  • a-memory-library Zero-overhead memory pools, auto-growing buffers, and debug-wrapped allocators.
  • the-macro-library Type-safe C macros for core algorithms (introsort, bsearch, red-black trees, heaps).
  • a-bitset-library Expandable bitset structures for setting, querying, and bitwise operations.

⚙️ Distributed Processing & I/O

  • a-map-reduce-library Single-node, partitioned DAG execution engine for out-of-core data processing and pipelining.
  • the-io-library Record-oriented file processing with transparent compression, partitioning, and sort-merging.
  • the-lz4-library Fast LZ4 compression and decompression primitives.

🕸️ Networking & Security

  • a-curl-library Async event-loop wrapper over libcurl with rate-limiting, backoffs, and dependency scheduling.
  • a-curl-openai-plugin Builder API on the curl event loop for handling OpenAI streams and structured outputs.
  • an-encryption-library Secure key generation and in-place AES-GCM encryption/decryption.

🗂️ Parsing & Serialization

🔎 NLP, Search & ML



✦ Fleet Management & Orchestration

Tools designed to tame the complexity of multi-repo ecosystems through declarative configuration and GitOps automation.

  • scaffold-repo — A declarative fleet manager and build orchestrator. Resolves dynamic dependency graphs, enforces OSS license compliance, and automates Git branching/releases across dozens of interconnected micro-repos via a single Python CLI.
  • scaffold-templates — The centralized Template Registry powering scaffold-repo. Defines language stacks (C/CMake, Python), organizational profiles, and dynamic Jinja2 file routing to prevent vendor lock-in. Clone this to define your own fleet!


✦ Activity & Statistics

contactandyc's Github chart

GitHub Stats
```

Pinned Loading

  1. a-map-reduce-library a-map-reduce-library Public

    A library for orchestrating a map reduce workload on a machine

    C

  2. scaffold-repo scaffold-repo Public

    A repo to scaffold other repos (handles licenses, builds, clones, makefiles, boilerplate stuff)

    Python

  3. a-memory-library a-memory-library Public

    A library for handling allocation

    C 3

  4. a-json-library a-json-library Public

    A very fast json library

    C 3

  5. search-index-library search-index-library Public

    Forked from knode-ai-open-source/search-index-library

    A library for indexing and finding data like a search engine

    C

  6. sql-parser-library sql-parser-library Public

    Forked from knode-ai-open-source/sql-parser-library

    A library for matching data structures to SQL

    C