Highlights
- Pro
Stars
- All languages
- Assembly
- Astro
- Brainfuck
- C
- C#
- C++
- CMake
- CSS
- Cairo
- Circom
- Clojure
- CoffeeScript
- Common Lisp
- Coq
- Cuda
- Cython
- Dafny
- Dart
- Dockerfile
- Elixir
- Elm
- Emacs Lisp
- Erlang
- F#
- Fortran
- GCC Machine Description
- Go
- HCL
- HTML
- Haskell
- HolyC
- Isabelle
- Java
- JavaScript
- JetBrains MPS
- Julia
- Jupyter Notebook
- Just
- KCL
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Mathematica
- Metal
- Mojo
- Nearley
- Nim
- Nix
- OCaml
- Objective-C
- PHP
- Perl
- PostScript
- Python
- QML
- R
- Racket
- Reason
- Rocq Prover
- Roff
- Ruby
- Rust
- SCSS
- SMT
- SWIG
- Scala
- Scheme
- Shell
- Smarty
- Solidity
- Standard ML
- Starlark
- Svelte
- Swift
- SystemVerilog
- TeX
- TypeScript
- Typst
- V
- Verilog
- Vim Script
- Vim Snippet
- Vue
- Vyper
- WebAssembly
- Zig
- q
Open-source framework for the research and development of foundation models.
Named Tensors for Legible Deep Learning in JAX
A simple, performant, and scalable Jax LLM!
JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training
SpecTrax is a JAX-native library for neural networks and graph learning, built for performance, composability and modularity.
torchax is a PyTorch frontend for JAX. It gives JAX the ability to author JAX programs using familiar PyTorch syntax. It also provides JAX-Pytorch interoperability, meaning, one can mix JAX & Pytor…
A Python DSL to write Nvidia PTX for Hopper and Blackwell in JAX and PyTorch
Experimentation using the xla compiler from rust
Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
ATLAS Autoformalized Textbook Library At Scale
ThunderKittens LCF forward non-causal attention kernel benchmarked against FlashAttention-2 and FlashAttention-3 on Hopper.
This module defines a type system for distributed training code, based off of JAX's sharding in types, but adapted for the PyTorch ecosystem.
FoundationDB - the open source, distributed, transactional key-value store
Seastar boilerplate project with cmake
FlashMLA: Efficient Multi-head Latent Attention Kernels
DeepGEMM: clean and efficient BLAS kernel library on GPU
DeepEP: an efficient expert-parallel communication library
NoSQL data store using the SEASTAR framework, compatible with Redis
mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems
Algorithm powering the For You feed on X






