Skip to content
View casper-hansen's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Copenhagen, Denmark

Highlights

  • Pro

Block or report casper-hansen

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. AutoAWQ AutoAWQ Public archive

    AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

    Python 2.3k 297

  2. AutoAWQ_kernels AutoAWQ_kernels Public archive

    Cuda 79 23

  3. OpenCoconut OpenCoconut Public archive

    OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.

    Python 175 23