Skip to content
View jt-zhang's full-sized avatar

Highlights

  • Pro

Organizations

@thu-ml

Block or report jt-zhang

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jt-zhang/README.md

Hi 😊

I am a first-year PhD student in the CS Dept. at Tsinghua University, focusing on efficient training and inference of large models.

  • WeChat WeChat ID : Zjt_Tete

Pinned Loading

  1. thu-ml/TurboDiffusion thu-ml/TurboDiffusion Public

    TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

    Python 3.3k 224

  2. thu-ml/SageAttention thu-ml/SageAttention Public

    [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

    Cuda 3.1k 330

  3. thu-ml/SpargeAttn thu-ml/SpargeAttn Public

    [ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

    Cuda 921 83

  4. CardinalityEstimationTestbed CardinalityEstimationTestbed Public

    CardinalityEstimationTestbed

    Python 49 14

  5. Sparse_Attention_API Sparse_Attention_API Public

    Python 67 7

  6. attention-survey/Efficient_Attention_Survey attention-survey/Efficient_Attention_Survey Public

    A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention

    278 5