Skip to content
View farazkh80's full-sized avatar
💭
Ambitious
💭
Ambitious

Highlights

  • Pro

Organizations

@uw-midsun @castorini

Block or report farazkh80

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
farazkh80/README.md

Hi, I’m Faraz.

I completed my Software Engineering degree at the University of Waterloo. I spent two years at Cohere AI working on model inference optimization and post‑training/finetuning. Now, I work on TensorRT-LLM at NVIDIA.



Thanks For Visiting and Feel Free to Connect:

Pinned Loading

  1. TensorRT-LLM TensorRT-LLM Public

    Forked from NVIDIA/TensorRT-LLM

    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

    Python 1

  2. sglang sglang Public

    Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Python

  3. NVIDIA/TensorRT-Incubator NVIDIA/TensorRT-Incubator Public

    Experimental projects related to TensorRT

    MLIR 125 27