Skip to content
View junkangwu's full-sized avatar

Block or report junkangwu

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. QAE QAE Public

    [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning

    Python 23

  2. alpha-DPO alpha-DPO Public

    [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"

    Python 30

  3. Dr_DPO Dr_DPO Public

    [ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"

    Python 18 3

  4. beta-DPO beta-DPO Public

    [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$

    Python 50 5

  5. ADNCE ADNCE Public

    [NeurIPS2023] Official code of "Understanding Contrastive Learning via Distributionally Robust Optimization"

    Python 41 2

  6. Adap_tau Adap_tau Public

    [WWW 2023] Official code of "Adap-$\tau$: Adaptively Modulating Embedding Magnitude for Recommendation"

    Python 29 3