[Refactor] Extract model specific logics in export lib by h-guo18 · Pull Request #1828 · NVIDIA/Model-Optimizer

h-guo18 · 2026-06-25T20:06:14Z

What does this PR do?

Type of change: Refactor / code health

Extracts model-specific logic out of the generic export code into a new model-family registry, modelopt/torch/export/modeling/. Each family is described declaratively (per-model data + optional behavioral hooks); the export engine resolves a family's ModelSpec and reads from it instead of branching on model names. An unmatched lookup returns None, so the engine falls back to its original path — the migration is incremental and behavior-preserving.

Migrated so far:

Data: MoE expert names, iterable-expert flag, activation override, shared-embedding flag, MLP keyword roles, and AWQ pre_quant_scale fusion rules.
Hooks (ModelHooks + ExportContext, dependency-injected to keep modeling/ cycle-free): decoder sub-module placement for Gemma2/3 layernorms, Gemma3 q/k-norms, and MLlama self/cross attention.

Shared tables (HF_CONFIG_MAP) and generic algorithms stay in the engine. See modelopt/torch/export/MODEL_SPECIFIC_REFACTOR.md for the inventory, design, and migration plan.

Not yet migrated

Planned for follow-ups (still uses the engine's default path):

build_moe — per-family MoE router / experts / shared-expert construction (Llama/Phi3, DBRX, DeepSeek, Qwen).
unwrap_decoder_layer — DBRX/ExaOne/Deci module-tree unwrap, plus the head_is_first_dim flag (Bloom/Falcon/Phi3Small/InternLM) as data.
TRT-LLM target-config extras (tensorrt_llm_utils) — positional-embedding type, MPT alibi, RecurrentGemma, DBRX clip_qkv, Phi3-MoE sparse-mixer (mostly data).
HF path (unified_export_hf / moe_utils) — MoE expert export branches (Llama4/GptOss/DBRX/iterable) are a separate engine and would need HF-side seams; VLM language-tower extraction currently lives in model_utils.

Deferred (low value): embed √-scale (Gemma1-only + version-gated) and norm+1 (mixes the generic LayerNorm1P).

Intentionally kept in the engine: HF_CONFIG_MAP (a shared alias table, not a per-model branch), generic algorithms (_GATE_UP_PAIRS, expert-amax fallback), the ChatGLM/Phi3 fused gate/fc chunk-swap (a fused-weight reshape), and speculative-decoding export (already modular under plugins/hf_spec_*).

Usage

# Adding a model family is a small declarative file; no engine edits.
# modelopt/torch/export/modeling/families/<family>.py
register(
    ModelSpec(
        name="qwen_moe",
        moe_block_names=("Qwen3MoeSparseMoeBlock",),
        expert_linear_names=("gate_proj", "down_proj", "up_proj"),
        has_iterable_experts=True,
    )
)

Testing

pytest tests/unit/torch/export/ passes (78 passed; test_export_diffusers.py skipped — pre-existing CUDA/glibc collection error unrelated to this change).
Each migration step verified for behavioral equivalence against the legacy branches; fallback-first means un-migrated families are byte-for-byte unaffected.

Before your PR is "Ready for review"

Is this change backward compatible?: ✅ (behavior-preserving; fallback-first)
If you copied code from any other sources or added a new PIP dependency: N/A
Did you write any new necessary tests?: N/A — behavior-preserving refactor covered by existing tests/unit/torch/export/
Did you update Changelog?: N/A (internal refactor, no public API change)
Did you get Claude approval on this PR?: ❌ (pending)

Additional Information

Refactor only — no change to exported checkpoints. Design/rationale: modelopt/torch/export/MODEL_SPECIFIC_REFACTOR.md.

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

copy-pr-bot · 2026-06-25T20:06:18Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-06-25T20:06:23Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1f7520a8-570a-432d-a5dd-0b18214bd98f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch haoguo/export-modelinglib

_{Comment @coderabbitai help to get the list of available commands.}

codecov · 2026-06-25T20:16:47Z

Codecov Report

❌ Patch coverage is 62.96296% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.16%. Comparing base (4093664) to head (8ddd26c).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/torch/export/layer_utils.py	14.28%	24 Missing ⚠️
modelopt/torch/export/modeling/families/gemma.py	33.33%	16 Missing ⚠️
modelopt/torch/export/modeling/registry.py	50.00%	16 Missing ⚠️
modelopt/torch/export/modeling/families/mllama.py	53.84%	6 Missing ⚠️
modelopt/torch/export/modeling/hooks.py	72.72%	3 Missing ⚠️
modelopt/torch/export/quant_utils.py	25.00%	3 Missing ⚠️
modelopt/torch/export/model_config_export.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1828      +/-   ##
==========================================
- Coverage   77.36%   77.16%   -0.21%     
==========================================
  Files         513      534      +21     
  Lines       56889    57365     +476     
==========================================
+ Hits        44012    44265     +253     
- Misses      12877    13100     +223

Flag	Coverage Δ
unit	`54.69% <62.96%> (+0.09%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

h-guo18 · 2026-06-25T23:18:29Z

+
+
+@dataclass
+class ModelSpec:


trtllm export path is deprecated.

h-guo18 · 2026-06-25T23:19:27Z

@@ -60,6 +60,13 @@
    RgLruConfig,
 )
 from .model_config_utils import pad_weights


HF export only.

export modeling lib

0fbbbf9

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

h-guo18 changed the title ~~[Refactor] Extract model specific logics in export~~ Jun 25, 2026

h-guo18 added 7 commits June 25, 2026 20:23

moe

2d30cb9

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

activation function

87a8815

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

move force_share_embedding_table

86e3966

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

move mlp keywords

9d54011

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

move PQS_FUSE_MODULE_MAPPING

e2ab51c

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

clean up comments

cd74be8

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

add hook

8ddd26c

Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>

h-guo18 commented Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] Extract model specific logics in export lib#1828

[Refactor] Extract model specific logics in export lib#1828
h-guo18 wants to merge 8 commits into
mainfrom
haoguo/export-modelinglib

h-guo18 commented Jun 25, 2026 •

edited

Loading

copy-pr-bot Bot commented Jun 25, 2026

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

Review skipped

codecov Bot commented Jun 25, 2026 •

edited

Loading

h-guo18 Jun 25, 2026

h-guo18 Jun 25, 2026

Labels

1 participant



		@dataclass
		class ModelSpec:

Uh oh!

Conversation

h-guo18 commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Not yet migrated

Usage

Testing

Before your PR is "Ready for review"

Additional Information

copy-pr-bot Bot commented Jun 25, 2026

coderabbitai Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

codecov Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

h-guo18 Jun 25, 2026

Choose a reason for hiding this comment

h-guo18 Jun 25, 2026

Choose a reason for hiding this comment

Labels

1 participant

h-guo18 commented Jun 25, 2026 •

edited

Loading

coderabbitai Bot commented Jun 25, 2026 •

edited

Loading

codecov Bot commented Jun 25, 2026 •

edited

Loading