Update speculator config & converter to support hidden states indexing#142
Update speculator config & converter to support hidden states indexing#142
Conversation
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
|
📦 Build Artifacts Available |
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
…culators into hz-update-config
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
rahul-tuli
left a comment
There was a problem hiding this comment.
The PR looks good, but now since we are removing the forward pass through the model, does it still make sense to keep the --validate/ --validate-device arguments?
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
rahul-tuli
left a comment
There was a problem hiding this comment.
few questions/nits which can be addressed in a follow up, good work on this, LGTM once we raise the NotImplementedError for forward passes
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
dsikka
left a comment
There was a problem hiding this comment.
Do we have test cases for multiple decoder layers?
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
fynnsu
left a comment
There was a problem hiding this comment.
One question below which might require a fix.
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
|
LGTM pending quality! |
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
Added tests and review has been addressed.
Changes:
eagle_aux_hidden_state_layer_idsandinference_type.target_vocab_size. We default on using "t2d" length, if not available, load the config file of verifier model, recursively search the dict forvocab_size. (The search is needed for nested dict. e.g. target_config_dict["text_config"]["vocab_size"] )Command used:
Converted checkpoint:
shanjiaz/Llama4-Maverick-Eagle3-Speculators-converted