[Tests][Eagle3] Extend vLLM test cases with conversion step#93
Merged
Conversation
|
📦 Build Artifacts Available |
rahul-tuli
approved these changes
Aug 26, 2025
markurtz
requested changes
Aug 27, 2025
Collaborator
There was a problem hiding this comment.
One major issue that I left as a comment on the code, otherwise there are several minor cleanups that we could get in for this or for a later PR:
- The sampling params use temperature and top_p which causes extra, unneeded variability in the tests
- The test is asserting that the number of returned tokens has to be 20. Should be updated to >1 and <=20 or we should null out the stop tokens so it's forced to go to 20
- There's no cleanup of the vLLM resources (del llm, gc.collect, empty_cache, etc). A parameterized fixture with a yield would ensure both setup and teardown always run
- Currently only checking for vLLM imports are available, we likely would want to put an extra case on there for torch.cuda / GPU availability so we don't waist resources
- We could require in the dev environments hf_transfer and set HF_HUB_ENABLE_HF_TRANSFER for faster downloads
- _run_vllm_engine runs as a helper, so any failures within there are not going to surface properly with pytest
- Enabling a test config/settings longer term would be good so we can pull down things like gpu_memory_utilization and have a central place to update them for workers running tests
- Base comparisons / interface comparisons for what the output from conversion should look like before running in vLLM
markurtz
approved these changes
Aug 27, 2025
Collaborator
markurtz
left a comment
There was a problem hiding this comment.
Missed one of the pytest skips in the original review, approving with a note towards the potential future improvements we should get onto the backlog
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary