add multimodal preprocessing support#344
Conversation
There was a problem hiding this comment.
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
The quality checks have failed. Please run |
|
This pull request has merge conflicts that must be resolved before it can be |
|
@shanjiaz Thanks so much for your review! I saw that the other PR has already landed, so I’ll update this one based on the latest main. |
Signed-off-by: Haoxiang Sun <shx2005@126.com>
7214735 to
abcae25
Compare
|
I’ve rebased onto the latest PR and also fixed a missing I ran I haven’t re-tested the image-text pair path yet because the training logic has changed. I’m planning to finish the remaining image-text support over the next few weeks and then test them together. Thank you again for your patience and for all the guidance throughout this process! |
Signed-off-by: Haoxiang Sun <shx2005@126.com>
0a5d18c to
56c47b5
Compare
Purpose
Enable multimodal-aware(currently single image-text pair) preprocessing for the EAGLE offline data-generation pipeline while preserving backward compatibility for existing text-only workflows.
Description
This PR adds multimodal preprocessing support and refactors related preprocessing internals:
--multimodalflag to offline generation entrypoint and passed it through to preprocessing.is_multimodal=Nonedefaults to text mode.Related Issue
#290
Tests
Notes:
I have filled in: