Skip to content

Release/v1.3.0 alpha.1#6

Open
euphoria0-0 wants to merge 14 commits into
mainfrom
release/v1.3.0-alpha.1
Open

Release/v1.3.0 alpha.1#6
euphoria0-0 wants to merge 14 commits into
mainfrom
release/v1.3.0-alpha.1

Conversation

@euphoria0-0

@euphoria0-0 euphoria0-0 commented Feb 4, 2026

Copy link
Copy Markdown

Update v1.3.0-alpha.1 to update datasets

Copilot AI review requested due to automatic review settings February 4, 2026 08:09

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates configuration parameters and adds support for new datasets in the vector database benchmarking tool, specifically for the v1.3.0 alpha.1 release.

Changes:

  • Updated IVF-GAS index parameters (nlist and nprobe values) in product configuration
  • Added new food dataset configuration file with multiple index types
  • Refactored dataset preparation script to use case-based configuration with support for new datasets
  • Added utility script for generating random benchmark datasets

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vectordb_bench/config-files/envector_products_config.yml Updated nlist from 32768 to 1024 and nprobe from 6 to 16 for IVF-GAS configuration
vectordb_bench/config-files/envector_food_config.yml New configuration file for FOOD512D101K dataset with FLAT, IVF-FLAT, and IVF-GAS index configurations
scripts/prepare_random_dataset.py New script for generating random normalized vectors and ground truth neighbors for benchmarking
scripts/prepare_dataset.py Refactored to use case-based dataset configuration with simplified CLI interface
README.md Updated documentation with new dataset references and corrected example commands

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/prepare_dataset.py Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
--train-centroids True \
--centroids-path "./centroids/embeddinggemma-300m/centroids.npy" \
--nlist 32768 \
--nlist 1024 \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

이거 embedding-gemma 쓸 때 는 32768 맞지않나요??

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

반영했습니다! 166e5a6

Comment on lines +33 to +43
nlist: 128
nprobe: 6
train_centroids: true
centroids_path: food/centroids/centroids_128.npy

# GAS: enVector-customized ANN
envectorivfgas:
<<: [*base_dataset, *base_envector]
index_name: food101_ivfgas
db_label: FOOD512D101K-IVFGAS
nlist: 1024

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IVF FLAT / IVF GAS 에서의 nlist 값이 다른데 의도하신걸까요?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

의도했습니다!

Comment thread README.md Outdated
- `PRODUCTS512D400K`
- `FASHION512D200K`
- `FOOD512D75K`
- `PRODUCTS512D400K`: [cryptolab-playground/amazon-products-clip-vit-b-32](https://huggingface.co/datasets/cryptolab-playground/amazon-products-clip-vit-b-32)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food가 빠져있는 것 같습니다?!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

이제 추가했습니다!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot AI review requested due to automatic review settings February 4, 2026 08:58

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants