Description
Preconditions and environment
- Magento version : CE 2.4.6
- MariaDB version : 10.6.16
- RAM: 16 GB
- CPU Cores: 8
- Large catalog ( 500k+ products )
The problem was critical to the extent that the storefront was basically inaccessible
No reindex + everywhere where catalog_product_entity_int table was involved in queries - the query never completed, which broke the functionality of the entire storefront
Steps to reproduce
- A relatively large catalog is needed.
In a test environment, it is reproduced on a test catalog generated with the
bin/magento setup:perf:generate-fixtures setup/performance-toolkit/profiles/ce/extra_large.xml
- Start the reindex process
bin/magento indexer:reset
bin/magento indexer:reindex catalogsearch_fulltext
Expected result
Reindexing is successful, there are docs in elasticsearch/opensearch indexes
Actual result
Reindexing is not happening. Indexes are created in elasticsearch, but the data does not get into them
Additional information
In the process of reindexing the catalogsearch_fulltext index, a query is made to the database to retrieve the " batch" data, as an example :
SELECT `e`.`entity_id`, `e`.`type_id`, `e`.`sku` FROM `catalog_product_entity` AS `e`
INNER JOIN `catalog_product_website` AS `website` ON website.product_id = e.entity_id AND website.website_id = '1'
INNER JOIN `catalog_product_entity_int` AS `visibility_default` ON visibility_default.entity_id= e.entity_id AND visibility_default.attribute_id = '99' AND visibility_default.store_id = 0
LEFT JOIN `catalog_product_entity_int` AS `visibility_store` ON visibility_store.entity_id= e.entity_id AND visibility_store.attribute_id = '99' AND visibility_store.store_id = 2
INNER JOIN `catalog_product_entity_int` AS `status_default` ON status_default.entity_id= e.entity_id AND status_default.attribute_id = '97' AND status_default.store_id = 0
LEFT JOIN `catalog_product_entity_int` AS `status_store` ON status_store.entity_id= e.entity_id AND status_store.attribute_id = '97' AND status_store.store_id = 2 WHERE (IF(visibility_store.value_id > 0, visibility_store.value, visibility_default.value) IN (3, 2, 4)) AND (IF(status_store.value_id > 0, status_store.value, status_default.value) IN (1)) AND (e.entity_id > 2156) ORDER BY `e`.`entity_id` ASC
LIMIT 500;
In a normal situation this request should take up to a second to execute (rough approximation). In my case, on a real project, it was executed for several hours before the request timed out ( The screenshot only shows 10 minutes, but even that is extremely long )
Further investigation revealed that the problem is in catalog_product_entity_int , namely missing columns in the indexes.
After adding indexes for the specified table, the query started to succeed in short order
Release note
No response
Triage and priority
- Severity: S0 - Affects critical data or functionality and leaves users without workaround.
- Severity: S1 - Affects critical data or functionality and forces users to employ a workaround.
- Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
- Severity: S3 - Affects non-critical data or functionality and does not force users to employ a workaround.
- Severity: S4 - Affects aesthetics, professional look and feel, “quality” or “usability”.