-
Notifications
You must be signed in to change notification settings - Fork 811
Open
Description
Bug Description
The fix_incorrect_toc function in pageindex/page_index.py:760-827 has a variable shadowing bug that causes results to be written back to wrong index positions.
Problem
The list_index variable (storing the original TOC index to fix) is overwritten by the page loop iterator:
async def process_and_check_item(incorrect_item):
list_index = incorrect_item['list_index'] # ← Original TOC index
# ...
for page_index in range(prev_correct, next_correct+1):
list_index = page_index - start_index # ← Overwrites list_index!
# ...
return {
'list_index': list_index, # ← Uses overwritten value!
# ...
}When the result is written back to toc_with_page_number[list_idx], it updates the wrong position.
Example
If incorrect_item['list_index'] = 5 and the page loop iterates [10, 20]:
- Original
list_index= 5 (correct) - After loop,
list_index= 19 (wrong) - Update goes to index 19 instead of 5
Fix
Rename the loop variable to page_list_idx to avoid shadowing:
for page_index in range(prev_correct, next_correct+1):
page_list_idx = page_index - start_index
if page_list_idx >= 0 and page_list_idx < len(page_list):
page_text = f"..."Metadata
Metadata
Assignees
Labels
No labels