I am working with NVIDIA NeMo's ASR collection (QuartzNet model) in Google Colab. I am trying to evaluate the model using CTC Beam Search without a Language Model (setting alpha=0.0).
Even after installing pyctcdecode and kenlm, I encounter a ModuleNotFoundError: No module named 'ctc_decoders' when I call change_decoding_strategy.
Environment:
OS: Google Colab (Ubuntu 22.04 LTS)
Python: 3.12
NeMo version: Latest (installed via pip)
Code Snippet:
import os
from omegaconf import DictConfig
# Creating a dummy file to bypass initial FileNotFoundError for kenlm_path
with open("dummy.bin", "w") as f:
f.write("")
beam_widths = [3, 10, 20]
for width in beam_widths:
try:
# Resetting decoding strategy
quartznet_model_copy.decoding = None
beam_cfg = DictConfig({
"strategy": "beam",
"beam": {
"search_type": "default",
"kenlm_path": os.path.abspath("dummy.bin"),
"width": width,
"alpha": 0.0,
"beta": 0.0,
"return_best_hypothesis": True
}
})
quartznet_model_copy.change_decoding_strategy(decoding_cfg=beam_cfg)
# evaluate_wer call...
except Exception as e:
print(f"Error: {e}")
raise e
ModuleNotFoundError: No module named 'ctc_decoders'
...
/usr/local/lib/python3.12/dist-packages/nemo/collections/asr/modules/beam_search_decoder.py in __init__(...)
64 from ctc_decoders import Scorer, ctc_beam_search_decoder_batch
65 except ModuleNotFoundError:
---> 66 raise ModuleNotFoundError(
67 "BeamSearchDecoderWithLM requires the installation of ctc_decoders "
68 "from scripts/asr_language_modeling/ngram_lm/install_beamsearch_decoders.sh"
What I've tried:
Installing
pyctcdecodevia pip, but thedefaultsearch type still triggers the oldctc_decodersdependency.Changing
search_typetopyctcdecode, but I still get the same requirement forctc_decoders.
Question: How can I use Beam Search in NeMo on Google Colab without having to compile the legacy ctc_decoders library? Is there a way to force NeMo to use pyctcdecode as the backend without triggering this missing module error?