0

I am working with NVIDIA NeMo's ASR collection (QuartzNet model) in Google Colab. I am trying to evaluate the model using CTC Beam Search without a Language Model (setting alpha=0.0).

Even after installing pyctcdecode and kenlm, I encounter a ModuleNotFoundError: No module named 'ctc_decoders' when I call change_decoding_strategy.

Environment:

  • OS: Google Colab (Ubuntu 22.04 LTS)

  • Python: 3.12

  • NeMo version: Latest (installed via pip)

Code Snippet:

import os
from omegaconf import DictConfig

# Creating a dummy file to bypass initial FileNotFoundError for kenlm_path
with open("dummy.bin", "w") as f:
    f.write("")

beam_widths = [3, 10, 20]

for width in beam_widths:
    try:
        # Resetting decoding strategy
        quartznet_model_copy.decoding = None

        beam_cfg = DictConfig({
            "strategy": "beam",
            "beam": {
                "search_type": "default",  
                "kenlm_path": os.path.abspath("dummy.bin"),
                "width": width,            
                "alpha": 0.0,              
                "beta": 0.0,
                "return_best_hypothesis": True
            }
        })

        quartznet_model_copy.change_decoding_strategy(decoding_cfg=beam_cfg)
        # evaluate_wer call...
        
    except Exception as e:
        print(f"Error: {e}")
        raise e
ModuleNotFoundError: No module named 'ctc_decoders'
...
/usr/local/lib/python3.12/dist-packages/nemo/collections/asr/modules/beam_search_decoder.py in __init__(...)
     64             from ctc_decoders import Scorer, ctc_beam_search_decoder_batch
     65         except ModuleNotFoundError:
---> 66             raise ModuleNotFoundError(
     67                 "BeamSearchDecoderWithLM requires the installation of ctc_decoders "
     68                 "from scripts/asr_language_modeling/ngram_lm/install_beamsearch_decoders.sh"

What I've tried:

  1. Installing pyctcdecode via pip, but the default search type still triggers the old ctc_decoders dependency.

  2. Changing search_type to pyctcdecode, but I still get the same requirement for ctc_decoders.

Question: How can I use Beam Search in NeMo on Google Colab without having to compile the legacy ctc_decoders library? Is there a way to force NeMo to use pyctcdecode as the backend without triggering this missing module error?

1 Answer 1

1

In order to use pyctcdecode, you must set your strategy as pyctcdecode and beam.search_type as "pyctcdecode". Your config should be:

       cfg = DictConfig({
            "strategy": "pyctcdecode",
            "beam": {
                "search_type": "pyctcdecode",  
                "kenlm_path": os.path.abspath("dummy.bin"),
                "width": width,            
                "alpha": 0.0,              
                "beta": 0.0,
                "return_best_hypothesis": True
            }
        })

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.