Skip to main content

All Questions

0 votes
0 answers
147 views

Montreal Forced Aligner(MFA) taking too much time(almost 18 days still going on) to train a 33 GB corpus

WE are using Montreal Forced Aligner (MFA) 3.x to train an acoustic model on a large dataset (~33GB of audio and transcripts in an Indian language). The training process takes an extremely long time(...
Swayangjit's user avatar
  • 1,881
0 votes
1 answer
40 views

I am unable to produce rearch results from google/youtube in my speech recognition code

I am trying to build a chatbot which can interact with people and help them with quick updates.Below is the code that I am using to get the search results from youtube/google. Please tell me where the ...
Sai Sasidhar's user avatar
0 votes
0 answers
165 views

Not having the train model in Generic Assistant

In book GenericAssistant, when we enter the data from file json, we have to press train_modle, but when I press train_modle, it says there is no such module! import sys import threading import tkinter ...
Yashar Moh's user avatar
-2 votes
1 answer
175 views

Understanding LSTM for speech recognition

I am trying to understand LSTM for speech recognition. I do understand that LSTM here basically generates phone index (let's say each unique phone is mapped to a unique integer) at the output for ...
Anantha Krishnan's user avatar
2 votes
1 answer
1k views

How can I split word in the wav file in python?

e.g wav file is("How are you ?") ı want to split 3 wav file like as ("How"), ("are"), ("you").Could you help me ?
ezgi ozgur's user avatar
0 votes
1 answer
802 views

How can I leave r.recognize_google to keep listening and dont stop

Im working in a chatbot and it works fine so far, however, if you dont talk directly after the chatbot talks to you, it gives you this error. in recognize_google if not isinstance(actual_result, dict) ...
Enrique Gil Garcia's user avatar
2 votes
0 answers
307 views

"phones in the dictionary that do not have acoustic models" montreal forced aligner

I try to follow the example in the documentation of MFA : I execute on my computer (windows 10, Python 3.9, pip 21.2.4): pip install montreal-forced-aligner mfa download acoustic english Then, when I ...
Yanirmr's user avatar
  • 1,042
0 votes
1 answer
355 views

How can I get only text part out of recognised object in Microsoft Speech Service

Following is my output of speech recognition from file from Microsoft Azure Speech SDK. I want to know how can I extract just the 'text' part from this output rather than complete. ...
Arihant Jain's user avatar
3 votes
3 answers
3k views

How to load a percentage of data from huggingface load_dataset

I am trying to download the "librispeech_asr" dataset which totals 29GB, but due to limited space in google colab, I'm not able to download/load the dataset i.e. the notebook crashes. So I ...
dev1ce's user avatar
  • 1,757
1 vote
0 answers
178 views

Text Normalizer on large text files TOO SLOW (Python)

I am working on a text normalizer. It works just fine with small text files but takes a very long time with large text files such as 5 MB or more. Is there anything to change in the code to make it ...
mehio hatab's user avatar
0 votes
1 answer
533 views

How to make custom speech recognition in python?

I tried out following code: import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: r.adjust_for_ambient_noise(source) print("Say something!") audio = r....
Nitin Singhal's user avatar
0 votes
1 answer
158 views

Speech to Text recognition : Text Correction and Result Improvisation in Python

How can I achieve below result in Python reference text is : " We wanted people to know that we've got something brand new and essentially this product is uh what we call disruptive changes the way ...
Manoj Deshpande's user avatar
-1 votes
2 answers
1k views

Enabling Audio Input for Speech Recognition Library

How do I turn on audio input for all device indexes using a Speech Recognition Library? As I want to pass in the audio for testing and there might be possibility that the library uses a different ...
Aditya Totla's user avatar
1 vote
2 answers
7k views

Speech Recognition duration setting issue in python

I have an audio file in Wav format that I want to transcribe: My code is: import speech_recognition as sr harvard = sr.AudioFile('speech_file.wav') with harvard as source: try: audio = r....
user avatar
0 votes
0 answers
165 views

How to convert Speech to text using google api

what solution will you suggest I want to Convert speech to text, not in English and then translate text to English look for specific keywords save data in the database
syed irfan's user avatar

15 30 50 per page