All Questions
20 questions
0
votes
0
answers
147
views
Montreal Forced Aligner(MFA) taking too much time(almost 18 days still going on) to train a 33 GB corpus
WE are using Montreal Forced Aligner (MFA) 3.x to train an acoustic model on a large dataset (~33GB of audio and transcripts in an Indian language). The training process takes an extremely long time(...
0
votes
1
answer
40
views
I am unable to produce rearch results from google/youtube in my speech recognition code
I am trying to build a chatbot which can interact with people and help them with quick updates.Below is the code that I am using to get the search results from youtube/google. Please tell me where the ...
0
votes
0
answers
165
views
Not having the train model in Generic Assistant
In book GenericAssistant, when we enter the data from file json, we have to press train_modle, but when I press train_modle, it says there is no such module!
import sys
import threading
import tkinter ...
-2
votes
1
answer
175
views
Understanding LSTM for speech recognition
I am trying to understand LSTM for speech recognition. I do understand that LSTM here basically generates phone index (let's say each unique phone is mapped to a unique integer) at the output for ...
2
votes
1
answer
1k
views
How can I split word in the wav file in python?
e.g wav file is("How are you ?") ı want to split 3 wav file like as ("How"), ("are"), ("you").Could you help me ?
0
votes
1
answer
802
views
How can I leave r.recognize_google to keep listening and dont stop
Im working in a chatbot and it works fine so far, however, if you dont talk directly after the chatbot talks to you, it gives you this error.
in recognize_google
if not isinstance(actual_result, dict) ...
2
votes
0
answers
307
views
"phones in the dictionary that do not have acoustic models" montreal forced aligner
I try to follow the example in the documentation of MFA :
I execute on my computer (windows 10, Python 3.9, pip 21.2.4):
pip install montreal-forced-aligner
mfa download acoustic english
Then, when I ...
0
votes
1
answer
355
views
How can I get only text part out of recognised object in Microsoft Speech Service
Following is my output of speech recognition from file from Microsoft Azure Speech SDK. I want to know how can I extract just the 'text' part from this output rather than complete.
...
3
votes
3
answers
3k
views
How to load a percentage of data from huggingface load_dataset
I am trying to download the "librispeech_asr" dataset which totals 29GB, but due to limited space in google colab, I'm not able to download/load the dataset i.e. the notebook crashes.
So I ...
1
vote
0
answers
178
views
Text Normalizer on large text files TOO SLOW (Python)
I am working on a text normalizer. It works just fine with small text files but takes a very long time with large text files such as 5 MB or more.
Is there anything to change in the code to make it ...
0
votes
1
answer
533
views
How to make custom speech recognition in python?
I tried out following code:
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source)
print("Say something!")
audio = r....
0
votes
1
answer
158
views
Speech to Text recognition : Text Correction and Result Improvisation in Python
How can I achieve below result in Python
reference text is : " We wanted people to know that we've got something brand new and essentially this product is uh what we call disruptive changes the way ...
-1
votes
2
answers
1k
views
Enabling Audio Input for Speech Recognition Library
How do I turn on audio input for all device indexes using a Speech Recognition Library? As I want to pass in the audio for testing and there might be possibility that the library uses a different ...
1
vote
2
answers
7k
views
Speech Recognition duration setting issue in python
I have an audio file in Wav format that I want to transcribe:
My code is:
import speech_recognition as sr
harvard = sr.AudioFile('speech_file.wav')
with harvard as source:
try:
audio = r....
0
votes
0
answers
165
views
How to convert Speech to text using google api
what solution will you suggest
I want to Convert speech to text, not in English and then translate text to English look for specific keywords save data in the database