Skip to main content

Questions tagged [natural-language-processing]

The field of natural language processing covers attempts to make sense of text in a human language using computers

5 votes
2 answers
111 views

I created a Python 3.11 utility that truncates an input string to a fixed word count—splitting on any whitespace, collapsing runs, and dropping trailing stop-words—so you get clean, concise snippets ...
Bob's user avatar
  • 221
2 votes
3 answers
135 views

I’ve implemented a small utility function in Python 3.11 that takes an input string, splits it into word-based chunks of a given size, and allows a specified overlap between consecutive chunks. This ...
Bob's user avatar
  • 221
4 votes
1 answer
121 views

I made a simple search engine using the xkcd API in Rust which turned out better than I'd hoped for! I decided to use tf-idf as a way to rank results, which I feel like has some room for improvement. ...
joeymalvinni's user avatar
6 votes
2 answers
131 views

I am trying to build a useable NLP corpus but getting bottlenecked by how long the program takes (200 hours). With so much data I know that optimizing my code even a little bit will net me huge time ...
evader110's user avatar
  • 163
3 votes
1 answer
152 views

I have a data set that is of 300,000 rows approximately and two columns, each row contains a string, some might be larger than others. All in all, the data set in a ...
Louis's user avatar
  • 131
7 votes
4 answers
472 views

Occasionally, we want to do a rudimentary parsing on English text; we separate the text into separate words. ...
Samuel Muldoon's user avatar
1 vote
1 answer
274 views

For educational purpose I am preprocessing multiple short texts containing the description of the symptoms of cars fault. The text is written by humans and is rich in misspelling, capital letters and ...
Andrea Ciufo's user avatar
2 votes
1 answer
137 views

I made a Python voice assistant. It takes the user's voice input and there are multiple if-else statements that specify a condition and if it satisfies that condition it executes a specific function. ...
Rohith Nambiar's user avatar
2 votes
1 answer
140 views

I have the following DataFrame in pandas: code town district suburb 02 Benalmádena Málaga Arroyo de la Miel 03 Alicante Jacarilla Jacarilla, Correntias Bajas (Jacarilla) 04 Cabrera d'Anoia Barcelona ...
Carola's user avatar
  • 163
3 votes
0 answers
766 views

I've been trying to create a piece of code which consists of looping through each element of a list of questions, preprocess it, and then calculate the Cosine similarity with the rest of the elements (...
Shodai Thox's user avatar
2 votes
2 answers
370 views

This is my first non-trivial program in my Python. I am coming from a Java background and I might have messed up or ignored some conventions. I would like to hear feedback on my code. ...
BovineScatologist's user avatar
1 vote
1 answer
225 views

I've just picked coding back up for the first time in a long time, so I understand if your eyes bleed looking at this code. It all works, but I'd be grateful for any tips (how to improve the python ...
Will's user avatar
  • 111
1 vote
1 answer
119 views

I wanted to build a Inverse Document Frequency function, because in my opinion was not easy to do with scikit and I wanted also to show how it works for educational reasons. Also reading this question ...
Andrea Ciufo's user avatar
2 votes
0 answers
68 views

As part of my NLP project at work, I want to loop over all files that are either PDF of docx in the same directory. The end purpose is to create a dataframe with text content of the files in one ...
Sam.H's user avatar
  • 143
2 votes
2 answers
264 views

I am working on a text normalizer. It works just fine with small text files but takes a very long time with large text files such as 5 MB or more. Is there anything to change in the code to make it ...
mehio hatab's user avatar

15 30 50 per page
1
2 3 4 5
8