Skip to main content
Advice
0 votes
0 replies
37 views

I’m currently working on extracting / segmenting text lines from handwritten documents. Most of the input images are camera-captured, which introduces several challenges: Lines may be curved or ...
Activa Suit's user avatar
1 vote
2 answers
232 views

I want to use Benepar with a French model to do a syntactic segmentation. I followed the tutorial but I have always have this error RuntimeError: Error(s) in loading state_dict for ChartParser: ...
nassima.crt's user avatar
0 votes
1 answer
216 views

For a text image input, I need to break the text into segments using the OPENCV library Let's say the image has 4 lines of text, I need to write a function that breaks down and cuts the lines and ...
Rozira's user avatar
  • 1
3 votes
0 answers
95 views

The icu4x icu_segmenter::WordSegmenter seems like the best word segmenter out there. I don't understand how data providers work with word segmentation at all. It seems very complicated to me and I ...
mash's user avatar
  • 2,544
1 vote
1 answer
721 views

I'm trying to write a method to count the number of words when the content is in chinese and japanese. This should exclude the special characters / punctuations / whiteSpaces. I tried creating a regex ...
Sherlock's user avatar
1 vote
0 answers
39 views

I am currently working on a problem that requires segmenting a video lecture transcript based on the topics present within the video. My dataset consists of sentence wise labels where 1 indicates the ...
Yusha Arif's user avatar
0 votes
1 answer
593 views

OriginalImage1 BinarizedImage1 OriginalImage2 BinarizedImage2 OriginalImage3 BinarizedImage3 OriginalImage4 BinarizedImage4 I`m preparing image for OCR by Tesseract (pre-trained for this custom font) ...
user18722995's user avatar
2 votes
1 answer
476 views

I want to split into sentences a large corpus (.txt) with a custom rule i.e. {SENT} using Spacy 3.1. My main issue is that I want to "disable" the segmentation from the pretrained spacy ...
Artemis's user avatar
  • 145
1 vote
0 answers
47 views

Is it possible to segment a bs4.element.Tag into several bs4.element.Tag? You can think of an application as the following: 1- The original bs4.element.Tag contains a paragraph. 2- We want to segment ...
A.M.'s user avatar
  • 1,807
1 vote
1 answer
2k views

The following code uses SymSpell in Python, see the symspellpy guide on word_segmentation. It uses "de-100k.txt" and "en-80k.txt" frequency dictionaries from a github repo, you ...
questionto42's user avatar
  • 9,922
9 votes
1 answer
3k views

What is the difference between Tokenization and Segmentation in NLP. I searched about them but I didn't really find any differences .
Mahmoud Noor's user avatar
0 votes
1 answer
89 views

So the question revolve around character segmentation. My problem is the following: I want to segment characters, based on y-axis pixel numbers, following this ( in python) : source What i already ...
Questonary's user avatar
0 votes
1 answer
199 views

How can I obtain a whole word within a string-type sentence? \ For instance, if the given string was: The app has been updated to 88.0.1234.141 which contains a number of fixes and improvements. And ...
Bhj's user avatar
  • 13
-1 votes
2 answers
710 views

Is there a simple way to convert plain text into a segmented array of chunks in python? Each chunk should be for example 16 Bytes. If the last part of the plain text is smaller than 16 Bytes it should ...
Pm740's user avatar
  • 423
2 votes
3 answers
436 views

I'd like to remove all the timestamps in the parentheses in the below sample text data. Input: Agent: Can I help you? ( 3s ) Customer: Thank you( 40s ) Customer: I have a question about X. ( 8m 1s ) ...
LY1's user avatar
  • 35

15 30 50 per page
1
2 3 4 5
14