Questions tagged [string-matching]
Use for generic string matching that may be exact substring matching (though then prefer the tag `exact-string-matching`), may be matching to a regular expression, or may be approximate matching (e.g. finding substrings within a given Levenshtein distance of a reference string)
79 questions
4
votes
0
answers
75
views
Remove contiguous 5th powers (5-fold repetitions) from list of 'a's and 'b's?
Given a list of characters in $\{a,b\}$, for example $abababababa$, what is the most efficient way to remove all 5th powers in a way that makes the string as short as possible?
(This example would ...
0
votes
0
answers
61
views
Fastest algorithm to get thortest uncommon substring for every word within a list of words
Given a list of words of length $N$ where each word is off length atmost $M$, what is the best known algorithm with respect to time complexity for computing the shortest "unique" substring ...
0
votes
0
answers
53
views
string matching algorithm question for matching approximately similar names between two lists
The focus of this question is on natural language processing, specifically matching names between 2 lists. I am looking at employees that work in the same organization, however I obtained data from ...
0
votes
0
answers
55
views
Why does the Horspool algorithm check the substring backward?
I have some doubts about the need for backward checking in the Horspool algorithm.
The shift value depends only on the last character of the current window in the text. Whether the substring ...
1
vote
0
answers
69
views
Sparse bit string pattern matching
Suppose there are two strings of bits. Let's call them the needle (n) and the haystack (h).
We'll say that the needle matches ...
1
vote
1
answer
95
views
Knuth–Morris–Pratt algorithm - Determine the next position of the proper prefix of the pattern string
When I learn Knuth–Morris–Pratt algorithm, I got a function f from the book Fundamentals of Data Structures in C.
...
1
vote
1
answer
136
views
A problem maybe related to pattern-matching
Let $\Sigma_{1}=\{a,b\}$ and $\Sigma_{2}=\{t,f\}$.
Define the function $f_{w}:\Sigma_{1}^{*}\rightarrow\Sigma_{2}^{*}$ for every $w\in\Sigma_{1}^{*}$; $f_{w}(w')\in\Sigma_{2}^{*}$ is the word obtained ...
1
vote
0
answers
89
views
Simultaneous matching of all Caesar rotations of a pattern in a text
Suppose we have an alphabet of size $S$, a pattern of length $P$ and a text of length $T$. We want to design an algorithm for matching all caesar rotations of the pattern $P$ in the text $T$. The ...
1
vote
1
answer
149
views
Finding the subset of a dictionary that has the minimum edit distance to a given string
I'm looking for the most efficient way of solving an Levenshtein edit distance problem.
We are given as input:
A set of strings S of size ...
0
votes
0
answers
38
views
What algrorithm computes a mutally exclusive partitioning from two regulair expressions?
The Question
A regular expression, such as AL+[EYI]+, represents a set of strings.
...
0
votes
1
answer
156
views
Shortest string which all input strings is its substring
Given a set of input strings, how do I find a shortest string S so that all input strings appear as a substring of S?
for example:
...
0
votes
1
answer
333
views
Longest prefix of string S that is also a sub-prefix of S in linear time
As the question suggest. I want an algorithm that runs in linear time which finds the longest prefix of a substring that is a sub-prefix of the same string.
Formally:
Given a string $S$ of length $n$,...
2
votes
1
answer
414
views
Understanding Rabin-Karp's rolling hash computation
Possibly related to this. Let $T$ be the text and $n$ be the length of the pattern. I understand that if substrings of $T$ are interpreted as base-$d$ numbers where $d$ is the alphabet's size, then ...
2
votes
1
answer
109
views
Complexity of right-to-left Knuth-Morris-Pratt algorithm
Suppose we modify the Knuth-Morris-Pratt string-matching algorithm to scan the pattern right-to-left a la Boyer-Moore, and consequently apply the shift rule on the right side of the cursor instead of ...
1
vote
1
answer
85
views
Compressed string lookup by name or ordinal
I have a large list of strings with associated values. I need to do a fast lookup into the list by the string prefix, but also by the string's ordinal within the list. For example,
...