Skip to main content
Source Link

Here is a simple-to-state, but notoriously difficult open question from algorithmics, which I think might not have received yet the attention of the larger mathematics community.

The Greedy Superstring Conjecture:

Consider a set of $n$ strings $s_1, \dots, s_n$ over a finite alphabet $\Sigma$. Assume that none of the strings is a substring of another. We want to find a shortest string $s$ that contains all $s_1, \dots, s_n$ as substrings.

Here's an "obvious" heuristic: find the two strings $s_i, s_j$ with the largest overlap, and replace $s_i$ and $s_j$ with the string obtained by overlapping $s_i$ and $s_j$ as much as possible. After $n-1$ steps we obtain a string that is the superstring of all $s_1, \dots, s_n$. The conjecture says that this solution is at most twice as long as the optimum.

The problem is described (among other places) in Vazirani's book Approximation Algorithms, Chapter 2.3.

Post Made Community Wiki by László Kozma