Finding a 'short' binary sequence not contained in a given binary sequence

Question

Let $k>1$ be an integer, let $n = 2^k$. Let $s: \{0, \ldots, n-1\} \to \{0,1\}$ be any binary sequence of length $n = 2^k$. There are $n - k$ coherent subsequences of $s$ having length $k$, and the total number of binary sequence of length $k$ is $n=2^k$, so there are some binary sequences of length $k$ that are not a subsequence of $s$.

Question. Is there an algorithm with time $O(n)$ outputting one binary sequence of length $k$ that is not a subsequence of $s$?

Yuval Filmus · Accepted Answer · 2022-01-14 12:05:40Z

Construct a suffix tree for your string. If your string contained all $2^k$ substrings of length $k$, then the first $k$ levels of the suffix tree, ignoring edges labelled \$, would be a complete binary tree, and moreover, in the first $k-1$ levels, all edge labels have length $1$. Since this is not the case, one of the following must happen:

Some edge label in the first $k-1$ levels has length more than $1$. Take such an edge label as high up in the tree as possible. In spells a substring $p$ of length at most $k$ which does occur in the word, but if we flip the final bit, then we get a substring $p'$ which doesn't.
All edge labels in the first $k-1$ levels have length exactly $1$. In that case, after removing edges labelled \$, the first $k$ levels cannot form a complete binary tree. In particular, some node, corresponding to a substring $p$ of length less than $k$, has a child whose label starts with $\sigma \neq \\\$$, but no child whose label starts with $\overline{\sigma}$. Thus $p\overline{\sigma}$ does not occur in the word.

Here is an example: Let $k = 2$, and consider the word $0011$.

After removing edges labeled \$, the root has two outoing edges labelled $0,1$; its $0$-child has two outgoing edges lablled $011\\\$$ and $11\\\$$; and its $1$-child has a single outgoing edge $1\\\$$. Thus the subtring $10$ is missing.

Here is another example: Again let $k=2$, and consider the word $0101$.

After removing edges labeled \$, the root has two outgoing edges labelled $01,1$. Thus the substring $00$ is missing. The $1$-child has a single outgoing edge labelled $01\\\$$, and so the substring $11$ is also missing.

Stack Exchange Network

Finding a 'short' binary sequence not contained in a given binary sequence

1 Answer 1

Hot Network Questions

Finding a 'short' binary sequence not contained in a given binary sequence

1 Answer 1

Related

Hot Network Questions