2
$\begingroup$

Most of us are familiar with Godel numbering. While ideal for many theoretical purposes, it seems unlikely to be the most efficient mapping of ordered $k$-tuples to $\mathbb N$.

I realize my question might turn on my definition of "efficient", so I guess as a starting point: given $k$ natural numbers in $[1,N]$, the most efficient encoding is that function which minimizes its aggregate sum across all possible $k$-tuples. This may be a bad definition if it turns out to support many wildly different encoding schemas, but my guess would be that an obvious best choice (or best general approach) would quickly emerge as $k$ and $N$ increase.

It's also a requirement that every unique $k$-tuple correspond to one and only one natural number, such that the ordered $k$-tuple can be recovered. It's also a requirement that one can use $k,N$ values outside of the bounds without the function failing; my hope is that it would support arbitrarily many arbitrarily large values in a reasonable way.

Is there an established answer to this question, or one similar to it in spirit?

As an example, I've got a method which at least improves slightly on Godel numbering in that the base for each additional term remains at $2$ instead of increasing, although it grows on the same order as Godel otherwise. Also, you needn't keep track of any primes to use it.

$$ \begin{array}{} s_1:=2^{n_1}+1 \\ s_i:= 2^{n_i}s_{i-1}+1 \end{array} $$

This is effectively nothing more sophisticated than encoding $(5,3)$ as $1000100_2$, however. Surely there's a smarter approach? Ideally, something sub-exponential?

$\endgroup$
3
  • 3
    $\begingroup$ Is it your intention to fix $k$, or to have it vary? For fixed $k$ and $N$ the best encodings are some form of square/triangular array, with size $N\choose k$ for unordered pairs; for fixed $k$ and variable $N$ one can arrange the elements of these arrays such that the sequence for $N$ is an initial subsequence of the sequence for $N+1$ and so get optimal size for all $N$ for that fixed $k$. If you want to vary $k$, you may have to be more specific about your efficiency. $\endgroup$ Commented Jul 8, 2021 at 23:22
  • $\begingroup$ Do you have some practical application in mind? Note that programming languages like Lisp, Standard ML, OCaml, Haskell etc., all provide efficient generic implementations of recursive data types. Of course, these language use representations that are a compromise between space and speed (of access), while you seem to be focussed on minimising space. $\endgroup$ Commented Jul 8, 2021 at 23:53
  • $\begingroup$ @StevenStadnicki Thanks, that's helpful. But my intention is for $k$ to vary; what I'd like to say is I'm looking for the best encoding mechanism for unknown $k$ and $N$, but I realized that's probably too vague, so I guess I need another definition of efficiency where $k$ can vary in some well-defined way... $\endgroup$ Commented Jul 9, 2021 at 0:08

1 Answer 1

3
$\begingroup$

Just a thought that someone else could choose to improve, but you could instead rewrite your numbers in base 9, and then use 9 as your delimiter to separate the values. So you could write $(5,3)$ as $593$. Likewise, you could write $(35,19,31)$ as $( 38_9,21_9,34_9)$ which converts to $38921934$. This is not the best possible method for encoding though, since there are at least some strings of digits that are never encoded into (such as $9999$). The most efficient encoding should not miss any strings.

$\endgroup$
1
  • 2
    $\begingroup$ Wow, I don't know why I didn't consider this. This actually does seem to satisfy what I was looking for. I changed it to using base-3 and 2s as delimiters, which is a bit more compact (for small $k$ and $N$), and it seems to work awfully well. $\endgroup$ Commented Jul 9, 2021 at 0:24

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.