What happens if I start a bash array with a big index?

Question

I was trying to create a bash "multidimensional" array, I saw the ideas on using associative arrays, but I thought the simplest way to do it would be the following:

for i in 0 1 2
do 
    for j in 0 1 2
    do
        a[$i$j]="something"
    done
done

It is easy to set and get values, but the jumps the indexes might make it terrible for arrays a bit bigger if bash allocates space for the elements from index 00 to 22 sequentially (I mean allocating positions {0,1,2,3,4,...,21,22}), instead of just the elements which were actually set: {00,01,02,10,11,...,21,22}.

This made me wonder, what happens when we start a bash array with an index 'n'? Does it allocate enough space for indexes 0 to n, or does it allocate the nth element sort of individually?

Not to detract from Stéphane's excellent answer below, but the fact that you even need multi-dimensional arrays suggests that you are doing something that shell scripts are not suited to doing (such as processing data rather than executing other programs to process that data) and should probably be using a language better suited to your task - awk, perl, or python for example. The fact that you can wrestle bash into doing something it's bad at doesn't mean that you should. — cas, Commented Aug 12, 2023 at 12:55

Stéphane Chazelas · Accepted Answer · 2023-08-11 20:29:45Z

Array indices in bash like in ksh (whose array design bash copied) can be any arithmetic expression.

In a[$i$j]="something", the $i and $j variables are expanded, so with i=0 j=1, that becomes a[01]="something", 01 as an arithmetic expression means octal number 1 in bash. With i=0 j=10, that would be a[010]="something" same as a[8]="something". And you'd get a[110]="something" for both x=11 y=0 and x=1 y=10.

It should be obvious by now that it's not what you want.

Instead, you'd do like you do in C for bidimensional arrays (matrices):

matrix_size=3
for (( i = 0; i < matrix_size; i++ )) {
  for (( j = 0; j < matrix_size; j++ )) { 
    a[i * matrix_size + j]="something"
  }
}

(the for (( ...; ...; ...)) C-like construct copied from ksh93).

Or switch to ksh93 which has multidimensional array support:

for (( i = 0; i < 3; i++ )) {
  for (( j = 0; j < 3; j++ )) { 
    a[i][j]="something"
  }
}

It's also somewhat possible to implement multidimensional arrays using associative arrays whose keys are just strings:

typeset -A a
for (( i = 0; i < 3; i++ )) {
  for (( j = 0; j < 3; j++ )) { 
    a[$i,$j]="something"
  }
}

The resulting variable you get in all three as reported by typeset -p:

declare -a a=([0]="something" [1]="something" [2]="something" [3]="something" [4]="something" [5]="something" [6]="something" [7]="something" [8]="something")

typeset -a a=((something something something) (something something something) (something something something) )

declare -A a=([0,2]="something" [0,1]="something" [0,0]="something" [2,1]="something" [2,0]="something" [2,2]="something" [1,2]="something" [1,0]="something" [1,1]="something" )

Now to answer the question in the subject, in bash like in ksh, plain arrays are sparse, which means you can have a[n] defined without a[0] to a[n-1] being defined, so in that sense they're are not like the arrays of C or most other languages or shells.

Initially in ksh, array indices were limited to 4095, so you could have matrices at most 64x64 large, that limit has been raised to 4,194,303 since. In ksh93, I see doing a[4194303]=1 does allocate over 32MiB of memory I guess to hold 4194304 64bit pointers and some overhead, while that doesn't seem to happen in bash, where array indices can go up to 9223372036854775807 (at least on GNU/Linux amd64 here) without allocating more memory than that is needed to store the elements that are actually set.

In all other shells with array support ((t)csh, zsh, rc, es, fish...), array indices start at 1 instead of 0 and arrays are normal non-sparse arrays where you can't have a[2] set without a[1] also set even if that's to the empty string.

Like in most programming languages, associative arrays in bash are implemented as hash tables with no notion of order or rank (you'll notice typeset -p shows them in seemingly random order above).

For more details on array design in different shells, see this answer to Test for array support by shell.

Stack Exchange Network

What happens if I start a bash array with a big index?

1 Answer 1

You must log in to answer this question.

Linked

Hot Network Questions

What happens if I start a bash array with a big index?

1 Answer 1

You must log in to answer this question.

Linked

Related

Hot Network Questions