2

I have created a 2d Numpy string array like so:

a = np.full((2, 3), '#', dtype=np.unicode)
print(a)

The output is:

array([['#', '#', '#'], ['#', '#', '#']], dtype=`'<U1'`)

I would like to pad it with '?' on all sides with a width of 1. I'm expecting output as:

array([
['?', '?', '?', '?', '?'],
['?', '#', '#', '#', '?'],
['?', '#', '#', '#', '?'],
['?', '#', '#', '#', '?'],
['?', '?', '?', '?', '?']],
dtype=`'<U1')

I tried the following:

b = np.pad(a, ((1, 1), (1, 1)), 'constant', constant_values=(('?', '?'), ('?', '?')))

But that gives the following error:

File "<stdin>", line 1, in <module>
File "/usr/lib/python3/dist-packages/numpy/lib/arraypad.py", line 1357, in pad
    cast_to_int=False)
File "/usr/lib/python3/dist-packages/numpy/lib/arraypad.py", line 1069, in _normalize_shape
    return tuple(tuple(axis) for axis in arr.tolist())
AttributeError: 'tuple' object has no attribute 'tolist'

Similar code works for integers. What am I doing wrong for strings?

2
  • This is simply a numpy bug -- it's implementation detail assumes too specific conditions about the array, and it dumps an unhelpful message when it can't go on. As @Kasramvd has shown, you can circumvent it by creating your own padding function. Commented Mar 1, 2018 at 12:53
  • I get a different error in 1.14. Notice that some of the modes involve maximum and interpolation. The 'constant' mode may be just a special case of one of those. It's overkill for a simple padding like this. Commented Mar 1, 2018 at 17:21

2 Answers 2

4

You can't pad your array with string literals. Instead as it's mentioned in documentation you can use a pad_with function as follows:

In [79]: def pad_with(vector, pad_width, iaxis, kwargs):
    ...:     pad_value = kwargs.get('padder', '?')
    ...:     vector[:pad_width[0]] = pad_value
    ...:     vector[-pad_width[1]:] = pad_value
    ...:     return vector
    ...: 

In [80]: 

In [80]: np.pad(a, 1, pad_with)
Out[80]: 
array([['?', '?', '?', '?', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')

Note that in line pad_value = kwargs.get('padder', '?') in pad_with function you should use a default padding value in case there's no padding argument provided in np.pad's caller. You an pass the intended padder as a keyword argument to the function.

In [82]: np.pad(a, 1, pad_with, padder='*')
Out[82]: 
array([['*', '*', '*', '*', '*'],
       ['*', '#', '#', '#', '*'],
       ['*', '#', '#', '#', '*'],
       ['*', '#', '#', '#', '*'],
       ['*', '*', '*', '*', '*']], dtype='<U1')
Sign up to request clarification or add additional context in comments.

Comments

0

Even if you can get pad to work, it would be faster to insert a into a blank b. pad is setup for complex padding patterns, and does the job iteratively - row by row, column by column.

In [29]: a = np.full((2,3),'#')
In [30]: a
Out[30]: 
array([['#', '#', '#'],
       ['#', '#', '#']], dtype='<U1')
In [31]: b = np.full((4,5),'?')
In [32]: b
Out[32]: 
array([['?', '?', '?', '?', '?'],
       ['?', '?', '?', '?', '?'],
       ['?', '?', '?', '?', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')
In [33]: b[1:-1,1:-1] = a
In [34]: b
Out[34]: 
array([['?', '?', '?', '?', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')

Here's the clever pad_with solution, with an added print so we can see how often it is called:

In [36]: def pad_with(vector, pad_width, iaxis, kwargs):
    ...:     ...:     print(vector)
    ...:     ...:     pad_value = kwargs.get('padder', '?')
    ...:     ...:     vector[:pad_width[0]] = pad_value
    ...:     ...:     vector[-pad_width[1]:] = pad_value
    ...:     ...:     return vector
    ...: 
In [37]: np.pad(a,1,pad_with)
['' '' '' '']
['' '#' '#' '']
['' '#' '#' '']
['' '#' '#' '']
['' '' '' '']
['?' '?' '?' '?' '?']
['' '#' '#' '#' '']
['' '#' '#' '#' '']
['?' '?' '?' '?' '?']
Out[37]: 
array([['?', '?', '?', '?', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '#', '#', '#', '?'],
       ['?', '?', '?', '?', '?']], dtype='<U1')

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.