10

I have a 2D numpy char array (from a NetCDF4 file) which actually represents a list of strings. I want to convert it into a list of strings.

I know I can use join() to concatenate the chars into a string, but I can only find a way to do this one string at a time:

data = np.array([['a','b'],['c','d']])
for row in data[:]:
    print ''.join(row)

But it's very slow. How can I return an array of strings in a single command? Thanks

1
  • 3
    Why are you copying data in your for loop? Commented Jun 11, 2012 at 17:07

3 Answers 3

14

The list comprehension is the most "pythonic" way.

The most "numpythonic" way would be:

>>> data = np.array([['a','b'],['c','d']])
# a 2D view
>>> data.view('S2')
array([['ab'],
       ['cd']], 
      dtype='|S2')
# or maybe a 1D view ...fastest solution:
>>> data.view('S2').ravel()
array(['ab', 'cd'], 
      dtype='|S2')

No looping, no list comprehension, not even a copy. The buffer just sits there unchanged with a different "view" so this is the fastest solution available.

Sign up to request clarification or add additional context in comments.

1 Comment

An important caveat is that the array must be contiguous in memory -- otherwise the view fails. You can ensure this by using data = np.ascontiguousarray(data).
5

Try a list comprehension:

>> s = [''.join(row) for row in data]
>> s
['ab', 'cd']

which is just your for loop rewritten.

2 Comments

@DavidRobinson Hadn't thought of that - very nice.
@AdrianR- don't forget to accept his answer (by clicking on the green checkmark) if it answered your question.
2
[row.tostring() for row in data]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.