106

How would you convert a string to ASCII values?

For example, "hi" would return [104 105].

I can individually do ord('h') and ord('i'), but it's going to be troublesome when there are a lot of letters.

10 Answers 10

159

You can use a list comprehension:

>>> s = 'hi'
>>> [ord(c) for c in s]
[104, 105]
Sign up to request clarification or add additional context in comments.

Comments

34

Here is a pretty concise way to perform the concatenation:

>>> s = "hello world"
>>> ''.join(str(ord(c)) for c in s)
'10410110810811132119111114108100'

And a sort of fun alternative:

>>> '%d'*len(s) % tuple(map(ord, s))
'10410110810811132119111114108100'

1 Comment

What was I thinking? This is much more pythonic than mine. That's what I get for trying to answer a python question right after reading a bunch of Haskell questions... +1
18

In 2021 we can assume only Python 3 is relevant, so...

If your input is bytes:

>>> list(b"Hello")
[72, 101, 108, 108, 111]

If your input is str:

>>> list("Hello".encode('ascii'))
[72, 101, 108, 108, 111]

If you want a single solution that works with both:

list(bytes(text, 'ascii'))

(all the above will intentionally raise UnicodeEncodeError if str contains non-ASCII chars. A fair assumption as it makes no sense to ask for the "ASCII value" of non-ASCII chars.)

Comments

9

If you are using python 3 or above,

>>> list(bytes(b'test'))
[116, 101, 115, 116]

1 Comment

A great approach, but bytes() is redundant for a bytes input, and for a string input you need to specify an encoding.
7

If you want your result concatenated, as you show in your question, you could try something like:

>>> reduce(lambda x, y: str(x)+str(y), map(ord,"hello world"))
'10410110810811132119111114108100'

Comments

3

It is not at all obvious why one would want to concatenate the (decimal) "ascii values". What is certain is that concatenating them without leading zeroes (or some other padding or a delimiter) is useless -- nothing can be reliably recovered from such an output.

>>> tests = ["hi", "Hi", "HI", '\x0A\x29\x00\x05']
>>> ["".join("%d" % ord(c) for c in s) for s in tests]
['104105', '72105', '7273', '104105']

Note that the first 3 outputs are of different length. Note that the fourth result is the same as the first.

>>> ["".join("%03d" % ord(c) for c in s) for s in tests]
['104105', '072105', '072073', '010041000005']
>>> [" ".join("%d" % ord(c) for c in s) for s in tests]
['104 105', '72 105', '72 73', '10 41 0 5']
>>> ["".join("%02x" % ord(c) for c in s) for s in tests]
['6869', '4869', '4849', '0a290005']
>>>

Note no such problems.

Comments

3

your description is rather confusing; directly concatenating the decimal values doesn't seem useful in most contexts. the following code will cast each letter to an 8-bit character, and THEN concatenate. this is how standard ASCII encoding works

def ASCII(s):
    x = 0
    for i in xrange(len(s)):
        x += ord(s[i])*2**(8 * (len(s) - i - 1))
    return x

Comments

2
def stringToNumbers(ord(message)):
    return stringToNumbers
    stringToNumbers.append = (ord[0])
    stringToNumbers = ("morocco")

Comments

1

you can actually do it with numpy:

import numpy as np
a = np.fromstring('hi', dtype=np.uint8)
print(a)

1 Comment

Note fromstring is now deprecated, so something like np.frombuffer(b'hi', dtype=np.uint8) would be preferred.
0

If you don't mind the numpy dependency, you can also do it by simply casting the string as a 1D numpy ndarray and view it as int32 dtype.

import numpy as np

text = "hi"
np.array([text]).view('int32').tolist()   # [104, 105]

Note that similar to the built-in ord() function, the above operation returns the unicode code points of characters (only much faster if the string is very long) whereas .encode() encodes a string literal into a bytes literal which permits only ASCII characters which is not a problem for the scope of this current question but if you have a non-ASCII character such as Japanese, Russian etc. you may not get what you expected.

For example:

s = "Меси"
list(map(ord, s))                     # [1052, 1077, 1089, 1080]
np.array([s]).view('int32').tolist()  # [1052, 1077, 1089, 1080]
list(s.encode())                      # [208, 156, 208, 181, 209, 129, 208, 184]

2 Comments

for that to work on my system, it needs to be np.array([text]).view(dtype=np.int32).tolist() and np.array([s]).view(dtype=int32).tolist() respectively.
@Andj thanks for pointing me to that issue. I edited the post accordingly. Thanks

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.