How can I check if a string contains ANY letters from the alphabet?

Question

What is best pure Python implementation to check if a string contains ANY letters from the alphabet?

string_1 = "(555).555-5555"
string_2 = "(555) 555 - 5555 ext. 5555

Where string_1 would return False for having no letters of the alphabet in it and string_2 would return True for having letter.

Should this be limited to english a/z alphabet only ? Should 'special' characters from others alphabets, like German, be taken in account ? — Kotch
– Kotch, Commented Jan 31, 2012 at 0:35
Is there any chance that you will receive unicode? Or just plain ascii roman letters? — KobeJohn
– KobeJohn, Commented Jan 31, 2012 at 0:39
Nice timing there :) Anyway, check this similar question out if you need help testing strings with unicode characters. — KobeJohn
– KobeJohn, Commented Jan 31, 2012 at 0:44
Limited to English a/z alphabet only and only plain ascii roman letters :) — papezjustin
– papezjustin, Commented Jan 31, 2012 at 17:15

JBernardo · Accepted Answer · 2019-07-24 18:37:57Z

169

Regex should be a fast approach:

re.search('[a-zA-Z]', the_string)

edited Jul 24, 2019 at 18:37

answered Jan 31, 2012 at 0:42

JBernardo

33.6k13 gold badges92 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Jollywatt Over a year ago

Regex certainly seems a bit overkill. any(c.isalpha() for c in string_1) is deliciously Pythonic.

JBernardo Over a year ago

@Joseph No, it is not. This regex is far more readable than your expression. Also, what does isalpha even means? This will have totally different behaviors when comparing Python 2 with Python 3. Is Chinese part of the alphabet? If not, you are blindly matching it with your generator on Python 3 (Or Python 2 for unicode strings!). If you want Pythonic, here it is: Simple is better than complex.. And check OP's comment above: He wants only the roman alphabet to be matched.

Hinton Over a year ago

I think Joseph's answer is perfectly readable and it's certainly faster than an additional import; plus you don't have to remember the order of arguments in re.search

Srini Over a year ago

In case anyone else is wondering what the return value is, you get a Match object if there is a match, or None if there isn't. So this is compatible with a if re.search(... pattern.

carloswm85 Over a year ago

@JBernardo Knowing from which module to import is not a triviality. It should be at least mentioned. Import Regular Expression Operation from re module (Python 2.7 to 3.9.5).

|

DSM · Accepted Answer · 2012-01-31 00:31:50Z

116

How about:

>>> string_1 = "(555).555-5555"
>>> string_2 = "(555) 555 - 5555 ext. 5555"
>>> any(c.isalpha() for c in string_1)
False
>>> any(c.isalpha() for c in string_2)
True

answered Jan 31, 2012 at 0:31

DSM

356k67 gold badges606 silver badges505 bronze badges

8 Comments

Rik Poggi Over a year ago

Would set(string_1) be more efficent?

KobeJohn Over a year ago

@Rik. You mean converting string_1 to a set before testing it? No it won't be more efficient. That is guaranteed to deal with all characters at least once while I believe the any function will short circuit (stop) when it encounters the first false.

JBernardo Over a year ago

This code will be somewhat slow because it requires a function call per char. Converting to set may or may not reduce function calls, but adds some overhead.

DSM Over a year ago

@JBernardo: timeit suggests it's about an order of magnitude slower than a compiled regex and takes only about 66% more time than a non-compiled one. That's well within my "I hate regular expressions" limits.

DSM Over a year ago

Sure: and if you use "(555).555-5555 ext. 5555"*1000 you're back to comparable speeds because of the short-circuiting. I much prefer writing in Python to writing regular expressions, which I find hard to debug unless they're trivial, and I'm not going to give up on writing clear Python unless performance requirements demand it.

|

John Strood · Accepted Answer · 2018-09-06 20:47:15Z

29

You can use islower() on your string to see if it contains some lowercase letters (amongst other characters). or it with isupper() to also check if contains some uppercase letters:

below: letters in the string: test yields true

>>> z = "(555) 555 - 5555 ext. 5555"
>>> z.isupper() or z.islower()
True

below: no letters in the string: test yields false.

>>> z= "(555).555-5555"
>>> z.isupper() or z.islower()
False
>>>

Not to be mixed up with isalpha() which returns True only if all characters are letters, which isn't what you want.

Note that Barm's answer completes mine nicely, since mine doesn't handle the mixed case well.

edited Sep 6, 2018 at 20:47

John Strood

2,0394 gold badges33 silver badges42 bronze badges

answered Mar 24, 2017 at 15:08

Jean-François Fabre♦

141k24 gold badges180 silver badges247 bronze badges

6 Comments

Cornbeetle Over a year ago

I like that this will test if it CONTAINS letters, not just test if input is ALL letters.

Jean-François Fabre Over a year ago

@Cornbeetle yes, that kind of really answers the question after all those years, thanks

pnv Over a year ago

Very nice way to put this. How is it in terms of efficiency ? better than regex?

Jean-François Fabre Over a year ago

there are no python loops involved, so the efficiency is good. I didn't compare with regex but I suppose it's slightly faster, specially for the initialization phase because there's no regex to compile

Jean-François Fabre Over a year ago

doesn't handle mixed case, that's stated in the answer

|

John Strood · Accepted Answer · 2018-09-06 22:01:59Z

20

I liked the answer provided by @jean-françois-fabre, but it is incomplete.
His approach will work, but only if the text contains purely lower- or uppercase letters:

>>> text = "(555).555-5555 extA. 5555"
>>> text.islower()
False
>>> text.isupper()
False

The better approach is to first upper- or lowercase your string and then check.

>>> string1 = "(555).555-5555 extA. 5555"
>>> string2 = '555 (234) - 123.32   21'

>>> string1.upper().isupper()
True
>>> string2.upper().isupper()
False

edited Sep 6, 2018 at 22:01

John Strood

2,0394 gold badges33 silver badges42 bronze badges

answered Nov 23, 2017 at 10:40

Barm

4035 silver badges11 bronze badges

Comments

Mihir Verma · Accepted Answer · 2019-12-12 09:06:43Z

I tested each of the above methods for finding if any alphabets are contained in a given string and found out average processing time per string on a standard computer.

~250 ns for

import re

~3 µs for

re.search('[a-zA-Z]', string)

~6 µs for

any(c.isalpha() for c in string)

~850 ns for

string.upper().isupper()

Opposite to as alleged, importing re takes negligible time, and searching with re takes just about half time as compared to iterating isalpha() even for a relatively small string.
Hence for larger strings and greater counts, re would be significantly more efficient.

But converting string to a case and checking case (i.e. any of upper().isupper() or lower().islower() ) wins here. In every loop it is significantly faster than re.search() and it doesn't even require any additional imports.

You can also compile the regex for furhter optimization. alpha_regex = re.compile('[a-zA-Z]') later alpha_regex.search(string)
Not to mention isalpha() doesn't workout well for multi languages. I was looking for this because I wanted to check whether a string that is expected to be Korean contains any English letters and the isalpha() method returns True for every korean string.

cola · Accepted Answer · 2012-01-31 00:50:13Z

11

You can use regular expression like this:

import re

print re.search('[a-zA-Z]+',string)

answered Jan 31, 2012 at 0:50

cola

12.5k36 gold badges111 silver badges169 bronze badges

Comments

Ronald Saunfe · Accepted Answer · 2018-04-12 13:44:36Z

1

You can also do this in addition

import re
string='24234ww'
val = re.search('[a-zA-Z]+',string) 
val[0].isalpha() # returns True if the variable is an alphabet
print(val[0]) # this will print the first instance of the matching value

Also note that if variable val returns None. That means the search did not find a match

edited Apr 12, 2018 at 13:44

answered Apr 12, 2018 at 13:18

Ronald Saunfe

6518 silver badges21 bronze badges

Collectives™ on Stack Overflow

How can I check if a string contains ANY letters from the alphabet?

7 Answers 7

10 Comments

8 Comments

6 Comments

Comments

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

10 Comments

8 Comments

6 Comments

Comments

2 Comments

Comments

Comments

Linked

Related