Checking for many strings in a single string with Python

Question

Is there a more efficient way to do something than this... I am just creating a block list for a bunch of URLs

url = 'http://www.google.com'
blocks = ['youtube.com','google.com','bing.com']
for block in blocks:
    if block in url:
        return 0
return 1

Algorithmically, you could store your blocks as a radix tree. Practically, in Python, you may not get much better performance than speeding up the iteration via comprehensions and other optimization tricks. — Joel Cornett
– Joel Cornett, Commented Oct 7, 2015 at 23:08

jaime · Accepted Answer · 2015-10-07 20:54:51Z

2

url = 'http://www.google.com'
blocks = ['youtube.com','google.com','bing.com']
return filter(lambda b: b in url, blocks)

answered Oct 7, 2015 at 20:54

jaime

2,3641 gold badge20 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

xiº · Accepted Answer · 2015-10-07 20:59:17Z

0

List comprehension is more preferable according to PEP.

>>> print [block in url for block in blocks]
[False, True, False]

or if you prefer int:

>>> print [int(block in url) for block in blocks]
[0, 1, 0]

answered Oct 7, 2015 at 20:59

xiº

4,7173 gold badges31 silver badges43 bronze badges

Comments

Joel Cornett · Accepted Answer · 2015-10-07 23:38:17Z

If you are going to be doing this operation over and over again, it may give you a modest performance gain to precompile a regex containing all of the blocks. For example:

import re
blocks = ["youtube.com", "google.com", "bing.com"]
precomp_regex = re.compile("|".join(map(re.escape, blocks)))

def string_contains_block(string, regex=precomp_regex):
    return regex.search(string)

If your set of blocks is very large, or subject to change often, it may be worth storing as a radix trie. (Think of the OR'd regex as a very naive implementation of a radix trie).

R Nar · Accepted Answer · 2015-10-07 22:00:35Z

-1

url = 'http://www.google.com'
blocks = ['youtube.com','google.com','bing.com']
return not(any([block in url for block in blocks]))

edited Oct 7, 2015 at 22:00

answered Oct 7, 2015 at 21:00

R Nar

5,5231 gold badge20 silver badges34 bronze badges

1 Comment

xiº Over a year ago

there is a result just False

Collectives™ on Stack Overflow

Checking for many strings in a single string with Python

4 Answers 4

Comments

Comments

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

1 Comment

Related