I'm comparing my algorithms, slice_prefix
and bit_prefix
, with existing ones to find the common prefix length of 2 strings as fast as possible.
Another program I'm working with spends most of its time on one line len(os.path.commonprefix((item1, item2)))
. I've searched for other algorithms in Python and created my own to reduce this bottleneck.
Here's what I've found
times = [
... ['os.path.commonprefix', ospathcommonprefix],
... ['bit_prefix', bit_prefix],
... ['slice_prefix', slice_prefix],
... ['index_prefix', index_prefix],
... ['zip_prefix', zip_prefix],
... ['izip_prefix', izip_prefix]]
a = 'a' * 1000000
b = 'a' * 900000
for i, algorithm in enumerate(times):
times[i][1] = timeit.timeit(lambda: algorithm[1](a, b), number=100)
for result in sorted(times, key=lambda x: x[-1]): print result
['slice_prefix', 0.0408039022572666]
['bit_prefix', 6.330115954430312]
['izip_prefix', 6.653096312379603]
['os.path.commonprefix', 7.151773449750181]
['index_prefix', 8.905739173445]
['zip_prefix', 12.252280194828245]
times = [
... ['os.path.commonprefix', ospathcommonprefix],
... ['bit_prefix', bit_prefix],
... ['slice_prefix', slice_prefix],
... ['index_prefix', index_prefix],
... ['zip_prefix', zip_prefix],
... ['izip_prefix', izip_prefix]]
a = 'a' * 2
b = 'a' * 1
for i, algorithm in enumerate(times):
times[i][1] = timeit.timeit(lambda: algorithm[1](a, b), number=1000000)
for result in sorted(times, key=lambda x: x[-1]): print result
['slice_prefix', 0.7089188635036408]
['index_prefix', 0.8099312869949244]
['izip_prefix', 0.9187295778019688]
['zip_prefix', 1.115640751833098]
['os.path.commonprefix', 1.3019800865420166]
['bit_prefix', 3.7144623631063496]
a = 'a' * 1000000
b = 'a' * 900000
times = [
... ['os.path.commonprefix', ospathcommonprefix],
... ['bit_prefix', bit_prefix],
... ['slice_prefix', slice_prefix],
... ['index_prefix', index_prefix],
... ['zip_prefix', zip_prefix],
... ['izip_prefix', izip_prefix]]
for i, algorithm in enumerate(times):
times[i][1] = algorithm[1](a, b)
print 'Sanity check. All results should be 900000'
Sanity check. All results should be 900000
for result in sorted(times, key=lambda x: x[-1]): print result
['os.path.commonprefix', 900000]
['bit_prefix', 900000]
['slice_prefix', 900000]
['index_prefix', 900000]
['zip_prefix', 900000]
['izip_prefix', 900000]
The algorithms
from itertools import izip, count
import os
def bit_prefix(a, b):
min_len = min(len(a), len(b))
if min_len > 0:
x = str(bin(
int(a[::-1].encode('hex'), 16) ^
int(b[::-1].encode('hex'), 16)))
y = x.strip('0')
if len(y) is 1:
return min_len
else:
return (len(x) - len(y)) / 8
else:
return 0
def ospathcommonprefix(a, b):
return len(os.path.commonprefix((a, b)))
def slice_prefix(a, b, start=0, length=1):
while 1:
while a[start:start + length] == b[start:start + length]:
start += length
length += length
if length > 1:
length = 1
else:
return start
def index_prefix(a, b):
i = -1
min_len = min(len(a), len(b))
for i in xrange(0, min_len):
if a[i] != b[i]:
i -= 1
break
return i + 1
def zip_prefix(a, b):
i = -1
for i, (x, y) in enumerate(zip(a, b)):
if x != y:
i -= 1
break
return i + 1
def izip_prefix(a, b):
i = -1
for i, x, y in izip(count(), a, b):
if x != y:
i -= 1
break
return i + 1
While slice_prefix
is faster than other solutions, I still need to improve 50% more to remove my bottleneck. Maybe learning to write a Python C extension would work?
Do you think a C extension would give me the 50% + performance boost I'm looking for? Other algorithm suggestions? Any optimizations for what I've got here?
A directional variant of slice_prefix
def directional_prefix(a, b, start=0, length=1):
while 1:
while a[start:start + length] == b[start:start + length]:
start = start + length
length += length
length = length / 2
while length > 0 and a[start: start + length] != b[start: start + length]:
length = length / 2
if length > 0:
start = start + length
length = 1
else:
return start
a = 'a' * 1000000
b = 'a' * 900000
i = 0
while 2 ** i < len(b):
length = 2 ** i
print length, timeit.timeit(lambda: directional_prefix(a, b, length=length), number=100)
i += 1
1 0.0380843005421
2 0.0303687541309
4 0.0326917143407
8 0.0325356651429
16 0.0319154189644
32 0.0329461337922
64 0.0315618391367
128 0.0301917666861
256 0.0344714653179
512 0.0325779366976
1024 0.0378156588852
2048 0.0330583311902
4096 0.0388345218751
8192 0.0310024323921
16384 0.0407225196375
32768 0.0332771951284
65536 0.0506670016787
131072 0.0531404802284
262144 0.0809286942078
524288 0.0748048496141
i = 0
while 2 ** i < len(b):
length = 2 ** i
print length, timeit.timeit(lambda: slice_prefix(a, b, length=length), number=100)
i += 1
1 0.041934962645
2 0.0349668721381
4 0.0359395129606
8 0.0346022305948
16 0.0389324970677
32 0.0366881540488
64 0.0372364990776
128 0.0352363039174
256 0.0375355604517
512 0.0350917114961
1024 0.0369461290516
2048 0.034510971444
4096 0.035836797033
8192 0.0356835132641
16384 0.0378843995443
32768 0.0352453903265
65536 0.054919046022
131072 0.0592635347
262144 0.0652649103033
524288 0.0740941344068
slice_prefix
I'd try replacingif length > 1: length = 1 ...
with ``if length > 1: length = length / 2 ...` – when running ahead you double the step length each time, why not shorten by half when slowing down? Reducing from hundreds or thousands right to 1 seems too strong reduction. \$\endgroup\$