Re: Re: Fwd: little request :)

From: Date: Mon, 10 Feb 2014 23:55:41 +0000
Subject: Re: Re: Fwd: little request :)
References: 1 2 3 4 5 6  Groups: php.internals 
Request: Send a blank email to internals+get-72450@lists.php.net to get a copy of this message
Hi,


On Tue, Feb 11, 2014 at 3:50 AM, Yasuo Ohgaki <yohgaki@ohgaki.net> wrote:

> Hi Tjerk,
>
> On Mon, Feb 10, 2014 at 7:36 PM, Tjerk Meesters <tjerk.meesters@gmail.com>wrote:
>
>> I've given this problem some more thought and the following looked
>> promising; it would be great to get your feedback on it.
>>
>> Basically, to remove the modulo and MAX(), I've used a combined buffer
>> that's used for the comparison, in two ways:
>>
>> 1. PAD := known (padded with \0 based on given_len)
>> 2. PAD := known CONCAT given
>>
>> The comparison is straightforward because we already know the buffer is at
>> least given_len.
>>
>> Implementations of above can be found here:
>> 1. https://github.com/datibbaw/php-src/compare/master...const-cmp
>> 2. https://github.com/datibbaw/php-src/compare/master...const-cmp-2
>>
>> Bottom line is that this attempts to compromise memory against time
>> stability.
>>
>
> I think length leak does not matter much.
> Therefore, I may ignore length leak as it could know it from other codes
> using strlen(), etc.
>
> Our main concern would be performance.
>

This problem isn't about performance, not necessarily anyway; the primary
problem that this rfc is attempting to solve is a string comparison
function with the tightest possible Theta(given-length) runtime.

If this can be accomplished with reading double words first, followed by a
byte wise suffix comparison, then great. If not, the first requirement
should be honoured.


>
> https://github.com/yohgaki/php-src/compare/PHP-5.6-rfc-hash-compare
>
> and the bench mark result of
>
> https://gist.github.com/yohgaki/ede544f290c6cf9fa90d
>
> is
>
> [yohgaki@dev github-php-src]$ ./php-bin t.php
> str_siphash_compare  Elapsed: 9.450211   Iterations: 1000000 DataSize: 1024
> str_xxhash32_compare Elapsed: 3.849933   Iterations: 1000000 DataSize: 1024
> str_md5_compare      Elapsed: 27.174614   Iterations: 1000000 DataSize:
> 1024
> str_byte_compare     Elapsed: 5.874092   Iterations: 1000000 DataSize: 1024
> str_byte_compare2    Elapsed: 8.761152   Iterations: 1000000 DataSize: 1024
> str_word_compare     Elapsed: 1.867914   Iterations: 1000000 DataSize: 1024
> str_compare          Elapsed: 1.178738   Iterations: 1000000 DataSize: 1024
>
> I had the same idea, but I didn't try as it requires additional memory
> allocation.
>
> str_byte_compare     Elapsed: 5.874092   Iterations: 1000000 DataSize: 1024
> str_byte_compare2    Elapsed: 8.761152   Iterations: 1000000 DataSize: 1024
>
> 1st one is simple byte by byte XOR, 2nd one is using modulo as original
> patch
> proposed. There is obvious difference even with additional modulo.
>
> str_compare          Elapsed: 1.178738   Iterations: 1000000 DataSize: 1024
>
> This is the result of simple strncmp() as a reference.
> How well your version perform compare to simple strncmp?
>
> Regards,
>
> --
> Yasuo Ohgaki
> yohgaki@ohgaki.net
>



-- 
--
Tjerk


Thread (42 messages)

« previous php.internals (#72450) next »