Revisions to Find the first repeating sequence in digits of pi

deleted 19 characters in body

Source Link

edited May 18, 2021 at 9:00

Marc

5.7k
2
15
35

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return list(map(int, seq)), lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in ~50 seconds. The slowest part now seems to be str(pi), which generates the digits. Once the digits are generated, searching for the sequence with this approach takes less than a second.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return list(map(int, seq)), lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in ~50 seconds. The slowest part now seems to be str(pi), which generates the digits. Once the digits are generated, searching for the sequence with this approach takes less than a second.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return list(map(int, seq)), lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in ~50 seconds. The slowest part now seems to be str(pi), which generates the digits. Once the digits are generated, searching for the sequence takes less than a second.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

added 168 characters in body

Source Link

edited May 18, 2021 at 8:32

Marc

5.7k
2
15
35

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return [intlist(s) for s inmap(int, seq]seq)), lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in about ~40~50 seconds. The slowest part now seems to be str(pi), which generates the digits. Once the digits are generated, searching for the sequence with this approach takes less than a second.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return [int(s) for s in seq], lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in about ~40 seconds.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return list(map(int, seq)), lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in ~50 seconds. The slowest part now seems to be str(pi), which generates the digits. Once the digits are generated, searching for the sequence with this approach takes less than a second.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

added 130 characters in body

Source Link

edited May 18, 2021 at 3:17

Marc

5.7k
2
15
35

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return [int(s) for s in seq], lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in about ~40 seconds.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return [int(s) for s in seq], lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

How do I make it simpler and faster?

A more efficient approach:

Iterate on pi with a sliding window of size n.
Save each sequence (and the starting position) in a lookup map
Return as soon as you find a match in the lookup map

Something like this:

def repeats(n):
    lookup = {}
    pi_s = str(pi)[2:]
    for i in range(n, len(pi_s)):
        start = i - n
        seq = pi_s[start:i]
        if seq in lookup:
            return [int(s) for s in seq], lookup[seq], start
        lookup[seq] = start

For n=5 this runs in 0.001 seconds, while your approach takes ~40 seconds.

For n=10 and 1 million digits, this approach finds ([4, 3, 9, 2, 3, 6, 6, 4, 8, 4], 182104, 240478) in about ~40 seconds.

Review

Naming: the function name repeats is a bit ambiguous, it sounds like how many times n appears in the sequence. A better name could be first_repeating_sequence_of.
Edge case: for bigger n, more digits of pi are required, otherwise no matches are found. Consider passing the max number of digits of pi as an argument so the caller is aware of this limit.

deleted 12 characters in body

Source Link

edited May 18, 2021 at 2:32

Marc

5.7k
2
15
35

Loading

Source Link

answered May 18, 2021 at 2:23

Marc

5.7k
2
15
35

Loading

Stack Exchange Network

Return to Answer

Review

Review

Review

Review

Review

Review

Review

Review

Review