Arbitrary precision integer square root algorithm?

Question

Are there any known subquadratic algorithms for calculating the floor of the square root of an n bit integer?

The naive algorithm would be something like

def sqrt(x):
    r = 0
    i = x.bit_length() // 2
    while i >= 0:
        inc = (r << (i+1)) + (1 << (i*2))
        if inc <= x:
            x -= inc
            r += 1 << i
        i -= 1
    return r

This takes O(n) iterations, each one involving additions that are O(n) time, so it's O(n^2) time overall. Is there anything faster? I know that for the case of multiplication there are special algorithms that do better than quadratic time, but I can't find anything for square roots.

My answer to something related might help cs.stackexchange.com/a/37338/12052. Only problem is, a part of the necessary equation you would need to find empirically to tweak its accuracy. — Francesco Gramano
– Francesco Gramano, Commented Jan 26, 2015 at 18:55
btw, is this sub-quadratic requirement part of a bigger problem? Because the different between the simple quadratic and complicated sub-quadratic might not be that big in practice. Or is it just of theoretical interest? — Aryabhata
– Aryabhata, Commented Jan 27, 2015 at 16:51
@Aryabhata Sorry I didn't see your comment earlier. No, it's not part of a bigger problem, just curiosity. — Antimony
– Antimony, Commented Jan 28, 2015 at 15:41

D.W. · Accepted Answer · 2015-01-26 22:26:17Z

6

You can use Newton's method or any of a number of other methods for finding approximations to roots of the polynomial $p(x) = x^2 -c$.

The rate of convergence for Newton's method will be quadratic, which means that the number of bits that are correct doubles in each iteration. This means $O(\lg n)$ iterations of Newton's method suffice.

Each iteration of Newton's method computes

$$x_{j+1} = x_j - (x_j^2 -c)/(2x_j) = 0.5 x_j + \frac{c}{2x_j}.$$

The bit complexity of multiplication is $\stackrel{~}{O}(b \lg b)$, to multiply two $b$-bit integers (ignoring $\lg \lg b$ factors). The bit complexity for division (to $b$ bits of precision) is the same. Therefore, each iteration can be computed in $\stackrel{~}{O}(n \lg n)$ operations. Multiplying by $O(\lg n)$ iterations, we find that the overall running time to compute the square root to $n$ bits of precision is $\stackrel{~}{O}(n (\lg n)^2)$. This is sub-quadratic.

I think a more careful analysis shows that this can be improved to $\stackrel{~}{O}(n \lg n)$ running time (by taking into account that we only need to know each $x_j$ to within about $j$ bits of precision, rather than $n$ bits of precision). However, even the more basic analysis already shows a running time that is clearly subquadratic.

edited Jan 26, 2015 at 22:26

answered Jan 26, 2015 at 19:15

D.W.♦

169k23 gold badges236 silver badges519 bronze badges

$\begingroup$ In binary one also has a great initial guess using the identity $x^{1/2} = 2^{1/2 \log_2 x}$. Instead of computing the log, one can approximate $\log_2 x$ as the number of digits in $x$. Eg, $\log_2 101011 \approx 6$. $\endgroup$

Nick Alger
– Nick Alger

2015-01-27 00:00:09 +00:00
Commented Jan 27, 2015 at 0:00
$\begingroup$ @D.W.: But aren't we looking for an integer square root? If you do the newton's method iteration using only integer arithmetic, then we need to some additional justification for the $O(\log n)$ claim, don't we? Otherwise, we are assuming sufficiently large precision already... Sorry if I am missing something obvious. $\endgroup$

Aryabhata
– Aryabhata

2015-01-27 01:20:31 +00:00
Commented Jan 27, 2015 at 1:20
$\begingroup$ @D.W. : $\;\;\;$ "The rate of convergence for Newton's method" won't be quadratic if $c\hspace{-0.04 in}=\hspace{-0.04 in}0$, and I don't know what happens for values of $c$ that aren't non-negative reals. $\:$ Your estimate for the bit complexity of multiplication is tighter than your following remark suggests. $\:$ Also, we "need to know each $x_j$ to within about" $2^{\hspace{.02 in}j}$ "bits of precision". $\;\;\;\;\;\;\;$ $\endgroup$

user12859
– user12859

2015-01-27 04:37:17 +00:00
Commented Jan 27, 2015 at 4:37
$\begingroup$ @Aryabhata : $\;\;\;$ We're not quite "looking for an integer square root"; we're looking for "the floor of the square root". $\:$ You're right about the integer arithmetic issue, although the same bit complexities hold for floating-point operations. $\;\;\;\;\;\;\;$ $\endgroup$

user12859
– user12859

2015-01-27 04:41:23 +00:00
Commented Jan 27, 2015 at 4:41
1

$\begingroup$ @RickyDemer, yes, $c=0$ is a special case, because then the root of $p(x)$ has multiplicity 2, but when $c>0$, the root has multiplicity 1 so Newton's method does have quadratic convergence. I assume no one would use Newton's method to compute the square root of $c=0$ (because the square root of zero is obviously zero). So what are you trying to say? Is your comment a trivial comment that is addressed by adding something to my answer that says "special-case the square root of zero", or is there something deeper here that I'm missing? $\endgroup$

D.W.
– D.W. ♦

2015-01-27 06:22:53 +00:00
Commented Jan 27, 2015 at 6:22

| Show 4 more comments

Pseudonym · Accepted Answer · 2015-01-27 03:01:42Z

One of the problems with Newton's method is that it requires a division operation in each iteration, which is the slowest basic integer operation.

Newton's method for the reciprocal square root, however, doesn't. If $x$ is the number for which you want to find $\frac{1}{\sqrt x}$, iterate:

$$r_{i+1} = \frac{1}{2} r_i (3 - x r_i^2)$$

This is often expressed as:

$$w_i = r_i^2$$ $$d_i = 1 - w_i x$$ $$r_{i+1} = r_i + \frac{r_i d_i}{2}$$

That's three multiplication operations. The division by two can be implemented as a shift-right.

Now the problem is that $r$ is not an integer. However, you can manipulate it as such by implementing floating-point manually, and doing a bunch of shift operations to compensate when appropriate.

First, let's rescale $x$:

$$x' = 2^{-2e} x$$

where we would like $x'$ to be greater than, but close to, $1$. If we run the above algorithm on $x'$ instead of $x$, we find $r = \frac{1}{\sqrt x'}$. Then, $\sqrt{x} = 2^e r x'$.

Now let's split $r$ into a mantissa and exponent:

$$r_i = 2^{-e_i} r'_i$$

where $r'_i$ is an integer. Intuitively, $e_i$ represent the precision of the answer.

We know that Newton's method roughly doubles the number of accurate significant digits. So we can choose:

$$e_{i+1} = 2e_i$$

With a little manipulation, we find:

$$e_{i+1} = 2e_i$$ $$w_i = {r'_i}^2$$ $$x'_i = \frac{x}{2^{2e - e_{i+1}}}$$ $$d_i = 2^{e_{i+1}} - \frac{w_i' x'_i}{2^{e_{i+1}}}$$ $$r'_{i+1} = 2^{e_i} r'_i - \frac{r'_i d_i}{2^{e_i + 1}}$$

At every iteration:

$$\sqrt{x} \approx \frac{r'_i x}{2^{e + e_i}}$$

As an example, let's try calculating the square root of $x = 2^{63}$. We happen to know that the answer is $2^{31}\sqrt{2}$. The reciprocal square root is $\frac{1}{\sqrt{2}} 2^{-31}$, so we'll set $e = 31$ (this is the scale of the problem) and for our initial guess we'll pick $r'_0 = 3$ and $e_0 = 2$. (That is, we pick $\frac{3}{4}$ for our initial estimate to $\frac{1}{\sqrt{2}}$.)

Then:

$$e_1 = 4, r'_1 = 11$$ $$e_2 = 8, r'_2 = 180$$ $$e_3 = 16, r'_3 = 46338$$ $$e_4 = 32, r'_4 = 3037000481$$

We can work out when to stop iterating by comparing $e_i$ to $e$; if I've calculated correctly, $e_i > 2e$ should be good enough. We'll stop here, though, and find:

$$\sqrt{2^{63}} \approx \frac{3037000481 \times 2^{63}}{2^{31+32}} = 3037000481$$

The correct integer square root is $3037000499$, so we're pretty close. We could do another iteration, or do an optimised final iteration which doesn't double $e_i$. The details are left as an exercise.

To analyse the complexity of this method, note that multiplying two $b$-bit integers takes $O(b \log b)$ operations. However, we have arranged things so that $r'_i < 2^{e_i}$. So the multiplication to calculate $w_i$ multiplies two $e_i$-bit numbers to produce a $e_{i+1}$-bit number, and the other two multiplications multiply two $e_{i+1}$-bit numbers to produce a $2e_{i+1}$-bit number.

In each case, the number of operations per iteration is $O(e_i \log e_i)$, and there are $O(\log e)$ iterations required. The final multiplication is on the order of $O(2e \log 2e)$ operations. So the overall complexity is $O(e \log^2 e)$ operations, which is sub-quadratic in the number of bits in $x$. That ticks all the boxes.

However, this analysis hides an important principle which everyone working with large integers should keep in mind: because multiplication is superlinear in the number of bits, any multiplication operations should only be performed on integers which have the roughly the magnitude of the current precision (and, I might add, you should try to multiply numbers together which have a similar order of magnitude). Using integers larger than that is a waste of effort. Constant factors matter, and for large integers, they matter a lot.

As a final observation, two of the multiplications are of the form $\frac{ab}{2^c}$. Clearly it's wasteful to compute the all the bits of $ab$ only to throw $c$ of them away with a right-shift. Implementing a smart multiplication method which takes this into account is also left as an exercise.

This is great stuff. One comment, though: Isn't the bit-complexity of division asymptotically approximately the same as the bit-complexity of multiplication? So you're talking about something that gives a constant factor improvement, not an asymptotic improvement, right? That wasn't entirely clear from your answer. — D.W.
– D.W. ♦, Commented Jan 27, 2015 at 6:18
You say that multiplying two $b$-bit integers takes $O(b \lg b)$ bit operations. I think the correct answer is something like $O(b \lg b (\lg lg b)^{O(1)})$ (right?). You might want to indicate that you are ignoring poly-log-log factors (e.g., by putting a tilde over your big O, or something). — D.W.
– D.W. ♦, Commented Jan 27, 2015 at 6:19
@D.W. : $\;\;\;$ No, he says that "multiplying two $b$-bit integers takes $O(b\log b)$ operations." $\:$ The word "bit" only appears once in that; otherwise I would've already pointed that out. $\;\;\;\;\;\;\;$ — user12859
– user12859, Commented Jan 27, 2015 at 6:21
It is a matter of constant factors, yes. The best large integer division algorithms use a technique very similar to the whole algorithm, such as Newton-Raphson iteration and doubling the effective precision on each iteration. A Newton-Raphson loop within a Newton-Raphson loop piles on the constant factors! Ricky Demer is correct; I was thinking in the word RAM model. I probably should have mentioned this. — Pseudonym
– Pseudonym ♦, Commented Jan 27, 2015 at 13:31

Stack Exchange Network

Arbitrary precision integer square root algorithm?

2 Answers 2

Linked

Hot Network Questions

Arbitrary precision integer square root algorithm?

2 Answers 2

Linked

Related

Hot Network Questions