### Arithmetic on bigInts

### L. Allison, FIT, Monash University, Australia 3800.

#### 16 August 2021

### 1. Introduction

An arbitrarily big integer, a *bigInt*, can be represented by a
sign (±) and a vector (array) of *digits* in some *base*:

sign D[d-1], ..., D[0]

where the base is chosen so that each digit fits in an integer
as can be manipulated by basic computer operations.
Addition (subtraction) of two bigInts can be performed digit by digit
taking account of any carries (borrows), e.g., in base ten,
9+9+1
= 19
= 9 carry 1;
it is convenient if
'digit_{1}+digit_{2}+carry'
cannot overflow.
Multiplication
of two bigInts can be performed
by various algorithms that ultimately rely on
basic computer operations to multiply individual digits
and add any *carries*.
Consider for example,
in base ten,

9×9+8
= 81+8
= 89
= 9 carry 8

and,
in base 16,

1111×1111+1110
= 11101111
= 1111 carry 1110

(15×15+14
= 239
= 14×16+15
= 15 carry 14).

We see that the product of two b-bit "digits" can produce a 2b-bit result.
If a 2b-bit integer fits in one computer Int then two digits can be
multiplied by a basic computer multiply.
Failing that, two digits must be multiplied
by a small software procedure of a few computer operations.
Division of bigInts can be performed by multiplication
combined with other operations on bigInts.

Timings of *long* division and *long* multiplication
are reasonably consistent with
quadratic time-complexity O(d^{2}) (d~log N),
div having a constant several times bigger
(but this version is operating bit-wise not digit-wise like '×').
Note that N.toString() does a special 'divide by 10' repeatedly
(log_{10}(N) times) and takes maybe
O(d^{2})-time overall but is very lumpy (try 64K bits or more),
while N.toBinary() is *very* fast of course.
(And the garbage collector may kick in at any time.)

### 2. Addition and Subtraction

Addition and subtraction of d-digit integers take O(d)time.

### 3. Multiplication

### 3.1. Long-Multiplication

1 2 3 x 4 5 6 ----------. 4 9 2 | 4 6 1 5 | 5 7 3 8 | 6 --------- 5 6 0 8 8Figure 2: Long-Multiplication.

For d-digit integers,
long-multiplication takes O(d^{2})-time.

### 3.2. Karatsuba ×

Multiply two d-digit integers by multiplying three half-length, (d/2)-digit, integers [Kar62].

- A;B and C;D are big integers.
- A;B × C;D = A×C<< + (A×D+B×C)< + B×D
- now (A×D+B×C) = −(A−B)×(C−D) + A×C + B×D (also = (A+B)×(C+D) − A×C − B×D), hence
- A;B × C;D = A×C<< + A×C< − (A−B)×(C−D)< + C×D< + C×D
- So to multiply A;B × C;D we only need to do three multiplications of half-size integers, A×C, C×D and (A−B)×(C−D), and some addition, subtraction and left-shift (<).

The algorithm has O(d^{log2(3)}) time complexity
for d-digit multiplication, log_{2}(3)≈1.58,
so it beats long-multiplication for big enough bigInts.

#### 2.2.1 The polynomial view:

An integer corresponds to a polynomial, the integer's digits being the coefficients of the polynomial. Multiplying two integers is mirrored by multiplying their corresponding polynomials to give the product polynomial. Recall that a polynomial of degree 'd' is defined by its values at d+1 points. In particular, a quadratic is defined by its values at three points, e.g., 0, 1 and −1, say.

- f(x) × g(x) = (Ax + B) × (Cx + D)
= Wx
^{2}+ Yx + Z = p(x).

- p(0) = Z = f(0) × g(0) = (0+B)×(0+D) = BD
- p(1) = W + Y + Z = (A+B)×(C+D)
- p(−1) = W − Y + Z = (−A+B)×(−C+D),

- p(1)−p(−1) = 2Y = (A+B)×(C+D) − (−A+B)×(−C+D) = 2(AD + BC)
- p(1)+p(−1) = 2(W + Z) = (A+B)×(C+D) + (−A+B)×(−C+D) = 2(AC + BD)
- W = ×AC, Y = AD+BC, Z = ×BD, so it is right
- and, looking at p(−1), Y = W+Z−(A−B)×(C−D)

### 3.3. Toom-Cook × (Toom3)

Multiply two d-digit integers by multiplying five (d/3)-digit integers, version after M. Bodrato [Bod07].

- A;B;C × D;E;F
- f(x) × g(x)
= (Ax
^{2}+ Bx + C) × (Dx^{2}+ Ex + F) = Ux^{4}+ Vx^{3}+ Wx^{2}+ Yx + Z = p(x)

- p(0) = Z = f(0) × g(0) = C×F
- p(1) = U+V+W+Y+Z = (A+B+C) × (D+E+F)
- p(−1) = U−V+W−Y+Z = (A−B+C) × (D−E+F)
- p(−2) = 16U−8V+4W−2Y+Z = (4A−2B+C) × (4D−2E+F)
- p(∞): U = A×D

- Z = p(0) = CF ✓
- U = p(∞) = AD ✓
- r = (p(-2) − p(1))/3

= (((4A−2B+C)×(4D−2E+F)) − ((A+B+C)×(D+E+F)))/3

= 5AD−3AE+AF−3BD+BE−BF+CD−CE - s = (p(1) − p(-1))/2

= ((A+B+C)×(D+E+F)) − ((A−B+C)×(D−E+F)))/2

= AE+BD+BF+CE - t = p(-1) − Z

= (A−B+C)×(D−E+F)−CF

= AD−AE+AF−BD+BE−BF+CD−CE - V = (t − r)/2 + 2.U

= AE+BD ✓ - W = t + s − U

= AF+BE+CD ✓ - Y = s − V

= BF+CE ✓ - note five '×' (and some shifts, '+'s and '−'s) to calculate p(0), p(1), p(−1), p(−2) and p(∞) and from them U, V, W, Y and Z.

The algorithm has O(d^{log3(5)}) time complexity
(log_{3}(5)≈1.465).
This beats Karatsuba *asymptotically* but the more extensive
house-keeping makes for a larger constant of proportionality.

### 3.4. Faster Multiplication

The Schonhage-Strassen algorithm [SS71],
running in O(n log n log log d)-time,
was the asymptotically fastest multiplication algorithm for
more than four decades.
Harvey and van der Hoeven [HV19] finally removed the log log d term
(as S and S had speculated would be possible); the improvement is of
great theoretical interest but little *practical* use.

1 2 3 123 x 456 <-------- 6 | 7 3 8 | . 5 | 6 1 5 . | . . 4 | 4 9 2 . 8 v . . . 8 . . 0 . 6 5 123 x 456 = 56088Figure 4: Another Angle on Long-Multiplication.

Long multiplication is closely related to vector *convolution*,
M conv N, where we pair-wise multiply elements and
add up diagonals and that is all,
the carries are dealt with afterwards
(this is *linear* convolution below):

1 2 3 123 conv 456 <-------- 6 | 6 12 18 | . 5 | 5 10 15 . | . . 4 | 4 8 12 . 18 v . . . 27 . . 28 . 13 4 4,13,28,27,18 >carries> 5, 6, 0, 8, 8, i.e., 56088Figure 5: Linear Convolution

Trouble is, the obvious way to do the above still takes quadratic time, O(d

^{2}), but there is a faster method.

The Schonhage-Strassen algorithm [SS71] relies on
the Fast Fourier Transform (FFT) algorithm [CT65] which
calculates the Discrete Fourier Transform (F) of a vector
and on the fact that

conv(X, Y) = F^{-1}( F(X) . F(Y) )

where F(X) . F(Y) is element-wise multiplication.

The FFT and FFT^{-1} algorithms carry out O(d log d)
arithmetic operations.

### 4. Division

To divide M≥0 by N>0, M ÷ N, find quotient Q≥0 and remainder r∈[0, N−1] such that M = Q×N + r, e.g., for 14÷4, Q=3 and r=2.

### 4.1. Long-Division

Long-division has O(d^{2}) time complexity.

e.g., 3219876 ÷ 394 8172 rem 108 .------- 394 |3219876 x8 = 3152 ---- 678 x1 = 394 --- 2847 x7 = 2758 ---- 896 x2 = 788 --- 108Figure 3: Long-Division

Long-division can be carried out in any base.
If the base of the *stored* integers is a power of two (2^{k})
the algorithm can still act in terms of binary (2^{1}) and
this is the simplest version to implement
because the only possible digit-multipliers are zero and one,
however the time-complexity does decrease as the base increases
because there are fewer digits.

### 4.2. Recursive Division

There are recursive algorithms [BZ98] to divide bigInts, e.g.,

To divide M by N, giving a quotient, Q, and a remainder:

Initialise Q := 0.

If M < N return 0 remainder M.

If M is "short", or if N is "nearly" as long as M, use long-division.

Otherwise M > N, M is "long" and M is longer than N.

If N is "long"
set *half-length* integers N1 and N0 such that N=N1;N0 and
set N1' =N1+1
(note that (N1'<<^{|N0|}) > N),
otherwise N is "short",
set N1' = N.

Set *half-length* integers M1 and M0 such that M=M1;M0;
note that
M1 <<^{|M0|} ≤ M.
Set
Q_{1} := (M1 **÷** N1') <<^{|M0|-|N0|}
(←recursion);
note that
Q_{1} ≤ M ÷ N
and
Q_{1} > 0.
Set
Q := Q + Q_{1},
and
M := M−Q_{1}**×**N.

Repeat,
accumulating the quotient Q,
until M < N, which leaves the remainder in M.

(<<^{k} is left-shift by k digit places and
|X| is the # of digits in X.)

Naturally this division algorithm avails itself of the fastest multiplication algorithm.

### 5. Conclusion

To make the very fastest implementation of operations on large integers it is best to use a programming language that is close to the underlying hardware and C, C++ or assembler are obvious choices. Javascript is not that kind of choice but it does run in web browsers which is convenient for interactive demonstrations of algorithms.

### References

- [AKS21] L. Allison, A. S. Konagurthu and D. F. Schmidt 'On universal codes for integers: Wallace Tree, Elias Omega and beyond', Data Compression Conference (DCC), IEEE, pp.313-322, March 2021 [more] [doi:10.1109/DCC50243.2021.00039]
- An application
of bigInts to
encoding and decoding large integer values for data compression and
machine learning.

- [Bod07] M. Bodrato. 'Toward optimal Toom-Cook multiplication for univariate and multivariate polynomials in characteristic 2 and 0', Int. Workshop on Arith. & Finite Fields (WAIFI), Springer, 2007, [doi:10.1007/978-3-540-73074-3_10].
- Also see
Toom-Cook@[wikip]['21].

- [BZ98] C. Burnikel and J. Ziegler,
'Fast recursive division',
Im Stadtwald, D-Saarbrucken, 1998
[www]['21].

- [CT65] J. W. Cooley and J. W. Tukey, 'An algorithm for the machine calculation of complex Fourier series', Math. Comp., 19, 297-301, 1965 [doi:10.1090/S0025-5718-1965-0178586-1]
- Also see the
FFT@[wikip]['21].

- [HV19] D. Harvey and J. Van Der Hoeven,
'Integer multiplication in time O(n log n)',
HALarchive, 2019
[www]['21].

- [Kar62] A. Karatsuba and Yu. Ofman, 'Multiplication of many-digital numbers by automatic computers', Proc. USSR Acad. Scis, 145, pp.293-294, 1962
- Translation in the acad. j.
Physics-Doklady, 7, pp.595-596, 1963.

Also see algorithm@[wikip]['21].

- [SS71] A. Schonhage and V. Strassen, 'Schnelle multiplikation grosser zahlen', Computing, 7, pp.281-292, 1971 [doi:10.1007/BF02242355]
- O(n log n log log n)-time.
Also see
algorithm@[wikip]['21].