Question

Multiplication of huge massive of numbers in python

I'm working on a small python program for myself and I need an algorithm for fast multiplication of a huge array with numbers (over 660 000 numbers, each is 9 digits). The result number is over 4 millions digits. Currently I'm using math.prod, which calculates it in ~10 minutes, but that's too slow, especially if I want to increase amount of numbers.

I checked some algorithms for faster multiplications, for example Schönhage–Strassen algorithm and Toom–Cook multiplication, but I didn't understand how they works or how to make them. I tried some versions that I've found on the internet, but they're not working too well and are even slower. I wonder if someone knows how to multiplicate these amounts of numbers faster, or could explain how to use some math to do this?

5 218 5

1 Jan 1970

Solution

There are two keys to making this fast. First, using the fastest mult implementation you can get. For "sufficiently large" multiplicands, Python's Karatsuba mult is O(n^1.585). The decimal module's much fancier NTT mult is more like O(n log n). But fastest of all is to install the gmpy2 extension package, which wraps GNU's GMP library, whose chief goal is peak speed. That has essentially the same asymptotics as decimal mult, but with a smaller constant factor.

Second, the advanced mult algorithms work best when multiplying two large ints of about the same size (number of bits). You can leave that to luck, or, as below, you can force it by using a priority queue and, at each step, multiplying the "two smallest" partial products remaining.

from gmpy2 import mpz
from heapq import heapreplace, heappop, heapify

# Assuming your input ints are in `xs`.
mpzs = list(map(mpz, xs))
heapify(mpzs)
for _ in range(len(mpzs) - 1):
    heapreplace(mpzs, heappop(mpzs) * mpzs[0])
assert len(mpzs) == 1
# the result is mpzs[0]

That's the code I'd use. Note that the cost of recursion (which this doesn't use) is trivial compared to the cost of huge-int arithmetic. Heap operations are more expensive than recursion, but still relatively cheap, and can waaaaay more than repay their cost if the input is in an order such that the "by luck" methods aren't lucky enough.

2024-07-16

Tim Peters

Solution

math.prod will accumulate a product one number at a time. You can do better by recursively dividing the list, taking the product of each half, for example, which reduces the size of the intermediate products.

This runs in a few seconds for me:

import math


def recursive_prod(ns, r):
    if len(r) <= 10:  # arbitrary small base case
        return math.prod(ns[i] for i in r)

    split_at = len(r) // 2
    return recursive_prod(ns, r[:split_at]) * recursive_prod(ns, r[split_at:])


import random
ns = [random.randrange(1_000_000_000) for _ in range(660_000)]
p = recursive_prod(ns, range(len(ns)))

2024-07-16

Ry-

Solution

A tricky one I've been using, always multiplying the oldest two not-yet-multiplied numbers until only one is left:

from operator import mul

def prod(ns):
    ns = list(ns)
    it = iter(ns)
    ns += map(mul, it, it)
    return ns[-1]

About as fast as @Ry-'s. The speed comes from Karatsuba kicking in when multiplying two large numbers instead of one large and one tiny.

And using decimal seems about five times faster due to even faster multiplication algorithms (and has the advantage of being much faster to print in decimal, if you want that):

from operator import mul
from decimal import *

setcontext(Context(prec=MAX_PREC, Emax=MAX_EMAX))

def prod(ns):
    ns = list(map(Decimal, ns))
    it = iter(ns)
    ns += map(mul, it, it)
    return ns[-1]

2024-07-16

no comment