Node:Accelerated GCD, Next:Extended GCD, Previous:Binary GCD, Up:Greatest Common Divisor Algorithms
For sizes above GCD_ACCEL_THRESHOLD
, GMP uses the Accelerated GCD
algorithm described independently by Weber and Jebelean (the latter as the
"Generalized Binary" algorithm), see References. This algorithm is
still O(N^2), but is much faster than the binary algorithm since it
does fewer multi-precision operations. It consists of alternating the
k-ary reduction by Sorenson, and a "dmod" exact remainder reduction.
For operands u and v the k-ary reduction replaces
u with n*v-d*u where n and d are single limb
values chosen to give two trailing zero limbs on that value, which can be
stripped. n and d are calculated using an algorithm similar to
half of a two limb GCD (see find_a
in mpn/generic/gcd.c
).
When u and v differ in size by more than a certain number of bits, a dmod is performed to zero out bits at the low end of the larger. It consists of an exact remainder style division applied to an appropriate number of bits (see Exact Division, and see Exact Remainder). This is faster than a k-ary reduction but useful only when the operands differ in size. There's a dmod after each k-ary reduction, and if the dmod leaves the operands still differing in size then it's repeated.
The k-ary reduction step can introduce spurious factors into the GCD calculated, and these are eliminated at the end by taking GCDs with the original inputs gcd(u,gcd(v,g)) using the binary algorithm. Since g is almost always small this takes very little time.
At small sizes the algorithm needs a good implementation of find_a
. At
larger sizes it's dominated by mpn_addmul_1
applying n and
d.