Table Of Contents

Decimal Arithmetic Benchmarks

The sources for various benchmarks are available at the download page. Some Python benchmarks are explained in more detail at the quickstart page.

C Libraries

This benchmark runs a modified version of the escape time algorithm for drawing a Mandelbrot set. It was chosen simply because it contains a large number of multiplications, additions and subtractions.

The contestants are libmpdec, DecNumber and IntelRDFPMathLib.

These are the data types of the three libraries. The arbitrary precision data types and the IEEE types are compatible:

  libmpdec DecNumber IntelRDFPMathLib
arbitrary precision mpd decNumber -
IEEE 16 digits - decDouble bid64
IEEE 34 digits - DecQuad bid128

Mandelbrot, 64-bit

  mpd decDouble decQuad decNumber bid64 bid128
16 digits 4.94s 6.10s - 7.70s 2.40s -
34 digits 6.78s - 11.40s 13.10s - 8.07s
9 digits 4.68s - - 5.69s - -
19 digits 4.84s - - 9.44s - -
38 digits 6.67s - - 16.02s - -

3.16GHz Core 2 Duo, 10000000 iterations

Mandelbrot, 32-bit

  mpd decDouble decQuad decNumber bid64 bid128
16 digits 4.57s 3.74s - 5.45s 2.37s -
34 digits 7.88s - 7.89s 9.09s - 9.15s
9 digits 3.47s - - 3.74s - -
19 digits 5.32s - - 6.55s - -
38 digits 9.79s - - 10.84s - -

Athlon 3700+, 3000000 iterations

Python, Java

This benchmark compares the Python modules cdecimal, decimal.py and and Java’s BigDecimal class. Python’s native binary float type and gmpy’s binary mpf type are included for a better overview.

The benchmark runs 10000 iterations of calculating pi to various precisions.

Pi, 64-bit

  Python floats cdecimal cdecimal-nt [1] gmpy decimal Java [2]
9 digits 0.12s 0.27s 0.24s 0.52s 17.61s 0.38s
19 digits - 0.58s 0.55s 0.52s 42.75s 0.73s
38 digits - 1.32s 1.21s 1.07s - 1.25s
100 digits - 4.52s 4.08s 3.57s - 5.35s

3.16GHz Core 2 Duo

Pi, 32-bit

  Python floats cdecimal cdecimal-nt [1] gmpy decimal Java [2]
9 digits 0.33s 0.78s 0.73s 1.25s 34.04s 1.82s
19 digits - 1.77s 1.68s 1.25s 80.46s 1.61s
38 digits - 3.95s 3.93s 2.45s - 2.80s
100 digits - 12.11s 11.70s 7.42s - 13.72s

Athlon 3700+

[1](1, 2) Using the –without-threads option.
[2](1, 2) Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)

Python

Telco, 64-bit

The telco benchmark was devised by Mike Cowlishaw as a way of measuring decimal performance in a real world telecom application.

  decimal cdecimal
telco 172.19s 5.68s

3.16GHz Core 2 Duo

Bignum Benchmarks

Arbitrary Precision Libraries

This benchmark compares libmpdec, apfloat and mpfr. Both libmpdec and apfloat use a power of ten base, so it is no surprise that mpfr leads every benchmark with respect to pure calculation time [3] . However, when the result of a large calculation needs to be converted to decimal, libmpdec and apfloat are faster. To show this, the multiplication benchmarks have been split into calculation + conversion time. All other benchmarks are calculation time only.

Various Functions, 64-bit

func/prec/iter libmpdec apfloat mpfr
mul/5000/1 0.00s + 0.00s 0.00s + 0.00s 0.00s + 0.00s
mul/100000/1 0.02s + 0.00s 0.01s + 0.01s 0.01s + 0.02s
mul/1000000/1 0.14s + 0.03s 0.19s + 0.03s 0.04s + 0.75s
mul/10000000/1 2.32s + 0.26s 2.96s + 0.35s 0.64s + 17.44s
mul/100000000/1 20.59s + 2.65s 32.59s + 3.40s 8.56s + 328.59s
div/5000/1 0.01s 0.00s 0.00s
div/100000/1 0.04s 0.05s 0.02s
div/1000000/1 0.62s 0.51s 0.16s
div/10000000/1 7.59s 7.91s 2.85s
div/100000000/1 75.98s 88.64s 49.14s
sqrt/9/100000 0.18s - 0.01s
sqrt/19/100000 0.26s - 0.01s
sqrt/38/100000 0.36s 0.72s 0.02s
sqrt/1000000/1 1.59s [4] 0.82s 0.13s
sqrt/10000000/1 22.50s 12.67s 2.31s
invroot/1000000/1 0.62s 0.64s 0.11s
invroot/10000000/1 8.44s 9.79s 1.54s
exp/9/100000 0.73s - 0.45s
exp/19/100000 1.21s - 0.65s
exp/38/100000 2.16s 70.96s [5] 1.71s
exp/5000/1 1.04s 0.62s 0.01s
ln/9/100000 1.27s - 0.25s
ln/19/100000 2.00s - 0.28s
ln/38/100000 3.30s 20.90s [5] 0.48s
ln/5000/1 2.25s 0.20s 0.01s

3.16GHz Core 2 Duo

[3]mpfr also uses very clever algorithms.
[4]1.39s without overhead for correctly rounded square root
[5](1, 2) apfloat’s exp and ln algorithms are optimized for very large numbers.

Python

This benchmark compares the Python modules cdecimal and gmpy. Python’s native integer type is included for a better overview. The benchmark calculates the factorial of huge numbers. The timings are split into pure calculation time and conversion time to decimal string.

Factorial, 64-bit

  Python integers cdecimal cdecimal-nt [6] gmpy
100000! 1.65s + 78.66s 0.48s + 0.01s 0.45s + 0.01s 0.18s + 0.18s
1000000! 78.54s + >2h 7.84s + 0.07s 7.56s + 0.07s 2.76s + 6.52s
10000000! - 99.17s + 0.87s 96.08s + 0.87s 46.57s + 165.80s

3.16GHz Core 2 Duo

Factorial, 32-bit

  Python integers cdecimal cdecimal-nt [6] gmpy
100000! 2.72s + 119.81s 1.73s + 0.01s 1.71s + 0.01s 0.36s + 0.34s
1000000! 103.85s + >2h 29.99s + 0.12s 29.69s + 0.12s 5.47s + 11.30s
10000000! - 381.89s + 1.48s 380.51s + 1.49s 86.46s + 291.41s

Athlon 3700+

[6](1, 2) Using the –without-threads option.