ParaMonte Fortran 2.0.0
Parallel Monte Carlo and Machine Learning Library
See the latest version documentation.
pm_mathSum::getSum Interface Reference

Generate and return the expression sum(x) accurately (almost without roundoff error).
More...

Detailed Description

Generate and return the expression sum(x) accurately (almost without roundoff error).

Parameters
[in]x: The input vector of arbitrary size, of
  1. type complex of kind any supported by the processor (e.g., CK, CK32, CK64, or CK128), or
  2. type real of kind any supported by the processor (e.g., RK, RK32, RK64, or RK128),
representing the x value whose sum of elements is to be returned.
[in]method: The input scalar object that can be,
  1. the constant iteration or equivalently, an object of type iteration_type.
    Specifying this option forces the use of normal iterative approach to summing all array elements.
    This option is equivalent to the default implementations of the Fortran intrinsic sum().
    This approach is the fastest serial method among all, but also generally the least accurate.
  2. the constant recursion or equivalently, an object of type recursion_type.
    Specifying this option forces the use of recursive pairwise approach to summing all array elements.
  3. the constant kahanbabu or equivalently, an object of type kahanbabu_type.
    Specifying this option forces the use of the Kahan-Babuska compensated approach to summing all array elements.
    This algorithm, while accurate, can be up to 2-4 times more expensive than the iterative approach discussed above.
  4. the constant fablocked or equivalently, an object of type fablocked_type.
    Specifying this option forces the use of the Fast Accurate Blocked approach to summing all array elements.
  5. the constant nablocked or equivalently, an object of type nablocked_type.
    Specifying this option forces the use of the Naive Blocked approach to summing all array elements.
The presence of this argument is merely for compile-time resolution of the procedures of this generic interface.
(optional, default = fablocked.)
Returns
sumres : The output scalar of the same type and kind containing the result of summing all elements of the input x.


Possible calling interfaces

use pm_mathSum, only: getSum, iteration, recursion, kahanbabu, fablocked, nablocked
sumres = getSum(x)
sumres = getSum(x, method)
Generate and return the expression sum(x) accurately (almost without roundoff error).
Definition: pm_mathSum.F90:367
This module contains procedures and generic interfaces for computing sum(x) accurately when x is a lo...
Definition: pm_mathSum.F90:72
type(kahanbabu_type), parameter kahanbabu
This is a scalar parameter object of type kahanbabu_type.
Definition: pm_mathSum.F90:266
type(fablocked_type), parameter fablocked
This is a scalar parameter object of type fablocked_type.
Definition: pm_mathSum.F90:143
type(nablocked_type), parameter nablocked
This is a scalar parameter object of type nablocked_type.
Definition: pm_mathSum.F90:206
Warning
The returned value is 0 is the size of the condition size(x) == 0 holds.
The pure procedure(s) documented herein become impure when the ParaMonte library is compiled with preprocessor macro CHECK_ENABLED=1.
By default, these procedures are pure in release build and impure in debug and testing builds.
See also
get1mexp
getLog1p
getCumSum
getLogAddExp
getLogSubExp
getLogSumExp


Example usage

1program example
2
3 use pm_kind, only: SK, IK, LK, RKH
4 use pm_distUnif, only: setUnifRand
7 use pm_mathSum, only: getSum, iteration, recursion, kahanbabu, fablocked, nablocked
8 use pm_io, only: display_type
9
10 implicit none
11
12 real(RKH) :: truth
13 real(RKH), allocatable :: sumres(:), relerr(:)
14 type(display_type) :: disp
15
16 disp = display_type(file = "main.out.F90")
17
18 block
19 use pm_kind, only: TKG => RKS
20 integer(IK), parameter :: lenx = 10**7
21 real(TKG) :: x(lenx), lb, ub
22 call disp%skip()
23 call disp%show("lenx")
24 call disp%show( lenx )
25 call disp%show("lb = real(lenx, TKG); ub = 1._TKG")
26 lb = real(lenx, TKG); ub = 1._TKG
27 call disp%show("call setLinSpace(x, lb, ub) ! call setUnifRand(x)")
28 call setLinSpace(x, lb, ub) ! call setUnifRand(x)
29 call disp%show("truth = (real(ub, TKG) + real(lb, RKH)) * size(x, 1, IK) / 2 ! reference high-precision value for comparison")
30 truth = (real(ub, TKG) + real(lb, RKH)) * size(x, 1, IK) / 2 ! reference high-precision value for comparison
31 call disp%show("truth")
32 call disp%show( truth )
33 call disp%show("sumres = [getSum(x), getSum(x, iteration), getSum(x, recursion), getSum(x, kahanbabu), getSum(x, fablocked), getSum(x, nablocked)]")
34 sumres = [getSum(x), getSum(x, iteration), getSum(x, recursion), getSum(x, kahanbabu), getSum(x, fablocked), getSum(x, nablocked)]
35 call disp%show("sumres")
36 call disp%show( sumres )
37 call disp%show("relerr = abs(truth - sumres) / truth")
38 relerr = abs(truth - sumres) / truth
39 call disp%show("relerr")
40 call disp%show( relerr )
41 call disp%show("getRankDense(relerr)")
42 call disp%show( getRankDense(relerr) )
43 call disp%skip()
44 end block
45
46 block
47 use pm_kind, only: TKG => RKD
48 integer(IK), parameter :: lenx = 10**8
49 real(TKG) :: x(lenx), lb, ub
50 call disp%skip()
51 call disp%show("lenx")
52 call disp%show( lenx )
53 call disp%show("lb = real(lenx, TKG); ub = 1._TKG")
54 lb = real(lenx, TKG); ub = 1._TKG
55 call disp%show("call setLinSpace(x, lb, ub) ! call setUnifRand(x)")
56 call setLinSpace(x, lb, ub) ! call setUnifRand(x)
57 call disp%show("truth = (real(ub, TKG) + real(lb, RKH)) * size(x, 1, IK) / 2 ! reference high-precision value for comparison")
58 truth = (real(ub, TKG) + real(lb, RKH)) * size(x, 1, IK) / 2 ! reference high-precision value for comparison
59 call disp%show("truth")
60 call disp%show( truth )
61 call disp%show("sumres = [getSum(x), getSum(x, iteration), getSum(x, recursion), getSum(x, kahanbabu), getSum(x, fablocked), getSum(x, nablocked)]")
62 sumres = [getSum(x), getSum(x, iteration), getSum(x, recursion), getSum(x, kahanbabu), getSum(x, fablocked), getSum(x, nablocked)]
63 call disp%show("sumres")
64 call disp%show( sumres )
65 call disp%show("relerr = abs(truth - sumres) / truth")
66 relerr = abs(truth - sumres) / truth
67 call disp%show("relerr")
68 call disp%show( relerr )
69 call disp%show("getRankDense(relerr)")
70 call disp%show( getRankDense(relerr) )
71 call disp%skip()
72 end block
73
74 block
75 call disp%skip()
76 call disp%show("[getSum([real ::]), getSum([real :: 1]), getSum([real :: 1, 1]), getSum([real :: 1, 1, 1])]")
77 call disp%show( [getSum([real ::]), getSum([real :: 1]), getSum([real :: 1, 1]), getSum([real :: 1, 1, 1])] )
78 call disp%skip()
79 end block
80
81end program example
Generate and return the Dense rank of the input scalar string or contiguous array of rank 1 in ascend...
Return the linSpace output argument with size(linSpace) elements of evenly-spaced values over the int...
Return a uniform random scalar or contiguous array of arbitrary rank of randomly uniformly distribute...
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11726
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11508
This module contains procedures and generic interfaces for obtaining various rankings of elements of ...
This module contains procedures and generic interfaces for generating arrays with linear or logarithm...
This module contains classes and procedures for computing various statistical quantities related to t...
This module contains classes and procedures for input/output (IO) or generic display operations on st...
Definition: pm_io.F90:252
type(display_type) disp
This is a scalar module variable an object of type display_type for general display.
Definition: pm_io.F90:11393
This module defines the relevant Fortran kind type-parameters frequently used in the ParaMonte librar...
Definition: pm_kind.F90:268
integer, parameter LK
The default logical kind in the ParaMonte library: kind(.true.) in Fortran, kind(....
Definition: pm_kind.F90:541
integer, parameter IK
The default integer kind in the ParaMonte library: int32 in Fortran, c_int32_t in C-Fortran Interoper...
Definition: pm_kind.F90:540
integer, parameter RKD
The double precision real kind in Fortran mode. On most platforms, this is an 64-bit real kind.
Definition: pm_kind.F90:568
integer, parameter SK
The default character kind in the ParaMonte library: kind("a") in Fortran, c_char in C-Fortran Intero...
Definition: pm_kind.F90:539
integer, parameter RKH
The scalar integer constant of intrinsic default kind, representing the highest-precision real kind t...
Definition: pm_kind.F90:858
integer, parameter RKS
The single-precision real kind in Fortran mode. On most platforms, this is an 32-bit real kind.
Definition: pm_kind.F90:567
Generate and return an object of type display_type.
Definition: pm_io.F90:10282

Example Unix compile command via Intel ifort compiler
1#!/usr/bin/env sh
2rm main.exe
3ifort -fpp -standard-semantics -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe

Example Windows Batch compile command via Intel ifort compiler
1del main.exe
2set PATH=..\..\..\lib;%PATH%
3ifort /fpp /standard-semantics /O3 /I:..\..\..\include main.F90 ..\..\..\lib\libparamonte*.lib /exe:main.exe
4main.exe

Example Unix / MinGW compile command via GNU gfortran compiler
1#!/usr/bin/env sh
2rm main.exe
3gfortran -cpp -ffree-line-length-none -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe

Example output
1
2lenx
3+10000000
4lb = real(lenx, TKG); ub = 1._TKG
5call setLinSpace(x, lb, ub) ! call setUnifRand(x)
6truth = (real(ub, TKG) + real(lb, RKH)) * size(x, 1, IK) / 2 ! reference high-precision value for comparison
7truth
8+50000005000000.0000000000000000000000
9sumres = [getSum(x), getSum(x, iteration), getSum(x, recursion), getSum(x, kahanbabu), getSum(x, fablocked), getSum(x, nablocked)]
10sumres
11+50000000188416.0000000000000000000000, +49423623127040.0000000000000000000000, +50000008577024.0000000000000000000000, +50000004382720.0000000000000000000000, +50000000188416.0000000000000000000000, +49999983411200.0000000000000000000000
12relerr = abs(truth - sumres) / truth
13relerr
14+0.962316703768329623167037683296231645E-7, +0.115276363064363693563630643636935635E-1, +0.715404728459527154047284595271540462E-7, +0.123455987654401234559876544012345606E-7, +0.962316703768329623167037683296231645E-7, +0.431775956822404317759568224043177598E-6
15getRankDense(relerr)
16+3, +5, +2, +1, +3, +4
17
18
19lenx
20+100000000
21lb = real(lenx, TKG); ub = 1._TKG
22call setLinSpace(x, lb, ub) ! call setUnifRand(x)
23truth = (real(ub, TKG) + real(lb, RKH)) * size(x, 1, IK) / 2 ! reference high-precision value for comparison
24truth
25+5000000050000000.00000000000000000000
26sumres = [getSum(x), getSum(x, iteration), getSum(x, recursion), getSum(x, kahanbabu), getSum(x, fablocked), getSum(x, nablocked)]
27sumres
28+5000000050000000.00000000000000000000, +5000000050000000.00000000000000000000, +5000000050000000.00000000000000000000, +5000000050000000.00000000000000000000, +5000000050000000.00000000000000000000, +5000000050000000.00000000000000000000
29relerr = abs(truth - sumres) / truth
30relerr
31+0.00000000000000000000000000000000000, +0.00000000000000000000000000000000000, +0.00000000000000000000000000000000000, +0.00000000000000000000000000000000000, +0.00000000000000000000000000000000000, +0.00000000000000000000000000000000000
32getRankDense(relerr)
33+1, +1, +1, +1, +1, +1
34
35
36[getSum([real ::]), getSum([real :: 1]), getSum([real :: 1, 1]), getSum([real :: 1, 1, 1])]
37+0.00000000, +1.00000000, +2.00000000, +3.00000000
38
39
Benchmarks:


Benchmark :: The effects of method on runtime efficiency

The following program compares the runtime performance of getSum algorithms with the default Fortran sum() intrinsic function.
1! Test the performance of `getSum()` with and without the selection `control` argument.
2program benchmark
3
4 use pm_bench, only: bench_type
5 use pm_distUnif, only: setUnifRand
6 use pm_mathCumSum, only: setCumSum
8 use pm_kind, only: SK, IK, LK, RKH, RK, RKG => RKD
9 use iso_fortran_env, only: error_unit
10
11 implicit none
12
13 integer(IK) :: ibench
14 integer(IK) :: iarr
15 integer(IK) :: arrlen
16 integer(IK) :: fileUnit
17 real(RKG) :: dumsum = 0._RKG
18 real(RKG) :: array(10**8)
19 real(RKG) :: sumres
20 type(bench_type) , allocatable :: bench(:)
21
22 bench = [ bench_type(name = SK_"sum()", exec = getSumFortran, overhead = setOverhead) &
23 , bench_type(name = SK_"fablocked", exec = getSumFAB, overhead = setOverhead) &
24 , bench_type(name = SK_"nablocked", exec = getSumNAB, overhead = setOverhead) &
25 , bench_type(name = SK_"kahanbabu", exec = getSumKAB, overhead = setOverhead) &
26 , bench_type(name = SK_"iteration", exec = getSumIte, overhead = setOverhead) &
27 , bench_type(name = SK_"recursion", exec = getSumRec, overhead = setOverhead) &
28 ]
29
30 write(*,"(*(g0,:,' '))")
31 write(*,"(*(g0,:,' '))") "sum() vs. getSum()"
32 write(*,"(*(g0,:,' '))")
33
34 open(newunit = fileUnit, file = "main.out", status = "replace")
35
36 call setUnifRand(array)
37 write(fileUnit, "(*(g0,:,','))") "Array Size", (bench(ibench)%name, ibench = 1, size(bench))
38 loopOverArraySize: do iarr = 2, 26, 2
39
40 arrlen = 2**iarr
41 write(*,"(*(g0,:,' '))") "Benchmarking with array size", arrlen
42 do ibench = 1, size(bench)
43 bench(ibench)%timing = bench(ibench)%getTiming(minsec = 0.07_RK)
44 end do
45 write(fileUnit,"(*(g0,:,','))") arrlen, (bench(ibench)%timing%mean, ibench = 1, size(bench))
46
47 end do loopOverArraySize
48 write(*,"(*(g0,:,' '))") dumsum
49 write(*,"(*(g0,:,' '))")
50
51 close(fileUnit)
52
53contains
54
55 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
56 ! procedure wrappers.
57 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
58
59 subroutine setOverhead()
60 call getDummy()
61 end subroutine
62
63 subroutine getDummy()
64 dumsum = dumsum + sumres
65 end subroutine
66
67 subroutine getSumFortran()
68 sumres = sum(array(1 : arrlen))
69 call getDummy()
70 end subroutine
71
72 subroutine getSumFAB()
73 use pm_mathSum, only: getSum!, fablocked
74 sumres = getSum(array(1 : arrlen))!, fablocked)
75 call getDummy()
76 end subroutine
77
78 subroutine getSumNAB()
79 use pm_mathSum, only: getSum, nablocked
80 sumres = getSum(array(1 : arrlen), nablocked)
81 call getDummy()
82 end subroutine
83
84 subroutine getSumKAB()
85 use pm_mathSum, only: getSum, kahanbabu
86 sumres = getSum(array(1 : arrlen), kahanbabu)
87 call getDummy()
88 end subroutine
89
90 subroutine getSumIte()
91 use pm_mathSum, only: getSum, iteration
92 sumres = getSum(array(1 : arrlen), iteration)
93 call getDummy()
94 end subroutine
95
96 subroutine getSumRec()
97 use pm_mathSum, only: getSum, recursion
98 sumres = getSum(array(1 : arrlen), recursion)
99 call getDummy()
100 end subroutine
101
102end program benchmark
Allocate or resize (shrink or expand) an input allocatable scalar string or array of rank 1....
Generate and return an object of type timing_type containing the benchmark timing information and sta...
Definition: pm_bench.F90:574
Return the cumulative sum of the input array, optionally in the backward direction and optionally,...
This module contains procedures and generic interfaces for resizing allocatable arrays of various typ...
This module contains abstract interfaces and types that facilitate benchmarking of different procedur...
Definition: pm_bench.F90:41
integer, parameter RK
The default real kind in the ParaMonte library: real64 in Fortran, c_double in C-Fortran Interoperati...
Definition: pm_kind.F90:543
This module contains the procedures and interfaces for computing the cumulative sum of an array.
This is the class for creating benchmark and performance-profiling objects.
Definition: pm_bench.F90:386


Example Unix compile command via Intel ifort compiler

1#!/usr/bin/env sh
2rm main.exe
3ifort -fpp -standard-semantics -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe


Example Windows Batch compile command via Intel ifort compiler

1del main.exe
2set PATH=..\..\..\lib;%PATH%
3ifort /fpp /standard-semantics /O3 /I:..\..\..\include main.F90 ..\..\..\lib\libparamonte*.lib /exe:main.exe
4main.exe


Example Unix / MinGW compile command via GNU gfortran compiler

1#!/usr/bin/env sh
2rm main.exe
3gfortran -cpp -ffree-line-length-none -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe


Postprocessing of the benchmark output

1#!/usr/bin/env python
2
3import matplotlib.pyplot as plt
4import pandas as pd
5import numpy as np
6
7import os
8dirname = os.path.basename(os.getcwd())
9
10fontsize = 14
11
12df = pd.read_csv("main.out", delimiter = ",")
13colnames = list(df.columns.values)
14
15
18
19ax = plt.figure(figsize = 1.25 * np.array([6.4,4.6]), dpi = 200)
20ax = plt.subplot()
21
22for colname in colnames[1:]:
23 plt.plot( df[colnames[0]].values
24 , df[colname].values
25 , linewidth = 2
26 )
27
28plt.xticks(fontsize = fontsize)
29plt.yticks(fontsize = fontsize)
30ax.set_xlabel(colnames[0], fontsize = fontsize)
31ax.set_ylabel("Runtime [ seconds ]", fontsize = fontsize)
32ax.set_title(" vs. ".join(colnames[1:])+"\nLower is better.", fontsize = fontsize)
33ax.set_xscale("log")
34ax.set_yscale("log")
35plt.minorticks_on()
36plt.grid(visible = True, which = "both", axis = "both", color = "0.85", linestyle = "-")
37ax.tick_params(axis = "y", which = "minor")
38ax.tick_params(axis = "x", which = "minor")
39ax.legend ( colnames[1:]
40 #, loc='center left'
41 #, bbox_to_anchor=(1, 0.5)
42 , fontsize = fontsize
43 )
44
45plt.tight_layout()
46plt.savefig("benchmark." + dirname + ".runtime.png")
47
48
51
52ax = plt.figure(figsize = 1.25 * np.array([6.4,4.6]), dpi = 200)
53ax = plt.subplot()
54
55plt.plot( df[colnames[0]].values
56 , np.ones(len(df[colnames[0]].values))
57 , linestyle = "--"
58 #, color = "black"
59 , linewidth = 2
60 )
61for colname in colnames[2:]:
62 plt.plot( df[colnames[0]].values
63 , df[colname].values / df[colnames[1]].values
64 , linewidth = 2
65 )
66
67plt.xticks(fontsize = fontsize)
68plt.yticks(fontsize = fontsize)
69ax.set_xlabel(colnames[0], fontsize = fontsize)
70ax.set_ylabel("Runtime compared to {}".format(colnames[1]), fontsize = fontsize)
71ax.set_title("Runtime Ratio Comparison. Lower means faster.\nLower than 1 means faster than {}.".format(colnames[1]), fontsize = fontsize)
72ax.set_xscale("log")
73ax.set_yscale("log")
74plt.minorticks_on()
75plt.grid(visible = True, which = "both", axis = "both", color = "0.85", linestyle = "-")
76ax.tick_params(axis = "y", which = "minor")
77ax.tick_params(axis = "x", which = "minor")
78ax.legend ( colnames[1:]
79 #, bbox_to_anchor = (1, 0.5)
80 #, loc = "center left"
81 , fontsize = fontsize
82 )
83
84plt.tight_layout()
85plt.savefig("benchmark." + dirname + ".runtime.ratio.png")


Visualization of the benchmark output


Benchmark moral

  1. Among all summation algorithms, fablocked_type appears to offer the most accurate result while also being even faster than the default Fortran sum() and all other implemented summation algorithms for array sizes > 100.
Test:
test_pm_mathSum


Final Remarks


If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

  1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
  2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Amir Shahmoradi, August 8, 2024, 10:23 PM, NASA Goddard Space Flight Center, Washington, D.C.

Definition at line 367 of file pm_mathSum.F90.


The documentation for this interface was generated from the following file: