Compute and return an iteratively-refined set of cluster centers given the input sample using the k-means approach.
More...

Detailed Description

Compute and return an iteratively-refined set of cluster centers given the input sample using the k-means approach.

See the documentation of pm_clusKmeans for more information on the Kmeans clustering algorithm.
The metric used within this generic interface is the Euclidean distance.

Parameters

[in,out]	rng	: The input/output scalar that can be an object of, type rngf_type, implying the use of intrinsic Fortran uniform RNG. type xoshiro256ssw_type, implying the use of xoshiro256 uniform RNG. (optional**. If this argument is present, then all `intent(inout)` arguments below have `intent(out)` argument and will be initialized using the k-means++ algorithm. If this argument is missing, then the user must initialize all of the following arguments with `intent(inout)` via k-means++ or any other method before passing them to this generic interface.)
[in,out]	membership	: The input/output vector of shape `(1:nsam)` of type `integer` of default kind IK, containing the membership of each input sample in `sample` from its nearest cluster `center`, such that `cluster(membership(i))` is the nearest cluster center to the `i`th sample `sample(:, i)` at a squared-distance of `disq(i)`. If the argument `rng` is missing, then `membership` has `intent(inout)` and must be properly initialized before calling this routine. If the argument `rng` is present, then `membership` has `intent(out)` and will contain the final cluster memberships on output.
[in,out]	disq	: The input/output vector of shape `(1:nsam)` of the same type and kind as the input argument `sample`, containing the Euclidean squared distance of each input sample in `sample` from its nearest cluster `center`. If the argument `rng` is missing, then `disq` has `intent(inout)` and must be properly initialized before calling this routine. If the argument `rng` is present, then `disq` has `intent(out)` and will contain the final squared-distances from cluster centers on output.
[in]	sample	: The input vector, or matrix of, type `real` of kind any supported by the processor (e.g., RK, RK32, RK64, or RK128), containing the sample of `nsam` points in a `ndim`-dimensional space whose corresponding cluster centers must be computed. If `sample` is a vector of shape `(1 : nsam)` and `center` is a vector of shape `(1 : ncls)`, then the input `sample` must be a collection of `nsam` points (in univariate space). If `sample` is a matrix of shape `(1 : ndim, 1 : nsam)` and `center` is a matrix of shape `(1 : ndim, 1 : ncls)`, then the input `sample` must be a collection of `nsam` points (in `ndim`-dimensional space).
[in,out]	center	: The input/output vector of shape `(1:ncls)` or matrix of shape `(1 : ndim, 1 : ncls)` of the same type and kind as the input argument `sample`, containing the set of `ncls` unique cluster centers (centroids) computed based on the input sample memberships and minimum distances. If the argument `rng` is missing, then `center` has `intent(inout)` and must be properly initialized before calling this routine. If the argument `rng` is present, then `center` has `intent(out)` and will contain the final cluster centers on output.
[in,out]	size	: The input/output vector of shape `(1:ncls)` type `integer` of default kind IK, containing the sizes (number of members) of the clusters with the corresponding centers output in the argument `center`. If the argument `rng` is missing, then `size` has `intent(inout)` and must be properly initialized before calling this routine. If the argument `rng` is present, then `size` has `intent(out)` and will contain the final cluster sizes (member counts) on output.
[in,out]	potential	: The input/output vector of shape `(1:ncls)` of the same type and kind as the input argument `sample`, the `i`th element of which contains the sum of squared distances of all members of the `i`th cluster from the cluster center as output in the `i`th element of `center`. If the argument `rng` is missing, then `potential` has `intent(inout)` and must be properly initialized before calling this routine (although its values are not explicitly referenced). If the argument `rng` is present, then `potential` has `intent(out)` and will contain the final cluster potentials (sums of squared distances from cluster centers) on output.
[out]	failed	: The output scalar of type `logical` of default kind LK that is `.true.` if and only if the algorithm fails to converge within the user-specified or default criteria for convergence. Failure occurs only if `any(size < minsize) .or. maxniter < niter`.
[out]	niter	: The output scalar of type `integer` of default kind IK, containing the number of refinement iterations performed within the algorithm to achieve convergence. An output `niter` value larger than the input `maxniter` implies lack of convergence before return. (optional. If missing, the number of refinement iterations will not be output.)
[in]	maxniter	: The input non-negative scalar of type `integer` of default kind IK, containing the maximum number of refinement iterations allowed within the algorithm to achieve convergence. If convergence does not occur within the maximum specified value, the output arguments can be passed again as is to the generic interface (without the optional `rng` argument) to continue the refinement iterations until convergence. A reasonable choice can be `300` or comparable values. (optional, default = `300`.)
[in]	minsize	: The input non-negative scalar of type `integer` of default kind IK, containing the minimum allowed size of each cluster. If any cluster has any number of members below the specified `minsize`, the algorithm will return without achieving convergence. The situation can be detected of any element of the output `size` is smaller than the specified `minsize`. A reasonable choice can be `ndim = ubound(sample, 1)` or comparable values although any non-negative value including zero is possible. (optional, default = `1`.)
[in]	nfail	: The input non-negative scalar of type `integer` of default kind IK, containing the number of times the k-means algorithm is allowed to fail before returning without convergence. (optional. It can be present only if the argument `rng` is also present, allowing random initializations in case of failures.)

Possible calling interfaces ⛓

: use pm_clusKmeans, only: setKmeans

call setKmeans(membership(1 : nsam), disq(1 : nsam), sample(1 : nsam) , center(1 : ncls) , size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize)

call setKmeans(membership(1 : nsam), disq(1 : nsam), sample(1 : ndim, 1 : nsam), center(1 : ndim, 1 : ncls), size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize)

call setKmeans(rng, membership(1 : nsam), disq(1 : nsam), csdisq(0 : nsam), sample(1 : nsam) , center(1 : ncls) , size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize, nfail = nfail)

call setKmeans(rng, membership(1 : nsam), disq(1 : nsam), csdisq(0 : nsam), sample(1 : ndim, 1 : nsam), center(1 : ndim, 1 : ncls), size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize, nfail = nfail)

pm_clusKmeans::setKmeans
Compute and return an iteratively-refined set of cluster centers given the input sample using the k-m...
Definition: pm_clusKmeans.F90:1661

pm_clusKmeans
This module contains procedures and routines for the computing the Kmeans clustering of a given set o...
Definition: pm_clusKmeans.F90:113

Warning: If the argument rng is present, then all arguments associated with setKmeansPP equally apply to this generic interface.
The condition ubound(center, rank(center)) > 0 must hold for the corresponding input arguments.
The condition ubound(sample, rank(sample)) == ubound(disq, 1) must hold for the corresponding input arguments.
The condition ubound(center, rank(center)) == ubound(size, 1) must hold for the corresponding input arguments.
The condition ubound(center, rank(center)) == ubound(potential, 1) must hold for the corresponding input arguments.
The condition ubound(sample, rank(sample)) == ubound(membership, 1) must hold for the corresponding input arguments.
The condition ubound(center, rank(center)) <= ubound(sample, rank(sample)) must hold for the corresponding input arguments (the number of clusters must be less than or equal to the sample size).
The condition ubound(sample, 1) == ubound(center, 1) must hold for the corresponding input arguments.
The condition 0 <= maxniter must hold for the corresponding input arguments.
The condition 0 <= minsize must hold for the corresponding input arguments.
The condition 0 <= nfail must hold for the corresponding input arguments.
These conditions are verified only if the library is built with the preprocessor macro CHECK_ENABLED=1.; The pure procedure(s) documented herein become impure when the ParaMonte library is compiled with preprocessor macro CHECK_ENABLED=1.
By default, these procedures are pure in release build and impure in debug and testing builds. The procedures of this generic interface are always impure when the input rng argument is set to an object of type rngf_type.

Remarks: The functionality of this generic interface is highly similar to setCenter, with the major difference being that setKmeans simultaneously computes the new cluster centers and sample memberships, whereas setCenter computes the new cluster centers based on given sample membership.

See also: setKmeans
setCenter
setMember
setKmeansPP
k-means clustering

Example usage ⛓

: 1program example

2

3 use pm_kind, only: SK, IK, LK

4 use pm_kind, only: RKG => RKS ! all other real kinds are also supported.

5 use pm_io, only: display_type

6 use pm_distUnif, only: getUnifRand

7 use pm_arrayResize, only: setResized

8 use pm_clusKmeans, only: setKmeansPP

9 use pm_clusKmeans, only: setKmeans, rngf

10

11 implicit none

12

13 logical(LK) :: failed

14 integer(IK) :: ndim, nsam, ncls, itry, niter

15 real(RKG) , allocatable :: sample(:,:), center(:,:), disq(:), csdisq(:), potential(:)

16 integer(IK) , allocatable :: membership(:), size(:)

17 type(display_type) :: disp

18

19 disp = display_type(file = "main.out.F90")

20

21 call disp%skip

22 call disp%show("!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")

23 call disp%show("! Compute cluster centers based on an input sample and cluster memberships and member-center distances.")

24 call disp%show("!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")

25 call disp%skip

26

27 do itry = 1, 10

28 call disp%skip()

29 call disp%show("ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);")

30 ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

31 call disp%show("[ndim, nsam, ncls]")

32 call disp%show( [ndim, nsam, ncls] )

33 call disp%show("sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.")

34 sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

35 call disp%show("sample")

36 call disp%show( sample )

37 call disp%show("call setResized(disq, nsam)")

38 call setResized(disq, nsam)

39 call disp%show("call setResized(csdisq, nsam + 1_IK)")

40 call setResized(csdisq, nsam + 1_IK)

41 call disp%show("call setResized(membership, nsam)")

42 call setResized(membership, nsam)

43 call disp%show("call setResized(center, [ndim, ncls])")

44 call setResized(center, [ndim, ncls])

45 call disp%show("call setResized(potential, ncls)")

46 call setResized(potential, ncls)

47 call disp%show("call setResized(size, ncls)")

48 call setResized(size, ncls)

49 call disp%skip()

50

51 call disp%show("call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.")

52 call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

53 call disp%show("failed")

54 call disp%show( failed )

55 call disp%show("niter")

56 call disp%show( niter )

57 call disp%show("disq")

58 call disp%show( disq )

59 call disp%show("csdisq")

60 call disp%show( csdisq )

61 call disp%show("membership")

62 call disp%show( membership )

63 call disp%show("potential")

64 call disp%show( potential )

65 call disp%show("center")

66 call disp%show( center )

67 call disp%show("size")

68 call disp%show( size )

69 call disp%skip()

70 end do

71

72 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

73 ! Output an example for visualization.

74 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

75

76 block

77 integer(IK) :: funit, i

78 ndim = 2

79 ncls = 5

80 nsam = 5000

81 center = getUnifRand(0., 1., ndim, ncls)

82 sample = getUnifRand(0., 1., ndim, nsam)

83 call setResized(disq, nsam)

84 call setResized(csdisq, nsam + 1_IK)

85 call setResized(membership, nsam)

86 call setResized(potential, ncls)

87 call setResized(size, ncls)

88 ! output the sample and the initial centers.

89 call setKmeansPP(rngf, membership, disq, csdisq, sample, center, size, potential)

90 open(newunit = funit, file = "setKmeansPP.center.txt")

91 do i = 1, ncls

92 write(funit, "(*(g0,:,','))") i, center(:,i)

93 end do

94 close(funit)

95 open(newunit = funit, file = "setKmeansPP.sample.txt")

96 do i = 1, nsam

97 write(funit, "(*(g0,:,','))") membership(i), sample(:,i)

98 end do

99 close(funit)

100 call setKmeans(membership, disq, sample, center, size, potential, failed)

101 open(newunit = funit, file = "setKmeans.center.txt")

102 do i = 1, ncls

103 write(funit, "(*(g0,:,','))") i, center(:,i)

104 end do

105 close(funit)

106 open(newunit = funit, file = "setKmeans.sample.txt")

107 do i = 1, nsam

108 write(funit, "(*(g0,:,','))") membership(i), sample(:,i)

109 end do

110 close(funit)

111 end block

112

113end program example

pm_arrayResize::setResized
Allocate or resize (shrink or expand) an input allocatable scalar string or array of rank 1....
Definition: pm_arrayResize.F90:249

pm_clusKmeans::setKmeansPP
Compute and return an asymptotically optimal set of cluster centers for the input sample,...
Definition: pm_clusKmeans.F90:1181

pm_distUnif::getUnifRand
Generate and return a scalar or a contiguous array of rank 1 of length s1 of randomly uniformly distr...
Definition: pm_distUnif.F90:4150

pm_io::show
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11726

pm_io::skip
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11508

pm_arrayResize
This module contains procedures and generic interfaces for resizing allocatable arrays of various typ...
Definition: pm_arrayResize.F90:81

pm_distUnif
This module contains classes and procedures for computing various statistical quantities related to t...
Definition: pm_distUnif.F90:274

pm_distUnif::rngf
type(rngf_type) rngf
The scalar constant object of type rngf_type whose presence signified the use of the Fortran intrinsi...
Definition: pm_distUnif.F90:2886

pm_io
This module contains classes and procedures for input/output (IO) or generic display operations on st...
Definition: pm_io.F90:252

pm_io::disp
type(display_type) disp
This is a scalar module variable an object of type display_type for general display.
Definition: pm_io.F90:11393

pm_kind
This module defines the relevant Fortran kind type-parameters frequently used in the ParaMonte librar...
Definition: pm_kind.F90:268

pm_kind::LK
integer, parameter LK
The default logical kind in the ParaMonte library: kind(.true.) in Fortran, kind(....
Definition: pm_kind.F90:541

pm_kind::IK
integer, parameter IK
The default integer kind in the ParaMonte library: int32 in Fortran, c_int32_t in C-Fortran Interoper...
Definition: pm_kind.F90:540

pm_kind::SK
integer, parameter SK
The default character kind in the ParaMonte library: kind("a") in Fortran, c_char in C-Fortran Intero...
Definition: pm_kind.F90:539

pm_kind::RKS
integer, parameter RKS
The single-precision real kind in Fortran mode. On most platforms, this is an 32-bit real kind.
Definition: pm_kind.F90:567

pm_io::display_type
Generate and return an object of type display_type.
Definition: pm_io.F90:10282

Example Unix compile command via Intel ifort compiler ⛓
1#!/usr/bin/env sh

2rm main.exe

3ifort -fpp -standard-semantics -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe

4./main.exe

Example Windows Batch compile command via Intel ifort compiler ⛓
1del main.exe

2set PATH=..\..\..\lib;%PATH%

3ifort /fpp /standard-semantics /O3 /I:..\..\..\include main.F90 ..\..\..\lib\libparamonte*.lib /exe:main.exe

4main.exe

Example Unix / MinGW compile command via GNU gfortran compiler ⛓
1#!/usr/bin/env sh

2rm main.exe

3gfortran -cpp -ffree-line-length-none -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe

4./main.exe

Example output ⛓
1

2!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3! Compute cluster centers based on an input sample and cluster memberships and member-center distances.

4!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

5

6

7ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

8[ndim, nsam, ncls]

9+1, +5, +5

10sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

11sample

12+0.194852054, +1.36804998, +3.50469255, +2.98460984, +3.36448717

13call setResized(disq, nsam)

14call setResized(csdisq, nsam + 1_IK)

15call setResized(membership, nsam)

16call setResized(center, [ndim, ncls])

17call setResized(potential, ncls)

18call setResized(size, ncls)

19

20call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

21failed

22F

23niter

24+1

25disq

26+0.00000000, +0.00000000, +0.00000000, +0.00000000, +0.00000000

27csdisq

28+0.00000000, +0.00000000, +0.00000000, +0.196575504E-1, +0.196575504E-1, +0.196575504E-1

29membership

30+1, +3, +5, +4, +2

31potential

32+0.00000000, +0.00000000, +0.00000000, +0.00000000, +0.00000000

33center

34+0.194852054, +3.36448717, +1.36804998, +2.98460984, +3.50469255

35size

36+1, +1, +1, +1, +1

37

38

39ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

40[ndim, nsam, ncls]

41+4, +3, +2

42sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

43sample

44+1.21075058, +0.477194190, +0.705844164E-1

45+1.29817665, +4.80855989, +1.38401628

46+4.96016550, +1.97615683, +3.32968044

47+1.95087135, +4.59686327, +4.26652861

48call setResized(disq, nsam)

49call setResized(csdisq, nsam + 1_IK)

50call setResized(membership, nsam)

51call setResized(center, [ndim, ncls])

52call setResized(potential, ncls)

53call setResized(size, ncls)

54

55call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

56failed

57F

58niter

59+1

60disq

61+0.00000000, +3.45849395, +3.45849514

62csdisq

63+0.00000000, +0.00000000, +28.7664757, +38.0945740

64membership

65+1, +2, +2

66potential

67+0.00000000, +13.8339787

68center

69+1.21075058, +0.273889303

70+1.29817665, +3.09628820

71+4.96016550, +2.65291858

72+1.95087135, +4.43169594

73size

74+1, +2

75

76

77ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

78[ndim, nsam, ncls]

79+4, +2, +1

80sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

81sample

82+4.78989077, +2.71389627

83+3.91539121, +0.366598368E-2

84+4.81316328, +3.63036513

85+4.66318464, +3.42691422

86call setResized(disq, nsam)

87call setResized(csdisq, nsam + 1_IK)

88call setResized(membership, nsam)

89call setResized(center, [ndim, ncls])

90call setResized(potential, ncls)

91call setResized(size, ncls)

92

93call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

94failed

95F

96niter

97+1

98disq

99+5.63468075, +5.63468075

100csdisq

101+0.00000000, +22.5387230, +22.5387230

102membership

103+1, +1

104potential

105+22.5387230

106center

107+3.75189352

108+1.95952857

109+4.22176409

110+4.04504967

111size

112+2

113

114

115ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

116[ndim, nsam, ncls]

117+2, +4, +2

118sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

119sample

120+4.80599022, +4.60842800, +4.02689409, +2.87200499

121+0.725803375, +4.06653929, +1.99143863, +0.680049956

122call setResized(disq, nsam)

123call setResized(csdisq, nsam + 1_IK)

124call setResized(membership, nsam)

125call setResized(center, [ndim, ncls])

126call setResized(potential, ncls)

127call setResized(size, ncls)

128

129call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

130failed

131F

132niter

133+1

134disq

135+0.983213305, +0.00000000, +0.753585756, +1.26477575

136csdisq

137+0.00000000, +2.20882344, +6.85304737, +6.85304737, +9.90655708

138membership

139+1, +2, +1, +1

140potential

141+5.26233292, +0.00000000

142center

143+3.90162992, +4.60842800

144+1.13243067, +4.06653929

145size

146+3, +1

147

148

149ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

150[ndim, nsam, ncls]

151+1, +5, +4

152sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

153sample

154+2.94972777, +2.89715075, +3.70324373, +4.72440243, +1.00358248

155call setResized(disq, nsam)

156call setResized(csdisq, nsam + 1_IK)

157call setResized(membership, nsam)

158call setResized(center, [ndim, ncls])

159call setResized(potential, ncls)

160call setResized(size, ncls)

161

162call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

163failed

164F

165niter

166+1

167disq

168+0.691085705E-3, +0.691085705E-3, +0.00000000, +0.00000000, +0.00000000

169csdisq

170+0.00000000, +0.00000000, +0.276434282E-2, +0.570550621, +0.570550621, +0.570550621

171membership

172+2, +2, +4, +3, +1

173potential

174+0.00000000, +0.276434282E-2, +0.00000000, +0.00000000

175center

176+1.00358248, +2.92343926, +4.72440243, +3.70324373

177size

178+1, +2, +1, +1

179

180

181ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

182[ndim, nsam, ncls]

183+2, +2, +1

184sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

185sample

186+4.08557463, +1.71922982

187+3.98617840, +0.944840908E-1

188call setResized(disq, nsam)

189call setResized(csdisq, nsam + 1_IK)

190call setResized(membership, nsam)

191call setResized(center, [ndim, ncls])

192call setResized(potential, ncls)

193call setResized(size, ncls)

194

195call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

196failed

197F

198niter

199+1

200disq

201+5.18621778, +5.18621826

202csdisq

203+0.00000000, +0.00000000, +20.7448730

204membership

205+1, +1

206potential

207+20.7448730

208center

209+2.90240216

210+2.04033136

211size

212+2

213

214

215ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

216[ndim, nsam, ncls]

217+1, +3, +2

218sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

219sample

220+3.13750243, +1.17432690, +3.06939936

221call setResized(disq, nsam)

222call setResized(csdisq, nsam + 1_IK)

223call setResized(membership, nsam)

224call setResized(center, [ndim, ncls])

225call setResized(potential, ncls)

226call setResized(size, ncls)

227

228call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

229failed

230F

231niter

232+1

233disq

234+0.115951535E-2, +0.00000000, +0.115949905E-2

235csdisq

236+0.00000000, +0.00000000, +3.85405827, +3.85869622

237membership

238+1, +2, +1

239potential

240+0.463802880E-2, +0.00000000

241center

242+3.10345078, +1.17432690

243size

244+2, +1

245

246

247ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

248[ndim, nsam, ncls]

249+1, +7, +4

250sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

251sample

252+4.73206520, +0.647053123, +1.83548307, +4.30943537, +1.03655994, +0.113397837E-2, +2.97877836

253call setResized(disq, nsam)

254call setResized(csdisq, nsam + 1_IK)

255call setResized(membership, nsam)

256call setResized(center, [ndim, ncls])

257call setResized(potential, ncls)

258call setResized(size, ncls)

259

260call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

261failed

262F

263niter

264+1

265disq

266+0.446540937E-1, +0.276653945, +0.438841194, +0.446538925E-1, +0.186246689E-1, +0.00000000, +0.00000000

267csdisq

268+0.00000000, +0.00000000, +0.151715562, +0.789993763, +0.968609750, +0.968609750, +2.04071665, +2.04071665

269membership

270+1, +2, +2, +1, +2, +4, +3

271potential

272+0.178615972, +0.789993763, +0.00000000, +0.00000000

273center

274+4.52075005, +1.17303216, +2.97877836, +0.113397837E-2

275size

276+2, +3, +1, +1

277

278

279ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

280[ndim, nsam, ncls]

281+2, +4, +4

282sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

283sample

284+3.97530222, +4.11549854, +1.85393333, +2.35237861

285+4.09439468, +2.17375374, +2.54279947, +0.151126087

286call setResized(disq, nsam)

287call setResized(csdisq, nsam + 1_IK)

288call setResized(membership, nsam)

289call setResized(center, [ndim, ncls])

290call setResized(potential, ncls)

291call setResized(size, ncls)

292

293call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

294failed

295F

296niter

297+1

298disq

299+0.00000000, +0.00000000, +0.00000000, +0.00000000

300csdisq

301+0.00000000, +0.00000000, +0.00000000, +0.00000000, +5.96854925

302membership

303+3, +1, +2, +4

304potential

305+0.00000000, +0.00000000, +0.00000000, +0.00000000

306center

307+4.11549854, +1.85393333, +3.97530222, +2.35237861

308+2.17375374, +2.54279947, +4.09439468, +0.151126087

309size

310+1, +1, +1, +1

311

312

313ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);

314[ndim, nsam, ncls]

315+1, +8, +5

316sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.

317sample

318+2.25985265, +0.205343664, +0.625154376, +1.04388475, +2.02893543, +0.152878165, +4.06452274, +4.94983768

319call setResized(disq, nsam)

320call setResized(csdisq, nsam + 1_IK)

321call setResized(membership, nsam)

322call setResized(center, [ndim, ncls])

323call setResized(potential, ncls)

324call setResized(size, ncls)

325

326call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.

327failed

328F

329niter

330+1

331disq

332+0.133307176E-1, +0.688157161E-3, +0.438337810E-1, +0.438337810E-1, +0.133306626E-1, +0.688157161E-3, +0.00000000, +0.00000000

333csdisq

334+0.00000000, +0.533227585E-1, +0.533227585E-1, +0.533227585E-1, +0.228657886, +0.228657886, +0.231410518, +1.01519310, +1.01519310

335membership

336+3, +4, +1, +1, +3, +4, +5, +2

337potential

338+0.175335124, +0.00000000, +0.533227585E-1, +0.275262864E-2, +0.00000000

339center

340+0.834519565, +4.94983768, +2.14439392, +0.179110914, +4.06452274

341size

342+2, +1, +2, +2, +1

343

344

Postprocessing of the example output ⛓
1#!/usr/bin/env python

2

3import matplotlib.pyplot as plt

4import pandas as pd

5import numpy as np

6import glob

7import sys

8

9

10fontsize = 17

11

12prefixes = { "setKmeansPP" : "Memberships based on k-means++."

13 , "setKmeans" : "k-means refinement of k-means++ memberships."

14 }

15for prefix in list(prefixes.keys()):

16

17 legends = []

18 fig = plt.figure(figsize = 1.25 * np.array([6.4, 4.8]), dpi = 200)

19 ax = plt.subplot()

20

21 fileList = glob.glob(prefix + ".*.txt")

22 if len(fileList) == 2:

23

24 for file in fileList:

25

26 kind = file.split(".")[1]

27 df = pd.read_csv(file, delimiter = ",", header = None)

28

29 if kind == "center":

30 ax.scatter ( df.values[:, 1]

31 , df.values[:,2]

32 , zorder = 100

33 , marker = "*"

34 , c = "red"

35 , s = 50

36 )

37 legends.append("center")

38 elif kind == "sample":

39 ax.scatter ( df.values[:, 1]

40 , df.values[:,2]

41 , c = df.values[:, 0]

42 , s = 10

43 )

44 legends.append("sample")

45 else:

46 sys.exit("Ambiguous file exists: {}".format(file))

47

48 ax.legend(legends, fontsize = fontsize)

49 plt.xticks(fontsize = fontsize - 2)

50 plt.yticks(fontsize = fontsize - 2)

51 ax.set_xlabel("X", fontsize = 17)

52 ax.set_ylabel("Y", fontsize = 17)

53 ax.set_title(prefixes[prefix], fontsize = fontsize)

54

55 plt.axis('equal')

56 plt.grid(visible = True, which = "both", axis = "both", color = "0.85", linestyle = "-")

57 ax.tick_params(axis = "y", which = "minor")

58 ax.tick_params(axis = "x", which = "minor")

59 ax.set_axisbelow(True)

60 plt.tight_layout()

61

62 plt.savefig(prefix + ".png")

63

64 else:

65

66 sys.exit("Ambiguous file list exists.")

Visualization of the example output ⛓

Test:: test_pm_clusKmeans

Final Remarks ⛓

If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Copyright: Computational Data Science Lab

Author:: Amir Shahmoradi, September 1, 2012, 12:00 AM, National Institute for Fusion Studies, The University of Texas Austin

Definition at line 1661 of file pm_clusKmeans.F90.

The documentation for this interface was generated from the following file:

src/fortran/main/pm_clusKmeans.F90