ParaMonte Fortran 2.0.0
Parallel Monte Carlo and Machine Learning Library
See the latest version documentation.
pm_clusKmeans::setKmeans Interface Reference

Compute and return an iteratively-refined set of cluster centers given the input sample using the k-means approach.
More...

Detailed Description

Compute and return an iteratively-refined set of cluster centers given the input sample using the k-means approach.

See the documentation of pm_clusKmeans for more information on the Kmeans clustering algorithm.
The metric used within this generic interface is the Euclidean distance.

Parameters
[in,out]rng: The input/output scalar that can be an object of,
  1. type rngf_type, implying the use of intrinsic Fortran uniform RNG.
  2. type xoshiro256ssw_type, implying the use of xoshiro256** uniform RNG.
(optional. If this argument is present, then all intent(inout) arguments below have intent(out) argument and will be initialized using the k-means++ algorithm.
If this argument is missing, then the user must initialize all of the following arguments with intent(inout) via k-means++ or any other method before passing them to this generic interface.)
[in,out]membership: The input/output vector of shape (1:nsam) of type integer of default kind IK, containing the membership of each input sample in sample from its nearest cluster center, such that cluster(membership(i)) is the nearest cluster center to the ith sample sample(:, i) at a squared-distance of disq(i).
If the argument rng is missing, then membership has intent(inout) and must be properly initialized before calling this routine.
If the argument rng is present, then membership has intent(out) and will contain the final cluster memberships on output.
[in,out]disq: The input/output vector of shape (1:nsam) of the same type and kind as the input argument sample, containing the Euclidean squared distance of each input sample in sample from its nearest cluster center.
If the argument rng is missing, then disq has intent(inout) and must be properly initialized before calling this routine.
If the argument rng is present, then disq has intent(out) and will contain the final squared-distances from cluster centers on output.
[in]sample: The input vector, or matrix of,
  1. type real of kind any supported by the processor (e.g., RK, RK32, RK64, or RK128),
containing the sample of nsam points in a ndim-dimensional space whose corresponding cluster centers must be computed.
  1. If sample is a vector of shape (1 : nsam) and center is a vector of shape (1 : ncls), then the input sample must be a collection of nsam points (in univariate space).
  2. If sample is a matrix of shape (1 : ndim, 1 : nsam) and center is a matrix of shape (1 : ndim, 1 : ncls), then the input sample must be a collection of nsam points (in ndim-dimensional space).
[in,out]center: The input/output vector of shape (1:ncls) or matrix of shape (1 : ndim, 1 : ncls) of the same type and kind as the input argument sample, containing the set of ncls unique cluster centers (centroids) computed based on the input sample memberships and minimum distances.
If the argument rng is missing, then center has intent(inout) and must be properly initialized before calling this routine.
If the argument rng is present, then center has intent(out) and will contain the final cluster centers on output.
[in,out]size: The input/output vector of shape (1:ncls) type integer of default kind IK, containing the sizes (number of members) of the clusters with the corresponding centers output in the argument center.
If the argument rng is missing, then size has intent(inout) and must be properly initialized before calling this routine.
If the argument rng is present, then size has intent(out) and will contain the final cluster sizes (member counts) on output.
[in,out]potential: The input/output vector of shape (1:ncls) of the same type and kind as the input argument sample, the ith element of which contains the sum of squared distances of all members of the ith cluster from the cluster center as output in the ith element of center.
If the argument rng is missing, then potential has intent(inout) and must be properly initialized before calling this routine (although its values are not explicitly referenced).
If the argument rng is present, then potential has intent(out) and will contain the final cluster potentials (sums of squared distances from cluster centers) on output.
[out]failed: The output scalar of type logical of default kind LK that is .true. if and only if the algorithm fails to converge within the user-specified or default criteria for convergence.
Failure occurs only if any(size < minsize) .or. maxniter < niter.
[out]niter: The output scalar of type integer of default kind IK, containing the number of refinement iterations performed within the algorithm to achieve convergence.
An output niter value larger than the input maxniter implies lack of convergence before return.
(optional. If missing, the number of refinement iterations will not be output.)
[in]maxniter: The input non-negative scalar of type integer of default kind IK, containing the maximum number of refinement iterations allowed within the algorithm to achieve convergence.
If convergence does not occur within the maximum specified value, the output arguments can be passed again as is to the generic interface (without the optional rng argument) to continue the refinement iterations until convergence.
A reasonable choice can be 300 or comparable values.
(optional, default = 300.)
[in]minsize: The input non-negative scalar of type integer of default kind IK, containing the minimum allowed size of each cluster.
If any cluster has any number of members below the specified minsize, the algorithm will return without achieving convergence.
The situation can be detected of any element of the output size is smaller than the specified minsize.
A reasonable choice can be ndim = ubound(sample, 1) or comparable values although any non-negative value including zero is possible.
(optional, default = 1.)
[in]nfail: The input non-negative scalar of type integer of default kind IK, containing the number of times the k-means algorithm is allowed to fail before returning without convergence.
(optional. It can be present only if the argument rng is also present, allowing random initializations in case of failures.)


Possible calling interfaces

call setKmeans(membership(1 : nsam), disq(1 : nsam), sample(1 : nsam) , center(1 : ncls) , size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize)
call setKmeans(membership(1 : nsam), disq(1 : nsam), sample(1 : ndim, 1 : nsam), center(1 : ndim, 1 : ncls), size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize)
call setKmeans(rng, membership(1 : nsam), disq(1 : nsam), csdisq(0 : nsam), sample(1 : nsam) , center(1 : ncls) , size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize, nfail = nfail)
call setKmeans(rng, membership(1 : nsam), disq(1 : nsam), csdisq(0 : nsam), sample(1 : ndim, 1 : nsam), center(1 : ndim, 1 : ncls), size(1 : ncls), potential(1 : ncls), failed, niter = niter, maxniter = maxniter, minsize = minsize, nfail = nfail)
Compute and return an iteratively-refined set of cluster centers given the input sample using the k-m...
This module contains procedures and routines for the computing the Kmeans clustering of a given set o...
Warning
If the argument rng is present, then all arguments associated with setKmeansPP equally apply to this generic interface.
The condition ubound(center, rank(center)) > 0 must hold for the corresponding input arguments.
The condition ubound(sample, rank(sample)) == ubound(disq, 1) must hold for the corresponding input arguments.
The condition ubound(center, rank(center)) == ubound(size, 1) must hold for the corresponding input arguments.
The condition ubound(center, rank(center)) == ubound(potential, 1) must hold for the corresponding input arguments.
The condition ubound(sample, rank(sample)) == ubound(membership, 1) must hold for the corresponding input arguments.
The condition ubound(center, rank(center)) <= ubound(sample, rank(sample)) must hold for the corresponding input arguments (the number of clusters must be less than or equal to the sample size).
The condition ubound(sample, 1) == ubound(center, 1) must hold for the corresponding input arguments.
The condition 0 <= maxniter must hold for the corresponding input arguments.
The condition 0 <= minsize must hold for the corresponding input arguments.
The condition 0 <= nfail must hold for the corresponding input arguments.
These conditions are verified only if the library is built with the preprocessor macro CHECK_ENABLED=1.
The pure procedure(s) documented herein become impure when the ParaMonte library is compiled with preprocessor macro CHECK_ENABLED=1.
By default, these procedures are pure in release build and impure in debug and testing builds. The procedures of this generic interface are always impure when the input rng argument is set to an object of type rngf_type.
Remarks
The functionality of this generic interface is highly similar to setCenter, with the major difference being that setKmeans simultaneously computes the new cluster centers and sample memberships, whereas setCenter computes the new cluster centers based on given sample membership.
See also
setKmeans
setCenter
setMember
setKmeansPP
k-means clustering


Example usage

1program example
2
3 use pm_kind, only: SK, IK, LK
4 use pm_kind, only: RKG => RKS ! all other real kinds are also supported.
5 use pm_io, only: display_type
6 use pm_distUnif, only: getUnifRand
10
11 implicit none
12
13 logical(LK) :: failed
14 integer(IK) :: ndim, nsam, ncls, itry, niter
15 real(RKG) , allocatable :: sample(:,:), center(:,:), disq(:), csdisq(:), potential(:)
16 integer(IK) , allocatable :: membership(:), size(:)
17 type(display_type) :: disp
18
19 disp = display_type(file = "main.out.F90")
20
21 call disp%skip
22 call disp%show("!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
23 call disp%show("! Compute cluster centers based on an input sample and cluster memberships and member-center distances.")
24 call disp%show("!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
25 call disp%skip
26
27 do itry = 1, 10
28 call disp%skip()
29 call disp%show("ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);")
30 ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
31 call disp%show("[ndim, nsam, ncls]")
32 call disp%show( [ndim, nsam, ncls] )
33 call disp%show("sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.")
34 sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
35 call disp%show("sample")
36 call disp%show( sample )
37 call disp%show("call setResized(disq, nsam)")
38 call setResized(disq, nsam)
39 call disp%show("call setResized(csdisq, nsam + 1_IK)")
40 call setResized(csdisq, nsam + 1_IK)
41 call disp%show("call setResized(membership, nsam)")
42 call setResized(membership, nsam)
43 call disp%show("call setResized(center, [ndim, ncls])")
44 call setResized(center, [ndim, ncls])
45 call disp%show("call setResized(potential, ncls)")
46 call setResized(potential, ncls)
47 call disp%show("call setResized(size, ncls)")
48 call setResized(size, ncls)
49 call disp%skip()
50
51 call disp%show("call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.")
52 call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
53 call disp%show("failed")
54 call disp%show( failed )
55 call disp%show("niter")
56 call disp%show( niter )
57 call disp%show("disq")
58 call disp%show( disq )
59 call disp%show("csdisq")
60 call disp%show( csdisq )
61 call disp%show("membership")
62 call disp%show( membership )
63 call disp%show("potential")
64 call disp%show( potential )
65 call disp%show("center")
66 call disp%show( center )
67 call disp%show("size")
68 call disp%show( size )
69 call disp%skip()
70 end do
71
72 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
73 ! Output an example for visualization.
74 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
75
76 block
77 integer(IK) :: funit, i
78 ndim = 2
79 ncls = 5
80 nsam = 5000
81 center = getUnifRand(0., 1., ndim, ncls)
82 sample = getUnifRand(0., 1., ndim, nsam)
83 call setResized(disq, nsam)
84 call setResized(csdisq, nsam + 1_IK)
85 call setResized(membership, nsam)
86 call setResized(potential, ncls)
87 call setResized(size, ncls)
88 ! output the sample and the initial centers.
89 call setKmeansPP(rngf, membership, disq, csdisq, sample, center, size, potential)
90 open(newunit = funit, file = "setKmeansPP.center.txt")
91 do i = 1, ncls
92 write(funit, "(*(g0,:,','))") i, center(:,i)
93 end do
94 close(funit)
95 open(newunit = funit, file = "setKmeansPP.sample.txt")
96 do i = 1, nsam
97 write(funit, "(*(g0,:,','))") membership(i), sample(:,i)
98 end do
99 close(funit)
100 call setKmeans(membership, disq, sample, center, size, potential, failed)
101 open(newunit = funit, file = "setKmeans.center.txt")
102 do i = 1, ncls
103 write(funit, "(*(g0,:,','))") i, center(:,i)
104 end do
105 close(funit)
106 open(newunit = funit, file = "setKmeans.sample.txt")
107 do i = 1, nsam
108 write(funit, "(*(g0,:,','))") membership(i), sample(:,i)
109 end do
110 close(funit)
111 end block
112
113end program example
Allocate or resize (shrink or expand) an input allocatable scalar string or array of rank 1....
Compute and return an asymptotically optimal set of cluster centers for the input sample,...
Generate and return a scalar or a contiguous array of rank 1 of length s1 of randomly uniformly distr...
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11726
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11508
This module contains procedures and generic interfaces for resizing allocatable arrays of various typ...
This module contains classes and procedures for computing various statistical quantities related to t...
type(rngf_type) rngf
The scalar constant object of type rngf_type whose presence signified the use of the Fortran intrinsi...
This module contains classes and procedures for input/output (IO) or generic display operations on st...
Definition: pm_io.F90:252
type(display_type) disp
This is a scalar module variable an object of type display_type for general display.
Definition: pm_io.F90:11393
This module defines the relevant Fortran kind type-parameters frequently used in the ParaMonte librar...
Definition: pm_kind.F90:268
integer, parameter LK
The default logical kind in the ParaMonte library: kind(.true.) in Fortran, kind(....
Definition: pm_kind.F90:541
integer, parameter IK
The default integer kind in the ParaMonte library: int32 in Fortran, c_int32_t in C-Fortran Interoper...
Definition: pm_kind.F90:540
integer, parameter SK
The default character kind in the ParaMonte library: kind("a") in Fortran, c_char in C-Fortran Intero...
Definition: pm_kind.F90:539
integer, parameter RKS
The single-precision real kind in Fortran mode. On most platforms, this is an 32-bit real kind.
Definition: pm_kind.F90:567
Generate and return an object of type display_type.
Definition: pm_io.F90:10282

Example Unix compile command via Intel ifort compiler
1#!/usr/bin/env sh
2rm main.exe
3ifort -fpp -standard-semantics -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe

Example Windows Batch compile command via Intel ifort compiler
1del main.exe
2set PATH=..\..\..\lib;%PATH%
3ifort /fpp /standard-semantics /O3 /I:..\..\..\include main.F90 ..\..\..\lib\libparamonte*.lib /exe:main.exe
4main.exe

Example Unix / MinGW compile command via GNU gfortran compiler
1#!/usr/bin/env sh
2rm main.exe
3gfortran -cpp -ffree-line-length-none -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe

Example output
1
2!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3! Compute cluster centers based on an input sample and cluster memberships and member-center distances.
4!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5
6
7ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
8[ndim, nsam, ncls]
9+3, +1, +1
10sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
11sample
12+1.86056590
13+2.33848810
14+4.94708586
15call setResized(disq, nsam)
16call setResized(csdisq, nsam + 1_IK)
17call setResized(membership, nsam)
18call setResized(center, [ndim, ncls])
19call setResized(potential, ncls)
20call setResized(size, ncls)
21
22call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
23failed
24F
25niter
26+1
27disq
28+0.00000000
29csdisq
30+0.00000000, +0.00000000
31membership
32+1
33potential
34+0.00000000
35center
36+1.86056590
37+2.33848810
38+4.94708586
39size
40+1
41
42
43ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
44[ndim, nsam, ncls]
45+4, +6, +3
46sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
47sample
48+0.605106354E-1, +3.91550684, +1.78259075, +0.244861841E-1, +3.74207926, +1.23868608
49+3.82219648, +2.51834297, +4.38724089, +1.76161551, +2.37216377, +0.920128822
50+3.29170966, +2.14836836, +2.48454905, +1.69616365, +3.94570684, +4.50155544
51+4.74300432, +1.08490205, +0.494896472, +2.63030267, +3.73949027, +3.34173584
52call setResized(disq, nsam)
53call setResized(csdisq, nsam + 1_IK)
54call setResized(membership, nsam)
55call setResized(center, [ndim, ncls])
56call setResized(potential, ncls)
57call setResized(size, ncls)
58
59call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
60failed
61F
62niter
63+1
64disq
65+4.27282333, +2.12580848, +2.12580872, +2.63968539, +4.27282381, +2.63968563
66csdisq
67+0.00000000, +0.00000000, +8.50323391, +8.50323391, +19.7598000, +36.8510933, +50.0884705
68membership
69+2, +1, +1, +3, +2, +3
70potential
71+8.50323391, +17.0912933, +10.5587416
72center
73+2.84904885, +1.90129495, +0.631586134
74+3.45279193, +3.09718013, +1.34087217
75+2.31645870, +3.61870813, +3.09885955
76+0.789899230, +4.24124718, +2.98601913
77size
78+2, +2, +2
79
80
81ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
82[ndim, nsam, ncls]
83+4, +1, +1
84sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
85sample
86+3.31905961
87+1.77769637
88+4.34667253
89+0.445553660E-1
90call setResized(disq, nsam)
91call setResized(csdisq, nsam + 1_IK)
92call setResized(membership, nsam)
93call setResized(center, [ndim, ncls])
94call setResized(potential, ncls)
95call setResized(size, ncls)
96
97call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
98failed
99F
100niter
101+1
102disq
103+0.00000000
104csdisq
105+0.00000000, +0.00000000
106membership
107+1
108potential
109+0.00000000
110center
111+3.31905961
112+1.77769637
113+4.34667253
114+0.445553660E-1
115size
116+1
117
118
119ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
120[ndim, nsam, ncls]
121+5, +3, +3
122sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
123sample
124+1.91127121, +1.73543835, +4.50633907
125+2.02446437, +0.232717097, +0.109225810
126+2.12177014, +4.74487829, +3.43190169
127+1.99768043, +4.05633163, +1.49159253
128+3.93518448, +0.113532245, +3.09043932
129call setResized(disq, nsam)
130call setResized(csdisq, nsam + 1_IK)
131call setResized(membership, nsam)
132call setResized(center, [ndim, ncls])
133call setResized(potential, ncls)
134call setResized(size, ncls)
135
136call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
137failed
138F
139niter
140+1
141disq
142+0.00000000, +0.00000000, +0.00000000
143csdisq
144+0.00000000, +0.00000000, +0.00000000, +13.0886822
145membership
146+1, +2, +3
147potential
148+0.00000000, +0.00000000, +0.00000000
149center
150+1.91127121, +1.73543835, +4.50633907
151+2.02446437, +0.232717097, +0.109225810
152+2.12177014, +4.74487829, +3.43190169
153+1.99768043, +4.05633163, +1.49159253
154+3.93518448, +0.113532245, +3.09043932
155size
156+1, +1, +1
157
158
159ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
160[ndim, nsam, ncls]
161+5, +7, +4
162sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
163sample
164+4.50138950, +2.56586671, +3.02935314, +2.58235574, +4.41346312, +1.45052254, +0.178368092
165+0.559035540, +4.24414110, +3.28050518, +3.15112329, +4.54099274, +2.02088332, +1.48733282
166+3.66114235, +1.40305936, +2.48747873, +0.932545960, +3.06326628, +4.25475121, +4.35074234
167+3.61076713, +2.94333220, +0.214233398, +1.71196580, +1.57029033, +0.717387497, +1.96168566
168+3.62745976, +3.48571634, +4.92286730, +0.182558894, +4.99397373, +4.07332945, +2.42210770
169call setResized(disq, nsam)
170call setResized(csdisq, nsam + 1_IK)
171call setResized(membership, nsam)
172call setResized(center, [ndim, ncls])
173call setResized(potential, ncls)
174call setResized(size, ncls)
175
176call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
177failed
178F
179niter
180+1
181disq
182+0.00000000, +3.46086335, +1.42001688, +3.46086407, +1.42001677, +1.54676962, +1.54676962
183csdisq
184+0.00000000, +20.3676052, +20.3676052, +26.0476723, +39.8911285, +39.8911285, +39.8911285, +46.0782089
185membership
186+4, +1, +2, +1, +2, +3, +3
187potential
188+13.8434544, +5.68006706, +6.18707848, +0.00000000
189center
190+2.57411122, +3.72140813, +0.814445317, +4.50138950
191+3.69763231, +3.91074896, +1.75410807, +0.559035540
192+1.16780269, +2.77537251, +4.30274677, +3.66114235
193+2.32764912, +0.892261863, +1.33953655, +3.61076713
194+1.83413768, +4.95842075, +3.24771857, +3.62745976
195size
196+2, +2, +2, +1
197
198
199ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
200[ndim, nsam, ncls]
201+1, +3, +3
202sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
203sample
204+4.85299492, +3.85692549, +3.69540405
205call setResized(disq, nsam)
206call setResized(csdisq, nsam + 1_IK)
207call setResized(membership, nsam)
208call setResized(center, [ndim, ncls])
209call setResized(potential, ncls)
210call setResized(size, ncls)
211
212call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
213failed
214F
215niter
216+1
217disq
218+0.00000000, +0.00000000, +0.00000000
219csdisq
220+0.00000000, +0.00000000, +0.260891747E-1, +0.260891747E-1
221membership
222+2, +3, +1
223potential
224+0.00000000, +0.00000000, +0.00000000
225center
226+3.69540405, +4.85299492, +3.85692549
227size
228+1, +1, +1
229
230
231ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
232[ndim, nsam, ncls]
233+3, +10, +5
234sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
235sample
236+2.75528002, +2.93790388, +0.147336721, +0.442206860E-2, +0.712560713, +0.765154064, +0.685321689, +0.690966249, +3.45343304, +2.26599216
237+4.52442741, +1.90929794, +0.782236457E-1, +1.68936908, +0.733100474, +2.87327075, +0.891475379, +4.29375362, +1.72655642, +0.475087166
238+1.95573115, +3.73172855, +2.99981856, +2.11505222, +1.52355337, +1.54282331, +3.17285490, +1.90022445, +4.98853016, +0.505707562
239call setResized(disq, nsam)
240call setResized(csdisq, nsam + 1_IK)
241call setResized(membership, nsam)
242call setResized(center, [ndim, ncls])
243call setResized(potential, ncls)
244call setResized(size, ncls)
245
246call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
247failed
248F
249niter
250+1
251disq
252+1.07942069, +0.469678760, +0.245186925, +0.395962179, +1.15452337, +1.33519077, +0.245186970, +1.07942045, +0.469678760, +0.00000000
253csdisq
254+0.00000000, +0.00000000, +1.87871504, +5.27774096, +5.27774096, +7.04352188, +9.35130405, +11.5705090, +15.8881912, +15.8881912, +15.8881912
255membership
256+1, +3, +5, +4, +4, +4, +5, +1, +3, +2
257potential
258+4.31768274, +0.00000000, +1.87871504, +4.07356310, +0.980747759
259center
260+1.72312307, +2.26599216, +3.19566846, +0.494045615, +0.416329205
261+4.40909052, +0.475087166, +1.81792712, +1.76524675, +0.484849513
262+1.92797780, +0.505707562, +4.36012936, +1.72714305, +3.08633661
263size
264+2, +1, +2, +3, +2
265
266
267ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
268[ndim, nsam, ncls]
269+1, +9, +5
270sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
271sample
272+3.07414961, +0.723273456, +1.06133819, +3.49542165, +1.94098568, +0.823132992, +0.477921963, +0.442555249, +3.12862921
273call setResized(disq, nsam)
274call setResized(csdisq, nsam + 1_IK)
275call setResized(membership, nsam)
276call setResized(center, [ndim, ncls])
277call setResized(potential, ncls)
278call setResized(size, ncls)
279
280call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
281failed
282F
283niter
284+1
285disq
286+0.742006698E-3, +0.213086475E-1, +0.368985347E-1, +0.00000000, +0.00000000, +0.212661899E-2, +0.312701100E-3, +0.312701100E-3, +0.742006698E-3
287csdisq
288+0.00000000, +0.00000000, +0.997192692E-2, +0.667136386E-1, +0.667136386E-1, +0.667136386E-1, +0.667136386E-1, +0.185884297, +0.330723703, +0.333691716
289membership
290+2, +1, +1, +4, +3, +1, +5, +5, +2
291potential
292+0.667136386E-1, +0.296802679E-2, +0.00000000, +0.00000000, +0.125080440E-2
293center
294+0.869248271, +3.10138941, +1.94098568, +3.49542165, +0.460238606
295size
296+3, +2, +1, +1, +2
297
298
299ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
300[ndim, nsam, ncls]
301+2, +10, +5
302sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
303sample
304+3.45596528, +4.18688726, +1.62195742, +4.01878929, +0.605930686, +0.474122763, +2.98025417, +3.16942739, +1.88655758, +0.837803781
305+3.01706028, +1.91547215, +3.67292905, +1.77127719, +3.07917452, +0.681773722, +2.69988775, +1.79693007, +0.562771261, +0.186223388
306call setResized(disq, nsam)
307call setResized(csdisq, nsam + 1_IK)
308call setResized(membership, nsam)
309call setResized(center, [ndim, ncls])
310call setResized(potential, ncls)
311call setResized(size, ncls)
312
313call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
314failed
315F
316niter
317+1
318disq
319+0.817249268E-1, +0.163841620, +0.00000000, +0.547741093E-1, +0.00000000, +0.392473638, +0.817248076E-1, +0.388184130, +0.680419862, +0.136653349
320csdisq
321+0.00000000, +0.00000000, +1.74774337, +1.74774337, +3.61648989, +3.61648989, +3.61648989, +3.94338942, +5.51421118, +7.52334499, +7.90117884
322membership
323+1, +5, +3, +5, +4, +2, +1, +5, +2, +2
324potential
325+0.326899469, +2.38696766, +0.00000000, +0.00000000, +1.09832597
326center
327+3.21810961, +1.06616139, +1.62195742, +0.605930686, +3.79170179
328+2.85847402, +0.476922810, +3.67292905, +3.07917452, +1.82789326
329size
330+2, +3, +1, +1, +3
331
332
333ndim = getUnifRand(1, 5); ncls = getUnifRand(1, 5); nsam = getUnifRand(ncls, 2 * ncls);
334[ndim, nsam, ncls]
335+3, +8, +4
336sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
337sample
338+4.21177197, +3.65612292, +4.27762413, +0.231634676, +3.91298723, +3.40240502, +2.76649618, +4.97584581
339+4.07332706, +4.20142221, +1.51513553, +0.276147127, +3.72665834, +3.36839247, +0.102711618, +4.99919701
340+3.08924484, +1.17688060, +4.06864548, +3.44875050, +1.04852080, +4.63641834, +2.18145585, +1.96244693
341call setResized(disq, nsam)
342call setResized(csdisq, nsam + 1_IK)
343call setResized(membership, nsam)
344call setResized(center, [ndim, ncls])
345call setResized(potential, ncls)
346call setResized(size, ncls)
347
348call setKmeans(rngf, membership, disq, csdisq, sample, center, size, potential, failed, niter) ! compute the new clusters and memberships.
349failed
350F
351niter
352+1
353disq
354+1.95381999, +0.335764915, +2.27954841, +0.00000000, +0.532117009, +0.958829999, +0.00000000, +1.42790747
355csdisq
356+0.00000000, +3.54575348, +3.54575348, +8.06868935, +8.06868935, +8.37654591, +8.37654591, +16.4381847, +19.4334126
357membership
358+1, +2, +1, +3, +2, +1, +4, +2
359potential
360+8.06868935, +3.30308390, +0.00000000, +0.00000000
361center
362+3.96393371, +4.18165207, +0.231634676, +2.76649618
363+2.98561859, +4.30909252, +0.276147127, +0.102711618
364+3.93143630, +1.39594936, +3.44875050, +2.18145585
365size
366+3, +3, +1, +1
367
368

Postprocessing of the example output
1#!/usr/bin/env python
2
3import matplotlib.pyplot as plt
4import pandas as pd
5import numpy as np
6import glob
7import sys
8
9
10fontsize = 17
11
12prefixes = { "setKmeansPP" : "Memberships based on k-means++."
13 , "setKmeans" : "k-means refinement of k-means++ memberships."
14 }
15for prefix in list(prefixes.keys()):
16
17 legends = []
18 fig = plt.figure(figsize = 1.25 * np.array([6.4, 4.8]), dpi = 200)
19 ax = plt.subplot()
20
21 fileList = glob.glob(prefix + ".*.txt")
22 if len(fileList) == 2:
23
24 for file in fileList:
25
26 kind = file.split(".")[1]
27 df = pd.read_csv(file, delimiter = ",", header = None)
28
29 if kind == "center":
30 ax.scatter ( df.values[:, 1]
31 , df.values[:,2]
32 , zorder = 100
33 , marker = "*"
34 , c = "red"
35 , s = 50
36 )
37 legends.append("center")
38 elif kind == "sample":
39 ax.scatter ( df.values[:, 1]
40 , df.values[:,2]
41 , c = df.values[:, 0]
42 , s = 10
43 )
44 legends.append("sample")
45 else:
46 sys.exit("Ambiguous file exists: {}".format(file))
47
48 ax.legend(legends, fontsize = fontsize)
49 plt.xticks(fontsize = fontsize - 2)
50 plt.yticks(fontsize = fontsize - 2)
51 ax.set_xlabel("X", fontsize = 17)
52 ax.set_ylabel("Y", fontsize = 17)
53 ax.set_title(prefixes[prefix], fontsize = fontsize)
54
55 plt.axis('equal')
56 plt.grid(visible = True, which = "both", axis = "both", color = "0.85", linestyle = "-")
57 ax.tick_params(axis = "y", which = "minor")
58 ax.tick_params(axis = "x", which = "minor")
59 ax.set_axisbelow(True)
60 plt.tight_layout()
61
62 plt.savefig(prefix + ".png")
63
64 else:
65
66 sys.exit("Ambiguous file list exists.")

Visualization of the example output


Test:
test_pm_clusKmeans


Final Remarks


If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

  1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
  2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Amir Shahmoradi, September 1, 2012, 12:00 AM, National Institute for Fusion Studies, The University of Texas at Austin

Definition at line 1661 of file pm_clusKmeans.F90.


The documentation for this interface was generated from the following file: