ParaMonte Fortran 2.0.0
Parallel Monte Carlo and Machine Learning Library
See the latest version documentation.
pm_clustering::setMember Interface Reference

Compute and return the memberships and minimum distances of a set of input points with respect to the an input set of cluster centers.
More...

Detailed Description

Compute and return the memberships and minimum distances of a set of input points with respect to the an input set of cluster centers.

The membership ID of the ith sample sample(:,i) is determined by computing the distance of the sample from each cluster and choosing the cluster ID j whose center(:,j) has the minimum distance from the sample among all clusters.
This minimum squared distance is output in disq(:,i).
The metric used within this generic interface is the Euclidean distance.

Parameters
[out]membership: The output scalar or vector of shape (1:nsam) of type integer of default kind IK, containing the membership of each input sample in sample from its nearest cluster center, such that cluster(membership(i)) is the nearest cluster center to the ith sample sample(:, i) at a squared-distance of disq(i).
[out]disq: The output scalar or vector of shape (1:nsam) of the same type and kind as the input argument sample, containing the Euclidean squared distance of each input sample in sample from its nearest cluster center.
[in]sample: The input scalar, vector, or matrix of,
  1. type real of kind any supported by the processor (e.g., RK, RK32, RK64, or RK128),
containing the sample of nsam points in a ndim-dimensional space whose memberships and minimum distances with respect to the input centers must be computed.
  1. If sample is a scalar and center is a vector of shape (1 : ncls), then the input sample must be the coordinate of a single sample in (univariate space) whose distance from ncls cluster centers must be computed.
  2. If sample is a vector of shape (1 : ndim) and center is a matrix of shape (1 : ndim, 1 : ncls), then the input sample must be a single sample (in ndim-dimensional space) whose distance from ncls cluster centers must be computed.
  3. If sample is a vector of shape (1 : nsam) and center is a vector of shape (1 : ncls), then the input sample must be a collection of nsam points (in univariate space) whose distances from ncls cluster centers must be computed.
  4. If sample is a matrix of shape (1 : ndim, 1 : nsam) and center is a matrix of shape (1 : ndim, 1 : ncls), then the input sample must be a collection of nsam points (in ndim-dimensional space) whose distances from ncls cluster centers must be computed.
[in]center: The input vector of shape (1:ncls) or matrix of shape (1 : ndim, 1 : ncls) of the same type and kind as the input argument sample, containing the set of ncls cluster centers (centroids) with respect to which the sample memberships and minimum distances must be computed.


Possible calling interfaces

call setMember(membership , disq , sample , center(1 : ncls))
call setMember(membership(1 : nsam) , disq(1 : nsam), sample(1 : nsam) , center(1 : ncls))
call setMember(membership , disq , sample(1 : ndim) , center(1 : ndim, 1 : ncls))
call setMember(membership(1 : nsam) , disq(1 : nsam), sample(1 : ndim, 1 : nsam), center(1 : ndim, 1 : ncls))
Compute and return the memberships and minimum distances of a set of input points with respect to the...
This module contains procedures and routines for the computing the Kmeans clustering of a given set o...
Warning
The condition ubound(center, rank(center)) > 0 must hold for the corresponding input arguments.
The condition ubound(sample, rank(sample)) == ubound(disq, 1) must hold for the corresponding input arguments.
The condition ubound(sample, rank(sample)) == ubound(membership, 1) must hold for the corresponding input arguments.
The condition ubound(sample, 1) == ubound(center, 1) .or. rank(sample) == 0 .or. rank(center) == 1 must hold for the corresponding input arguments.
These conditions are verified only if the library is built with the preprocessor macro CHECK_ENABLED=1.
The pure procedure(s) documented herein become impure when the ParaMonte library is compiled with preprocessor macro CHECK_ENABLED=1.
By default, these procedures are pure in release build and impure in debug and testing builds.
Remarks
This generic interface implements one of the two major steps involved in Kmeans clustering.
See also
setKmeans
setCenter
setMember
setKmeansPP


Example usage

1program example
2
3 use pm_kind, only: SK, IK, LK
4 use pm_kind, only: RKG => RKS ! all other real kinds are also supported.
5 use pm_io, only: display_type
6 use pm_distUnif, only: getUnifRand
8 use pm_clustering, only: setMember
9 use pm_arrayRange, only: getRange
10
11 implicit none
12
13 integer(IK) :: ndim, nsam, ncls
14 real(RKG) , allocatable :: sample(:,:), center(:,:), disq(:)
15 integer(IK) , allocatable :: membership(:)
16 type(display_type) :: disp
17
18 disp = display_type(file = "main.out.F90")
19
20 call disp%skip
21 call disp%show("!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
22 call disp%show("! Compute memberships of a sample of points from arbitrary dimensional cluster centers.")
23 call disp%show("!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
24 call disp%skip
25
26 call disp%skip()
27 call disp%show("ndim = getUnifRand(1, 5); nsam = getUnifRand(1, 5); ncls = getUnifRand(1, 5);")
28 ndim = getUnifRand(1, 5); nsam = getUnifRand(1, 5); ncls = getUnifRand(1, 5);
29 call disp%show("[ndim, nsam, ncls]")
30 call disp%show( [ndim, nsam, ncls] )
31 call disp%show("center = getUnifRand(0., 5., ndim, ncls) ! initialize random centers.")
32 center = getUnifRand(0., 5., ndim, ncls) ! initialize random centers.
33 call disp%show("center")
34 call disp%show( center )
35 call disp%show("sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.")
36 sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
37 call disp%show("sample")
38 call disp%show( sample )
39 call disp%show("call setResized(disq, nsam)")
40 call setResized(disq, nsam)
41 call disp%show("call setResized(membership, nsam)")
42 call setResized(membership, nsam)
43 call disp%skip()
44
45 call disp%show("call setMember(membership, disq, sample, center) ! sample points memberships.")
46 call setMember(membership, disq, sample, center) ! sample points memberships.
47 call disp%show("membership")
48 call disp%show( membership )
49 call disp%show("disq")
50 call disp%show( disq )
51 call disp%skip()
52
53 call disp%show("call setMember(membership(1), disq(1), sample(:,1), center) ! single point membership.")
54 call setMember(membership(1), disq(1), sample(:,1), center) ! single point membership.
55 call disp%show("membership")
56 call disp%show( membership )
57 call disp%show("disq")
58 call disp%show( disq )
59 call disp%skip()
60
61 call disp%show("call setMember(membership, disq, sample(1,:), center(1,:)) ! sample points memberships in one-dimension.")
62 call setMember(membership, disq, sample(1,:), center(1,:)) ! sample points memberships in one-dimension.
63 call disp%show("membership")
64 call disp%show( membership )
65 call disp%show("disq")
66 call disp%show( disq )
67 call disp%skip()
68
69 call disp%show("call setMember(membership(1), disq(1), sample(1,1), center(1,:)) ! single point membership in one-dimension.")
70 call setMember(membership(1), disq(1), sample(1,1), center(1,:)) ! single point membership in one-dimension.
71 call disp%show("membership")
72 call disp%show( membership )
73 call disp%show("disq")
74 call disp%show( disq )
75 call disp%skip()
76
77 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
78 ! Output an example for visualization.
79 !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
80
81 block
82 integer(IK) :: funit, i
83 ndim = 2
84 ncls = 5
85 nsam = 5000
86 center = getUnifRand(0., 1., ndim, ncls)
87 sample = getUnifRand(0., 1., ndim, nsam)
88 call setResized(disq, nsam)
89 call setResized(membership, nsam)
90 call setMember(membership, disq, sample, center)
91 open(newunit = funit, file = "setMember.center.txt")
92 do i = 1, ncls
93 write(funit, "(*(g0,:,','))") i, center(:,i)
94 end do
95 close(funit)
96 open(newunit = funit, file = "setMember.sample.txt")
97 do i = 1, nsam
98 write(funit, "(*(g0,:,','))") membership(i), sample(:,i)
99 end do
100 close(funit)
101 end block
102
103end program example
Generate minimally-spaced character, integer, real sequences or sequences at fixed intervals of size ...
Allocate or resize (shrink or expand) an input allocatable scalar string or array of rank 1....
Generate and return a scalar or a contiguous array of rank 1 of length s1 of randomly uniformly distr...
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11726
This is a generic method of the derived type display_type with pass attribute.
Definition: pm_io.F90:11508
This module contains procedures and generic interfaces for generating ranges of discrete character,...
This module contains procedures and generic interfaces for resizing allocatable arrays of various typ...
This module contains classes and procedures for computing various statistical quantities related to t...
This module contains classes and procedures for input/output (IO) or generic display operations on st...
Definition: pm_io.F90:252
type(display_type) disp
This is a scalar module variable an object of type display_type for general display.
Definition: pm_io.F90:11393
This module defines the relevant Fortran kind type-parameters frequently used in the ParaMonte librar...
Definition: pm_kind.F90:268
integer, parameter LK
The default logical kind in the ParaMonte library: kind(.true.) in Fortran, kind(....
Definition: pm_kind.F90:541
integer, parameter IK
The default integer kind in the ParaMonte library: int32 in Fortran, c_int32_t in C-Fortran Interoper...
Definition: pm_kind.F90:540
integer, parameter SK
The default character kind in the ParaMonte library: kind("a") in Fortran, c_char in C-Fortran Intero...
Definition: pm_kind.F90:539
integer, parameter RKS
The single-precision real kind in Fortran mode. On most platforms, this is an 32-bit real kind.
Definition: pm_kind.F90:567
Generate and return an object of type display_type.
Definition: pm_io.F90:10282

Example Unix compile command via Intel ifort compiler
1#!/usr/bin/env sh
2rm main.exe
3ifort -fpp -standard-semantics -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe

Example Windows Batch compile command via Intel ifort compiler
1del main.exe
2set PATH=..\..\..\lib;%PATH%
3ifort /fpp /standard-semantics /O3 /I:..\..\..\include main.F90 ..\..\..\lib\libparamonte*.lib /exe:main.exe
4main.exe

Example Unix / MinGW compile command via GNU gfortran compiler
1#!/usr/bin/env sh
2rm main.exe
3gfortran -cpp -ffree-line-length-none -O3 -Wl,-rpath,../../../lib -I../../../inc main.F90 ../../../lib/libparamonte* -o main.exe
4./main.exe

Example output
1
2!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3! Compute memberships of a sample of points from arbitrary dimensional cluster centers.
4!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5
6
7ndim = getUnifRand(1, 5); nsam = getUnifRand(1, 5); ncls = getUnifRand(1, 5);
8[ndim, nsam, ncls]
9+5, +5, +5
10center = getUnifRand(0., 5., ndim, ncls) ! initialize random centers.
11center
12+2.95287347, +0.873432755, +2.94427490, +4.48586750, +2.38695025
13+4.31980324, +2.16027975, +1.91202998, +3.01914549, +0.972197354
14+4.63103580, +0.415405929, +2.97512984, +4.01900101, +2.90705252
15+0.324255228E-1, +4.30651140, +2.72871852, +2.04859352, +3.20556593
16+2.71212721, +0.887698233, +2.62440348, +0.803742111, +2.39284754
17sample = getUnifRand(0., 5., ndim, nsam) ! Create a random sample.
18sample
19+3.16145992, +3.65849543, +3.82833219, +4.83974886, +4.06710815
20+3.87250900, +1.49034524, +3.47379589, +1.40087605, +2.95571947
21+0.848776400, +2.89913535, +2.59124303, +4.34037304, +3.14335108
22+3.55772209, +2.43910432, +4.86771584, +3.81262803, +2.25654960
23+1.82135320, +1.59435725, +4.12382507, +4.43122387, +3.55864477
24call setResized(disq, nsam)
25call setResized(membership, nsam)
26
27call setMember(membership, disq, sample, center) ! sample points memberships.
28membership
29+3, +3, +3, +3, +3
30disq
31+9.74416351, +1.83857572, +10.1916142, +10.1574488, +3.47409105
32
33call setMember(membership(1), disq(1), sample(:,1), center) ! single point membership.
34membership
35+3, +3, +3, +3, +3
36disq
37+9.74416351, +1.83857572, +10.1916142, +10.1574488, +3.47409105
38
39call setMember(membership, disq, sample(1,:), center(1,:)) ! sample points memberships in one-dimension.
40membership
41+1, +1, +4, +4, +4
42disq
43+0.435083099E-1, +0.497902334, +0.432352692, +0.125232011, +0.175359383
44
45call setMember(membership(1), disq(1), sample(1,1), center(1,:)) ! single point membership in one-dimension.
46membership
47+1, +1, +4, +4, +4
48disq
49+0.435083099E-1, +0.497902334, +0.432352692, +0.125232011, +0.175359383
50
51

Postprocessing of the example output
1#!/usr/bin/env python
2
3import matplotlib.pyplot as plt
4import pandas as pd
5import numpy as np
6import glob
7import sys
8import os
9
10fontsize = 17
11fig = plt.figure(figsize = 1.25 * np.array([6.4, 4.8]), dpi = 200)
12ax = plt.subplot()
13
14parent = os.path.basename(os.path.dirname(__file__))
15pattern = parent + "*.txt"
16
17fileList = glob.glob(pattern)
18legends = []
19if len(fileList) == 2:
20 for file in fileList:
21
22 kind = file.split(".")[1]
23 prefix = file.split(".")[0]
24 df = pd.read_csv(file, delimiter = ",", header = None)
25
26 if kind == "center":
27 ax.scatter ( df.values[:,1]
28 , df.values[:,2]
29 , zorder = 100
30 , marker = "*"
31 , c = "red"
32 , s = 50
33 )
34 legends.append("center")
35 elif kind == "sample":
36 ax.scatter ( df.values[:,1]
37 , df.values[:,2]
38 , c = df.values[:, 0]
39 , s = 10
40 )
41 legends.append("sample")
42 else:
43 sys.exit("Ambiguous file exists: {}".format(file))
44
45 ax.legend(legends, fontsize = fontsize)
46 plt.xticks(fontsize = fontsize - 2)
47 plt.yticks(fontsize = fontsize - 2)
48 ax.set_xlabel("X", fontsize = 17)
49 ax.set_ylabel("Y", fontsize = 17)
50 ax.set_title("Membership Scatter Plot", fontsize = fontsize)
51
52 plt.axis('equal')
53 plt.grid(visible = True, which = "both", axis = "both", color = "0.85", linestyle = "-")
54 ax.tick_params(axis = "y", which = "minor")
55 ax.tick_params(axis = "x", which = "minor")
56 ax.set_axisbelow(True)
57 plt.tight_layout()
58
59 plt.savefig(prefix + ".png")
60else:
61 sys.exit("Ambiguous file list exists.")

Visualization of the example output
Test:
test_pm_clustering


Final Remarks


If you believe this algorithm or its documentation can be improved, we appreciate your contribution and help to edit this page's documentation and source file on GitHub.
For details on the naming abbreviations, see this page.
For details on the naming conventions, see this page.
This software is distributed under the MIT license with additional terms outlined below.

  1. If you use any parts or concepts from this library to any extent, please acknowledge the usage by citing the relevant publications of the ParaMonte library.
  2. If you regenerate any parts/ideas from this library in a programming environment other than those currently supported by this ParaMonte library (i.e., other than C, C++, Fortran, MATLAB, Python, R), please also ask the end users to cite this original ParaMonte library.

This software is available to the public under a highly permissive license.
Help us justify its continued development and maintenance by acknowledging its benefit to society, distributing it, and contributing to it.

Author:
Amir Shahmoradi, September 1, 2012, 12:00 AM, National Institute for Fusion Studies, The University of Texas at Austin

Definition at line 126 of file pm_clustering.F90.


The documentation for this interface was generated from the following file: