Line data Source code
1 : !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
2 : !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
3 : !!!! !!!!
4 : !!!! ParaMonte: Parallel Monte Carlo and Machine Learning Library. !!!!
5 : !!!! !!!!
6 : !!!! Copyright (C) 2012-present, The Computational Data Science Lab !!!!
7 : !!!! !!!!
8 : !!!! This file is part of the ParaMonte library. !!!!
9 : !!!! !!!!
10 : !!!! LICENSE !!!!
11 : !!!! !!!!
12 : !!!! https://github.com/cdslaborg/paramonte/blob/main/LICENSE.md !!!!
13 : !!!! !!!!
14 : !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
15 : !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
16 :
17 : !> \brief
18 : !> This module contains procedures and generic interfaces for facilitating parallel computations or computing the performance of the parallel Coarray/MPI/OpenMP algorithms.
19 : !>
20 : !> \details
21 : !> A primary goal for the design of this type and the associated procedures is to facilitate parallelism-agnostic coding within a codebase.<br>
22 : !> As such, the attributes and procedures within this module are guaranteed to also work in serial applications.<br>
23 : !> However, the procedures of this module typically do nothing when the application is serial or the parallelism is not recognized.<br>
24 : !>
25 : !> \test
26 : !> [test_pm_parallelism](@ref test_pm_parallelism)
27 : !>
28 : !> \finmain
29 : !>
30 : !> \author
31 : !> \AmirShahmoradi, Tuesday March 7, 2017, 3:50 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
32 :
33 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
34 :
35 : module pm_parallelism
36 :
37 : use pm_kind, only: SK, IK, LK
38 : use pm_val2str, only: getStr
39 :
40 : implicit none
41 :
42 : character(*, SK), parameter :: MODULE_NAME = SK_"@pm_parallelism"
43 :
44 : #if CAF_ENABLED || MPI_ENABLED
45 : !> \brief
46 : !> The scalar constant of type `character` of default kind \SK, whose value is set to `parallel`
47 : !> if the ParaMonte library is built with the preprocessor macro `-DCAF_ENABLED=1` `-DMPI_ENABLED=1`,
48 : !> otherwise, it is set to `"serial"`.<br>
49 : character(*, SK), parameter :: PARALLELIZATION_MODE = SK_"parallel"
50 : #else
51 : character(*, SK), parameter :: PARALLELIZATION_MODE = SK_"serial"
52 : #endif
53 :
54 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
55 :
56 : !> \brief
57 : !> This is the [imageis_type](@ref pm_parallelism::imageis_type) type for generating objects with components
58 : !> of type `logical` of default kind \LK that contain information about the current image/processor/thread.<br>
59 : !>
60 : !> \details
61 : !> Objects of this type are not meant to be used directly by the end user.<br>
62 : !> This type merely exists to create the `is` component of the [image_type](@ref pm_parallelism::image_type) class.<br>
63 : !>
64 : !> \interface{imageis_type}
65 : !> \code{.F90}
66 : !>
67 : !> use pm_parallelism, only: imageis_type
68 : !> type(imageis_type) :: imageis
69 : !>
70 : !> imageis = imageis_type()
71 : !>
72 : !> \endcode
73 : !>
74 : !> \finmain{imageis_type}
75 : !>
76 : !> \author
77 : !> \AmirShahmoradi, Tuesday March 7, 2017, 3:50 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
78 : type :: imageis_type
79 : logical(LK) :: first = .false._LK !< \public The scalar `logical` of default kind \LK indicating whether the current process is ID #1.
80 : logical(LK) :: extra = .false._LK !< \public The scalar `logical` of default kind \LK indicating whether the current process is NOT ID #1.
81 : logical(LK) :: leader = .false._LK !< \public The scalar `logical` of default kind \LK indicating whether the current process is a leader.<br>
82 : !! The default value is `.true.` <b>if and only if</b> the corresponding image ID is `1`.<br>
83 : !! **Otherwise, it must be set by the user after calling the type constructor depending on the parallelism type**.<br>
84 : logical(LK) :: rooter = .false._LK !< \public The scalar `logical` of default kind \LK indicating whether the current process is a follower.<br>
85 : !! The default value is `.true.` <b>if and only if</b> the corresponding image ID is **not** `1`.<br>
86 : !! **Otherwise, it must be set by the user after calling the type constructor depending on the parallelism type**.<br>
87 : end type
88 :
89 : !> \brief
90 : !> This is the [image_type](@ref pm_parallelism::image_type) type for generating objects that contain
91 : !> information about the current image/processor/thread and facilitate its synchronization with other processes,
92 : !> or the global finalization of all inter-process parallel communications (e.g., as is done in MPI applications).
93 : !>
94 : !> \details
95 : !> <ol>
96 : !> <li> Within the Fortran terminology, an **image** (or **process**) is equivalent to a replication of the program having its own set of data objects.<br>
97 : !> <li> The number of images could be the same as or more than or less than the available number of physical processors.
98 : !> <li> A particular implementation may permit the number of images to be chosen at compile time, at link time, or at execution time.
99 : !> <li> Each image executes asynchronously, and the normal rules of Fortran apply within each image.
100 : !> <li> The execution sequence can differ from image to image as specified by the programmer who, with the help
101 : !> of a unique image index, determines the actual path using normal Fortran control constructs and explicit synchronizations.
102 : !> </ol>
103 : !>
104 : !> \return
105 : !> `image` : The output scalar of type [image_type](@ref pm_parallelism::image_type).
106 : !>
107 : !> \interface{image_type}
108 : !> \code{.F90}
109 : !>
110 : !> use pm_parallelism, only: image_type
111 : !> type(image_type) :: image
112 : !>
113 : !> image = image_type()
114 : !>
115 : !> \endcode
116 : !>
117 : !> \warning
118 : !> Even when the application is MPI/OpenMP-parallel, the `id` assigned to the zeroth process is `1` in this type.<br>
119 : !> This convention follows the rules of the Fortran Coarray parallel programming language.<br>
120 : !>
121 : !> \warning
122 : !> For **OpenMP-enabled** ParaMonte library builds, this routine return the value of the OpenMP-intrinsic `omp_get_num_threads()` routine,
123 : !> that is, the number of threads in the team that is executing the parallel region to which the routine region binds.<br>
124 : !> If this constructor is called from the sequential part of an OpenMP-enabled program, this routine returns `1`.<br>
125 : !> To return a value larger than `1`, this routine must be called from within an OpenMP-enabled parallel region.<br>
126 : !> The desired number of OpenMP threads can be set at runtime via [setImageCount()](@ref pm_parallelism::setImageCount),
127 : !> or directly via the OpenMP intrinsic subroutine `omp_set_num_threads()`.<br>
128 : !>
129 : !> \finmain{image_type}
130 : !>
131 : !> \author
132 : !> \AmirShahmoradi, Tuesday March 7, 2017, 3:50 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
133 : type :: image_type
134 : integer(IK) :: count = -huge(1_IK) !< \public The scalar `integer` of default kind \IK representing the total count of runtime parallel processes available within the current communication.
135 : integer(IK) :: id = -huge(1_IK) !< \public The scalar `integer` of default kind \IK representing the ID of the runtime parallel process starting with `1`: `1`, `2`, `3`, ...
136 : type(imageis_type) :: is = imageis_type() !< \public The scalar of type [imageis_type](@ref pm_parallelism::imageis_type) containing `logical` components that signify the current image role.
137 : character(:, SK), allocatable :: label !< \public The `allocatable` scalar `character` of default kind \SK containing the ID of the current process in the format `@process(ID)`.
138 : contains
139 : procedure, nopass :: sync => setImageSynced
140 : procedure, nopass :: finalize => setImageFinalized
141 : end type
142 :
143 : !> \cond excluded
144 : interface image_type
145 : module procedure :: constructImage
146 : end interface
147 : !> \endcond excluded
148 :
149 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
150 :
151 : !> \brief
152 : !> Return the predicted parallel Fork-Join speedup scaling behavior for simulations whose image
153 : !> contributions are stochastically accepted only from the first successful image starting from image ID `1`.
154 : !>
155 : !> \details
156 : !> This generic interface can be used to compute theoretical speedup gained by a
157 : !> Fork-Join parallel simulation where the probability of contribution of an image to the simulation
158 : !> is less than `1`, meaning that image contributions are stochastic at each step of the Fork-Join cycle.<br>
159 : !> If the image contributions are inspected sequentially from the first image to last and only the first successful image contribution
160 : !> is kept in the simulation, then it can be shown that the contribution of the images to the parallel Fork-Join simulation follows a [Cyclic Geometric Distribution](@ref pm_distGeomCyclic).<br>
161 : !> See [Amir Shahmoradi, Fatemeh Bagheri (2020). ParaDRAM: A Cross-Language Toolbox for Parallel High-Performance Delayed-Rejection Adaptive Metropolis Markov Chain Monte Carlo Simulations.](https://www.cdslab.org/pubs/2020_Shahmoradi_I.pdf)
162 : !> for theoretical details.<br>
163 : !>
164 : !> \param[in] conProb : The input scalar of,
165 : !> <ol>
166 : !> <li> type `real` of kind \RKALL,
167 : !> </ol>
168 : !> containing the average probability of contribution of each image to the Fork-Join simulation.<br>
169 : !> For example, `conProb` can be the effective acceptance rate in parallel Fork-Join Monte Carlo or MCMC sampling simulations
170 : !> where each image is tasked with returning a single proposal state for subsequent evaluation and if is first to be accepted accepted,
171 : !> for usage in the next simulation step.<br>
172 : !> \param[in] seqSecTime : The input scalar of the same type and kind as `conProb` containing the time (in seconds) **per image (or process or thread)** of the inherently-sequential sections of the entire Fork-Join simulation.<br>
173 : !> \param[in] parSecTime : The input scalar of the same type and kind as `conProb` containing the time (in seconds) **per image (or process or thread)** of the inherently-parallel sections of the entire Fork-Join simulation.<br>
174 : !> \param[in] comSecTime : The input scalar of the same type and kind as `conProb` containing the time (in seconds) **per image (or process or thread)** of the **pairwise** inter-process communication sections of the entire Fork-Join simulation.<br>
175 : !> Assuming all rooter images communicate only with the main image and the entire communication takes `overhead` seconds, then `comSecTime = overhead / (nproc - 1)` with `nproc` representing the total number of images.<br>
176 : !> \param[out] scaling : The output `allocatable` vector of the same type and kind as the input `conProb` containing the predicted speedup scaling of
177 : !> the stochastic Fork-Join simulation under a range of possible image (or thread or process) counts given in the output `numproc`.<br>
178 : !> On output, the vector `scaling` will be resized until the maximum speedup and the corresponding image count is identified in the scaling.<br>
179 : !> The `i`th element of `scaling` represents theoretically-predicted speedup if `numproc(i)` images (threads or processes) were used in the parallel Fork-Join simulation.<br>
180 : !> \param[out] numproc : The output `allocatable` vector of type `integer` of default kind \IK containing the
181 : !> number of images for the speedup reported in the corresponding element of `scaling`.<br>
182 : !> If the input argument `comSecTime` is positive, then `numproc` is simply a linear range of integers starting with `1` with a jump size of `1`.<br>
183 : !> If the input argument `comSecTime` is positive, then `numproc` is simply a log(2)-linear range of integers starting with `1` with a jump size of `1`.<br>
184 : !> \param[in] scalingMaxVal : The output scalar of the same type and kind as the input `conProb` containing the maximum achievable speedup (corresponding to `maxval(scaling)`).<br>
185 : !> \param[in] scalingMaxLoc : The output scalar of type `integer` of default kind \IK containing the index of element of the output `scaling` containing `scalingMaxVal` (i.e., `maxloc(scaling)`).<br>
186 : !> \param[out] scalingMinLen : The input positive scalar of type `integer` of default kind \IK containing the minimum size of the output `scaling`.<br>
187 : !> (**optional**, default = `int(2 / max(conProb, 0.001))`)
188 : !>
189 : !> \interface{setForkJoinScaling}
190 : !> \code{.F90}
191 : !>
192 : !> use pm_parallelism, only: setForkJoinScaling
193 : !>
194 : !> call setForkJoinScaling(conProb, seqSecTime, parSecTime, comSecTime, scaling, numproc, scalingMaxVal, scalingMaxLoc, scalingMinLen = scalingMinLen)
195 : !> !
196 : !> \endcode
197 : !>
198 : !> \warning
199 : !> The condition `0 <= seqSecTime` must hold for the corresponding input arguments.<br>
200 : !> The condition `0 <= parSecTime` must hold for the corresponding input arguments.<br>
201 : !> The condition `0 <= comSecTime` must hold for the corresponding input arguments.<br>
202 : !> The condition `0 <= scalingMinLen` must hold for the corresponding input arguments.<br>
203 : !> The condition `0 <= conProb .and. conProb <= 1` must hold for the corresponding input arguments.<br>
204 : !> \vericons
205 : !>
206 : !> \see
207 : !> [pm_sampling](@ref pm_sampling)<br>
208 : !> [pm_distGeomCyclic](@ref pm_distGeomCyclic)<br>
209 : !> [Amir Shahmoradi, Fatemeh Bagheri (2020). ParaDRAM: A Cross-Language Toolbox for Parallel High-Performance Delayed-Rejection Adaptive Metropolis Markov Chain Monte Carlo Simulations.](https://www.cdslab.org/pubs/2020_Shahmoradi_I.pdf)<br>
210 : !>
211 : !> \example{setForkJoinScaling}
212 : !> \include{lineno} example/pm_parallelism/setForkJoinScaling/main.F90
213 : !> \compilef{setForkJoinScaling}
214 : !> \output{setForkJoinScaling}
215 : !> \include{lineno} example/pm_parallelism/setForkJoinScaling/main.out.F90
216 : !> \postproc{setForkJoinScaling}
217 : !> \include{lineno} example/pm_parallelism/setForkJoinScaling/main.py
218 : !> \vis{setForkJoinScaling}
219 : !> \image html pm_parallelism/setForkJoinScaling/setForkJoinScaling.png width=700
220 : !>
221 : !> \test
222 : !> [test_pm_parallelism](@ref test_pm_parallelism)
223 : !>
224 : !> \finmain{setForkJoinScaling}
225 : !>
226 : !> \author
227 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
228 :
229 : interface setForkJoinScaling
230 :
231 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
232 :
233 : #if RK5_ENABLED
234 : PURE module subroutine setForkJoinScaling_RK5(conProb, seqSecTime, parSecTime, comSecTime, scaling, numproc, scalingMaxVal, scalingMaxLoc, scalingMinLen)
235 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
236 : !DEC$ ATTRIBUTES DLLEXPORT :: setForkJoinScaling_RK5
237 : #endif
238 : use pm_kind, only: RKC => RK5
239 : integer(IK) , intent(out) :: scalingMaxLoc
240 : integer(IK) , intent(in) , optional :: scalingMinLen
241 : integer(IK) , intent(out) , allocatable :: numproc(:)
242 : real(RKC) , intent(out) , allocatable :: scaling(:)
243 : real(RKC) , intent(in) :: conProb, seqSecTime, parSecTime, comSecTime
244 : real(RKC) , intent(out) :: scalingMaxVal
245 : end subroutine
246 : #endif
247 :
248 : #if RK4_ENABLED
249 : PURE module subroutine setForkJoinScaling_RK4(conProb, seqSecTime, parSecTime, comSecTime, scaling, numproc, scalingMaxVal, scalingMaxLoc, scalingMinLen)
250 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
251 : !DEC$ ATTRIBUTES DLLEXPORT :: setForkJoinScaling_RK4
252 : #endif
253 : use pm_kind, only: RKC => RK4
254 : integer(IK) , intent(out) :: scalingMaxLoc
255 : integer(IK) , intent(in) , optional :: scalingMinLen
256 : integer(IK) , intent(out) , allocatable :: numproc(:)
257 : real(RKC) , intent(out) , allocatable :: scaling(:)
258 : real(RKC) , intent(in) :: conProb, seqSecTime, parSecTime, comSecTime
259 : real(RKC) , intent(out) :: scalingMaxVal
260 : end subroutine
261 : #endif
262 :
263 : #if RK3_ENABLED
264 : PURE module subroutine setForkJoinScaling_RK3(conProb, seqSecTime, parSecTime, comSecTime, scaling, numproc, scalingMaxVal, scalingMaxLoc, scalingMinLen)
265 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
266 : !DEC$ ATTRIBUTES DLLEXPORT :: setForkJoinScaling_RK3
267 : #endif
268 : use pm_kind, only: RKC => RK3
269 : integer(IK) , intent(out) :: scalingMaxLoc
270 : integer(IK) , intent(in) , optional :: scalingMinLen
271 : integer(IK) , intent(out) , allocatable :: numproc(:)
272 : real(RKC) , intent(out) , allocatable :: scaling(:)
273 : real(RKC) , intent(in) :: conProb, seqSecTime, parSecTime, comSecTime
274 : real(RKC) , intent(out) :: scalingMaxVal
275 : end subroutine
276 : #endif
277 :
278 : #if RK2_ENABLED
279 : PURE module subroutine setForkJoinScaling_RK2(conProb, seqSecTime, parSecTime, comSecTime, scaling, numproc, scalingMaxVal, scalingMaxLoc, scalingMinLen)
280 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
281 : !DEC$ ATTRIBUTES DLLEXPORT :: setForkJoinScaling_RK2
282 : #endif
283 : use pm_kind, only: RKC => RK2
284 : integer(IK) , intent(out) :: scalingMaxLoc
285 : integer(IK) , intent(in) , optional :: scalingMinLen
286 : integer(IK) , intent(out) , allocatable :: numproc(:)
287 : real(RKC) , intent(out) , allocatable :: scaling(:)
288 : real(RKC) , intent(in) :: conProb, seqSecTime, parSecTime, comSecTime
289 : real(RKC) , intent(out) :: scalingMaxVal
290 : end subroutine
291 : #endif
292 :
293 : #if RK1_ENABLED
294 : PURE module subroutine setForkJoinScaling_RK1(conProb, seqSecTime, parSecTime, comSecTime, scaling, numproc, scalingMaxVal, scalingMaxLoc, scalingMinLen)
295 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
296 : !DEC$ ATTRIBUTES DLLEXPORT :: setForkJoinScaling_RK1
297 : #endif
298 : use pm_kind, only: RKC => RK1
299 : integer(IK) , intent(out) :: scalingMaxLoc
300 : integer(IK) , intent(in) , optional :: scalingMinLen
301 : integer(IK) , intent(out) , allocatable :: numproc(:)
302 : real(RKC) , intent(out) , allocatable :: scaling(:)
303 : real(RKC) , intent(in) :: conProb, seqSecTime, parSecTime, comSecTime
304 : real(RKC) , intent(out) :: scalingMaxVal
305 : end subroutine
306 : #endif
307 :
308 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
309 :
310 : end interface
311 :
312 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
313 :
314 : !> \cond excluded
315 : #if OMP_ENABLED
316 : logical(LK) , save :: mv_failed = .false._LK
317 : #endif
318 : !> \endcond excluded
319 :
320 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
321 :
322 : contains
323 :
324 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
325 :
326 : !> \brief
327 : !> Generate and return an object of class [image_type](@ref pm_parallelism::image_type)
328 : !> containing information and statistics of the parallel images/processes/threads available, depending on the type of parallelism requested.<br>
329 : !>
330 : !> \details
331 : !> This procedure is the constructor of the type [image_type](@ref pm_parallelism::image_type).<br>
332 : !> See the documentation of [image_type](@ref pm_parallelism::image_type) for further details and example usage.<br>
333 : !>
334 : !> \return
335 : !> `image` : The output scalar of type [image_type](@ref pm_parallelism::image_type).
336 : !>
337 : !> \interface{constructImage}
338 : !> \code{.F90}
339 : !>
340 : !> use pm_parallelism, only: image_type
341 : !> type(image_type) :: image
342 : !>
343 : !> image = image_type()
344 : !>
345 : !> \endcode
346 : !>
347 : !> \warning
348 : !> Even when the application is MPI/OpenMP-parallel, the `id` assigned to the zeroth process is `1` in this type.<br>
349 : !> This convention follows the rules of the Fortran Coarray parallel programming language.<br>
350 : !>
351 : !> \warning
352 : !> For **OpenMP-enabled** ParaMonte library builds, this routine return the value of the OpenMP-intrinsic `omp_get_num_threads()` routine,
353 : !> that is, the number of threads in the team that is executing the parallel region to which the routine region binds.<br>
354 : !> If this constructor is called from the sequential part of an OpenMP-enabled program, this routine returns `1`.<br>
355 : !> To return a value larger than `1`, this routine must be called from within an OpenMP-enabled parallel region.<br>
356 : !> The desired number of OpenMP threads can be set at runtime via [setImageCount()](@ref pm_parallelism::setImageCount),
357 : !> or directly via the OpenMP intrinsic subroutine `omp_set_num_threads()`.<br>
358 : !>
359 : !> \finmain{constructImage}
360 : !>
361 : !> \author
362 : !> \AmirShahmoradi, Tuesday March 7, 2017, 3:50 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
363 60 : function constructImage() result(image)
364 : type(image_type) :: image
365 : !integer(IK), intent(in), optional :: nthread
366 : ! setup general processor / coarray image variables
367 60 : image%id = getImageID()
368 60 : image%count = getImageCount()
369 60 : image%label = SK_"@process("//getStr(image%id)//SK_")"
370 60 : image%is%first = image%id == 1_IK
371 60 : image%is%extra = image%id /= 1_IK
372 : !image%is%leader = .false._LK ! ATTN: this is to be set by the user at runtime, depending on the parallelism type.
373 : !image%is%rooter = .false._LK ! ATTN: this is to be set by the user at runtime, depending on the parallelism type.
374 60 : end function
375 :
376 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
377 :
378 : !> \brief
379 : !> Synchronize all existing parallel images and return nothing.
380 : !>
381 : !> \details
382 : !> This is a static member of the [image_type](@ref pm_parallelism::image_type) class.<br>
383 : !> This procedure synchronizes all images in the current communication world by,
384 : !> <ol>
385 : !> <li> calling `sync all` in Coarray applications.
386 : !> <li> calling `mpi_barrier()` in MPI applications.
387 : !> <li> calling nothing in all other (including serial) applications.
388 : !> </ol>
389 : !>
390 : !> \warning
391 : !> This routine contains global Coarray and MPI synchronization barriers and therefore, must be called by all processes in the current simulation.
392 : !>
393 : !> \warning
394 : !> The MPI library must be initialized and not finalized prior to calling this routine for MPI-parallel applications.
395 : !>
396 : !> \interface{setImageSynced}
397 : !> \code{.F90}
398 : !>
399 : !> use pm_parallelism, only: setImageSynced()
400 : !>
401 : !> call setImageSynced()
402 : !>
403 : !> \endcode
404 : !>
405 : !> \test
406 : !> [test_pm_parallelism](@ref test_pm_parallelism)
407 : !>
408 : !> \finmain{setImageSynced}
409 : !>
410 : !> \author
411 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
412 174 : subroutine setImageSynced()
413 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
414 : !DEC$ ATTRIBUTES DLLEXPORT :: setImageSynced
415 : #endif
416 : #if CAF_ENABLED
417 : sync all
418 : #elif MPI_ENABLED
419 : block
420 : use mpi !, only: mpi_barrier, mpi_comm_world
421 : integer :: ierrMPI
422 : call mpi_barrier(mpi_comm_world, ierrMPI)
423 : end block
424 : #elif OMP_ENABLED
425 : !$omp barrier
426 : #endif
427 174 : end subroutine
428 :
429 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
430 :
431 : !> \brief
432 : !> Finalize the current parallel simulation and return nothing.<br>
433 : !>
434 : !> \details
435 : !> This is a static member of the [image_type](@ref pm_parallelism::image_type) class.<br>
436 : !> <ol>
437 : !> <li> This procedure facilitates a parallelism-agnostic way of finalizing serial and parallel applications by hiding the finalization step within this procedure.<br>
438 : !> <li> This procedure is primarily relevant to MPI applications where MPI needs to be finalized before the simulation stop.<br>
439 : !> <li> This procedure performs a `sync all` for Coarray applications and returns.<br>
440 : !> <li> This procedure performs nothing for other all applications (e.g., serial, OpenMP, ...).<br>
441 : !> </ol>
442 : !>
443 : !> \interface{setImageFinalized}
444 : !> \code{.F90}
445 : !>
446 : !> use pm_parallelism, only: setImageFinalized
447 : !>
448 : !> call setImageFinalized()
449 : !> !
450 : !> \endcode
451 : !>
452 : !> \warning
453 : !> The MPI library must be initialized and not finalized prior to calling this routine for MPI-parallel applications.
454 : !>
455 : !> \warning
456 : !> The MPI communications will be shut down upon calling this routine and further inter-process communications will be impossible.
457 : !>
458 : !> \test
459 : !> [test_pm_parallelism](@ref test_pm_parallelism)
460 : !>
461 : !> \finmain{setImageFinalized}
462 : !>
463 : !> \author
464 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
465 : subroutine setImageFinalized() ! LCOV_EXCL_LINE
466 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
467 : !DEC$ ATTRIBUTES DLLEXPORT :: setImageFinalized
468 : #endif
469 : #if CAF_ENABLED
470 : sync all
471 : #elif MPI_ENABLED
472 : use mpi !mpi_f08, only: mpi_comm_world, mpi_finalized, mpi_finalize, mpi_barrier
473 : implicit none
474 : integer :: ierrMPI
475 : logical :: isFinalized
476 : call mpi_finalized(isFinalized, ierrMPI)
477 : if (.not. isFinalized) then
478 : call mpi_barrier(mpi_comm_world, ierrMPI)
479 : call mpi_finalize(ierrMPI)
480 : end if
481 : #elif OMP_ENABLED
482 : !$omp barrier
483 : #endif
484 46 : end subroutine
485 :
486 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
487 :
488 : !> \brief
489 : !> Generate and return the ID of the current Coarray image / MPI process / OpenMP thread, all starting with `1`.<br>
490 : !>
491 : !> \return
492 : !> `imageID` : The output scalar `integer` of default kind \IK representing the ID of the current image/process/thread.<br>
493 : !> The output `imageID` is set to,
494 : !> <ol>
495 : !> <li> The value returned by `this_image()` in Coarray applications,
496 : !> <li> The value returned by `mpi_comm_rank() + 1` in MPI applications,
497 : !> <li> The value returned by `omp_get_thread_num() + 1` in OpenMP applications,
498 : !> <li> The value `1` in serial applications,
499 : !> </ol>
500 : !>
501 : !> \interface{getImageID}
502 : !> \code{.F90}
503 : !>
504 : !> use pm_parallelism, only: getImageID
505 : !> use pm_kind, only: IK
506 : !> integer(IK) :: id
507 : !>
508 : !> id = getImageID() ! The ID (starting at 1) of the current Coarray image, MPI process, or OpenMP thread.
509 : !> !
510 : !> \endcode
511 : !>
512 : !> \warning
513 : !> The MPI library must be initialized and not finalized prior to calling this routine for MPI-parallel applications.<br>
514 : !> Otherwise, the MPI library will automatically be initialized, which may fail if it has been already finalized.<br>
515 : !>
516 : !> \test
517 : !> [test_pm_parallelism](@ref test_pm_parallelism)
518 : !>
519 : !> \finmain{getImageID}
520 : !>
521 : !> \author
522 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
523 387364 : function getImageID() result(imageID)
524 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
525 : !DEC$ ATTRIBUTES DLLEXPORT :: getImageID
526 : #endif
527 : integer(IK) :: imageID
528 : #if CAF_ENABLED
529 : imageID = this_image()
530 : #elif MPI_ENABLED
531 : block
532 : use mpi !mpi_f08, only : mpi_initialized, mpi_comm_world, mpi_comm_size, mpi_init
533 : integer :: rank, ierrMPI
534 : logical :: isinit, isfinit
535 : call mpi_initialized(isinit, ierrMPI)
536 : if (.not. isinit) then
537 : call mpi_finalized(isfinit, ierrMPI)
538 : if (isfinit) error stop MODULE_NAME//"@getImageID(): Error occurred. A finalized MPI library cannot be reinitialized."
539 : call mpi_init(ierrMPI)
540 : end if
541 : call mpi_comm_rank(mpi_comm_world, rank, ierrMPI)
542 : if (ierrMPI /= 0) error stop "Failed to fetch the MPI process counts."
543 : imageID = int(rank, IK) + 1_IK
544 : end block
545 : #elif OMP_ENABLED
546 : block
547 : use omp_lib, only: omp_get_thread_num
548 : imageID = int(omp_get_thread_num() + 1, IK)
549 : end block
550 : #else
551 : imageID = 1_IK
552 : #endif
553 387364 : end function
554 :
555 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
556 :
557 : !> \brief
558 : !> Generate and return the number of available processes in the current parallel world communication.
559 : !>
560 : !> \return
561 : !> `imageCount` : The output scalar `integer` of default kind \IK representing the number of available processes in the current world communication.<br>
562 : !> The output `imageCount` is set to,<br>
563 : !> <ol>
564 : !> <li> the value returned by the Fortran intrinsic `num_images()` in Coarray parallel applications,
565 : !> <li> the value returned by `mpi_comm_size()` subroutine in MPI parallel applications,
566 : !> <li> the value returned by `omp_get_num_threads()` in OpenMP parallel applications,
567 : !> <li> the value `1` in serial applications,
568 : !> </ol>
569 : !>
570 : !> \warning
571 : !> For **OpenMP-enabled** ParaMonte library builds, this routine return the value of the OpenMP-intrinsic `omp_get_num_threads()` routine,
572 : !> that is, the number of threads in the team that is executing the parallel region to which the routine region binds.<br>
573 : !> If this generic interface is called from the sequential part of an OpenMP-enabled program, this routine returns `1`.<br>
574 : !> To return a value larger than `1`, this routine must be called from within an OpenMP-enabled parallel region.<br>
575 : !> The desired number of OpenMP threads can be set at runtime via [setImageCount()](@ref pm_parallelism::setImageCount),
576 : !> or directly via the OpenMP intrinsic subroutine `omp_set_num_threads()`.<br>
577 : !>
578 : !> \interface{getImageCount}
579 : !> \code{.F90}
580 : !>
581 : !> use pm_kind, only: IK
582 : !> use pm_parallelism, only: getImageCount
583 : !> integer(IK) :: imageCount
584 : !>
585 : !> imageCount = getImageCount()
586 : !>
587 : !> \endcode
588 : !>
589 : !> \warning
590 : !> See the warnings associated with [getImageCountMPI](@ref pm_parallelism::getImageCountMPI).<br>
591 : !>
592 : !> \test
593 : !> [test_pm_parallelism](@ref test_pm_parallelism)
594 : !>
595 : !> \finmain{getImageCount}
596 : !>
597 : !> \author
598 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
599 60 : function getImageCount() result(imageCount)
600 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
601 : !DEC$ ATTRIBUTES DLLEXPORT :: getImageCount
602 : #endif
603 : integer(IK) :: imageCount
604 : #if CAF_ENABLED
605 : imageCount = num_images()
606 : #elif MPI_ENABLED
607 : imageCount = getImageCountMPI()
608 : #elif OMP_ENABLED
609 : imageCount = getImageCountOMP()
610 : #else
611 : imageCount = 1_IK
612 : #endif
613 60 : end function
614 :
615 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
616 :
617 : !> \brief
618 : !> Generate and return the number of available processes in the current **MPI-parallel** world communication.
619 : !>
620 : !> \return
621 : !> `imageCount` : The output scalar `integer` of default kind \IK representing the number of
622 : !> available processes in the current **MPI-parallel** world communication.<br>
623 : !> The output `imageCount` is set to,<br>
624 : !> <ol>
625 : !> <li> the value returned by `mpi_comm_size()` subroutine in MPI parallel applications,
626 : !> <li> the value `0` if the ParaMonte library build is not for MPI applications,
627 : !> </ol>
628 : !>
629 : !> \interface{getImageCountMPI}
630 : !> \code{.F90}
631 : !>
632 : !> use pm_kind, only: IK
633 : !> use pm_parallelism, only: getImageCountMPI
634 : !> integer(IK) :: imageCount
635 : !>
636 : !> imageCount = getImageCountMPI()
637 : !>
638 : !> \endcode
639 : !>
640 : !> \warning
641 : !> The MPI library must be initialized and not finalized prior to calling this routine for MPI-parallel applications.<br>
642 : !> Otherwise, the MPI library will automatically be initialized, which may fail if it has been already finalized.<br>
643 : !>
644 : !> \note
645 : !> Th C-binding of this routine is required for automatic detection of the use of `mpiexec` launcher in dynamic programming languages.
646 : !>
647 : !> \test
648 : !> [test_pm_parallelism](@ref test_pm_parallelism)
649 : !>
650 : !> \finmain{getImageCountMPI}
651 : !>
652 : !> \author
653 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
654 0 : function getImageCountMPI() result(imageCount) bind(C, name = "getImageCountMPI")
655 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
656 : !DEC$ ATTRIBUTES DLLEXPORT :: getImageCountMPI
657 : #endif
658 : #if MPI_ENABLED
659 : use mpi !mpi_f08, only : mpi_initialized, mpi_comm_world, mpi_comm_size, mpi_init
660 : integer :: nproc
661 : integer :: ierrMPI
662 : logical :: isinit
663 : integer(IK) :: imageCount
664 : imageCount = 0_IK
665 : call mpi_initialized(isinit, ierrMPI)
666 : if (ierrMPI /= 0) return ! LCOV_EXCL_LINE
667 : if (.not. isinit) then
668 : call mpi_init(ierrMPI) ! LCOV_EXCL_LINE
669 : if (ierrMPI /= 0) return ! LCOV_EXCL_LINE
670 : end if
671 : call mpi_comm_size(mpi_comm_world, nproc, ierrMPI)
672 : if (ierrMPI /= 0) return ! LCOV_EXCL_LINE
673 : imageCount = int(nproc, IK)
674 : #else
675 : integer(IK) :: imageCount
676 : imageCount = 0_IK
677 : #endif
678 0 : end function
679 :
680 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
681 :
682 : !> \brief
683 : !> Generate and return the number of available processes in the current **OpenMP-parallel** world communication.
684 : !>
685 : !> \return
686 : !> `imageCount` : The output scalar `integer` of default kind \IK representing the number of
687 : !> available processes in the current **OpenMP-parallel** world communication.<br>
688 : !> The output `imageCount` is set to,<br>
689 : !> <ol>
690 : !> <li> the value returned by `mpi_comm_size()` subroutine in OpenMP parallel applications,
691 : !> <li> the value `0` if the ParaMonte library build is not for OpenMP applications,
692 : !> </ol>
693 : !>
694 : !> \interface{getImageCountOMP}
695 : !> \code{.F90}
696 : !>
697 : !> use pm_kind, only: IK
698 : !> use pm_parallelism, only: getImageCountOMP
699 : !> integer(IK) :: imageCount
700 : !>
701 : !> imageCount = getImageCountOMP()
702 : !>
703 : !> \endcode
704 : !>
705 : !> \test
706 : !> [test_pm_parallelism](@ref test_pm_parallelism)
707 : !>
708 : !> \finmain{getImageCountOMP}
709 : !>
710 : !> \author
711 : !> \AmirShahmoradi, Tuesday March 7, 2017, 4:13 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
712 0 : function getImageCountOMP() result(imageCount) bind(C, name = "getImageCountOMP")
713 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
714 : !DEC$ ATTRIBUTES DLLEXPORT :: getImageCountOMP
715 : #endif
716 : #if OMP_ENABLED
717 : use omp_lib, only: omp_get_num_threads
718 : integer(IK) :: imageCount
719 : imageCount = omp_get_num_threads()
720 : #else
721 : integer(IK) :: imageCount
722 : imageCount = 0_IK
723 : #endif
724 0 : end function
725 :
726 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
727 :
728 : !> \brief
729 : !> Set the number the parallel **threads** for an OpenMP-enabled application.
730 : !>
731 : !> \details
732 : !> This generic interface is exclusively relevant to the OpenMP-enabled ParaMonte library builds and applications.<br>
733 : !> This interface offers a convenient consistent cross-platform method of setting the number of runtime OpenMP threads for a parallel code section.<br>
734 : !> It does so by calling the OpenMP intrinsic
735 : !>
736 : !> \param[in] count : The input scalar of type `integer` of default kind \IK containing the requested number of OpenMP threads.<br>
737 : !> This argument is relevant only if the OpenMP parallelism is enabled at the time of building the ParaMonte library.<br>
738 : !> Unlike Coarray or MPI parallelism paradigms, the number of images (threads) can be (re)set at runtime in OpenMP applications, hence the need for this argument.<br>
739 : !> If the input `count` is **non-positive**, this generic interface uses the output of OpenMP intrinsic `omp_get_num_procs()` as the number of threads.<br>
740 : !> This is unlike the default behavior of Intel and GNU OpenMP libraries which set the number of threads to `max(count, 1)`.<br>
741 : !> (**optional**, default = `omp_get_num_procs()`)
742 : !>
743 : !> \devnote
744 : !> Although this procedure exists for all parallelism builds of the ParaMonte library, it does nothing for non OpenMP-enabled library builds.<br>
745 : !> Note that the number of images/processes for Coarray/MPI -enabled applications must be set by the user at the time of running the parallel application.<br>
746 : !> This is unlike the OpenMP parallel applications where the number of threads can be (re)set at runtime during the program execution.<br>
747 : !>
748 : !> \interface{setImageCount}
749 : !> \code{.F90}
750 : !>
751 : !> use pm_parallelism, only: setImageCount
752 : !>
753 : !> call setImageCount(count = count)
754 : !>
755 : !> \endcode
756 : !>
757 : !> \finmain{setImageCount}
758 : !>
759 : !> \author
760 : !> \AmirShahmoradi, Tuesday March 7, 2017, 3:50 AM, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin
761 0 : subroutine setImageCount(count)
762 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
763 : !DEC$ ATTRIBUTES DLLEXPORT :: setImageCount
764 : #endif
765 : integer(IK), intent(in), optional :: count
766 : #if OMP_ENABLED
767 : block
768 : use omp_lib, only: omp_get_num_procs, omp_set_num_threads
769 : integer :: count_def
770 : count_def = omp_get_num_procs()
771 : if (present(count)) then
772 : if (0_IK < count) count_def = int(count)
773 : end if
774 : call omp_set_num_threads(count_def)
775 : end block
776 : #endif
777 0 : end subroutine
778 :
779 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
780 :
781 : !> \brief
782 : !> Broadcast the error condition from all images/processes to all images/processes.<br>
783 : !>
784 : !> \brief
785 : !> This procedure exists primarily to aid the detection and graceful
786 : !> handling of runtime errors during the testing of ParaMonte library samplers
787 : !> or less frequently, when calling the samplers from dynamic programming languages or interactive
788 : !> environments such as Jupyter which require graceful handling of any errors to avoid environment shutdown.<br>
789 : !> The use of this procedure is essential to avoid runtime parallel deadlocks.<br>
790 : !>
791 : !> \param[in] failed : The input scalar `logical` of default kind \LK that should be `.true.` if and only if a fatal error has occurred.<br>
792 : !> The value of `failed` is typically the `occurred` component of an object of type [err_type](@ref pm_err::err_type).<br>
793 : !> On input, the specified value for `failed` is broadcast to all other active processes in the program.<br>
794 : !>
795 : !> \return
796 : !> `failedParallelism` : The output scalar of type `logical` of default kind \LK that is `.true.` if and only if
797 : !> the input argument `failed` on **any** image/process in the current parallel simulation is `.true.`.<br>
798 : !>
799 : !> \interface{isFailedImage}
800 : !> \code{.F90}
801 : !>
802 : !> use pm_kind, only: LK
803 : !> use pm_parallelism, only: isFailedImage
804 : !> logical(LK) :: failedParallelism
805 : !> logical(LK) :: failed
806 : !>
807 : !> failedParallelism = isFailedImage(failed)
808 : !>
809 : !> \endcode
810 : !>
811 : !> \warning
812 : !> This subroutine must be called in parallel by **all** images or **none**.<br>
813 : !>
814 : !> \warning
815 : !> This function primarily exists for soft handling of fatal errors in parallel
816 : !> testing mode and should therefore ideally not be used in production builds.<br>
817 : !> If it is used repeatedly within an application, it can **significantly degrade** the parallel performance.<br>
818 : !>
819 : !> \impure
820 : !>
821 : !> \test
822 : !> [test_pm_parallelism](@ref test_pm_parallelism)
823 : !>
824 : !> \finmain{isFailedImage}
825 : !>
826 : !> \author
827 : !> \AmirShahmoradi, 9:49 PM Friday, March 1, 2013, Institute for Fusion Studies, The University of Texas at Austin
828 3 : function isFailedImage(failed) result(failedParallelism)
829 : #if __INTEL_COMPILER && DLL_ENABLED && (_WIN32 || _WIN64)
830 : !DEC$ ATTRIBUTES DLLEXPORT :: isFailedImage
831 : #endif
832 : logical(LK), intent(in) :: failed
833 : logical(LK) :: failedParallelism
834 : #if CAF_ENABLED
835 : logical(LK), allocatable, save :: failure(:)[:]
836 : integer :: iid, imageCount
837 : imageCount = num_images()
838 : allocate(failure(imageCount)[*])
839 : failure(this_image()) = failed
840 : sync all
841 : do iid = 1, imageCount
842 : failedParallelism = failure(iid)[iid]
843 : if (failedParallelism) return
844 : end do
845 : #elif MPI_ENABLED
846 : logical, allocatable :: failure(:) ! This must be default kind.
847 : integer :: imageCount!, imageID ! This must be default kind.
848 : block
849 : use mpi !mpi_f08, only: mpi_comm_world, mpi_comm_rank, mpi_comm_size, mpi_allgather, mpi_logical
850 : use pm_arrayResize, only: setResized
851 : integer :: ierrMPI
852 : !call mpi_comm_rank(mpi_comm_world, imageID, ierrMPI)
853 : call mpi_comm_size(mpi_comm_world, imageCount, ierrMPI)
854 : call setResized(failure, int(imageCount, IK))
855 : call mpi_allgather ( logical(failed) & ! LCOV_EXCL_LINE : send buffer
856 : , 1 & ! LCOV_EXCL_LINE : send count
857 : , mpi_logical & ! LCOV_EXCL_LINE : send datatype
858 : , failure(:) & ! LCOV_EXCL_LINE : receive buffer
859 : , 1 & ! LCOV_EXCL_LINE : receive count
860 : , mpi_logical & ! LCOV_EXCL_LINE : receive datatype
861 : , mpi_comm_world & ! LCOV_EXCL_LINE : comm
862 : , ierrMPI & ! LCOV_EXCL_LINE : error code
863 : )
864 : !call mpi_alltoall ( err%occurred & ! buffer_send : The buffer containing the data that will be scattered to other processes.<br>
865 : ! , 1 & ! count_send : The number of elements that will be sent to each process.<br>
866 : ! , mpi_logical & ! datatype_send : The type of one send buffer element.<br>
867 : ! , failure & ! buffer_recv : The buffer in which store the gathered data.<br>
868 : ! , imageCount & ! count_recv : The number of elements in the message to receive per process, not the total number of elements to receive from all processes altogether.<br>
869 : ! , mpi_logical & ! datatype_recv : The type of one receive buffer element.<br>
870 : ! , mpi_comm_world & ! communicator : The communicator in which the all to all takes place.<br>
871 : ! , ierrMPI & ! LCOV_EXCL_LINE : error code
872 : ! )
873 : failedParallelism = logical(any(failure), LK)
874 : end block
875 : #elif OMP_ENABLED
876 : !$omp critical
877 : mv_failed = mv_failed .or. failed
878 : !$omp end critical
879 : !$omp barrier
880 : failedParallelism = mv_failed
881 : !$omp master
882 : mv_failed = .false._LK
883 : !$omp end master
884 : #else
885 3 : failedParallelism = failed
886 : #endif
887 3 : end function
888 :
889 : !%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
890 :
891 : end module pm_parallelism ! LCOV_EXCL_LINE
|