Edit this page
When is the input file needed?
The input files to the ParaMonte library sampler routines are optional. If not provided, all simulation specifications will be set to appropriate default values or the ParaMonte samplers’ best guess for the proper values. However, if the user wants to fine-tune the specifics of a sampling simulation, then depending on the programming language interface to the ParaMonte library, the input file may or may not be the sole method of setting up the simulation specifications,
- The C/C++ programming language: Providing the path to an input external file is the sole method of simulation setup by the user.
- The Fortran programming language: Providing the path to an input external file is the preferred method of simulation setup by the user. However, the input file is not the sole method of specifying the simulation setup. See the ParaMonte Fortran documentation of the generic interface
getErrSampling
for more information about the alternative, less flexible method. - The MATLAB programming language: Providing the path to an input external file is NOT the preferred method of simulation setup by the user. This is because the ParaMonte MATLAB library already has a much more flexible dynamic method of simulation specifications setup from within the MATLAB programming language. Nevertheless, it is equally possible to specify everything from within an input simulation specifications file.
Warning: When you provide the path to an input file, all other simulation specifications made within the MATLAB programming environment will be ignored in favor of the corresponding values in the specified external input file, which may or may not be present.
- The Python programming language: Providing the path to an input external file is NOT the preferred method of simulation setup by the user. This is because the ParaMonte Python library already has a much more flexible dynamic method of simulation specifications setup from within the Python programming language. Nevertheless, it is equally possible to specify everything from within an input simulation specifications file.
Warning: When you provide the path to an input file, all other simulation specifications made within the Python programming environment will be ignored in favor of the corresponding values in the specified external input file, which may or may not be present.
The structure of the optional input file
Here is a summary of useful guidelines and rules for writing ParaMonte sampler input files.
Organization
- The input file structure for all ParaMonte samplers is the same in all programming languages.
- The simulation specifications for each ParaMonte sampler (e.g., the ParaDRAM MCMC sampler) must be grouped under the ParaMonte routine’s name. We call each group a
namelist
. - Each group, corresponding to one ParaMonte sampler routine, is identified by a group name preceded by an
&
and ending by a forward slash/
(see below for an example input file). - Multiple namelist groups can coexist within a single input file. Only the ones relevant to the simulation of interest will be read and used. The rest will be ignored.
- Comments are allowed anywhere inside the input file.
- Comments must begin with an exclamation mark (
!
). - Comments can appear on an empty line or after a value assignment.
Variables
- all variable assignments are optional and can be dropped or commented out. In such cases, the ParaMonte routines will assign appropriate default values to the missing variables in the input file.
Variables within a namelist group can be separated from each other by colon or whitespace characters. - The order by which the variables appear within a namelist group is irrelevant and unimportant.
- Variables can be defined multiple times, but only the last definition will be considered as input.
- All variable names are case-insensitive. However, for clarity, the ParaMonte library follows the camelCase code-writing practice.
Values
Like variables, values within a namelist group can be separated by either a colon or whitespace characters.
- Strings
- String values must be enclosed with single or double quotation marks:
''
or" "
. String values can be continued on multiple lines; however, any additional whitespace characters caused by the line continuation will NOT be ignored.
- String values must be enclosed with single or double quotation marks:
- Logical (Boolean)
Logical values are all case-insensitive and can be either.true.
,true
, ort
for aTRUE
value or.false.
,false
, orf
for aFALSE
value. - Real (Float)
- Real values are, by default, double-precision in MATLAB and Python programming languages. But they can be
single
,double
, orquad
precision within the C and C++ programming languages and any precision supported by the processor within the Fortran programming language. - The double precision can hold up to
16
digits of precision and represent numbers as large as $\approx 10^{307}$ and as tiny as $\approx 10^{-307}$.
Tip: To keep the representation of numbers accurate up to 16 digits, consider using the letterd
instead ofe
for the scientific representation of numbers. The letterd
stands for double precision (64-bit
real/float). For example, the value1.d0
is guaranteed to be1.0000000000000000
in the simulation, whereas1.e0
is only guaranteed to have single-precision accuracy. However, most compilers will represent this as1.0000000000000000
upon conversion to a full-precision value. But in general, it won’t hurt to used
in place ofe
for the scientific representation of double-precision values only within the input files (Outside the input files, the number-representation rules of the specific programming language of your choice, in which you are coding your objective function, must be followed). - Real values are, by default, double-precision in MATLAB and Python programming languages. But they can be
Verctors
- All vectors and arrays that are specified inside the input file begin with index
1
. This follows the convention of the majority of science-oriented programming languages and libraries including but not limited to Fortran, Julia, Mathematica, MATLAB, R, LAPACK, and Eigen (C++).Important: Remember that this 1-based indexing rule applies only to variable assignments made inside the external input file. Any specification variable assigned from within a programming language environment follows the rules of that language. For example, all variables specified from within C/C++/Python follow the zero-based indexing rules of these languages. - Vectors (and arrays) of strings, integers, or real numbers can be specified as comma-separated or space-separated values. For example,
! real-valued vector of length 4, specified as the starting point of an MCMC simulation proposalStart = 1.0, -100 3, 5.6e7 8.d1
You may have noticed above that some values are comma-separated while others are space-separated, which is a valid syntax.
- Vector (an array) values may be specified separately on multiple lines and in random order like the following,
! a vector of strings specifying the names of the variables that are going to be sampled in the simulation, ! each corresponds to one dimension of the objective function. domainAxisName(2) = "secondVariable" domainAxisName(1) = "FirstVariable" domainAxisName(3:4) = "ThirdVariable", "FourthVariable"
- Vector values may be selectively provided in the input file, and some values may be missing. For example,
! a vector of length 4 specifying the random walker's step sizes along different dimensions of the objective function in a ParaDRAM simulation. proposalStd(3) = 3.0 proposalStd(1:2) = 1.0, 2.0
or,
proposalStd = 1.0, 2.0, 3.0 ! This is identical to the above representation
Notice that the missing fourth variable will not be read from the input file. Instead, the ParaMonte routines will assign it a default value.
- Similar values in a vector that appear sequentially can be represented in abbreviated format via a repetition pattern rule involving
*
. For example,! vector of length 4, specifying the lower limits of the domain of the objective function along each dimension domainCubeLimitLower = -3.d100, 2*-20.0, -100
is equivalent to,
domainCubeLimitLower = -3.d100, -20.0, -20.0, -100
or,
domainCubeLimitLower = 3*-3.d100
is equivalent to,
domainCubeLimitLower = -3.d100, -3.d100, -3.d100, ! notice the fourth value is missing
In the latter example, only the first three values were provided. In such cases, the missing elements will be assigned appropriate default values.
Arrays
- The Array representation rules are identical to the vectors described in the previous section. For example, the following array value assignments are all equivalent,
! a symmetric matrix of size 4-by-4 of 64-bit real numbers representing the initial covariance matrix of the ParaDRAM sampler proposalCov = 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0,
or,
proposalCov(:,1) = 1.0, 0.0, 0.0, 0.0, proposalCov(:,2) = 0.0, 1.0, 0.0, 0.0, proposalCov(:,3) = 0.0, 0.0, 1.0, 0.0, proposalCov(:,4) = 0.0, 0.0, 0.0, 1.0,
or,
proposalCov(1:4,1:4) = 1.0, 4*0.0, 1.0, 4*0.0, 1.0, 4*0.0, 1.0
Warning: When dealing with multidimensional arrays, keep in mind that the ParaMonte routines read and store array elements in a column-major order from the input file. This is why the columns are represented by:
in the second matrix representation in the example above (instead of rows). Matrix rows are NOT stored sequentially in the computer memory. For symmetric positive-definite matrices (like covariance or correlation matrices), this convention is irrelevant and unimportant and does not have any effects (as is the case in the example above).
Example contents of a ParaDRAM simulation input file
The following box shows an example input specifications file for a ParaDRAM simulation of an objective function defined on a 4-dimensional domain (Notice the group name &ParaDRAM
at the beginning and /
at the end). Notice the ample usage of the comment symbol wherever the user deems it appropriate,
! DESCRIPTION:
!
! The external input file for sampling the 4-dimensional Multivariate Normal distribution function as implemented in the accompanying source files.
! This file is common between all supported programming language environments.
!
! NOTE:
!
! All simulation specifications (including this whole file) are optional and can be nearly safely commented out.
! However, if domain boundaries are finite, set them explicitly.
!
! USAGE:
!
! -- Comments must begin with an exclamation mark `!`.
! -- Comments can appear anywhere on an empty line or, after a variable assignment
! (but not in the middle of a variable assignment whether in single or multiple lines).
! -- All variable assignments are optional and can be commented out. In such cases, appropriate default values will be assigned.
! -- Use ParaDRAM namelist (group) name to group a set of ParaDRAM simulation specification variables.
! -- The order of the input variables in the namelist groups is irrelevant and unimportant.
! -- Variables can be defined multiple times, but only the last definition will be considered as input.
! -- All variable names are case insensitive. However, for clarity, this software follows the camelCase code-writing practice.
! -- String values must be enclosed with either single or double quotation marks.
! -- Logical values are case-insensitive and can be either .true., true, or t for a TRUE value, and .false., false, or f for a FALSE value.
! -- All vectors and arrays in the input file begin with index 1. This is following the convention of
! the majority of science-oriented programming languages: Fortran, Julia, Mathematica, MATLAB, and R.
!
! For comprehensive guidelines on the input file organization and rules, visit:
!
! https://www.cdslab.org/paramonte/generic/latest/usage/sampling/paradram/input/
!
! To see detailed descriptions of each of variables, visit:
!
! https://www.cdslab.org/paramonte/generic/latest/usage/sampling/paradram/specifications/
!
¶dram
! Base specifications.
description = "
This\n
is a\n
multi-line\n
description.\\n" ! strings must be enclosed with "" or '' and can be continued on multiple lines.
! No comments within strings are allowed.
domain = "cube"
domainAxisName = "variable1"
"variable2" ! values can appear in multiple lines.
domainBallAvg = 0 0 0 0 ! values can be separated with blanks or commas.
domainBallCor = 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
domainBallCov = 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
domainBallStd = 1 1 1 1
domainCubeLimitLower = 4*-1.e10 ! repetition pattern rules apply here. 4 dimensions => 4-element vector of values.
domainCubeLimitUpper(1) = +1.e10 ! Elements of vectors can be set individually.
domainCubeLimitUpper(2:4) = 3*+1.e10 ! Elements of vectors can be set individually.
domainErrCount = 100
domainErrCountMax = 1000
inputFileHasPriority = FALSE ! This is relevant only to simulations within the Fortran programming language.
outputChainFileFormat = "compact"
!outputColumnWidth = 25 ! This is an example of a variable that is commented out.
! Therefore, its value will not be read by the sampler routine.
! To pass it to the routine, simply remove the `!` mark at the beginning of the line.
outputFileName = "./out/mvn" ! A forward-slash character at the end of the string value would indicate the specified path
! is to be interpreted as the name of the folder to contain the simulation output files.
! The base name for the simulation output files will be generated from the current date and time.
! Otherwise, the specified base name at the end of the string will be used in naming the simulation output files.
outputPrecision = 17
outputReportPeriod = 1000
outputRestartFileFormat = "ascii"
outputSampleSize = -1
outputSeparator = ","
outputSplashMode = "normal" ! or quiet or silent.
outputStatus = "retry" ! or extend.
parallelism = "multi chain" ! "singleChain" would also work. Similarly, "multichain", "multi chain", or "multiChain".
parallelismMpiFinalizeEnabled = false ! TRUE, true, .true., .t., and t would be also all valid logical values representing truth.
parallelismNumThread = 3 ! number of threads to use in shared-memory parallelism.
!randomSeed = 2136275,
!targetAcceptanceRate = 0.23e0
! MCMC specifications.
outputChainSize = 10000
outputSampleRefinementCount = 10
outputSampleRefinementMethod = "BatchMeans"
proposal = "normal" ! or "uniform" as you wish.
proposalCor(:, 1) = 1 0 0 0 ! first matrix column.
proposalCor(:, 2:4) = 0 1 0 0
0 0 1 0
0 0 0 1 ! other matrix columns.
proposalCov = 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1 ! or specify all matrix elements in one statement.
proposalScale = "2*0.5*Gelman" ! The asterisk here means multiplication since it is enclosed within quotation marks.
!proposalStart = 4*1.e0 ! four values of 1.e0 are specified here by the repetition pattern symbol *
proposalStartDomainCubeLimitLower = 4*-10.e0 ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
proposalStartDomainCubeLimitUpper = 4*+10.e0 ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
proposalStartRandomized = false
proposalStd = 4*1.0 ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
! DRAM specifications.
proposalAdaptationBurnin = 1.
proposalAdaptationCount = 10000000
proposalAdaptationCountGreedy = 0
proposalAdaptationPeriod = 35
proposalDelayedRejectionCount = 5
proposalDelayedRejectionScale = 4*1., 2. ! The first four elements are 1, followed by 2.
/
- The simulation specifications of any specific ParaMonte routine (e.g., the ParaDRAM sampler) are identical across all supported programming languages.
- When specified from within an input file, all variable names are case-insensitive. However, when specified from within a programming language, all variable names are treated based on the rules of that language. For example, all routines and variable names are case-insensitive within the Fortran programming language. At the same time, the camelCase convention strictly holds for all names defined in Python and C/C++ calls to the ParaMonte library.
-
The best and the most up-to-date method of learning about the simulation specifications of a ParaMonte routine of choice is to run a simple toy simulation problem and then look at the contents of the output
*_report.txt
file generated by the ParaMonte routine.
Why is input-file the preferred method of simulation setup?
Specifying the properties of a ParaMonte simulation via an external input file is particularly beneficial when the ParaMonte library routines are called from within compiled languages (e.g., C/C++/Fortran). The reasons might be already clear to advanced programmers:
- Specifying the simulation properties in an external input file ensures your simulation’s highest level of flexibility and portability by avoiding the hardcoding of simulation specifications into your compiled code. Imagine you specify a simulation property inside your code, compile and run it, and then realize that you want to change that property value to something else. Without an external input file, you would have to recompile your code every time for every property change.
- Also, the same specification input file can be used to set up the same simulation settings from any programming language without a single line of change in the input file. The contents of the input files are programming-language-agnostic.
- All variable names are case-insensitive across all programming languages when specified from the input file.
- The order by which the simulation specification variables appear in the input file is irrelevant.
- Multiple simulation namelist groups, each corresponding to an independent call to a different ParaMonte routine, can be placed within a single input file, resulting in a cleaner, more portable organization of the input data for the given simulation problem.