The five different types of output files
Every successful ParaDRAM simulation generates (at least) 5 output files with the following suffixes,
_progress.txt
_report.txt
_sample.txt
_chain.txt
or_chain.bin
(for output chain files in binary format)_restart.txt
or_restart.bin
(for output restart files in binary format)
Each of the output files is prefixed with the filename that the user provides via the simulation specification variable outputFileName
as described here. If this variable is missing in the input specifications, the ParaDRAM routine will automatically generate a new random file name with a format similar to the following,
ParaDRAM_run_20200312_060333_408_process_1_progress.txt
ParaDRAM_run_20200312_060333_408_process_1_restart.txt
ParaDRAM_run_20200312_060333_408_process_1_report.txt
ParaDRAM_run_20200312_060333_408_process_1_sample.txt
ParaDRAM_run_20200312_060333_408_process_1_chain.txt
where _process_1
in the filenames implies that these files have been generated by the first processor in the simulation, whether the simulation is in parallel or serial. See this page for a complete detailed description of the pattern of the randomly-generated filename.
parallelizationModel = "multiChain"each leader process (that is, all processes in the
multiChain
parallel simulation) will generate 5 separate output files in the simulation. Each set of five files contains information about the unique simulation that has been performed by the corresponding process.
The output report file
Every ParaDRAM simulation generates an output report file whose name ends with _report.txt
. This file contains all the details about the simulation setup, in the following order,
- the ParaDRAM banner and version specifications,
- the specifications of the processor on which the current simulation is being performed,
- the specifications of the current ParaDRAM simulation being performed along with their values and descriptions,
- the relevant details about the simulation timing and performance, if the simulation finishes successfully, along with,
- the statistics of the simulation results, and,
- the final message: “Mission Accomplished.” indicating the successful ending of the simulation.
Tip: This "Mission Accomplished." message is the last piece of information to appear in a report file. If you do not see this message, then your simulation likely did not end properly, or more likely, the simulation was interrupted at runtime. The latter often happens when you schedule your simulation on a supercomputer with a limited preallocated time but the simulation happens to take longer than the prespecified time and ends prematurely. In such cases, you can seamlessly restart your simulation to continue from where it left off upon interruption.
The output sample file
Every ParaDRAM simulation generates an output sample file whose name ends with _sample.txt
. This is the final gem produced by the ParaDRAM sampler routine and contains a refined, decorrelated, independent and identically-distributed (i.i.d.) set of random states (points) sampled from the user-provided mathematical objective function. This file contains only two pieces of information,
SampleLogFunc
– the value of the user-provided mathematical objective function at the currently-sampled state on each row of the file,- the sampled state – a row-wise vector of values that represents the current state that has been sampled. The variable names corresponding to each of these state values can be specified via the input specification attribute
variableNameList
as described here.
sampleSize
according to the rules described here.
Here is an example snippet from the contents of a typical ParaDRAM sample file,
SampleLogFunc,SampleVariable1,SampleVariable2,SampleVariable3
-6653.7715446294633,118.66335229170707,5.8564153719573948,119.27086100863379
-6659.0698939324593,118.64302952747666,5.7628529654206142,119.26241800320899
-6653.3444041667954,119.38362793323084,6.0383527991454482,120.12283908646494
-6653.0709752975054,118.60160131367854,5.8469733471466157,119.23856706244642
...
The output progress file
Every ParaDRAM simulation generates an output progress file whose name ends with _progress.txt
. This file contains realtime information about the runtime progress of the simulation, including,
- information about the number of calls the ParaDRAM sampler makes to the user-provided mathematical objective function,
- information about the overall efficiency of the ParaDRAM sampler,
- information about the dynamic efficiency of the sampler over the past progressReportPeriod number of calls to the mathematical objective function,
- information about the timing of the simulation including,
- the time spent since the start of the simulation,
- the time since the last progress report,
- the estimated to finish the simulation.
Here is a snippet from the contents of a typical progress report file in Comma-Seperated-Values (CSV) format,
NumFuncCallTotal,NumFuncCallAccepted,MeanAcceptanceRateSinceStart,MeanAcceptanceRateSinceLastReport,TimeElapsedSinceLastReportInSeconds,TimeElapsedSinceStartInSeconds,TimeRemainedToFinishInSeconds
1000,136,.13371459319922982,.13371459319922982,9.3792600631713867,9.3792600631713867,6887.1354922687306
2000,197,.98254345562296549E-01,.62794097925363279E-01,7.6144800186157227,16.993740081787109,8609.2702608253749
3000,240,.80171526534224075E-01,.44005888478079132E-01,8.3076248168945312,25.301364898681641,10516.934009552002
4000,272,.69136158920019106E-01,.36030056077404171E-01,6.8951001167297363,32.196465015411377,11804.739202415241
...
The output restart file
Every ParaDRAM simulation generates an output restart file whose name ends with either _restart.bin
(the default suffix) or _restart.txt
. When the file extension is .bin
, the contents of the file are written in binary format. Although the binary format is NOT human-readable, it has several advantages to the human-readable ASCII-formatted file suffixed by _restart.txt
. The format of the output restart file can be specified via the input specification variable restartFileFormat
described here.
The output restart file contains all information that is needed to restart a simulation should runtime interruptions happen.
For this to happen, make sure to set the random seed of the ParaMonte sampler's random number generator (
randomSeed
) before the simulation, and fix the prefix of the output file names via the simulation specification outputFileName
.
The output chain file
Every ParaDRAM simulation generates an output chain file whose name ends with either _chain.txt
(the default suffix) or _chain.bin
. When the file extension is .bin
, the file has been generated in binary format and is unreadable to humans. The output chain file’s format can be specified via the input specification variable chainFileFormat
described here.
The output chain file contains information about all of the useful, non-rejected calls that the ParaDRAM routine makes to the user-provided mathematical objective function during the simulation, including,
ProcessID
– the ID of the processor that has successfully sampled the current state (point) from the user-provided mathematical objective function,Note: This information is only relevant for parallel simulations. However, to keep the contents of the chain files consistent with each other, this information is also reported in the output chain files of serial simulations.DelayedRejectionStage
– the delayed-rejection stage at which the newly sampled state has been accepted,MeanAcceptanceRate
– the mean acceptance rate of the sampler up to the newly-sampled state at a given row,AdaptationMeasure
– the amount of adaptation performed on the sampler’s proposal distribution, which is a number between zero and one, with one indicating extreme adaptation being performed at that stage in the simulation on the proposal distribution and, a value of zero indicating absolutely no adaptation being performed since the last sampled state,Tip: A well-behaving ParaDRAM simulation should exhibit an adaptation measure that starts at large values close to one and decays fast (within a few hundreds or thousands of newly-sampled points) to tiny values ($\ll0.01$) comparable to the precision of a 32-bit real/float and less ( $\lessapprox10^{-3}$ ). Ideally, this continuous decrease should resemble a power-law decay.BurninLocation
– the runtime estimate of the number of sampled states (from the beginning of the simulation) that are potentially non-useful and must be discarded as the burnin period,SampleWeight
– the number of times each newly-sampled point is repeated in the Markov chain before the next candidate state is accepted,Tip: The value represented bySampleWeight
is, by definition, always an integer $\ge1$.SampleLogFunc
– the value of the user-provided mathematical objective function at the currently-sampled state,- followed by a row-wise vector of values that represent the current state that has been sampled. The variable names corresponding to each of these state values can be specified via the input specification attribute
variableNameList
as described here.
Here is an example snippet from the contents of a typical ParaDRAM chain file,
ProcessID,DelayedRejectionStage,MeanAcceptanceRate,AdaptationMeasure,BurninLocation,SampleWeight,SampleLogFunc,SampleVariable1,SampleVariable2,SampleVariable3
1,0,.88253409849043021E-01,.0000000000000000,1,18,-6661.3317058445919,118.01753854159099,5.5487917511652096,118.80997505960300
3,0,.80642984387170699E-01,.0000000000000000,1,11,-6662.0952096861010,117.79618249468358,5.4936144718729913,118.57124540688957
11,0,.61746061174130930E-01,.0000000000000000,1,21,-6663.8532414945439,118.18713338111942,5.5828295535451078,118.97803516340616
6,0,.81607474204830915E-01,.0000000000000000,1,5,-6663.4173242973156,118.24855055054523,5.6551168673789398,119.04257546168272
...