Edit this page
Every ParaDRAM simulation generates an output restart file whose name ends with either _restart.bin
(the default suffix) or _restart.txt
. When the file extension is .bin
, the file’s contents are written in binary format. Although the binary format is NOT human-readable, it has several advantages to the human-readable ASCII-formatted file suffixed by _restart.txt
. The format of the output restart file can be specified via the input specification variable outputRestartFileFormat
described here.
The output restart file contains all the information needed to restart a simulation should runtime interruptions happen.
For this to happen, make sure to set the random seed of the ParaMonte sampler's random number generator (
randomSeed
) before the simulation, and fix the prefix of the output file names via the simulation specification outputFileName
.
How to restart an interrupted ParaDRAM simulation?
Short answer
Simply rerun the simulation with the same configuration (i.e., the same input specifications). Most importantly, ensure the value of the input variable outputFileName
is set to the same output filename’s prefix by which the simulation output files are generated.
outputFileName
. If you do so and the simulation gets interrupted for any reason, all you need to do to restart your simulation is to rerun it.Long answer
To understand the mechanism behind the restart functionality of the ParaDRAM sampler, you need to know the following facts about its inner workings:
- Every time a ParaDRAM simulation begins, the sampler checks for the value of the simulation specification variable outputFileName provided by the user. This serves as the common prefix in the names of all files that are output by the ParaDRAM sampler.
- For example, a user-input value of
outputFileName=./out/temp
will lead to the generation of a set of ParaDRAM output files with the following names (by the first processor in the simulation),temp_run1_pid1_progress.txt temp_run1_pid1_restart.txt temp_run1_pid1_report.txt temp_run1_pid1_sample.txt temp_run1_pid1_chain.txt
which are all stored in the folder
./out/
relative to the current working directory from which you called the sampler. - If the value of
outputFileName
ends with a forward- (/
) or backward- (\
) slash on Windows OS or with a forward-slash (/
) on Linux and Darwin (Mac) OS, then the user-provided value will be treated as the folder name in which the output files will have to be stored. In this case, the ParaDRAM sampler will assign a common random filename-prefix to all of the generated output files (which always starts withParaDRAM_
), for example,ParaDRAM_20240312_060333_408
in the following filenames,ParaDRAM_20240312_060333_408_run1_pid1_progress.txt ParaDRAM_20240312_060333_408_run1_pid1_restart.txt ParaDRAM_20240312_060333_408_run1_pid1_report.txt ParaDRAM_20240312_060333_408_run1_pid1_sample.txt ParaDRAM_20240312_060333_408_run1_pid1_chain.txt
- For example, a user-input value of
- Every time a ParaDRAM simulation begins, the sampler either gets the output files’ prefix from the user or generates a fresh new prefix as described above. Then, it checks for the existence of any collection of output files with this prefix in the simulation’s working directory. Then,
- Fresh simulation – If none of the output files already exist, the simulation begins as a fresh new simulation.
- Simulation crash – If all of the output files with the given prefix already exist in the output path, the simulation will abruptly stop with an error message stating that it cannot overwrite an already existing simulation.
- Restart mode – If all output files exist with the given prefix except the sample file (suffixed with
_sample.txt
), then the ParaDRAM sampler assumes that this simulation has prematurely ended in the past and that it must restart this unfinished simulation. The reason for this assumption is the following,Tip: When a ParaDRAM simulation ends prematurely, the interruption virtually always happens before the ParaDRAM sampler reaches the final stage of generating the output sample file.
Scenario 1 – Suppose you set the specification variableoutputFileName = "./output/MyRestartSimulation"
. Upon starting the simulation, the sampler will generate the following output files ( there will be more than one group of files if parallelism ="multiChain"
),MyRestartSimulation_run1_pid1_progress.txt MyRestartSimulation_run1_pid1_restart.txt MyRestartSimulation_run1_pid1_report.txt MyRestartSimulation_run1_pid1_sample.txt MyRestartSimulation_run1_pid1_chain.txt
Then, the simulation gets interrupted and ends prematurely for some reason. To restart this simulation, just rerun the simulation as done the first time.
Scenario 2 – Suppose you do not provide an input value for the specification variableoutputFileName
, in which case, the sampler will assign a random prefix to the output files, like the following,ParaDRAM_20240312_060333_408_run1_pid1_progress.txt ParaDRAM_20240312_060333_408_run1_pid1_restart.txt ParaDRAM_20240312_060333_408_run1_pid1_report.txt ParaDRAM_20240312_060333_408_run1_pid1_sample.txt ParaDRAM_20240312_060333_408_run1_pid1_chain.txt
You run the simulation which is then interrupted and ends prematurely. In these filenames, anything that appears before
_run
is part of the common prefix that the sampler randomly generates:ParaDRAM_20240312_060333_408
. Thus, to restart the simulation you will have to setoutputFileName = "./ParaDRAM_20240312_060333_408"
in your restart-simulation-specifications so that the old output files can be found by the sampler upon the simulation restart.