run

A module for running experiments that involve the execution of commands with different combinations of command line arguments.

This is a somewhat short reference documentation of the interface. There is also a longer explanation including various examples.

View Source

  1"""A module for running experiments that involve the execution of
  2commands with different combinations of command line arguments.
  3
  4This is a somewhat short reference documentation of the interface.
  5There is also a [longer explanation including various
  6examples](./index.html).
  7
  8"""
  9
 10import sys
 11import re
 12import itertools
 13import os
 14import subprocess
 15import signal
 16import time
 17from fnmatch import fnmatchcase
 18from collections import namedtuple
 19from inspect import signature
 20from filelock import SoftFileLock
 21import tqdm
 22from pathos.multiprocessing import ProcessingPool
 23
 24
 25def _identity(inp):
 26    """Identity function just returning the input.
 27
 28    Default for the modifier functions of ``add()``
 29
 30    """
 31    return inp
 32
 33
 34_Run = namedtuple(
 35    "_Run",
 36    [
 37        "name",
 38        "command",
 39        "args",
 40        "creates_file",
 41        "stdout_file",
 42        "stdout_mod",
 43        "stdout_res",
 44        "header_string",
 45        "header_command",
 46        "header_mod",
 47        "allowed_return_codes",
 48        "is_selected",
 49    ],
 50)
 51"""Datatype representing a single run."""
 52
 53
 54def add(
 55    name,
 56    command,
 57    arguments_descr,
 58    creates_file=None,
 59    stdout_file=None,
 60    stdout_mod=_identity,
 61    stdout_res=None,
 62    header_string=None,
 63    header_command=None,
 64    header_mod=_identity,
 65    return_string=None,
 66    allowed_return_codes=[0],
 67    combinations_filter=lambda args: True,
 68):
 69    """Add a new experiment.
 70
 71    Based on the experiment description, a set of inidvidual runs is
 72    generated, where each individual run basically corresponds to a
 73    set of command line arguments.  The arguments are represented by a
 74    dictionary, i.e., each argument has a key and a value.
 75
 76    To describe how the arguments of an individual run can be used in
 77    several places, we need the concept of a *blob*.  Ultimately, a
 78    blob is something that will be evaluated to a string by replacing
 79    wildcards of the form ``[[key]]`` with the value of the
 80    corresponding argument.  A blob can also be a function, in which
 81    case it is evaluated (with the arguments as parameter) before
 82    doing this kind of replacements.  For more details on how a blob
 83    turns into a string, see the function ``deblob()``.  Note that a
 84    blob is only defined in the context of an individual run.  Thus,
 85    whenever we talk about blob, we implicitly have an individual run
 86    with a specific set of arguments in mind.
 87
 88    Parameters
 89    ----------
 90
 91    name: blob
 92
 93      Name of the experiment.  An experiment is only run if its name
 94      (or the name of its ``group``) appears as command line parameter.
 95      It does not need to be unique among experiments.
 96
 97    command: blob
 98
 99      The command that will be called for each run.
100
101    arguments_descr: dictionary
102
103      Dictionary of arguments or lists of arguments.  In case of
104      lists, one run for each combination of arguments is generated.
105      Each individual argument is a blob, where the blobs are
106      evaluated to strings in order of appearance in the dictionary.
107
108    creates_file: blob, optional
109
110      Describes the name of a file that is created by calling the
111      command.  This is only used to skip the run if the file already
112      exists (at the time when this method is called, not at the time
113      when the command is actually run).
114
115    stdout_file: blob, optional
116
117      The filename to which the standard output of the run should be
118      written.  There are three different cases, depending on whether
119      this file already exists:
120
121      1. If the file exists when calling this function (i.e., before
122         performing any runs), the run is skipped.
123
124      1. If the run is not skipped but the file exists when executing
125         the run, the standard output is added at the end of the file.
126
127      1. If the file does not exist, it is created beginning with the
128         header (if specified) and then the standard output is added.
129
130    stdout_mod: function, default =``identity``
131
132      A function applied to the standard output of the run, before
133      writing it to the file.  If the function takes one argument, it
134      gets the standard output as string as input.  Otherwise, it
135      should take two arguments, the standard output as string and the
136      result of a call to ``subprocess.run()``.  The latter gives
137      access to additional information such as the return code.  The
138      function can return a blob using the special wildcard
139      ``[[stdout]]`` (similar to ``stdout_res``).
140
141    stdout_res: blob, optional
142
143      If given, this blob is written to the file instead of the
144      standard output itself.  This blob is somewhat special in the
145      sense that it evaluated after the run has finished with the
146      special argument ``stdout``, i.e., the blob can contain the
147      special wildcard ``[[stdout]]``, which will be replaced by the
148      standard output (after it was modified by ``stdout_mod``).
149
150    header_string: blob, optional
151
152      A string specifying the header; see input parameter
153      ``stdout_file``.
154
155    header_command: blob, optional
156
157      A command that is run to use its standard output as header.
158
159    header_mod: function, default =``identity``
160
161      A function that is applied to the header (specified by
162      ``header_string`` or ``header_command``) before writing it to a
163      file.  It should take one argument (a string) and return a
164      string.
165
166    return_string: blob, optional
167
168      If given, this blob will be evaluated for each run and a list of
169      the results is returned.
170
171    allowed_return_codes: list[int], default =``[0]``
172
173      A list of allowed return codes.  If a command returns any other
174      code, a warning is printed and the run is aborted.  The empty
175      list ``[]`` indicates that any return code should be accepted.
176
177    combinations_filter: function, default = always ``True``
178
179      A function that filters the combinations of arguments.  It
180      should take a dictionary of arguments and decide whether it
181      represents a valid combination by returning ``True`` of
182      ``False``.  The default returns always ``True``, i.e., a run is
183      created for every combination.
184
185    Returns
186    -------
187    None or list[string]
188        See documentation of ``return_string``.
189
190    """
191    if stdout_mod != _identity and stdout_file is None:
192        _print_warning("stdout_mod has no effect if stdout_file is not " "specified")
193    if stdout_res is not None and stdout_file is None:
194        _print_warning("stdout_res has no effect if stdout_file is not " "specified")
195    if header_string is not None and stdout_file is None:
196        _print_warning("header_string has no effect if stdout_file is not " "specified")
197    if header_command is not None and stdout_file is None:
198        _print_warning(
199            "header_command has no effect if stdout_file is not " "specified"
200        )
201    if header_string is not None and header_command is not None:
202        _print_warning(
203            "header_string and header_command specified" " - Which one should I use?"
204        )
205    if header_mod != _identity and header_string is None and header_command is None:
206        _print_warning(
207            "header_mod has no effect if not one of "
208            "header_string or header_command are specified"
209        )
210
211    # generate the set of arguments
212    arguments_descr = {
213        k: v if isinstance(v, list) else [v] for k, v in arguments_descr.items()
214    }
215    arguments_set = [
216        dict(zip(arguments_descr.keys(), vals))
217        for vals in itertools.product(*arguments_descr.values())
218    ]
219
220    return_strings = []
221    for args in arguments_set:
222        if not combinations_filter(args):
223            continue
224        for key, val in args.items():
225            args[key] = deblob(val, args)
226
227        real_name = deblob(name, args)
228        if real_name not in _state.groups[_state.group]:
229            _state.groups[_state.group].append(real_name)
230
231        run = _Run(
232            name=real_name,
233            command=deblob(command, args),
234            args=args,
235            creates_file=deblob(creates_file, args),
236            stdout_file=deblob(stdout_file, args),
237            stdout_mod=stdout_mod,
238            stdout_res=stdout_res,
239            header_string=deblob(header_string, args),
240            header_command=deblob(header_command, args),
241            header_mod=header_mod,
242            allowed_return_codes=allowed_return_codes,
243            is_selected=_is_selected(real_name),
244        )
245        _add_run(run)
246        if return_string is not None:
247            return_strings.append(deblob(return_string, args))
248
249    if return_string is not None:
250        return return_strings
251
252
253def _add_run(run):
254    """Add a single run to the list of runs.
255
256    Checks whether the run should was selected or whether it should be
257    skippped and adjusts the corresponding data structures
258    accordingly.
259
260    """
261    if run.name not in _state.runs_by_name:
262        _state.runs_by_name[run.name] = []
263        _state.counts_by_name[run.name] = [0, 0]
264
265    if _is_skipped(run):
266        _state.counts_by_name[run.name][1] += 1
267    else:
268        _state.counts_by_name[run.name][0] += 1
269
270    if run.is_selected and not _is_skipped(run):
271        _state.runs_by_name[run.name].append(run)
272
273
274def _wildcard_match(pattern: str, candidates: 'list[str]'):
275    """Decides whether some element a pattern with Unix shell-style wildcards
276    matches any candidate string.
277
278    Parameters
279    ----------
280
281    pattern: str
282
283        The pattern to match.
284
285    candidates: list[str]
286
287        The list of strings to match against.
288
289    Returns
290    -------
291    bool
292        True if any of the candidates matches the pattern, False otherwise.
293
294    """
295    for candidate in candidates:
296        if fnmatchcase(pattern, candidate):
297            return True
298    return False
299
300
301def _is_selected(name):
302    """Decide whether a given name was selected.
303
304    A name counts as selected if it is give as command line parameter
305    or if it belongs to a group that was given as command line
306    parameter.
307
308    Parameters
309    ----------
310
311    name: string
312
313        The name of the experiment.
314
315    Returns
316    -------
317    bool
318        True if the experiment should be run, False otherwise.
319
320    """
321    if _wildcard_match(name, sys.argv):
322        return True
323    for group, names in _state.groups.items():
324        if _wildcard_match(group, sys.argv) and _wildcard_match(name, names):
325            return True
326    return False
327
328
329def _is_skipped(run):
330    """Decides whether a given run should be skipped.
331
332    A run is skipped if the output file already exists.
333
334    """
335    return (run.creates_file is not None and os.path.isfile(run.creates_file)) or (
336        run.stdout_file is not None and os.path.isfile(run.stdout_file)
337    )
338
339
340def run():
341    """Run the previously declared experiments.
342
343    You should call this exactly once at the end of the file.
344
345    If ``dry_run`` is given as command line parameter, then the runs are
346    not executed but the commands printed to ``stdout``.
347
348    """
349    global _state
350    _state.run_was_called = True
351    _state.time_start_run = time.time()
352
353    _print_runs()
354
355    if _is_selected("dry_run"):
356        _run_dry()
357        _state = _State()
358        return
359
360    _print_section("\nrunning the experiments:")
361
362    with ProcessingPool(nodes=_cores) as pool:
363        for name, runs in _state.runs_by_name.items():
364            if len(runs) == 0:
365                continue
366            # run in parallel
367            orig_sigint_handler = signal.signal(signal.SIGINT, signal.SIG_IGN)
368            signal.signal(signal.SIGINT, orig_sigint_handler)
369            try:
370                for _ in tqdm.tqdm(
371                    pool.uimap(_run_run, runs),
372                    desc=name.ljust(_max_name_len()),
373                    total=len(runs),
374                ):
375                    pass
376            except KeyboardInterrupt:
377                _print_warning("aborted during experiment " + name)
378        pool.close()
379    _state.run_completed = True
380    _state = _State()
381
382
383def use_cores(nr_cores):
384    """Set the number of cores used to run the experiments.
385
386    Parameters
387    ----------
388
389    nr_cores: int
390
391      The number of cores that should be used.
392
393    """
394    global _cores
395    _cores = nr_cores
396
397
398_cores = 4
399"""The number of cores used to run the experiments."""
400
401
402def _run_dry():
403    """Perform a dry run."""
404    _print_section("\ndry run: just printing, no doing")
405    for name, runs in _state.runs_by_name.items():
406        if len(runs) == 0:
407            continue
408        # print the runs
409        _print_section("\ncommands for experiment " + name)
410        for run in runs:
411            print(run.command)
412
413
414def group(group_name):
415    """Set the current group.
416
417    Each experiment created with ``add()`` is added to the group for
418    which this function was last called.
419
420    Parameters
421    ----------
422
423    group_name: string
424
425      The name of the group.
426
427    """
428    _state.group = group_name
429    if group_name not in _state.groups:
430        _state.groups[group_name] = []
431
432
433def section(title):
434    """Print a section title.
435
436    Parameters
437    ----------
438
439    title: string
440
441      The title that should be printed.
442
443    """
444    _print_section(title)
445
446
447def deblob(blob, args):
448    """Transforms a blob into a string.
449
450    This function is meant for internal use but understanding what it
451    does might be useful.  A blob is transferred into a string in the
452    following steps.
453
454    1. If ``blob`` is a function, it is called with ``args`` as parameter.
455
456    1. The result (or ``blob`` itself, if 1. did not apply) is converted
457    to a string (using ``str()``).
458
459    1. Every pattern of the form ``[[key]]`` is replaced by the value of
460    the corresponding argument in ``args``.
461
462    Step 3 assumes that every pattern of the form ``[[key]]`` has a
463    corresponding argument with this key.
464
465    Parameters
466    ----------
467
468    blob: blob
469
470      The blob that should be turned into a string.
471
472    args: dictionary
473
474      The named arguments of the current run.
475
476    """
477    if blob is None:
478        return None
479
480    if callable(blob):
481        blob = blob(args)
482
483    blob = str(blob)
484
485    keys = [m.group(1) for m in re.finditer(r"\[\[([^\[\]]*)\]\]", blob)]
486    result = blob
487    for key in keys:
488        if key not in args:
489            _print_warning("No value for [[" + key + "]] found to replace in " + blob)
490            continue
491        result = result.replace("[[" + key + "]]", args[key])
492    return result
493
494
495def _run_run(run):
496    """Actually run a run."""
497    res = _execute(run.command)
498
499    # check return codes
500    if (
501        res.returncode not in run.allowed_return_codes
502        and run.allowed_return_codes != []
503    ):
504        _print_warning(
505            "unexpected return code ("
506            + str(res.returncode)
507            + ") for command: "
508            + res.args
509            + "\n"
510            + res.stderr.strip()
511            + ""
512        )
513        return
514
515    if run.stdout_file is None:
516        return
517
518    filename = run.stdout_file
519    lock = filename + ".lock"
520    if os.path.dirname(filename) != "":
521        os.makedirs(os.path.dirname(filename), exist_ok=True)
522
523    # create new file with header
524    if not os.path.isfile(filename) and (
525        run.header_command is not None or run.header_string is not None
526    ):
527        header = run.header_string
528        if run.header_command is not None:
529            header = _execute(run.header_command).stdout.strip()
530        header = run.header_mod(header)
531
532        with SoftFileLock(lock):
533            if not os.path.isfile(filename):
534                with open(filename, "w") as out:
535                    print(header, file=out, flush=True)
536
537    # write to stdout
538    stdout = res.stdout.strip()
539    mod = run.stdout_mod
540    output = mod(stdout) if len(signature(mod).parameters) == 1 else mod(stdout, res)
541    if mod != _identity:
542        run.args["stdout"] = stdout
543        output = deblob(output, run.args)
544    if run.stdout_res is not None:
545        run.args["stdout"] = output
546        output = deblob(run.stdout_res, run.args)
547
548    with SoftFileLock(run.stdout_file + ".lock"):
549        with open(run.stdout_file, "a") as out:
550            print(output, file=out, flush=True)
551
552
553def _execute(command):
554    """Execute a command line command and return the result.
555
556    This is a wrapper for ``subprocess.run()``.
557
558    """
559    return subprocess.run(
560        command,
561        shell=True,
562        stdout=subprocess.PIPE,
563        stderr=subprocess.PIPE,
564        universal_newlines=True,
565    )
566
567
568def _print_warning(string):
569    """Print a warning message."""
570    print(_color_warning("\nWARNING: " + string.replace("\n", "\n\t")), file=sys.stderr)
571
572
573def _print_section(string):
574    """Print section heading."""
575    print("\033[1m" + string + "\033[0m")
576
577
578def _print_runs():
579    """Print summary of all specified run."""
580    format_str = "{:<" + str(_max_name_len() + 5) + "}{:>10}{:>10}{:>10}"
581    _print_section(format_str.format("", "todo", "skipped", "total"))
582    for group, names in _state.groups.items():
583        if len(names) == 0:
584            continue
585        print(_color_selected(group, _is_selected(group)))
586        for name in names:
587            prefix = " └─ " if name == names[-1] else " ├─ "
588            count = _state.counts_by_name[name]
589            print(
590                _color_selected(
591                    format_str.format(
592                        prefix + name, count[0], count[1], count[0] + count[1]
593                    ),
594                    _is_selected(name),
595                )
596            )
597
598
599def _color_warning(string):
600    """Return the string but with ansi colors representing a warning."""
601    return "\u001b[33m" + string + "\u001b[0m"
602
603
604def _color_selected(string, selected):
605    """Return the string but with ansi colors for (un)selected experiments."""
606    col = "\u001b[32;1m" if selected else "\u001b[91;2m"
607    return col + string + "\u001b[0m"
608
609
610def _max_name_len():
611    """Gives the length of the longest used name.
612
613    Includes group and experiment names.
614
615    """
616    if _state.runs_by_name == {}:
617        return 0
618    return max([len(name) for name in _state.runs_by_name])
619
620
621class _State:
622    """Internal state."""
623
624    def __init__(self):
625        self.runs_by_name = dict()
626        self.counts_by_name = dict()
627        self.group = "ungrouped"
628        self.groups = {self.group: []}
629        self.time_start = time.time()
630        self.time_start_run = time.time()
631        self.run_was_called = False
632        self.run_completed = False
633
634    def __del__(self):
635        if self.runs_by_name == {}:
636            return
637
638        if not self.run_completed:
639            return
640
641        if not self.run_was_called:
642            _print_warning(
643                "Some runs were added without calling run(). "
644                "Did you forget to call run() at the end of the script?"
645            )
646            return
647
648        total_runs = sum(
649            [count[0] + count[1] for count in self.counts_by_name.values()]
650        )
651        print(
652            "time for gathering {0:d} runs: {1:.2f} seconds".format(
653                total_runs, self.time_start_run - self.time_start
654            )
655        )
656
657        print(
658            "time for running the experiments: {0:.2f} seconds".format(
659                time.time() - self.time_start_run
660            )
661        )
662        print()
663
664
665_state = _State()
666"""
667Instance of the internal state.
668"""

def add( name, command, arguments_descr, creates_file=None, stdout_file=None, stdout_mod=<function _identity>, stdout_res=None, header_string=None, header_command=None, header_mod=<function _identity>, return_string=None, allowed_return_codes=[0], combinations_filter=<function <lambda>>): View Source

 55def add(
 56    name,
 57    command,
 58    arguments_descr,
 59    creates_file=None,
 60    stdout_file=None,
 61    stdout_mod=_identity,
 62    stdout_res=None,
 63    header_string=None,
 64    header_command=None,
 65    header_mod=_identity,
 66    return_string=None,
 67    allowed_return_codes=[0],
 68    combinations_filter=lambda args: True,
 69):
 70    """Add a new experiment.
 71
 72    Based on the experiment description, a set of inidvidual runs is
 73    generated, where each individual run basically corresponds to a
 74    set of command line arguments.  The arguments are represented by a
 75    dictionary, i.e., each argument has a key and a value.
 76
 77    To describe how the arguments of an individual run can be used in
 78    several places, we need the concept of a *blob*.  Ultimately, a
 79    blob is something that will be evaluated to a string by replacing
 80    wildcards of the form ``[[key]]`` with the value of the
 81    corresponding argument.  A blob can also be a function, in which
 82    case it is evaluated (with the arguments as parameter) before
 83    doing this kind of replacements.  For more details on how a blob
 84    turns into a string, see the function ``deblob()``.  Note that a
 85    blob is only defined in the context of an individual run.  Thus,
 86    whenever we talk about blob, we implicitly have an individual run
 87    with a specific set of arguments in mind.
 88
 89    Parameters
 90    ----------
 91
 92    name: blob
 93
 94      Name of the experiment.  An experiment is only run if its name
 95      (or the name of its ``group``) appears as command line parameter.
 96      It does not need to be unique among experiments.
 97
 98    command: blob
 99
100      The command that will be called for each run.
101
102    arguments_descr: dictionary
103
104      Dictionary of arguments or lists of arguments.  In case of
105      lists, one run for each combination of arguments is generated.
106      Each individual argument is a blob, where the blobs are
107      evaluated to strings in order of appearance in the dictionary.
108
109    creates_file: blob, optional
110
111      Describes the name of a file that is created by calling the
112      command.  This is only used to skip the run if the file already
113      exists (at the time when this method is called, not at the time
114      when the command is actually run).
115
116    stdout_file: blob, optional
117
118      The filename to which the standard output of the run should be
119      written.  There are three different cases, depending on whether
120      this file already exists:
121
122      1. If the file exists when calling this function (i.e., before
123         performing any runs), the run is skipped.
124
125      1. If the run is not skipped but the file exists when executing
126         the run, the standard output is added at the end of the file.
127
128      1. If the file does not exist, it is created beginning with the
129         header (if specified) and then the standard output is added.
130
131    stdout_mod: function, default =``identity``
132
133      A function applied to the standard output of the run, before
134      writing it to the file.  If the function takes one argument, it
135      gets the standard output as string as input.  Otherwise, it
136      should take two arguments, the standard output as string and the
137      result of a call to ``subprocess.run()``.  The latter gives
138      access to additional information such as the return code.  The
139      function can return a blob using the special wildcard
140      ``[[stdout]]`` (similar to ``stdout_res``).
141
142    stdout_res: blob, optional
143
144      If given, this blob is written to the file instead of the
145      standard output itself.  This blob is somewhat special in the
146      sense that it evaluated after the run has finished with the
147      special argument ``stdout``, i.e., the blob can contain the
148      special wildcard ``[[stdout]]``, which will be replaced by the
149      standard output (after it was modified by ``stdout_mod``).
150
151    header_string: blob, optional
152
153      A string specifying the header; see input parameter
154      ``stdout_file``.
155
156    header_command: blob, optional
157
158      A command that is run to use its standard output as header.
159
160    header_mod: function, default =``identity``
161
162      A function that is applied to the header (specified by
163      ``header_string`` or ``header_command``) before writing it to a
164      file.  It should take one argument (a string) and return a
165      string.
166
167    return_string: blob, optional
168
169      If given, this blob will be evaluated for each run and a list of
170      the results is returned.
171
172    allowed_return_codes: list[int], default =``[0]``
173
174      A list of allowed return codes.  If a command returns any other
175      code, a warning is printed and the run is aborted.  The empty
176      list ``[]`` indicates that any return code should be accepted.
177
178    combinations_filter: function, default = always ``True``
179
180      A function that filters the combinations of arguments.  It
181      should take a dictionary of arguments and decide whether it
182      represents a valid combination by returning ``True`` of
183      ``False``.  The default returns always ``True``, i.e., a run is
184      created for every combination.
185
186    Returns
187    -------
188    None or list[string]
189        See documentation of ``return_string``.
190
191    """
192    if stdout_mod != _identity and stdout_file is None:
193        _print_warning("stdout_mod has no effect if stdout_file is not " "specified")
194    if stdout_res is not None and stdout_file is None:
195        _print_warning("stdout_res has no effect if stdout_file is not " "specified")
196    if header_string is not None and stdout_file is None:
197        _print_warning("header_string has no effect if stdout_file is not " "specified")
198    if header_command is not None and stdout_file is None:
199        _print_warning(
200            "header_command has no effect if stdout_file is not " "specified"
201        )
202    if header_string is not None and header_command is not None:
203        _print_warning(
204            "header_string and header_command specified" " - Which one should I use?"
205        )
206    if header_mod != _identity and header_string is None and header_command is None:
207        _print_warning(
208            "header_mod has no effect if not one of "
209            "header_string or header_command are specified"
210        )
211
212    # generate the set of arguments
213    arguments_descr = {
214        k: v if isinstance(v, list) else [v] for k, v in arguments_descr.items()
215    }
216    arguments_set = [
217        dict(zip(arguments_descr.keys(), vals))
218        for vals in itertools.product(*arguments_descr.values())
219    ]
220
221    return_strings = []
222    for args in arguments_set:
223        if not combinations_filter(args):
224            continue
225        for key, val in args.items():
226            args[key] = deblob(val, args)
227
228        real_name = deblob(name, args)
229        if real_name not in _state.groups[_state.group]:
230            _state.groups[_state.group].append(real_name)
231
232        run = _Run(
233            name=real_name,
234            command=deblob(command, args),
235            args=args,
236            creates_file=deblob(creates_file, args),
237            stdout_file=deblob(stdout_file, args),
238            stdout_mod=stdout_mod,
239            stdout_res=stdout_res,
240            header_string=deblob(header_string, args),
241            header_command=deblob(header_command, args),
242            header_mod=header_mod,
243            allowed_return_codes=allowed_return_codes,
244            is_selected=_is_selected(real_name),
245        )
246        _add_run(run)
247        if return_string is not None:
248            return_strings.append(deblob(return_string, args))
249
250    if return_string is not None:
251        return return_strings

Add a new experiment.

Based on the experiment description, a set of inidvidual runs is generated, where each individual run basically corresponds to a set of command line arguments. The arguments are represented by a dictionary, i.e., each argument has a key and a value.

To describe how the arguments of an individual run can be used in several places, we need the concept of a blob. Ultimately, a blob is something that will be evaluated to a string by replacing wildcards of the form [[key]] with the value of the corresponding argument. A blob can also be a function, in which case it is evaluated (with the arguments as parameter) before doing this kind of replacements. For more details on how a blob turns into a string, see the function deblob(). Note that a blob is only defined in the context of an individual run. Thus, whenever we talk about blob, we implicitly have an individual run with a specific set of arguments in mind.

Parameters

name (blob): Name of the experiment. An experiment is only run if its name (or the name of its group) appears as command line parameter. It does not need to be unique among experiments.
command (blob): The command that will be called for each run.
arguments_descr (dictionary): Dictionary of arguments or lists of arguments. In case of lists, one run for each combination of arguments is generated. Each individual argument is a blob, where the blobs are evaluated to strings in order of appearance in the dictionary.
creates_file (blob, optional): Describes the name of a file that is created by calling the command. This is only used to skip the run if the file already exists (at the time when this method is called, not at the time when the command is actually run).
stdout_file (blob, optional): The filename to which the standard output of the run should be written. There are three different cases, depending on whether this file already exists:
1. If the file exists when calling this function (i.e., before performing any runs), the run is skipped.
2. If the run is not skipped but the file exists when executing the run, the standard output is added at the end of the file.
3. If the file does not exist, it is created beginning with the header (if specified) and then the standard output is added.
stdout_mod (function, default =identity): A function applied to the standard output of the run, before writing it to the file. If the function takes one argument, it gets the standard output as string as input. Otherwise, it should take two arguments, the standard output as string and the result of a call to subprocess.run(). The latter gives access to additional information such as the return code. The function can return a blob using the special wildcard [[stdout]] (similar to stdout_res).
stdout_res (blob, optional): If given, this blob is written to the file instead of the standard output itself. This blob is somewhat special in the sense that it evaluated after the run has finished with the special argument stdout, i.e., the blob can contain the special wildcard [[stdout]], which will be replaced by the standard output (after it was modified by stdout_mod).
header_string (blob, optional): A string specifying the header; see input parameter stdout_file.
header_command (blob, optional): A command that is run to use its standard output as header.
header_mod (function, default =identity): A function that is applied to the header (specified by header_string or header_command) before writing it to a file. It should take one argument (a string) and return a string.
return_string (blob, optional): If given, this blob will be evaluated for each run and a list of the results is returned.
allowed_return_codes (list[int], default =[0]): A list of allowed return codes. If a command returns any other code, a warning is printed and the run is aborted. The empty list [] indicates that any return code should be accepted.
combinations_filter (function, default = always True): A function that filters the combinations of arguments. It should take a dictionary of arguments and decide whether it represents a valid combination by returning True of False. The default returns always True, i.e., a run is created for every combination.

Returns

None or list[string]: See documentation of return_string.

def run(): View Source

341def run():
342    """Run the previously declared experiments.
343
344    You should call this exactly once at the end of the file.
345
346    If ``dry_run`` is given as command line parameter, then the runs are
347    not executed but the commands printed to ``stdout``.
348
349    """
350    global _state
351    _state.run_was_called = True
352    _state.time_start_run = time.time()
353
354    _print_runs()
355
356    if _is_selected("dry_run"):
357        _run_dry()
358        _state = _State()
359        return
360
361    _print_section("\nrunning the experiments:")
362
363    with ProcessingPool(nodes=_cores) as pool:
364        for name, runs in _state.runs_by_name.items():
365            if len(runs) == 0:
366                continue
367            # run in parallel
368            orig_sigint_handler = signal.signal(signal.SIGINT, signal.SIG_IGN)
369            signal.signal(signal.SIGINT, orig_sigint_handler)
370            try:
371                for _ in tqdm.tqdm(
372                    pool.uimap(_run_run, runs),
373                    desc=name.ljust(_max_name_len()),
374                    total=len(runs),
375                ):
376                    pass
377            except KeyboardInterrupt:
378                _print_warning("aborted during experiment " + name)
379        pool.close()
380    _state.run_completed = True
381    _state = _State()

Run the previously declared experiments.

You should call this exactly once at the end of the file.

If dry_run is given as command line parameter, then the runs are not executed but the commands printed to stdout.

def use_cores(nr_cores): View Source

384def use_cores(nr_cores):
385    """Set the number of cores used to run the experiments.
386
387    Parameters
388    ----------
389
390    nr_cores: int
391
392      The number of cores that should be used.
393
394    """
395    global _cores
396    _cores = nr_cores

Set the number of cores used to run the experiments.

Parameters

nr_cores (int): The number of cores that should be used.

def group(group_name): View Source

415def group(group_name):
416    """Set the current group.
417
418    Each experiment created with ``add()`` is added to the group for
419    which this function was last called.
420
421    Parameters
422    ----------
423
424    group_name: string
425
426      The name of the group.
427
428    """
429    _state.group = group_name
430    if group_name not in _state.groups:
431        _state.groups[group_name] = []

Set the current group.

Each experiment created with add() is added to the group for which this function was last called.

Parameters

group_name (string): The name of the group.

def section(title): View Source

434def section(title):
435    """Print a section title.
436
437    Parameters
438    ----------
439
440    title: string
441
442      The title that should be printed.
443
444    """
445    _print_section(title)

Print a section title.

Parameters

title (string): The title that should be printed.

def deblob(blob, args): View Source

448def deblob(blob, args):
449    """Transforms a blob into a string.
450
451    This function is meant for internal use but understanding what it
452    does might be useful.  A blob is transferred into a string in the
453    following steps.
454
455    1. If ``blob`` is a function, it is called with ``args`` as parameter.
456
457    1. The result (or ``blob`` itself, if 1. did not apply) is converted
458    to a string (using ``str()``).
459
460    1. Every pattern of the form ``[[key]]`` is replaced by the value of
461    the corresponding argument in ``args``.
462
463    Step 3 assumes that every pattern of the form ``[[key]]`` has a
464    corresponding argument with this key.
465
466    Parameters
467    ----------
468
469    blob: blob
470
471      The blob that should be turned into a string.
472
473    args: dictionary
474
475      The named arguments of the current run.
476
477    """
478    if blob is None:
479        return None
480
481    if callable(blob):
482        blob = blob(args)
483
484    blob = str(blob)
485
486    keys = [m.group(1) for m in re.finditer(r"\[\[([^\[\]]*)\]\]", blob)]
487    result = blob
488    for key in keys:
489        if key not in args:
490            _print_warning("No value for [[" + key + "]] found to replace in " + blob)
491            continue
492        result = result.replace("[[" + key + "]]", args[key])
493    return result

Transforms a blob into a string.

This function is meant for internal use but understanding what it does might be useful. A blob is transferred into a string in the following steps.

If blob is a function, it is called with args as parameter.
The result (or blob itself, if 1. did not apply) is converted to a string (using str()).
Every pattern of the form [[key]] is replaced by the value of the corresponding argument in args.

Step 3 assumes that every pattern of the form [[key]] has a corresponding argument with this key.

Parameters

blob (blob): The blob that should be turned into a string.
args (dictionary): The named arguments of the current run.