run
A module for running experiments that involve the execution of commands with different combinations of command line arguments.
This is a somewhat short reference documentation of the interface. There is also a longer explanation including various examples.
1"""A module for running experiments that involve the execution of 2commands with different combinations of command line arguments. 3 4This is a somewhat short reference documentation of the interface. 5There is also a [longer explanation including various 6examples](./index.html). 7 8""" 9 10import sys 11import re 12import itertools 13import os 14import subprocess 15import signal 16import time 17from fnmatch import fnmatchcase 18from collections import namedtuple 19from inspect import signature 20from filelock import SoftFileLock 21import tqdm 22from pathos.multiprocessing import ProcessingPool 23 24 25def _identity(inp): 26 """Identity function just returning the input. 27 28 Default for the modifier functions of ``add()`` 29 30 """ 31 return inp 32 33 34_Run = namedtuple( 35 "_Run", 36 [ 37 "name", 38 "command", 39 "args", 40 "creates_file", 41 "stdout_file", 42 "stdout_mod", 43 "stdout_res", 44 "header_string", 45 "header_command", 46 "header_mod", 47 "allowed_return_codes", 48 "is_selected", 49 ], 50) 51"""Datatype representing a single run.""" 52 53 54def add( 55 name, 56 command, 57 arguments_descr, 58 creates_file=None, 59 stdout_file=None, 60 stdout_mod=_identity, 61 stdout_res=None, 62 header_string=None, 63 header_command=None, 64 header_mod=_identity, 65 return_string=None, 66 allowed_return_codes=[0], 67 combinations_filter=lambda args: True, 68): 69 """Add a new experiment. 70 71 Based on the experiment description, a set of inidvidual runs is 72 generated, where each individual run basically corresponds to a 73 set of command line arguments. The arguments are represented by a 74 dictionary, i.e., each argument has a key and a value. 75 76 To describe how the arguments of an individual run can be used in 77 several places, we need the concept of a *blob*. Ultimately, a 78 blob is something that will be evaluated to a string by replacing 79 wildcards of the form ``[[key]]`` with the value of the 80 corresponding argument. A blob can also be a function, in which 81 case it is evaluated (with the arguments as parameter) before 82 doing this kind of replacements. For more details on how a blob 83 turns into a string, see the function ``deblob()``. Note that a 84 blob is only defined in the context of an individual run. Thus, 85 whenever we talk about blob, we implicitly have an individual run 86 with a specific set of arguments in mind. 87 88 Parameters 89 ---------- 90 91 name: blob 92 93 Name of the experiment. An experiment is only run if its name 94 (or the name of its ``group``) appears as command line parameter. 95 It does not need to be unique among experiments. 96 97 command: blob 98 99 The command that will be called for each run. 100 101 arguments_descr: dictionary 102 103 Dictionary of arguments or lists of arguments. In case of 104 lists, one run for each combination of arguments is generated. 105 Each individual argument is a blob, where the blobs are 106 evaluated to strings in order of appearance in the dictionary. 107 108 creates_file: blob, optional 109 110 Describes the name of a file that is created by calling the 111 command. This is only used to skip the run if the file already 112 exists (at the time when this method is called, not at the time 113 when the command is actually run). 114 115 stdout_file: blob, optional 116 117 The filename to which the standard output of the run should be 118 written. There are three different cases, depending on whether 119 this file already exists: 120 121 1. If the file exists when calling this function (i.e., before 122 performing any runs), the run is skipped. 123 124 1. If the run is not skipped but the file exists when executing 125 the run, the standard output is added at the end of the file. 126 127 1. If the file does not exist, it is created beginning with the 128 header (if specified) and then the standard output is added. 129 130 stdout_mod: function, default =``identity`` 131 132 A function applied to the standard output of the run, before 133 writing it to the file. If the function takes one argument, it 134 gets the standard output as string as input. Otherwise, it 135 should take two arguments, the standard output as string and the 136 result of a call to ``subprocess.run()``. The latter gives 137 access to additional information such as the return code. The 138 function can return a blob using the special wildcard 139 ``[[stdout]]`` (similar to ``stdout_res``). 140 141 stdout_res: blob, optional 142 143 If given, this blob is written to the file instead of the 144 standard output itself. This blob is somewhat special in the 145 sense that it evaluated after the run has finished with the 146 special argument ``stdout``, i.e., the blob can contain the 147 special wildcard ``[[stdout]]``, which will be replaced by the 148 standard output (after it was modified by ``stdout_mod``). 149 150 header_string: blob, optional 151 152 A string specifying the header; see input parameter 153 ``stdout_file``. 154 155 header_command: blob, optional 156 157 A command that is run to use its standard output as header. 158 159 header_mod: function, default =``identity`` 160 161 A function that is applied to the header (specified by 162 ``header_string`` or ``header_command``) before writing it to a 163 file. It should take one argument (a string) and return a 164 string. 165 166 return_string: blob, optional 167 168 If given, this blob will be evaluated for each run and a list of 169 the results is returned. 170 171 allowed_return_codes: list[int], default =``[0]`` 172 173 A list of allowed return codes. If a command returns any other 174 code, a warning is printed and the run is aborted. The empty 175 list ``[]`` indicates that any return code should be accepted. 176 177 combinations_filter: function, default = always ``True`` 178 179 A function that filters the combinations of arguments. It 180 should take a dictionary of arguments and decide whether it 181 represents a valid combination by returning ``True`` of 182 ``False``. The default returns always ``True``, i.e., a run is 183 created for every combination. 184 185 Returns 186 ------- 187 None or list[string] 188 See documentation of ``return_string``. 189 190 """ 191 if stdout_mod != _identity and stdout_file is None: 192 _print_warning("stdout_mod has no effect if stdout_file is not " "specified") 193 if stdout_res is not None and stdout_file is None: 194 _print_warning("stdout_res has no effect if stdout_file is not " "specified") 195 if header_string is not None and stdout_file is None: 196 _print_warning("header_string has no effect if stdout_file is not " "specified") 197 if header_command is not None and stdout_file is None: 198 _print_warning( 199 "header_command has no effect if stdout_file is not " "specified" 200 ) 201 if header_string is not None and header_command is not None: 202 _print_warning( 203 "header_string and header_command specified" " - Which one should I use?" 204 ) 205 if header_mod != _identity and header_string is None and header_command is None: 206 _print_warning( 207 "header_mod has no effect if not one of " 208 "header_string or header_command are specified" 209 ) 210 211 # generate the set of arguments 212 arguments_descr = { 213 k: v if isinstance(v, list) else [v] for k, v in arguments_descr.items() 214 } 215 arguments_set = [ 216 dict(zip(arguments_descr.keys(), vals)) 217 for vals in itertools.product(*arguments_descr.values()) 218 ] 219 220 return_strings = [] 221 for args in arguments_set: 222 if not combinations_filter(args): 223 continue 224 for key, val in args.items(): 225 args[key] = deblob(val, args) 226 227 real_name = deblob(name, args) 228 if real_name not in _state.groups[_state.group]: 229 _state.groups[_state.group].append(real_name) 230 231 run = _Run( 232 name=real_name, 233 command=deblob(command, args), 234 args=args, 235 creates_file=deblob(creates_file, args), 236 stdout_file=deblob(stdout_file, args), 237 stdout_mod=stdout_mod, 238 stdout_res=stdout_res, 239 header_string=deblob(header_string, args), 240 header_command=deblob(header_command, args), 241 header_mod=header_mod, 242 allowed_return_codes=allowed_return_codes, 243 is_selected=_is_selected(real_name), 244 ) 245 _add_run(run) 246 if return_string is not None: 247 return_strings.append(deblob(return_string, args)) 248 249 if return_string is not None: 250 return return_strings 251 252 253def _add_run(run): 254 """Add a single run to the list of runs. 255 256 Checks whether the run should was selected or whether it should be 257 skippped and adjusts the corresponding data structures 258 accordingly. 259 260 """ 261 if run.name not in _state.runs_by_name: 262 _state.runs_by_name[run.name] = [] 263 _state.counts_by_name[run.name] = [0, 0] 264 265 if _is_skipped(run): 266 _state.counts_by_name[run.name][1] += 1 267 else: 268 _state.counts_by_name[run.name][0] += 1 269 270 if run.is_selected and not _is_skipped(run): 271 _state.runs_by_name[run.name].append(run) 272 273 274def _wildcard_match(pattern: str, candidates: 'list[str]'): 275 """Decides whether some element a pattern with Unix shell-style wildcards 276 matches any candidate string. 277 278 Parameters 279 ---------- 280 281 pattern: str 282 283 The pattern to match. 284 285 candidates: list[str] 286 287 The list of strings to match against. 288 289 Returns 290 ------- 291 bool 292 True if any of the candidates matches the pattern, False otherwise. 293 294 """ 295 for candidate in candidates: 296 if fnmatchcase(pattern, candidate): 297 return True 298 return False 299 300 301def _is_selected(name): 302 """Decide whether a given name was selected. 303 304 A name counts as selected if it is give as command line parameter 305 or if it belongs to a group that was given as command line 306 parameter. 307 308 Parameters 309 ---------- 310 311 name: string 312 313 The name of the experiment. 314 315 Returns 316 ------- 317 bool 318 True if the experiment should be run, False otherwise. 319 320 """ 321 if _wildcard_match(name, sys.argv): 322 return True 323 for group, names in _state.groups.items(): 324 if _wildcard_match(group, sys.argv) and _wildcard_match(name, names): 325 return True 326 return False 327 328 329def _is_skipped(run): 330 """Decides whether a given run should be skipped. 331 332 A run is skipped if the output file already exists. 333 334 """ 335 return (run.creates_file is not None and os.path.isfile(run.creates_file)) or ( 336 run.stdout_file is not None and os.path.isfile(run.stdout_file) 337 ) 338 339 340def run(): 341 """Run the previously declared experiments. 342 343 You should call this exactly once at the end of the file. 344 345 If ``dry_run`` is given as command line parameter, then the runs are 346 not executed but the commands printed to ``stdout``. 347 348 """ 349 global _state 350 _state.run_was_called = True 351 _state.time_start_run = time.time() 352 353 _print_runs() 354 355 if _is_selected("dry_run"): 356 _run_dry() 357 _state = _State() 358 return 359 360 _print_section("\nrunning the experiments:") 361 362 with ProcessingPool(nodes=_cores) as pool: 363 for name, runs in _state.runs_by_name.items(): 364 if len(runs) == 0: 365 continue 366 # run in parallel 367 orig_sigint_handler = signal.signal(signal.SIGINT, signal.SIG_IGN) 368 signal.signal(signal.SIGINT, orig_sigint_handler) 369 try: 370 for _ in tqdm.tqdm( 371 pool.uimap(_run_run, runs), 372 desc=name.ljust(_max_name_len()), 373 total=len(runs), 374 ): 375 pass 376 except KeyboardInterrupt: 377 _print_warning("aborted during experiment " + name) 378 pool.close() 379 _state.run_completed = True 380 _state = _State() 381 382 383def use_cores(nr_cores): 384 """Set the number of cores used to run the experiments. 385 386 Parameters 387 ---------- 388 389 nr_cores: int 390 391 The number of cores that should be used. 392 393 """ 394 global _cores 395 _cores = nr_cores 396 397 398_cores = 4 399"""The number of cores used to run the experiments.""" 400 401 402def _run_dry(): 403 """Perform a dry run.""" 404 _print_section("\ndry run: just printing, no doing") 405 for name, runs in _state.runs_by_name.items(): 406 if len(runs) == 0: 407 continue 408 # print the runs 409 _print_section("\ncommands for experiment " + name) 410 for run in runs: 411 print(run.command) 412 413 414def group(group_name): 415 """Set the current group. 416 417 Each experiment created with ``add()`` is added to the group for 418 which this function was last called. 419 420 Parameters 421 ---------- 422 423 group_name: string 424 425 The name of the group. 426 427 """ 428 _state.group = group_name 429 if group_name not in _state.groups: 430 _state.groups[group_name] = [] 431 432 433def section(title): 434 """Print a section title. 435 436 Parameters 437 ---------- 438 439 title: string 440 441 The title that should be printed. 442 443 """ 444 _print_section(title) 445 446 447def deblob(blob, args): 448 """Transforms a blob into a string. 449 450 This function is meant for internal use but understanding what it 451 does might be useful. A blob is transferred into a string in the 452 following steps. 453 454 1. If ``blob`` is a function, it is called with ``args`` as parameter. 455 456 1. The result (or ``blob`` itself, if 1. did not apply) is converted 457 to a string (using ``str()``). 458 459 1. Every pattern of the form ``[[key]]`` is replaced by the value of 460 the corresponding argument in ``args``. 461 462 Step 3 assumes that every pattern of the form ``[[key]]`` has a 463 corresponding argument with this key. 464 465 Parameters 466 ---------- 467 468 blob: blob 469 470 The blob that should be turned into a string. 471 472 args: dictionary 473 474 The named arguments of the current run. 475 476 """ 477 if blob is None: 478 return None 479 480 if callable(blob): 481 blob = blob(args) 482 483 blob = str(blob) 484 485 keys = [m.group(1) for m in re.finditer(r"\[\[([^\[\]]*)\]\]", blob)] 486 result = blob 487 for key in keys: 488 if key not in args: 489 _print_warning("No value for [[" + key + "]] found to replace in " + blob) 490 continue 491 result = result.replace("[[" + key + "]]", args[key]) 492 return result 493 494 495def _run_run(run): 496 """Actually run a run.""" 497 res = _execute(run.command) 498 499 # check return codes 500 if ( 501 res.returncode not in run.allowed_return_codes 502 and run.allowed_return_codes != [] 503 ): 504 _print_warning( 505 "unexpected return code (" 506 + str(res.returncode) 507 + ") for command: " 508 + res.args 509 + "\n" 510 + res.stderr.strip() 511 + "" 512 ) 513 return 514 515 if run.stdout_file is None: 516 return 517 518 filename = run.stdout_file 519 lock = filename + ".lock" 520 if os.path.dirname(filename) != "": 521 os.makedirs(os.path.dirname(filename), exist_ok=True) 522 523 # create new file with header 524 if not os.path.isfile(filename) and ( 525 run.header_command is not None or run.header_string is not None 526 ): 527 header = run.header_string 528 if run.header_command is not None: 529 header = _execute(run.header_command).stdout.strip() 530 header = run.header_mod(header) 531 532 with SoftFileLock(lock): 533 if not os.path.isfile(filename): 534 with open(filename, "w") as out: 535 print(header, file=out, flush=True) 536 537 # write to stdout 538 stdout = res.stdout.strip() 539 mod = run.stdout_mod 540 output = mod(stdout) if len(signature(mod).parameters) == 1 else mod(stdout, res) 541 if mod != _identity: 542 run.args["stdout"] = stdout 543 output = deblob(output, run.args) 544 if run.stdout_res is not None: 545 run.args["stdout"] = output 546 output = deblob(run.stdout_res, run.args) 547 548 with SoftFileLock(run.stdout_file + ".lock"): 549 with open(run.stdout_file, "a") as out: 550 print(output, file=out, flush=True) 551 552 553def _execute(command): 554 """Execute a command line command and return the result. 555 556 This is a wrapper for ``subprocess.run()``. 557 558 """ 559 return subprocess.run( 560 command, 561 shell=True, 562 stdout=subprocess.PIPE, 563 stderr=subprocess.PIPE, 564 universal_newlines=True, 565 ) 566 567 568def _print_warning(string): 569 """Print a warning message.""" 570 print(_color_warning("\nWARNING: " + string.replace("\n", "\n\t")), file=sys.stderr) 571 572 573def _print_section(string): 574 """Print section heading.""" 575 print("\033[1m" + string + "\033[0m") 576 577 578def _print_runs(): 579 """Print summary of all specified run.""" 580 format_str = "{:<" + str(_max_name_len() + 5) + "}{:>10}{:>10}{:>10}" 581 _print_section(format_str.format("", "todo", "skipped", "total")) 582 for group, names in _state.groups.items(): 583 if len(names) == 0: 584 continue 585 print(_color_selected(group, _is_selected(group))) 586 for name in names: 587 prefix = " └─ " if name == names[-1] else " ├─ " 588 count = _state.counts_by_name[name] 589 print( 590 _color_selected( 591 format_str.format( 592 prefix + name, count[0], count[1], count[0] + count[1] 593 ), 594 _is_selected(name), 595 ) 596 ) 597 598 599def _color_warning(string): 600 """Return the string but with ansi colors representing a warning.""" 601 return "\u001b[33m" + string + "\u001b[0m" 602 603 604def _color_selected(string, selected): 605 """Return the string but with ansi colors for (un)selected experiments.""" 606 col = "\u001b[32;1m" if selected else "\u001b[91;2m" 607 return col + string + "\u001b[0m" 608 609 610def _max_name_len(): 611 """Gives the length of the longest used name. 612 613 Includes group and experiment names. 614 615 """ 616 if _state.runs_by_name == {}: 617 return 0 618 return max([len(name) for name in _state.runs_by_name]) 619 620 621class _State: 622 """Internal state.""" 623 624 def __init__(self): 625 self.runs_by_name = dict() 626 self.counts_by_name = dict() 627 self.group = "ungrouped" 628 self.groups = {self.group: []} 629 self.time_start = time.time() 630 self.time_start_run = time.time() 631 self.run_was_called = False 632 self.run_completed = False 633 634 def __del__(self): 635 if self.runs_by_name == {}: 636 return 637 638 if not self.run_completed: 639 return 640 641 if not self.run_was_called: 642 _print_warning( 643 "Some runs were added without calling run(). " 644 "Did you forget to call run() at the end of the script?" 645 ) 646 return 647 648 total_runs = sum( 649 [count[0] + count[1] for count in self.counts_by_name.values()] 650 ) 651 print( 652 "time for gathering {0:d} runs: {1:.2f} seconds".format( 653 total_runs, self.time_start_run - self.time_start 654 ) 655 ) 656 657 print( 658 "time for running the experiments: {0:.2f} seconds".format( 659 time.time() - self.time_start_run 660 ) 661 ) 662 print() 663 664 665_state = _State() 666""" 667Instance of the internal state. 668"""
55def add( 56 name, 57 command, 58 arguments_descr, 59 creates_file=None, 60 stdout_file=None, 61 stdout_mod=_identity, 62 stdout_res=None, 63 header_string=None, 64 header_command=None, 65 header_mod=_identity, 66 return_string=None, 67 allowed_return_codes=[0], 68 combinations_filter=lambda args: True, 69): 70 """Add a new experiment. 71 72 Based on the experiment description, a set of inidvidual runs is 73 generated, where each individual run basically corresponds to a 74 set of command line arguments. The arguments are represented by a 75 dictionary, i.e., each argument has a key and a value. 76 77 To describe how the arguments of an individual run can be used in 78 several places, we need the concept of a *blob*. Ultimately, a 79 blob is something that will be evaluated to a string by replacing 80 wildcards of the form ``[[key]]`` with the value of the 81 corresponding argument. A blob can also be a function, in which 82 case it is evaluated (with the arguments as parameter) before 83 doing this kind of replacements. For more details on how a blob 84 turns into a string, see the function ``deblob()``. Note that a 85 blob is only defined in the context of an individual run. Thus, 86 whenever we talk about blob, we implicitly have an individual run 87 with a specific set of arguments in mind. 88 89 Parameters 90 ---------- 91 92 name: blob 93 94 Name of the experiment. An experiment is only run if its name 95 (or the name of its ``group``) appears as command line parameter. 96 It does not need to be unique among experiments. 97 98 command: blob 99 100 The command that will be called for each run. 101 102 arguments_descr: dictionary 103 104 Dictionary of arguments or lists of arguments. In case of 105 lists, one run for each combination of arguments is generated. 106 Each individual argument is a blob, where the blobs are 107 evaluated to strings in order of appearance in the dictionary. 108 109 creates_file: blob, optional 110 111 Describes the name of a file that is created by calling the 112 command. This is only used to skip the run if the file already 113 exists (at the time when this method is called, not at the time 114 when the command is actually run). 115 116 stdout_file: blob, optional 117 118 The filename to which the standard output of the run should be 119 written. There are three different cases, depending on whether 120 this file already exists: 121 122 1. If the file exists when calling this function (i.e., before 123 performing any runs), the run is skipped. 124 125 1. If the run is not skipped but the file exists when executing 126 the run, the standard output is added at the end of the file. 127 128 1. If the file does not exist, it is created beginning with the 129 header (if specified) and then the standard output is added. 130 131 stdout_mod: function, default =``identity`` 132 133 A function applied to the standard output of the run, before 134 writing it to the file. If the function takes one argument, it 135 gets the standard output as string as input. Otherwise, it 136 should take two arguments, the standard output as string and the 137 result of a call to ``subprocess.run()``. The latter gives 138 access to additional information such as the return code. The 139 function can return a blob using the special wildcard 140 ``[[stdout]]`` (similar to ``stdout_res``). 141 142 stdout_res: blob, optional 143 144 If given, this blob is written to the file instead of the 145 standard output itself. This blob is somewhat special in the 146 sense that it evaluated after the run has finished with the 147 special argument ``stdout``, i.e., the blob can contain the 148 special wildcard ``[[stdout]]``, which will be replaced by the 149 standard output (after it was modified by ``stdout_mod``). 150 151 header_string: blob, optional 152 153 A string specifying the header; see input parameter 154 ``stdout_file``. 155 156 header_command: blob, optional 157 158 A command that is run to use its standard output as header. 159 160 header_mod: function, default =``identity`` 161 162 A function that is applied to the header (specified by 163 ``header_string`` or ``header_command``) before writing it to a 164 file. It should take one argument (a string) and return a 165 string. 166 167 return_string: blob, optional 168 169 If given, this blob will be evaluated for each run and a list of 170 the results is returned. 171 172 allowed_return_codes: list[int], default =``[0]`` 173 174 A list of allowed return codes. If a command returns any other 175 code, a warning is printed and the run is aborted. The empty 176 list ``[]`` indicates that any return code should be accepted. 177 178 combinations_filter: function, default = always ``True`` 179 180 A function that filters the combinations of arguments. It 181 should take a dictionary of arguments and decide whether it 182 represents a valid combination by returning ``True`` of 183 ``False``. The default returns always ``True``, i.e., a run is 184 created for every combination. 185 186 Returns 187 ------- 188 None or list[string] 189 See documentation of ``return_string``. 190 191 """ 192 if stdout_mod != _identity and stdout_file is None: 193 _print_warning("stdout_mod has no effect if stdout_file is not " "specified") 194 if stdout_res is not None and stdout_file is None: 195 _print_warning("stdout_res has no effect if stdout_file is not " "specified") 196 if header_string is not None and stdout_file is None: 197 _print_warning("header_string has no effect if stdout_file is not " "specified") 198 if header_command is not None and stdout_file is None: 199 _print_warning( 200 "header_command has no effect if stdout_file is not " "specified" 201 ) 202 if header_string is not None and header_command is not None: 203 _print_warning( 204 "header_string and header_command specified" " - Which one should I use?" 205 ) 206 if header_mod != _identity and header_string is None and header_command is None: 207 _print_warning( 208 "header_mod has no effect if not one of " 209 "header_string or header_command are specified" 210 ) 211 212 # generate the set of arguments 213 arguments_descr = { 214 k: v if isinstance(v, list) else [v] for k, v in arguments_descr.items() 215 } 216 arguments_set = [ 217 dict(zip(arguments_descr.keys(), vals)) 218 for vals in itertools.product(*arguments_descr.values()) 219 ] 220 221 return_strings = [] 222 for args in arguments_set: 223 if not combinations_filter(args): 224 continue 225 for key, val in args.items(): 226 args[key] = deblob(val, args) 227 228 real_name = deblob(name, args) 229 if real_name not in _state.groups[_state.group]: 230 _state.groups[_state.group].append(real_name) 231 232 run = _Run( 233 name=real_name, 234 command=deblob(command, args), 235 args=args, 236 creates_file=deblob(creates_file, args), 237 stdout_file=deblob(stdout_file, args), 238 stdout_mod=stdout_mod, 239 stdout_res=stdout_res, 240 header_string=deblob(header_string, args), 241 header_command=deblob(header_command, args), 242 header_mod=header_mod, 243 allowed_return_codes=allowed_return_codes, 244 is_selected=_is_selected(real_name), 245 ) 246 _add_run(run) 247 if return_string is not None: 248 return_strings.append(deblob(return_string, args)) 249 250 if return_string is not None: 251 return return_strings
Add a new experiment.
Based on the experiment description, a set of inidvidual runs is generated, where each individual run basically corresponds to a set of command line arguments. The arguments are represented by a dictionary, i.e., each argument has a key and a value.
To describe how the arguments of an individual run can be used in
several places, we need the concept of a blob. Ultimately, a
blob is something that will be evaluated to a string by replacing
wildcards of the form [[key]]
with the value of the
corresponding argument. A blob can also be a function, in which
case it is evaluated (with the arguments as parameter) before
doing this kind of replacements. For more details on how a blob
turns into a string, see the function deblob()
. Note that a
blob is only defined in the context of an individual run. Thus,
whenever we talk about blob, we implicitly have an individual run
with a specific set of arguments in mind.
Parameters
- name (blob):
Name of the experiment. An experiment is only run if its name
(or the name of its
group
) appears as command line parameter. It does not need to be unique among experiments. - command (blob): The command that will be called for each run.
- arguments_descr (dictionary): Dictionary of arguments or lists of arguments. In case of lists, one run for each combination of arguments is generated. Each individual argument is a blob, where the blobs are evaluated to strings in order of appearance in the dictionary.
- creates_file (blob, optional): Describes the name of a file that is created by calling the command. This is only used to skip the run if the file already exists (at the time when this method is called, not at the time when the command is actually run).
stdout_file (blob, optional): The filename to which the standard output of the run should be written. There are three different cases, depending on whether this file already exists:
If the file exists when calling this function (i.e., before performing any runs), the run is skipped.
If the run is not skipped but the file exists when executing the run, the standard output is added at the end of the file.
If the file does not exist, it is created beginning with the header (if specified) and then the standard output is added.
- stdout_mod (function, default =
identity
): A function applied to the standard output of the run, before writing it to the file. If the function takes one argument, it gets the standard output as string as input. Otherwise, it should take two arguments, the standard output as string and the result of a call tosubprocess.run()
. The latter gives access to additional information such as the return code. The function can return a blob using the special wildcard[[stdout]]
(similar tostdout_res
). - stdout_res (blob, optional):
If given, this blob is written to the file instead of the
standard output itself. This blob is somewhat special in the
sense that it evaluated after the run has finished with the
special argument
stdout
, i.e., the blob can contain the special wildcard[[stdout]]
, which will be replaced by the standard output (after it was modified bystdout_mod
). - header_string (blob, optional):
A string specifying the header; see input parameter
stdout_file
. - header_command (blob, optional): A command that is run to use its standard output as header.
- header_mod (function, default =
identity
): A function that is applied to the header (specified byheader_string
orheader_command
) before writing it to a file. It should take one argument (a string) and return a string. - return_string (blob, optional): If given, this blob will be evaluated for each run and a list of the results is returned.
- allowed_return_codes (list[int], default =
[0]
): A list of allowed return codes. If a command returns any other code, a warning is printed and the run is aborted. The empty list[]
indicates that any return code should be accepted. - combinations_filter (function, default = always
True
): A function that filters the combinations of arguments. It should take a dictionary of arguments and decide whether it represents a valid combination by returningTrue
ofFalse
. The default returns alwaysTrue
, i.e., a run is created for every combination.
Returns
- None or list[string]: See documentation of
return_string
.
341def run(): 342 """Run the previously declared experiments. 343 344 You should call this exactly once at the end of the file. 345 346 If ``dry_run`` is given as command line parameter, then the runs are 347 not executed but the commands printed to ``stdout``. 348 349 """ 350 global _state 351 _state.run_was_called = True 352 _state.time_start_run = time.time() 353 354 _print_runs() 355 356 if _is_selected("dry_run"): 357 _run_dry() 358 _state = _State() 359 return 360 361 _print_section("\nrunning the experiments:") 362 363 with ProcessingPool(nodes=_cores) as pool: 364 for name, runs in _state.runs_by_name.items(): 365 if len(runs) == 0: 366 continue 367 # run in parallel 368 orig_sigint_handler = signal.signal(signal.SIGINT, signal.SIG_IGN) 369 signal.signal(signal.SIGINT, orig_sigint_handler) 370 try: 371 for _ in tqdm.tqdm( 372 pool.uimap(_run_run, runs), 373 desc=name.ljust(_max_name_len()), 374 total=len(runs), 375 ): 376 pass 377 except KeyboardInterrupt: 378 _print_warning("aborted during experiment " + name) 379 pool.close() 380 _state.run_completed = True 381 _state = _State()
Run the previously declared experiments.
You should call this exactly once at the end of the file.
If dry_run
is given as command line parameter, then the runs are
not executed but the commands printed to stdout
.
384def use_cores(nr_cores): 385 """Set the number of cores used to run the experiments. 386 387 Parameters 388 ---------- 389 390 nr_cores: int 391 392 The number of cores that should be used. 393 394 """ 395 global _cores 396 _cores = nr_cores
Set the number of cores used to run the experiments.
Parameters
- nr_cores (int): The number of cores that should be used.
415def group(group_name): 416 """Set the current group. 417 418 Each experiment created with ``add()`` is added to the group for 419 which this function was last called. 420 421 Parameters 422 ---------- 423 424 group_name: string 425 426 The name of the group. 427 428 """ 429 _state.group = group_name 430 if group_name not in _state.groups: 431 _state.groups[group_name] = []
Set the current group.
Each experiment created with add()
is added to the group for
which this function was last called.
Parameters
- group_name (string): The name of the group.
434def section(title): 435 """Print a section title. 436 437 Parameters 438 ---------- 439 440 title: string 441 442 The title that should be printed. 443 444 """ 445 _print_section(title)
Print a section title.
Parameters
- title (string): The title that should be printed.
448def deblob(blob, args): 449 """Transforms a blob into a string. 450 451 This function is meant for internal use but understanding what it 452 does might be useful. A blob is transferred into a string in the 453 following steps. 454 455 1. If ``blob`` is a function, it is called with ``args`` as parameter. 456 457 1. The result (or ``blob`` itself, if 1. did not apply) is converted 458 to a string (using ``str()``). 459 460 1. Every pattern of the form ``[[key]]`` is replaced by the value of 461 the corresponding argument in ``args``. 462 463 Step 3 assumes that every pattern of the form ``[[key]]`` has a 464 corresponding argument with this key. 465 466 Parameters 467 ---------- 468 469 blob: blob 470 471 The blob that should be turned into a string. 472 473 args: dictionary 474 475 The named arguments of the current run. 476 477 """ 478 if blob is None: 479 return None 480 481 if callable(blob): 482 blob = blob(args) 483 484 blob = str(blob) 485 486 keys = [m.group(1) for m in re.finditer(r"\[\[([^\[\]]*)\]\]", blob)] 487 result = blob 488 for key in keys: 489 if key not in args: 490 _print_warning("No value for [[" + key + "]] found to replace in " + blob) 491 continue 492 result = result.replace("[[" + key + "]]", args[key]) 493 return result
Transforms a blob into a string.
This function is meant for internal use but understanding what it does might be useful. A blob is transferred into a string in the following steps.
If
blob
is a function, it is called withargs
as parameter.The result (or
blob
itself, if 1. did not apply) is converted to a string (usingstr()
).Every pattern of the form
[[key]]
is replaced by the value of the corresponding argument inargs
.
Step 3 assumes that every pattern of the form [[key]]
has a
corresponding argument with this key.
Parameters
- blob (blob): The blob that should be turned into a string.
- args (dictionary): The named arguments of the current run.