Command Line Interface

Tadah!MLIP’s CLI follows one simple pattern; once you know the conventions you can drive every tool—from dataset wrangling to hyper-parameter optimisation.

Basic invocation

tadah <command> [<subcommand>] [OPTIONS]
  • command – one of the top-level verbs listed in the map below (analysis, data …).

  • subcommand – some commands are families that expose specialist sub-tools, e.g. analysis bfunc or data dedup. Commands without children omit this level.

  • OPTIONS – long keys start with --; single-letter aliases start with -. Boolean switches are always written in CAPS for their short form (-F, -S, -A …).

Tasks vs. direct CLI

Anything that can be written on the command line can be placed inside a task file and executed with --task tasks.tadah. A task file is simply the CLI translated into an INI-style block:

# global options
NUMERIC 14
VERBOSE 2

TASK predict
DBFILE   data.tadah
FORCE    true
ANALYTICS true
...

Short-hand rules

  • Boolean flags never take a value on the CLI: --force--force true

  • Lists accept space-separated tokens or comma/range syntax (1,3-5,10-20:2).

  • Mutually exclusive options are flagged in the help (Excludes); Tadah!MLIP aborts if you combine them.

  • Any multi-value option can be repeated: --dbfile file1 file2 or --dbfile file1 --dbfile file2.

Top-level command map

analysis

Visualise basis / cutoff functions or compute descriptors

  • descriptor – calculate structure descriptors

  • bfunc – evaluate/plot basis functions

  • cutoff – evaluate/plot cutoff functions

data

Create & manipulate datasets

  • convert – DFT → Tadah! format

  • print – echo structure info

  • write – export structures

  • dedup – remove duplicates from a database

  • merge – concatenate databases

  • split – partition a database

  • sample – sample a database

  • balance – re-weight / shift energies & forces

explain

Detailed help for any command/option (dot-notation)

hpo

Global hyper-parameter optimisation of a model

predict

Run a trained potential on datasets or raw structures

properties

Evaluate benchmark properties

  • pairwise – energy vs. distance curves

train

Fit a new potential (-c config.tadah)

Commands reference

analysis

DESCRIPTION:
  Visual analysis commands.

LONG DESCRIPTION:
  Visual analysis commands (e.g., descriptor, bfunc, cutoff).

OPTIONS:
  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis descriptor -d dataset.tadah -p pot.tadah -o out.txt

analysis bfunc

DESCRIPTION:
  Evaluate and plot basis functions.

LONG DESCRIPTION:
  Long description: Evaluate and plot basis functions.

OPTIONS:
  --type  <string> [<string> ...]
    Basis function type. "2b Y" or "mb N".
    Number of arguments: Min 1, Max 2
    Description:
      Choice of basis function. For example: "2b Y" or "mb N".
      Where Y/N controls computation of the cutoff function.
    Examples:
      - string1 string2
      - string1 /path/to/file

  --rescale
    Rescale the basis function by the cutoff function. This is useful for visualisation purposes.
    Default: false

  --outfile, -o  <string>
    String value.
    Number of arguments: Min 1, Max 1
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - validString
      - /path/to/file

  --range, -r  <START> <STOP> <NPOINTS>
    Plotting range [start stop npoints].
    Number of arguments: Min 3, Max 3
    Examples:
      - 0.1 9.5 100

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --derivative
    Calculate derivative of the function.
    Default: false

  --index, -i  <index>[,<index>...]
               <start>-<stop>
               <start>-<stop>:<step>
    Index pattern.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Allows flexible selection of dataset indices. Supports single indices, ranges
      (e.g., start-stop), lists, or intervals (start-stop:step). Indices are 1-based.
      Repeated indices are removed automatically.
    Examples:
      - 1,3,5
      - 1-4,7,9
      - 1-10:2


OPTIONS::input
  Provide either a potential file, a configuration file, or a task file.
  --potential, -p  <file>
    Trained model file.
    Number of arguments: Min 1, Max 1
    Needs: range, type
    Excludes: task, config
    Examples:
      - pot.tadah

  --config, -c  <file>
    Path to a configuration file.
    Number of arguments: Min 1, Max 1
    Needs: range, type
    Excludes: task, potential
    Examples:
      - config.tadah
      - ../config.tadah
      - /path/to/config.tadah

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis bfunc -p pot.tadah -o bfunc.txt -r 0 5 500 --type 2b Y

analysis check_bounds

DESCRIPTION:
  Validate HPO bounds: at every corner of the OPTIM box, run the pre-flight audit. Returns non-zero if any corner trips a FAIL finding. CI-friendly.

LONG DESCRIPTION:
  Phase 7 of the HPO bound-validation pipeline. Reads the HPOTARGET file
  referenced by the seed config, parses every OPTIM <KEY> (range) <lo> <hi>
  line into a Bound, samples corners (full 2^k for k <= 12, otherwise a
  Latin-hypercube subsample of min(2^k, ceil(k log2(k)))), and runs the
  Phase 5 pre-flight audit (run_preflight_from_db) at each corner with
  the corner's parameter values substituted into the seed Context.
  Returns exit code 0 if all corners pass and 1 if any corner produces a
  FAIL finding.

OPTIONS:
  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --hpotarget  <file>
    HPO target file.
    Number of arguments: Min 1, Max 1
    Examples:
      - hpotargets.txt


OPTIONS::input
  Provide a training config and an HPOTARGET file.
  --config, -c  <file>
    Path to a configuration file.
    Number of arguments: Min 1, Max 1
    Excludes: task
    Examples:
      - config.tadah
      - ../config.tadah
      - /path/to/config.tadah

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis check_bounds -c train.tadah --hpo-target config.hpo

analysis cutoff

DESCRIPTION:
  Evaluate and plot cutoff functions.

LONG DESCRIPTION:
  Long description: Evaluate and plot cutoff functions.

OPTIONS:
  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --range, -r  <START> <STOP> <NPOINTS>
    Plotting range [start stop npoints].
    Number of arguments: Min 3, Max 3
    Examples:
      - 0.1 9.5 100

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --derivative
    Calculate derivative of the function.
    Default: false


OPTIONS::input
  Provide either a cutoff type or a task file.
  --type  <string> [<string> ...]
    Generic types.
    Number of arguments: Min 1, Max MAX_INT
    Needs: range
    Excludes: task
    Examples:
      - string1 string2
      - string1 /path/to/file

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis cutoff -o cutoff.txt -r 0 3.14 100 -t "CutCos" --derivative

analysis dataset_stats

DESCRIPTION:
  Compute dataset statistics (per-pair r_ij, neighbour counts, energy/force scales) for HPO bound validation.

LONG DESCRIPTION:
  Computes per-element-pair distance percentiles + KDE peaks, neighbour-count
  percentiles per element, energy-per-atom percentiles, force-norm percentiles,
  and a material-class label over the supplied DBFILEs / STRUCTUREs. Writes
  the result to a text file (default 'dataset_stats.tadah') consumed by
  tadah hpo --suggest-bounds and the pre-flight design-matrix audit.

OPTIONS:
  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2


OPTIONS::input
  Structure source or a task file. Select from the following options:
  --config, -c  <file>
    Path to a configuration file.
    Number of arguments: Min 1, Max 1
    Needs: outfile
    Excludes: task, dbfile, structure
    Examples:
      - config.tadah
      - ../config.tadah
      - /path/to/config.tadah

  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile
    Excludes: task, config
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --structure, -s  <string> [<string> ...]
    Unified structural input(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile
    Excludes: task, config
    Description:
      Supported file formats: .cif (Crystallographic Information File), VASP
      (POSCAR/CONTCAR), and CASTEP (.cell). The online option fetches structures
      from databases (MP, COD, NOMAD). Multiple structures can be space-separated
      or repeated. A mix of files and online sources is allowed.
    Examples:
      - crystal.cif
      - crystal1.cif crystal2.cell
      - mp-42 crystal.cif

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis dataset_stats -c train.tadah --outfile stats.tadah

analysis descriptor

DESCRIPTION:
  Calculate structure descriptors.

LONG DESCRIPTION:
  Calculate structure descriptors.

OPTIONS:
  --potential, -p  <file>
    Trained model file.
    Number of arguments: Min 1, Max 1
    Examples:
      - pot.tadah

  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --index, -i  <index>[,<index>...]
               <start>-<stop>
               <start>-<stop>:<step>
    Index pattern.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Allows flexible selection of dataset indices. Supports single indices, ranges
      (e.g., start-stop), lists, or intervals (start-stop:step). Indices are 1-based.
      Repeated indices are removed automatically.
    Examples:
      - 1,3,5
      - 1-4,7,9
      - 1-10:2

  --force, -F
    Include forces.
    Default: false

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --merge
    Merge deduplication results into one file.
    Default: false

  --append
    Append to the existing file.
    Needs: outfile
    Default: false


OPTIONS::input
  Structure source or a task file. Select from the following options:
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile, potential
    Excludes: task
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --structure, -s  <string> [<string> ...]
    Unified structural input(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile, potential
    Excludes: task, index
    Description:
      Supported file formats: .cif (Crystallographic Information File), VASP
      (POSCAR/CONTCAR), and CASTEP (.cell). The online option fetches structures
      from databases (MP, COD, NOMAD). Multiple structures can be space-separated
      or repeated. A mix of files and online sources is allowed.
    Examples:
      - crystal.cif
      - crystal1.cif crystal2.cell
      - mp-42 crystal.cif

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis descriptor -d dataset.tadah -p pot.tadah -o out.txt --index 1,3,5

analysis suggest_bounds

DESCRIPTION:
  Suggest data-driven OPTIM bounds for the configured descriptors.

LONG DESCRIPTION:
  Reads the user's training config (DBFILEs + TYPE2B / TYPEMB), computes
  dataset statistics (per-pair r_ij percentiles + KDE peaks), then walks
  each configured descriptor's meta() and proposes one OPTIM bound per
  parameter. Writes a compilable OPTIM block (or to stdout if --outfile
  is omitted). Phase 4 of the HPO bound-validation pipeline.

OPTIONS:
  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2


OPTIONS::input
  Source: a training config or a task file.
  --config, -c  <file>
    Path to a configuration file.
    Number of arguments: Min 1, Max 1
    Excludes: task
    Examples:
      - config.tadah
      - ../config.tadah
      - /path/to/config.tadah

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah analysis suggest_bounds -c train.tadah --outfile suggested.hpo

data

DESCRIPTION:
  Dataset management commands.

LONG DESCRIPTION:
  Dataset management commands (convert, print, write, merge, split, dedup, sample, balance).

OPTIONS:
  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data --task tasks.tadah

data balance

DESCRIPTION:
  Apply energy shifts and/or rescaling to a dataset.

LONG DESCRIPTION:
  Modify a dataset by applying element-specific energy shifts or rescaling
  the weights for energies, forces, and stresses. The procedure can include
  threshold checks, forcing near-zero values to a default weight, while large
  magnitudes are inversely weighted. The updated dataset is then written to
  one or more output files, which can be merged if desired. This approach
  enhances the dataset's consistency and is especially useful when refining
  potential parameters or emphasizing certain configurations during training.

OPTIONS:
  --force, -F
    Apply rescaling to forces.
    Default: false

  --stress, -S
    Apply rescaling to stresses.
    Default: false

  --threshold  <double> [<double> ...]
    Floating point thresholds for energy, force, and stress.
    Number of arguments: Min 1, Max 3
    Default: 1e-4, 1e-5, 1e-6
    Description:
      If energy or the sum of force norms or the stress matrix norm exceeds the
      corresponding threshold, the relevant quantity will be rescaled using an
      inverse weighting..
    Examples:
      - 2.0 -4.65 0.4
      - -1.0

  --rescale
    Rescale structure weights.
    Default: false
    Description:
      Applies inverse weighting to energies, forces, and stresses based on their
      magnitudes. Specifically:
      • Energies: 1 / | energy |
      • Total force: 1 / Σ‖ force_i ‖
      • Stress: 1 / ‖stress‖
      This ensures smaller magnitudes keep their weights, while large values are
      downweighted to mitigate numerical instability.

  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --eshift  <double> [<double> ...]
    Per-atom reference energy to subtract from each configuration.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Per-element reference energies. If there are multiple species, the number of
      values must match the number of species (sorted by Z). At load time the total
      energy of each configuration is reduced by sum_Z N_Z * ESHIFT[Z], so an
      isolated-atom config with energy E_atom and ESHIFT[Z]=E_atom yields a
      post-shift energy of zero. Used by tadah train, tadah predict, tadah hpo,
      and tadah data balance. Persisted into pot.tadah for prediction round-trip.
    Examples:
      - 0.5
      - 0.5 -0.1

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --append
    Append to the existing file.
    Needs: outfile
    Default: false

  --merge
    Merge deduplication results into one file.
    Default: false

  --index, -i  <index>[,<index>...]
               <start>-<stop>
               <start>-<stop>:<step>
    Index pattern.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Allows flexible selection of dataset indices. Supports single indices, ranges
      (e.g., start-stop), lists, or intervals (start-stop:step). Indices are 1-based.
      Repeated indices are removed automatically.
    Examples:
      - 1,3,5
      - 1-4,7,9
      - 1-10:2


OPTIONS::input
  Select from the following options:
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile
    Excludes: task
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data balance -d db.tadah --rescale --eshift 0.5 --numeric 16 --outfile balanced.tadah
  - tadah data balance -d db1.tadah db2.tadah --rescale --outfile balanced1.tadah balanced2.tadah

data convert

DESCRIPTION:
  Convert DFT output file(s) to Tadah! dataset format.

LONG DESCRIPTION:
  Convert DFT output file(s) to Tadah! dataset format.

OPTIONS:
  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --append
    Append to the existing file.
    Needs: outfile
    Default: false


OPTIONS::input
  Input sources for printing. Select from the following options:
  --dft-file  <string> [<string> ...]
    Input DFT file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile
    Excludes: task
    Description:
      A single file or multiple files (space-separated). Used to extract reference
      data for training. Supported formats: VASP (OUTCAR, vasprun.xml), CASTEP
      (.castep, .md, .geom).
    Examples:
      - run1.outcar
      - run1.outcar run2.outcar

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data convert --dft-file run1.outcar -o output.tadah

data dedup

DESCRIPTION:
  Remove duplicate structures from a dataset.

LONG DESCRIPTION:
  Remove duplicate structures from Tadah! dataset(s). The merge option combines output into single file.

OPTIONS:
  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --threshold  <double> [<double> ...]
    Floating point comparison threshold.
    Number of arguments: Min 1, Max MAX_INT
    Examples:
      - 1e-4

  --merge
    Merge deduplication results into one file.
    Default: false

  --append
    Append to the existing file.
    Needs: outfile
    Default: false


OPTIONS::input
  Select from the following options:
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile
    Excludes: task
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data dedup -d db1.tadah db2.tadah -o dedup1.tadah dedup2.tadah
  - tadah data dedup -d db1.tadah db2,tadah -o db_merged.tadah --merge

data merge

DESCRIPTION:
  Merge multiple dataset files into a single file.

LONG DESCRIPTION:
  Merge multiple dataset files into a single file.

OPTIONS:
  --outfile, -o  <string>
    String value.
    Number of arguments: Min 1, Max 1
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - validString
      - /path/to/file

  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12


OPTIONS::input
  Sources for merging. Select from the following options:
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data merge -d db1.tadah db2.tadah -o merged.tadah

data print

DESCRIPTION:
  Print structure information to screen.

LONG DESCRIPTION:
  Print structure(s) information from dataset or structure files.

OPTIONS:
  --index, -i  <index>[,<index>...]
               <start>-<stop>
               <start>-<stop>:<step>
    Index pattern.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Allows flexible selection of dataset indices. Supports single indices, ranges
      (e.g., start-stop), lists, or intervals (start-stop:step). Indices are 1-based.
      Repeated indices are removed automatically.
    Examples:
      - 1,3,5
      - 1-4,7,9
      - 1-10:2

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12


OPTIONS::input
  Input sources for printing. Select from the following options:
  --dbfile, -d  <string>
    String value.
    Number of arguments: Min 1, Max 1
    Excludes: task
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - validString
      - /path/to/file

  --structure, -s  <string> [<string> ...]
    Unified structural input(s).
    Number of arguments: Min 1, Max MAX_INT
    Excludes: task, index
    Description:
      Supported file formats: .cif (Crystallographic Information File), VASP
      (POSCAR/CONTCAR), and CASTEP (.cell). The online option fetches structures
      from databases (MP, COD, NOMAD). Multiple structures can be space-separated
      or repeated. A mix of files and online sources is allowed.
    Examples:
      - crystal.cif
      - crystal1.cif crystal2.cell
      - mp-42 crystal.cif

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data print -d dataset.tadah
  - tadah data print -s crystal.cif

data sample

DESCRIPTION:
  Sample configurations from a dataset.

LONG DESCRIPTION:
  Sample a subset of configurations from an existing dataset to create a new,
  single output dataset. For the --index option, ensure requested sample size
  does not exceed the number of available configurations.

OPTIONS:
  --outfile, -o  <string>
    String value.
    Number of arguments: Min 1, Max 1
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - validString
      - /path/to/file

  --uniform  <unsigned integer>
    Sample uniformly every N-th entry.
    Number of arguments: Min 1, Max 1
    Excludes: random, index
    Examples:
      - 10

  --random  <unsigned integer>
    Randomly sample N entries.
    Number of arguments: Min 1, Max 1
    Excludes: uniform, index
    Examples:
      - 5

  --index, -i  <index>[,<index>...]
               <start>-<stop>
               <start>-<stop>:<step>
    Index pattern.
    Number of arguments: Min 1, Max MAX_INT
    Excludes: random, uniform
    Description:
      Allows flexible selection of dataset indices. Supports single indices, ranges
      (e.g., start-stop), lists, or intervals (start-stop:step). Indices are 1-based.
      Repeated indices are removed automatically.
    Examples:
      - 1,3,5
      - 1-4,7,9
      - 1-10:2

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --append
    Append to the existing file.
    Needs: outfile
    Default: false


OPTIONS::input
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile
    Excludes: task
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data sample -d full.tadah -o sample.tadah

data split

DESCRIPTION:
  Split a dataset into parts.

LONG DESCRIPTION:
  Split a dataset into parts.

OPTIONS:
  --even
    Split dataset into equal-size partitions. The number of partitions is determined by the number of output files provided. The last partition may be smaller if the dataset size is not divisible by the number of partitions.
    Excludes: chunk, percent
    Default: false

  --chunk  <unsigned integer> [<unsigned integer> ...]
    Specify chunk sizes.
    Number of arguments: Min 1, Max MAX_INT
    Excludes: even, percent
    Examples:
      - 20 5 3
      - 10

  --percent  <unsigned integer> [<unsigned integer> ...]
    Specify percentage partition.
    Number of arguments: Min 1, Max MAX_INT
    Excludes: even, chunk
    Examples:
      - 20 5 3
      - 10

  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --shuffle
    Randomize entries before splitting.
    Default: false

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --append
    Append to the existing file.
    Needs: outfile
    Default: false


OPTIONS::input
  Select from the following options:
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data split -d db.tadah -o part1.tadah part2.tadah --even

data write

DESCRIPTION:
  Write structures into a chosen format.

LONG DESCRIPTION:
  Write structures from a dataset or a structure file or online source into a chosen format.

OPTIONS:
  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --index, -i  <index>[,<index>...]
               <start>-<stop>
               <start>-<stop>:<step>
    Index pattern.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Allows flexible selection of dataset indices. Supports single indices, ranges
      (e.g., start-stop), lists, or intervals (start-stop:step). Indices are 1-based.
      Repeated indices are removed automatically.
    Examples:
      - 1,3,5
      - 1-4,7,9
      - 1-10:2

  --format, -f  <fmt>
    Output format (e.g., vasp, castep, lammps).
    Number of arguments: Min 1, Max 1
    Examples:
      - castep
      - lammps
      - vasp

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2


OPTIONS::input
  Structure sources for writing. Select from the following options:
  --dbfile, -d  <string>
    String value.
    Number of arguments: Min 1, Max 1
    Needs: outfile, index, format
    Excludes: task
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - validString
      - /path/to/file

  --structure, -s  <string> [<string> ...]
    Unified structural input(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: outfile, format
    Excludes: task, index
    Description:
      Supported file formats: .cif (Crystallographic Information File), VASP
      (POSCAR/CONTCAR), and CASTEP (.cell). The online option fetches structures
      from databases (MP, COD, NOMAD). Multiple structures can be space-separated
      or repeated. A mix of files and online sources is allowed.
    Examples:
      - crystal.cif
      - crystal1.cif crystal2.cell
      - mp-42 crystal.cif

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah data write -d dataset.tadah -o output.cell -f cell -i 7

explain

DESCRIPTION:
  Explain tadah command or option in more detail.

LONG DESCRIPTION:
  This is a help command that provides detailed information about a specific command or option in the Tadah! software. It is useful for users who want to understand the purpose and usage of a particular command or option.

OPTIONS:
  OPTION  <command>
          <command>.<option>
          <command>.<subcommand>.<option>
    Command or option to explain in a dot separated format.
    Required: true
    Number of arguments: Min 1, Max 1
    Description:
      The command or option to explain. The format is dot-separated, where each part represents a level of the command hierarchy. For example, to explain the 'verbose' option of the 'train' command, you would use 'train.verbose'.
    Examples:
      - tadah explain train.task
      - tadah explain data.split.even


EXAMPLES:
  - tadah explain task
  - tadah explain train.verbose
  - tadah explain data.split.even

hpo

DESCRIPTION:
  Optimize the model architecture and hyperparameters.

LONG DESCRIPTION:
  Refine your model's architecture and hyperparameters using Tadah!'s nested
  fitting procedure, an iterative approach that goes beyond standard force-
  and energy-focused methods. By evaluating trial potentials with LAMMPS
  scripts, this framework allows you to incorporate performance constraints
  such as surface energies or phase stability, significantly enhancing model
  transferability. The global optimization algorithm systematically explores
  the parameter space, producing a robust interatomic potential tailored to
  your priorities for accuracy, speed, and other metrics. This flexible
  workflow enables users to define search space constraints, assign weights
  to different objectives, and tune performance for a wide range of
  applications.

OPTIONS:
  --force, -F
    Include forces.
    Default: false

  --stress, -S
    Include stresses.
    Default: false

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --lscale  <double>
    Uniform length rescale factor applied to atomic positions, cell, and reference forces at load time.
    Number of arguments: Min 1, Max 1
    Default: 1.0
    Description:
      Multiplies atomic positions and cell vectors by this factor at the moment a
      dataset is loaded for training, prediction, or HPO. Reference forces are
      divided by the factor (chain rule on E(r)); stresses (stored as virial in
      energy units) are invariant under uniform length rescaling. The chosen factor
      is persisted into pot.tadah so future tadah predict and tadah hpo runs
      apply the same transformation. Use --no-lscale at predict time to override.

      LSCALE is a training-side concept: the LAMMPS pair_style does NOT re-apply
      LSCALE. The user is expected to provide LAMMPS positions at the scale that
      matches the trained model (e.g. experimental lattice).
    Examples:
      - 1.0030

  --eshift  <double> [<double> ...]
    Per-atom reference energy to subtract from each configuration.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Per-element reference energies. If there are multiple species, the number of
      values must match the number of species (sorted by Z). At load time the total
      energy of each configuration is reduced by sum_Z N_Z * ESHIFT[Z], so an
      isolated-atom config with energy E_atom and ESHIFT[Z]=E_atom yields a
      post-shift energy of zero. Used by tadah train, tadah predict, tadah hpo,
      and tadah data balance. Persisted into pot.tadah for prediction round-trip.
    Examples:
      - 0.5
      - 0.5 -0.1

  --eshift-atom
    Derive ESHIFT from isolated-atom configurations in the dataset (mean per Z).
    Default: false
    Description:
      Scans the loaded dataset for single-atom configurations (natoms == 1), groups
      them by atomic number, and sets ESHIFT[Z] to the mean per-Z energy. If a
      species has no isolated-atom config in the dataset, ESHIFT[Z] = 0 for that
      species and a WARNING is logged. If multiple isolated-atom configs of the same
      Z disagree by more than 1e-3 eV, an INFO line records the spread.
      Mutually exclusive with explicit ESHIFT and ESHIFT_DBATOM.

  --eshift-dbatom
    Derive ESHIFT by least-squares atomic-energy fit over the database.
    Default: false
    Description:
      Fits per-element reference energies by least squares: minimise
      ||y - M beta||^2 where y[i] is the total energy of configuration i and
      M[i, k] is the count of species k in configuration i. The fitted beta_k
      becomes ESHIFT[Z(k)]. More robust than ESHIFT_ATOM when the dataset has no
      isolated-atom configs but does have compositional diversity.
      Mutually exclusive with explicit ESHIFT and ESHIFT_ATOM.

  --efilter  <E_min_per_atom> <E_max_per_atom>
    Drop configurations whose per-atom energy is outside [E_min, E_max] (eV).
    Number of arguments: Min 2, Max 2
    Description:
      Outlier filter applied at load time before any energy-shift derivation or
      training-weight assignment, so outliers do not poison ESHIFT_ATOM /
      ESHIFT_DBATOM / EWEIGHT_TEMP. The threshold is compared against E/N_atoms
      (per-atom energy). Both bounds must be supplied. To disable, omit the key.
    Examples:
      - -12.0 -2.0

  --ffilter  <double>
    Drop configurations where any atomic force magnitude exceeds this value (eV/Å).
    Number of arguments: Min 1, Max 1
    Description:
      Outlier filter applied at load time. A configuration is dropped if any single
      atom has ‖F‖ > FFILTER. Useful for catching unconverged SCF or otherwise
      broken DFT runs.
    Examples:
      - 20.0

  --wdbfile  <double> [<double> ...]
    Per-dataset weight multipliers, one per DBFILE entry.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Multiplies eweight, fweight, and sweight of every configuration in the
      corresponding DBFILE by the given factor. Use to bias training toward or away
      from particular datasets. Composes multiplicatively with WDBFILE_AUTO.
    Examples:
      - 1.0 0.5 0.1

  --wdbfile-auto  <double>
    Auto size-balance datasets: per-config weight multiplied by 1/N_i^alpha.
    Number of arguments: Min 1, Max 1
    Default: 0.0
    Description:
      Rebalances per-dataset contributions to the training loss by multiplying each
      configuration's weight by N_i^(-alpha), where N_i is the number of (post-
      filter) configurations in dataset i. alpha=0 disables (default). alpha=0.5
      is the recommended starting point (sqrt-inverse, soft balance). alpha=1
      fully equalises aggregate dataset contribution. Composes multiplicatively
      with user-given WDBFILE.
    Examples:
      - 0.5
      - 1.0

  --eweight-temp  <double>
    Boltzmann reweighting temperature in Kelvin (multiplies eweight).
    Number of arguments: Min 1, Max 1
    Description:
      After ESHIFT is applied, multiplies each configuration's eweight by
      exp(-(E/N - E_min)/(kB * T)) where E_min is the minimum per-atom energy in
      the dataset and kB = 8.617333262e-5 eV/K. Emphasises low-energy
      configurations. Composes multiplicatively with the per-structure eweight
      already in the dataset file. Omit the key to disable.
    Examples:
      - 300
      - 1000

  --zero-com-force
    Subtract per-config mean force so each configuration has zero net force.
    Default: false
    Description:
      Per configuration, subtracts the mean force from each atom so that the sum of
      forces over the configuration is exactly zero. Standard DFT post-processing
      trick to remove residual translational forces from incomplete relaxation/SCF.


OPTIONS::input
  Input sources for hpo. Select from the following options:
  --config, -c  <file>
    Path to a configuration file.
    Number of arguments: Min 1, Max 1
    Needs: validation, hpotarget
    Excludes: task
    Examples:
      - config.tadah
      - ../config.tadah
      - /path/to/config.tadah

  --validation  <string> [<string> ...]
    Validation dataset file(s).
    Number of arguments: Min 1, Max MAX_INT
    Examples:
      - valid.tadah

  --hpotarget  <file>
    HPO target file.
    Number of arguments: Min 1, Max 1
    Examples:
      - hpotargets.txt

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah hpo -c config.tadah --hpotarget hpotargets.txt --validation valid.tadah

predict

DESCRIPTION:
  Predict properties using a trained model.

LONG DESCRIPTION:
  Predict using an already trained model. Energy per atom is always calculated,
  while forces and stresses are optional. By default, energies are written to
  energy.pred, and if forces or stresses are calculated, they are written to
  forces.pred and stress.pred, respectively. You can use a task file with a DBFILE
  key to list prediction datasets or STRUCTURE to list files or online structures
  (see tadah explain predict.structure for supported types). Alternatively, you
  can provide Tadah! datasets or individual structure files or online sources via
  the command line.

OPTIONS:
  --analytics, -A
    Perform analytics.
    Excludes: structure
    Default: false

  --outfile, -o  <energy_file> [<force_file>] [<stress_file>]
    Output file name(s) for predicted values. The first file is for energies, followed by forces/stresses if requested. Specify filename for every requested output.
    Number of arguments: Min 1, Max 3
    Default: energy.pred, forces.pred, stress.pred
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --potential, -p  <file>
    Trained model file.
    Number of arguments: Min 1, Max 1
    Examples:
      - pot.tadah

  --force, -F
    Include forces.
    Default: false

  --stress, -S
    Include stresses.
    Default: false

  --error
    Generate error estimates.
    Default: false

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --lscale  <double>
    Uniform length rescale factor applied to atomic positions, cell, and reference forces at load time.
    Number of arguments: Min 1, Max 1
    Default: 1.0
    Description:
      Multiplies atomic positions and cell vectors by this factor at the moment a
      dataset is loaded for training, prediction, or HPO. Reference forces are
      divided by the factor (chain rule on E(r)); stresses (stored as virial in
      energy units) are invariant under uniform length rescaling. The chosen factor
      is persisted into pot.tadah so future tadah predict and tadah hpo runs
      apply the same transformation. Use --no-lscale at predict time to override.

      LSCALE is a training-side concept: the LAMMPS pair_style does NOT re-apply
      LSCALE. The user is expected to provide LAMMPS positions at the scale that
      matches the trained model (e.g. experimental lattice).
    Examples:
      - 1.0030

  --eshift  <double> [<double> ...]
    Per-atom reference energy to subtract from each configuration.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Per-element reference energies. If there are multiple species, the number of
      values must match the number of species (sorted by Z). At load time the total
      energy of each configuration is reduced by sum_Z N_Z * ESHIFT[Z], so an
      isolated-atom config with energy E_atom and ESHIFT[Z]=E_atom yields a
      post-shift energy of zero. Used by tadah train, tadah predict, tadah hpo,
      and tadah data balance. Persisted into pot.tadah for prediction round-trip.
    Examples:
      - 0.5
      - 0.5 -0.1

  --efilter  <E_min_per_atom> <E_max_per_atom>
    Drop configurations whose per-atom energy is outside [E_min, E_max] (eV).
    Number of arguments: Min 2, Max 2
    Description:
      Outlier filter applied at load time before any energy-shift derivation or
      training-weight assignment, so outliers do not poison ESHIFT_ATOM /
      ESHIFT_DBATOM / EWEIGHT_TEMP. The threshold is compared against E/N_atoms
      (per-atom energy). Both bounds must be supplied. To disable, omit the key.
    Examples:
      - -12.0 -2.0

  --ffilter  <double>
    Drop configurations where any atomic force magnitude exceeds this value (eV/Å).
    Number of arguments: Min 1, Max 1
    Description:
      Outlier filter applied at load time. A configuration is dropped if any single
      atom has ‖F‖ > FFILTER. Useful for catching unconverged SCF or otherwise
      broken DFT runs.
    Examples:
      - 20.0

  --zero-com-force
    Subtract per-config mean force so each configuration has zero net force.
    Default: false
    Description:
      Per configuration, subtracts the mean force from each atom so that the sum of
      forces over the configuration is exactly zero. Standard DFT post-processing
      trick to remove residual translational forces from incomplete relaxation/SCF.

  --no-lscale
    (predict) Ignore any LSCALE recorded in the loaded potential file.
    Default: false
    Description:
      At predict time, override the LSCALE value stored in pot.tadah. Use when
      the dataset you are predicting on is already at the trained-model scale.

  --no-eshift
    (predict) Ignore any ESHIFT recorded in the loaded potential file.
    Default: false
    Description:
      At predict time, override the ESHIFT values stored in pot.tadah. Use when
      the dataset you are predicting on is already at the shifted baseline (or you
      just want raw model output without any reference energy subtraction).


OPTIONS::input
  Input sources for prediction. Select from the following options:
  --dbfile, -d  <string> [<string> ...]
    Path(s) to Tadah! database file(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: potential
    Description:
      Absolute or relative path to the Tadah! database file(s). The relative path
      is interpreted relative to the current working directory. Multiple dataset
      paths can be provided either as space-separated tokens or by repeating
      this key.
    Examples:
      - dbfile /path/to/dbfile
      - dbfile /path/to/dbfile1 /path/to/dbfile2

  --structure, -s  <string> [<string> ...]
    Unified structural input(s).
    Number of arguments: Min 1, Max MAX_INT
    Needs: potential
    Description:
      Supported file formats: .cif (Crystallographic Information File), VASP
      (POSCAR/CONTCAR), and CASTEP (.cell). The online option fetches structures
      from databases (MP, COD, NOMAD). Multiple structures can be space-separated
      or repeated. A mix of files and online sources is allowed.
    Examples:
      - crystal.cif
      - crystal1.cif crystal2.cell
      - mp-42 crystal.cif

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah predict -p pot.tadah -s crystal1.cif crystal2.cif --force
  - tadah predict -p pot.tadah -d db1.tadah --numeric 12
  - tadah predict -p pot.tadah -d db.tadah --stress --outfile predicted_energy.dat predicted_stresses.dat

properties

DESCRIPTION:
  Evaluate physical properties for MLIP target evaluation.

LONG DESCRIPTION:
  Evaluate physical properties for MLIP target evaluation (e.g., pairwise, ecurve, eos, defect, surface, mechanics).

OPTIONS:
  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah properties pairwise -p pot.tadah -o out.txt -r 0 10 100 -a "Kr Kr"

properties pairwise

DESCRIPTION:
  Compute E vs. r for a given potential.

OPTIONS:
  --eshift  <double>
    Shift the energy curve by this value. If zero is provided the curve will be shifted by the furthest right value of the curve. Provide single double for --eshift if potential is either two- or many-body. If potential is both two- and many-body, provide three doubles for --eshift. The first double if for total energy second double is for two-body, the third for many-body.
    Number of arguments: Min 1, Max 3
    Description:
      Per-element reference energies. If there are multiple species, the number of
      values must match the number of species (sorted by Z). At load time the total
      energy of each configuration is reduced by sum_Z N_Z * ESHIFT[Z], so an
      isolated-atom config with energy E_atom and ESHIFT[Z]=E_atom yields a
      post-shift energy of zero. Used by tadah train, tadah predict, tadah hpo,
      and tadah data balance. Persisted into pot.tadah for prediction round-trip.
    Examples:
      - 2.0
      - -1.0

  --outfile, -o  <string> [<string> ...]
    Output file.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      The output file to be written. Multiple files can be specified if the command
      produces more than one output.
    Examples:
      - output.tadah

  --range, -r  <START> <STOP> <NPOINTS>
    Plotting range [start stop npoints].
    Number of arguments: Min 3, Max 3
    Examples:
      - 0.1 9.5 100

  --atompair, -a  <element1> <element2>
    Pair of chemical elements.
    Number of arguments: Min 1, Max 2
    Examples:
      - "Kr Kr"

  --force, -F
    Include forces.
    Default: false

  --numeric  <unsigned integer>
    Numeric output precision.
    Number of arguments: Min 1, Max 1
    Default: 12
    Description:
      Sets the number of decimal places for output.
    Examples:
      - 12

  --error
    Generate error estimates.
    Default: false

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --bondenergy
    Calculate bond energy instead of per atom value.
    Default: false


OPTIONS::input
  Select from the following options:
  --potential, -p  <file>
    Trained model file.
    Number of arguments: Min 1, Max 1
    Needs: outfile, range, atompair
    Excludes: task
    Examples:
      - pot.tadah

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah properties pairwise -p pot.tadah -o out.txt -r 0 10 100 -a "Kr Kr"

train

DESCRIPTION:
  Train a model.

LONG DESCRIPTION:
  Train a model using either a configuration file or a task file. By default,
  it trains on energies; however, forces and stresses can also be included.
  This command requires a configuration file and yields a trained interatomic
  potential which can be used with Tadah! or for molecular
  dynamics simulations in LAMMPS via pair_style tadah.

OPTIONS:
  --outfile, -o  <file>
    Output file name for the trained model.
    Number of arguments: Min 1, Max 1
    Default: pot.tadah
    Description:
      Output file name for the trained model.
    Examples:
      - output.tadah

  --verbose, -v  <unsigned integer>
    Verbosity level. 0-2: ERROR, WARNING, INFO.
    Number of arguments: Min 1, Max 1
    Default: 1
    Description:
      Verbosity level. 0: ERROR, 1: WARNING, 2: INFO. The verbosity level controls
      the amount of information printed during execution. Higher levels provide
      more detailed output.
    Examples:
      - 2

  --force, -F
    Include forces.
    Default: false

  --stress, -S
    Include stresses.
    Default: false

  --uncertainty
    Output uncertainty estimates.
    Default: false

  --lscale  <double>
    Uniform length rescale factor applied to atomic positions, cell, and reference forces at load time.
    Number of arguments: Min 1, Max 1
    Default: 1.0
    Description:
      Multiplies atomic positions and cell vectors by this factor at the moment a
      dataset is loaded for training, prediction, or HPO. Reference forces are
      divided by the factor (chain rule on E(r)); stresses (stored as virial in
      energy units) are invariant under uniform length rescaling. The chosen factor
      is persisted into pot.tadah so future tadah predict and tadah hpo runs
      apply the same transformation. Use --no-lscale at predict time to override.

      LSCALE is a training-side concept: the LAMMPS pair_style does NOT re-apply
      LSCALE. The user is expected to provide LAMMPS positions at the scale that
      matches the trained model (e.g. experimental lattice).
    Examples:
      - 1.0030

  --eshift  <double> [<double> ...]
    Per-atom reference energy to subtract from each configuration.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Per-element reference energies. If there are multiple species, the number of
      values must match the number of species (sorted by Z). At load time the total
      energy of each configuration is reduced by sum_Z N_Z * ESHIFT[Z], so an
      isolated-atom config with energy E_atom and ESHIFT[Z]=E_atom yields a
      post-shift energy of zero. Used by tadah train, tadah predict, tadah hpo,
      and tadah data balance. Persisted into pot.tadah for prediction round-trip.
    Examples:
      - 0.5
      - 0.5 -0.1

  --eshift-atom
    Derive ESHIFT from isolated-atom configurations in the dataset (mean per Z).
    Default: false
    Description:
      Scans the loaded dataset for single-atom configurations (natoms == 1), groups
      them by atomic number, and sets ESHIFT[Z] to the mean per-Z energy. If a
      species has no isolated-atom config in the dataset, ESHIFT[Z] = 0 for that
      species and a WARNING is logged. If multiple isolated-atom configs of the same
      Z disagree by more than 1e-3 eV, an INFO line records the spread.
      Mutually exclusive with explicit ESHIFT and ESHIFT_DBATOM.

  --eshift-dbatom
    Derive ESHIFT by least-squares atomic-energy fit over the database.
    Default: false
    Description:
      Fits per-element reference energies by least squares: minimise
      ||y - M beta||^2 where y[i] is the total energy of configuration i and
      M[i, k] is the count of species k in configuration i. The fitted beta_k
      becomes ESHIFT[Z(k)]. More robust than ESHIFT_ATOM when the dataset has no
      isolated-atom configs but does have compositional diversity.
      Mutually exclusive with explicit ESHIFT and ESHIFT_ATOM.

  --efilter  <E_min_per_atom> <E_max_per_atom>
    Drop configurations whose per-atom energy is outside [E_min, E_max] (eV).
    Number of arguments: Min 2, Max 2
    Description:
      Outlier filter applied at load time before any energy-shift derivation or
      training-weight assignment, so outliers do not poison ESHIFT_ATOM /
      ESHIFT_DBATOM / EWEIGHT_TEMP. The threshold is compared against E/N_atoms
      (per-atom energy). Both bounds must be supplied. To disable, omit the key.
    Examples:
      - -12.0 -2.0

  --ffilter  <double>
    Drop configurations where any atomic force magnitude exceeds this value (eV/Å).
    Number of arguments: Min 1, Max 1
    Description:
      Outlier filter applied at load time. A configuration is dropped if any single
      atom has ‖F‖ > FFILTER. Useful for catching unconverged SCF or otherwise
      broken DFT runs.
    Examples:
      - 20.0

  --wdbfile  <double> [<double> ...]
    Per-dataset weight multipliers, one per DBFILE entry.
    Number of arguments: Min 1, Max MAX_INT
    Description:
      Multiplies eweight, fweight, and sweight of every configuration in the
      corresponding DBFILE by the given factor. Use to bias training toward or away
      from particular datasets. Composes multiplicatively with WDBFILE_AUTO.
    Examples:
      - 1.0 0.5 0.1

  --wdbfile-auto  <double>
    Auto size-balance datasets: per-config weight multiplied by 1/N_i^alpha.
    Number of arguments: Min 1, Max 1
    Default: 0.0
    Description:
      Rebalances per-dataset contributions to the training loss by multiplying each
      configuration's weight by N_i^(-alpha), where N_i is the number of (post-
      filter) configurations in dataset i. alpha=0 disables (default). alpha=0.5
      is the recommended starting point (sqrt-inverse, soft balance). alpha=1
      fully equalises aggregate dataset contribution. Composes multiplicatively
      with user-given WDBFILE.
    Examples:
      - 0.5
      - 1.0

  --eweight-temp  <double>
    Boltzmann reweighting temperature in Kelvin (multiplies eweight).
    Number of arguments: Min 1, Max 1
    Description:
      After ESHIFT is applied, multiplies each configuration's eweight by
      exp(-(E/N - E_min)/(kB * T)) where E_min is the minimum per-atom energy in
      the dataset and kB = 8.617333262e-5 eV/K. Emphasises low-energy
      configurations. Composes multiplicatively with the per-structure eweight
      already in the dataset file. Omit the key to disable.
    Examples:
      - 300
      - 1000

  --zero-com-force
    Subtract per-config mean force so each configuration has zero net force.
    Default: false
    Description:
      Per configuration, subtracts the mean force from each atom so that the sum of
      forces over the configuration is exactly zero. Standard DFT post-processing
      trick to remove residual translational forces from incomplete relaxation/SCF.


OPTIONS::input
  Provide a configuration file for a single task or a complete tasks file.
  --config, -c  <file>
    Path to a configuration file.
    Number of arguments: Min 1, Max 1
    Examples:
      - config.tadah
      - ../config.tadah
      - /path/to/config.tadah

  --task, -t  <file>
    A file containing task(s) to be executed.
    Number of arguments: Min 1, Max 1
    Excludes: ALL
    Description:
      The task file is a convenient way to specify multiple tasks without having to
      provide all the command-line arguments for each task. The task file should be
      in the same format as the configuration file, but it can also include additional
      information such as the task name and any specific parameters for that task.
      A task in a task file begins with the keyword 'TASK' followed by the task name.
      The task name is simply a command to be executed or both command and subcommand.
      The lines following the TASK keyword should contain parameters required for
      the task specified above. For example, CLI --verbose 2 is 'VERBOSE 2' in the
      task file.

      # Example TASK file containing two tasks:
      # Global options
      NUMERIC 14 # output precision
      VERBOSE 2  # verbosity level

      TASK predict
      DBFILE db1.tadah db2.tadah db3.tadah
      DBFILE db4.tadah db5.tadah db6.tadah
      FORCE true
      ANALYTICS true

      TASK data print
      STRUCTURE crystal1.cif crystal2.cif
    Examples:
      - path/to/tasks.tadah


EXAMPLES:
  - tadah train -c config.tadah
  - tadah train -c config.tadah --force --stress
  - tadah train -c config.tadah --verbose 2 --outfile potential.tadah
  - tadah train --task tasks.tadah