Descriptors
This page lists every two-body (D2_*) and many-body (DM_*) descriptor
compiled into Tadah!MLIP, together with their constructor arguments.
Because the tables are generated directly from the header files, the class
name you see here is exactly the keyword you must write in your configuration
file.
General workflow
Decide which descriptor families you need (two-body, many-body, or both).
Switch the family on with the corresponding INIT flag:
INIT2B true # activate two-body block INITMB false # skip many-body block
At least one of
INIT2B/INITMBmust betrue. If both aretrueTadah! will build one descriptor of each type unless you say otherwise withTYPE2B/TYPEMB.Provide exactly one line per descriptor you want to build, using the formats below. If you need to concatenate several descriptors of the same family, use the meta classes
D2_mJoinorDM_mJoinand list the components in a block right after the meta keyword.
The bias term
The first component of the overall descriptor vector can be a constant 1 (“bias”). Add it with
BIAS true
Descriptor key syntax
Two-body:
TYPE2B D2_<Name> [param ...] <EL1> <EL2> ...
Many-body:
TYPEMB DM_<Name> L N_C N_S [N_CE N_SE] <EL1> <EL2> ...
where
<Name>is copied from the headings below (case-sensitive).[param …]are the extra integers/doubles required by a given class (see its table for details).<EL1> <EL2>are element symbols (use*for “any”). Multiple pairs are allowed, e.g.DM_EAD 1 4 4 Ti Ti Ti Nb.Lis the maximum angular momentum number needed by DM descriptors.N_C,N_Sare the sizes of the centre and width grids (CGRID*/SGRID*).N_CE,N_SEappear only when non-linear function specification is required (CEMBFUNC/SEMBFUNC).
Remember to supply matching auxiliary keys:
Ordering matters: we recommend writing the block
TYPE2B …
RCTYPE2B …
RCUT2B …
CGRID2B …
SGRID2B …
together before starting the next descriptor family, so the eye can verify that list lengths match.
Composite Descriptors
Tadah!MLIP lets you stitch several primitive descriptors together into one
feature vector – a composite descriptor – so you can embed physical
insight (e.g. “add a short–range ZBL shield to a Blip term”) without
editing source code. The idea is implemented via the meta-classes
D2_mJoin (two-body) and DM_mJoin (many-body).
Key points
Activate the relevant family first:
INIT2B/INITMB.Declare the meta descriptor:
TYPE2B D2_mJoinorTYPEMB DM_mJoinImmediately follow with one TYPE line per constituent descriptor, in the order you want them concatenated.
Provide matching lists for every auxiliary key (
RCTYPE*,RCUT*,CGRID*,SGRID*, and, if required,C/SEMBFUNC). List length must equal the number of constituents.Each constituent can target its own element pair(s), cutoff type and distance.
Quick examples
Single Lennard-Jones descriptor
# -- simple monatomic model -----------------------------------------
INIT2B true
TYPE2B D2_LJ Kr Kr # no extra parameters
RCTYPE2B Cut_Cos
RCUT2B 6.0
2-body BP + many-body EAD
# ----- two-body Behler–Parrinello ----------------------------------
INIT2B true
TYPE2B D2_BP 10 10 Kr Kr # 10 radial functions
RCTYPE2B Cut_Cos
RCUT2B 6.5
CGRID2B LIN 10 0.0 6.5 # matching grid of centres
SGRID2B GEOM 10 0.05 0.70 # matching grid of widths
# ----- many-body embedded density ----------------------------------
INITMB true
TYPEMB DM_EAD 1 7 7 Kr Kr # L 1, 7 centres, 7 widths
RCTYPEMB Cut_Poly2
RCUTMB 6.5
CGRIDMB LIN 7 0.0 6.5
SGRIDMB GEOM 7 0.05 0.70
Composite 2-body terms with D2_mJoin
INIT2B true
TYPE2B D2_mJoin # meta descriptor
TYPE2B D2_MIE 12 6 Ti Ti # --- component 1
RCUT2B 5.0
RCTYPE2B Cut_Cos
TYPE2B D2_Blip 4 4 Ti Nb # --- component 2
RCUT2B 7.5
RCTYPE2B Cut_Poly2
CGRID2B LIN 4 0.0 7.5
SGRID2B GEOM 4 0.05 0.70
TYPE2B D2_Blip 3 3 * * # --- component 3
RCUT2B 4.0
RCTYPE2B Cut_Cos
CGRID2B LIN 3 0.0 7.5
SGRID2B GEOM 3 0.05 0.70
Reading the tables below
Following this overview you will find the Two-Body and Many-Body sections. Each entry shows
the C++ signature,
a short description & equation,
Required config keys, i.e. the options you must specify in the configuration file.
Copy the class name into the TYPE2B / TYPEMB line, supply the required
keys, and you are ready to train.
Two-Body
Below is a list of all two-body descriptors supported by Tadah:
-
class D2_LJ : public tadah::models::D2_Base
Standard Lennard - Jones descriptor.
\[ V_i = \sum_{j \neq i} 4 \epsilon \Bigg(\Big(\frac{\sigma}{r_{ij}}\Big)^{12} - \Big(\frac{\sigma}{r_{ij}}\Big)^6\Bigg) f_c(r_{ij}) \]or equivalently:
\[ V_i = \sum_{j \neq i} \frac{C_{12}}{r_{ij}^{12}} - \frac{C_6}{r_{ij}^6} f_c(r_{ij}) \]Note that machined learned coefficients \(C_6\) and \(C_{12}\) corresponds to \(\sigma\) and \(\epsilon\) through the following relation:
\[ \sigma = \Big(\frac{C_{12}}{C_6}\Big)^{1/6} \]\[ \epsilon = \frac{1}{4} \frac{C_6^2}{C_{12}} w(Z) \]where \(w(Z)\) is a species depended weight factor (default is an atomic number).The machine learned \(\sigma\) and \(\epsilon\) only make sense (say to compare with the literature ones) when BIAS false and NORM false and system in monatomic. It is ok thought to set them to true it’s just that numerical values will be different.
Required tadah::core::Context Key: INIT2B
-
class D2_BP : public tadah::models::D2_Base
Behler-Parrinello two-body descriptor.
\[ V_i^{\eta,r_s} = \sum_{j \neq i} \exp{\Big(-\eta(r_{ij}-r_s)^2\Big)}f_c(r_{ij}) \]CGRID2B parameters control position \( r_s \) of the gaussian basis function.
SGRID2B parameters control width \( \eta \) of the gaussian basis function.
This is essentially a \( G^1_i \) descriptor from the below paper with an exception that it can use any cutoff function defined in Ta-dah!:
Behler, J., Parrinello, M. (2007). Generalized neural-network representation of high-dimensional potential-energy surfaces. Physical Review Letters, 98(14), 146401. https://doi.org/10.1103/PhysRevLett.98.146401
Required tadah::core::Context keys: INIT2B CGRID2B SGRID2B
-
class D2_Blip : public tadah::models::D2_Base
Blip two-body descriptor.
\[ V_i^{\eta,r_s} =\sum_{j \neq i} \mathcal{B}(\eta(r_{ij}-r_s))f_c(r_{ij}) \]where \( f_c \) is a cutoff function and \( \mathcal{B} \) is a blip basis function centered at \(r_s\) of width \(4/\eta\).
CGRID2B parameters control position \( r_s \) of blip centres.
SGRID2B parameters control width \( \eta \) of blips.
Blip basis function is built out of 3rd degree polynomials in the four intervals [-2,-1], [-1,0], [0,1], [1,2] and is defined as:
\[\begin{split} \begin{equation} \mathcal{B}(r) = \begin{cases} 1-\frac{3}{2}r^2+\frac{3}{4}|r|^3 & \text{if} \qquad 0<|r|<1\\ \frac{1}{4}(2-|r|)^3 & \text{if} \qquad 1<|r|<2\\ 0 & \text{if} \qquad |r|>2 \end{cases} \end{equation} \end{split}\]More details about the blip basis functions can be found in the following paper:
Hernández, E., Gillan, M., Goringe, C. (1997). Basis functions for linear-scaling first-principles calculations. Physical Review B - Condensed Matter and Materials Physics, 55(20), 13485–13493. https://doi.org/10.1103/PhysRevB.55.13485
Required keys: INIT2B CGRID2B SGRID2B
-
class D2_EAM : public tadah::models::D2_Base
Pair-wise part for the Embedded Atom Method descriptor.
\[ V_i = \frac{1}{2} \sum_{j \neq i} \psi(r_{ij}) \]This descriptor will load tabulated values for the two-body potential \( \phi \) from the provided SETFL file.
This descriptor is usually used together with the many-body descriptor DM_EAM although this is not required and user can mix it with any other descriptors or use it on its own.
This descriptor will enforce cutoff distance as specified in a SETFL file. Set RCUT2B to the same value to suppress the warning message.
Required tadah::core::Context keys: INIT2B SETFL
-
class D2_MIE : public tadah::models::D2_Base
Mie descriptor.
\[ V_i = \sum_{j \neq i} C \epsilon \Bigg(\Big(\frac{\sigma}{r_{ij}}\Big)^{n} - \Big(\frac{\sigma}{r_{ij}}\Big)^m\Bigg) \]where
\[ C=\frac{n}{n-m}\Big( \frac{n}{m} \Big)^{\frac{m}{n-m}} \]Any cutoff can be used
Required tadah::core::Context Key: INIT2B TYPE2B
TYPE2B D2_MIE 12 6 ELEMENT1 ELEMENT2
will result in Lennard-Jones type descriptor
-
class D2_ZBL : public tadah::models::D2_Base
ZBL Descriptor.
The ZBL (Ziegler-Biersack-Littmark) potential is an empirical potential used to model short-range interactions between atoms.
The constant term \( \frac{e^2}{4 \pi \varepsilon_0 } \) is set to 1 and will be fitted as needed.
The simplified expression for the ZBL potential is given by:
\[ V(r) = \frac{Z_1 Z_2}{r} \phi\left(\frac{r}{a}\right) \]where \( a \) is the screening length, expressed as:
\[ a = \frac{s_0 a_0}{Z_1^{p_0} + Z_2^{p_1}} \]Here, \( a_0 \), \( s_0 \), \( p_0 \), and \( p_1 \) are adjustable hyperparameters. Setting any of these to -1 uses the default values:
\( a_0 = 0.52917721067 \, \text{Å} \)
\( s_0 = 0.88534 \)
\( p_0 = 0.23 \)
\( p_1 = 0.23 \)
The screening function \( \phi \) is defined as:
\[ \phi(x) = 0.1818 e^{-3.2x} + 0.5099 e^{-0.9423x} + 0.2802 e^{-0.4029x} + 0.02817 e^{-0.2016x} \]Required tadah::core::Context Key: INIT2B TYPE2B
TYPE2B D2_ZBL \( a_0 \) \( s_0 \) \( p_0 \) \( p_1 \) ELEMENT1 ELEMENT2
Examples:
-
class D2_Dummy : public tadah::models::D2_Base
Dummy two-body descriptor.
Use it to satisfy DescriptorsCalc requirements in case when two-body descriptor is not required.
-
class D2_mJoin : public tadah::models::D2_Base, public tadah::models::D_mJoin
Meta two-body descriptor for combining multiple D2 descriptors.
This descriptor provides a convenient interface for concatenating multiple two-body descriptors. The resulting descriptor can then be used by Tadah! like any standard two-body descriptor.
Each descriptor must have a specified type in a configuration file, along with a cutoff function, cutoff distance, and optionally SGRID2B and CGRID2B values if applicable.
When listing descriptors under the TYPE2B key, you must include parameters relevant to this descriptor.
Here is an example of how to configure these descriptors:
TYPE2B D2_mJoin # <-- Meta descriptor for concatenating two-body descriptors TYPE2B D2_MIE 11 6 Ti Ti # <-- MIE exponents RCTYPE2B Cut_Cos RCUT2B 3.0 TYPE2B D2_Blip 6 6 Ti Nb Nb Nb # <-- grid sizes RCTYPE2B Cut_Tanh RCUT2B 7.5 SGRID2B -2 6 0.1 10 # Grid for D2_Blip, blips widths, auto generated CGRID2B 0 0 0 0 0 0 # Grid for D2_Blip, blip centers
Note: Grids can be specified on a single line, and the order of the grids is important.
There is no limit to the number of descriptors that can be concatenated.
Ensure the types and grids are correctly specified in the configuration file.
The cutoff functions (RCTYPE2B) and distances (RCUT2B) must be defined for each descriptor.
Both SGRID2B and CGRID2B should be included if relevant, with their sizes matching the given descriptors.
Many-Body
-
class DM_Blip : public tadah::models::DM_Base
Blip Many Body Descriptor.
\[ V_i^{L,\eta,r_s} = \sum_{l_x,l_y,l_z}^{l_x+l_y+l_z=L} \frac{L!}{l_x!l_y!l_z!} \Big( \rho_i^{\eta,r_s,l_x,l_y,l_z} \Big)^2 \]where density \( \rho \) is calculated using modified Gaussian Type Orbitals (expansion in the Blip basis instead of usual Gaussians):
\[ \rho_i^{\eta,r_s,l_x,l_y,l_z} = \sum_{j \neq i} x_{ij}^{l_x}y_{ij}^{l_y}z_{ij}^{l_z} \mathcal{B}{\Big(-\eta(r_{ij}-r_s)^2\Big)}f_c(r_{ij}) \]CGRIDMB parameters control position \( r_s \) of the gaussian basis function.
SGRIDMB parameters control width \( \eta \) of the gaussian basis function.
e.g. \(L_{max}=2\) will calculate descriptors with \( L=0,1,2 \) (s,p,d orbitals).
More information about this descriptor:
Zhang, Y., Hu, C.,Jiang, B. (2019). Embedded atom neural network potentials: efficient and accurate machine learning with a physically inspired representation. Journal of Physical Chemistry Letters, 10(17), 4962–4967. https://doi.org/10.1021/acs.jpclett.9b02037
Required tadah::core::Context keys: INITMB CGRIDMB SGRIDMB
-
class DM_EAD : public tadah::models::DM_Base
Embedded Atom Descriptor
\[ V_i^{L,\eta,r_s} = \sum_{l_x,l_y,l_z}^{l_x+l_y+l_z=L} \frac{L!}{l_x!l_y!l_z!} \Big( \rho_i^{\eta,r_s,l_x,l_y,l_z} \Big)^2 \]where density \( \rho \) is calculated using Gaussian Type Orbitals:
\[ \rho_i^{\eta,r_s,l_x,l_y,l_z} = \sum_{j \neq i} x_{ij}^{l_x}y_{ij}^{l_y}z_{ij}^{l_z} \exp{\Big(-\eta(r_{ij}-r_s)^2\Big)}f_c(r_{ij}) \]CGRIDMB parameters control position \( r_s \) of the gaussian basis function.
SGRIDMB parameters control width \( \eta \) of the gaussian basis function.
e.g. \(L_{max}=2\) will calculate descriptors with \( L=0,1,2 \) (s,p,d orbitals).
More information about this descriptor:
Zhang, Y., Hu, C.,Jiang, B. (2019). Embedded atom neural network potentials: efficient and accurate machine learning with a physically inspired representation. Journal of Physical Chemistry Letters, 10(17), 4962–4967. https://doi.org/10.1021/acs.jpclett.9b02037
Required tadah::core::Context keys: INITMB CGRIDMB SGRIDMB
-
class DM_EAM : public tadah::models::DM_Base
many-body part for the Embedded Atom Method descriptor.
\[ V_i = F\Bigg(\sum_{j \neq i} \rho(r_{ij}) \Bigg) \]This descriptor will load tabulated values for the density \( \rho \) and embedded energy \( F \) from the provided SETFL file.
This descriptor is usually used together with the two-body descriptor D2_EAM although this is not required and user can mix it with any other descriptors or use it on its own.
This descriptor will enforce cutoff distance as specified in a SETFL file. Set RCUTMB to the same value to suppress the warning message.
Required tadah::core::Context keys: INITMB SETFL
-
template<typename F>
class DM_mEAD : public tadah::models::DM_Base Modified Embedded Atom Descriptor
REQUIRED KEYS: SGRIDMB, CGRIDMB, and KEYS OF THE EMBEDDING FUNCTION
This descriptor has a mathematical form very similar to DM_EAD but allows the usage of a custom-defined embedding function, \( \mathcal{F} \), in place of the default quadratic one.
Available implementations:
\[ V_i^{L,\eta,r_s} = \sum_{l_x,l_y,l_z}^{l_x+l_y+l_z=L} \frac{L!}{l_x!l_y!l_z!} \mathcal{F}\Big( \rho_i^{\eta,r_s,l_x,l_y,l_z} \Big) \]where the density \( \rho \) is calculated using Gaussian Type Orbitals:
\[ \rho_i^{\eta,r_s,l_x,l_y,l_z} = \sum_{j \neq i} x_{ij}^{l_x} y_{ij}^{l_y} z_{ij}^{l_z} \exp{\Big(-\eta(r_{ij}-r_s)^2\Big)} f_c(r_{ij}) \]CGRIDMB parameters control the position \( r_s \) of the Gaussian basis function.
SGRIDMB parameters control the width \( \eta \) of the Gaussian basis function.
e.g., \(L_{max}=2\) will calculate descriptors with \( L=0,1,2 \) (s, p, d orbitals).
# TYPEMB params: L, size(cgrid), size(sgrid), # size(cembfunc), size(sembfunc), list of element pairs TYPEMB DM_mRLR 0 7 7 1 1 Ta Ta RCTYPEMB Cut_Tanh RCUTMB 7.5 SGRIDMB -2 7 0.1 10 CGRIDMB 0 0 0 0 0 0 0 SEMBFUNC 1.2 CEMBFUNC 0.5
# TYPEMB params: L, size(cgrid), size(sgrid), # size(cembfunc), size(sembfunc), list of element pairs TYPEMB DM_mSQRT 1 5 5 0 1 * * RCTYPEMB Cut_Cos RCUTMB 3.0 CGRIDMB -1 5 0 3.0 SGRIDMB -2 5 1.0 10.0 SEMBFUNC 1.5
Required Config keys: INITMB CGRIDMB SGRIDMB
DM_mEAD functions
-
class F_RLR : public tadah::models::F_Base
Implements an embedding function of the form: \( s \rho \log(c \rho) \).
This class supports embedding functions characterized by two main parameters:
SEMBFUNC: Controls the depth, \( s \), of the embedding function.
CEMBFUNC: Determines the x-intercept, with the x-intercept at \( 1/c \).
Require: size(SEMBFUNC)=size(CEMBFUNC)=size([C/S]GRIDMB)
The number of keys for these parameters must match the entries in the mEAD descriptor.
-
class F_SQ : public tadah::models::F_Base
-
class F_SQRT : public tadah::models::F_Base
Implements \( s \sqrt{\rho} \).
Optional parameter:
SEMBFUNC: Controls the strength, \( s \), of the embedding function. If no; value is provided, the default is 1 for every s/cgrid point.
Require: size(SEMBFUNC)=(0 or size([C/S]GRIDMB)) and size(CEMBFUNC)=0
DM_Dummy
-
class DM_Dummy : public tadah::models::DM_Base
Dummy many-body descriptor.
Use it to satisfy DescriptorsCalc requirements in case when many-body descriptor is not required.
DM_mJoin
-
class DM_mJoin : public tadah::models::DM_Base, public tadah::models::D_mJoin
Meta many-body descriptor for combining multiple DM descriptors.
This descriptor provides an interface for concatenating various many-body descriptors. The resulting descriptor can then be used by Tadah! like any standard many-body descriptor.
Each descriptor must have a specified type in a configuration file, along with a cutoff function, cutoff distance, and other optional keys that are typically expected for this descriptor, such as SGRIDMB and CGRIDMB.
When listing descriptors under the TYPEMB key, include parameters relevant to this descriptor.
Here is an example of configuring these descriptors:
TYPEMB DM_mJoin # Meta descriptor for concatenating many-body descriptors TYPEMB DM_EAD 1 5 5 * * # L number, cgrid, sgrid, list of element pairs RCTYPEMB Cut_Cos RCUTMB 3.0 CGRIDMB -1 5 0 3.0 # Grid for DM_EAD, blips centers, auto-generated SGRIDMB -2 5 1.0 10.0 # Grid for DM_EAD, blips widths, auto-generated TYPEMB DM_Blip 0 7 7 Ta Ta # L number, cgrid, sgrid, list of element pairs RCTYPEMB Cut_Tanh RCUTMB 7.5 SGRIDMB -2 7 0.1 10 # Grid for DM_Blip, blips widths, auto-generated CGRIDMB 0 0 0 0 0 0 0 # Grid for DM_Blip, blips centers
Note: Grids can be specified on a single line, and the order of the grids should match the order of descriptors.
There is no limit to the number of descriptors that can be concatenated.
Ensure the types and grids are correctly specified in the configuration file.
The cutoff functions (RCTYPEMB) and distances (RCUTMB) must be defined for each descriptor.
Both SGRIDMB and CGRIDMB should be included if relevant, with their sizes matching the given descriptors.