Universal ML models
New: we highly recommend you to check out UAIQM that is our ultimate solution to the universal ML models.
MLatom supports a wide range of universal machine learning (ML)based models including ML potentials and hybrid MLenhanced quantum mechanical (QM) methods. They can be used outofbox without training. The table below lists the methods available with the links to specific tutorials:
method 
model type 
keywords in MLatom 
elements supported 
gradients 
hessian 
charged 
radicals 
excitedstates 

ML & ML/QM 

all elements 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
not available yet 

ML/QM 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 

ML/QM 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 

ML/QM 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 
\(\checkmark\) 

ML/QM 

H, C, N, O 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

ML/QM 

H, C, N, O 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

ML/QM 

H, C, N, O 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

ML/QM 

maingroup elements 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

ML/QM 

maingroup elements 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

ML/QM 

maingroup elements 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

ML/QM 

maingroup elements 
\(\times\) 
\(\times\) 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 

MLP 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
\(\times\) 

MLP 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
\(\times\) 

MLP 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
\(\times\) 

MLP 

H, C, N, O, F, S, Cl 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
\(\times\) 

MLP 

H, C, N, O, F, S, Cl 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
\(\times\) 

MLP 

H, C, N, O 
\(\checkmark\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
\(\times\) 

MLP 

H, B, C, N, O, F, Si, P, S, Cl, As, Se, Br, I 
\(\checkmark\) 
\(\times\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 

MLP 

H, B, C, N, O, F, Si, P, S, Cl, As, Se, Br, I 
\(\checkmark\) 
\(\times\) 
\(\checkmark\) 
\(\times\) 
\(\times\) 
In this tutorial, we will first introduce how to use these universal methods to perform various tasks with MLatom in general. Then we will go through each method in detail.
Using universal ML models
Below is also a brief general overview of how to use universal ML models with MLatom:
Singlepoint calculations
For singlepoint calculation, only 35 lines would be needed for MLatom input file as usual with one of the methods above specified:
AIMNet2@b973c
xyzfile=sp.xyz
yestfile=energy.dat
where the sp.xyz
is the XYZ geometries of molecule(s), which can also be defined explicitly:
AIMNet2@b973c
xyzfile='
2
H 0.000000 0.000000 0.363008
H 0.000000 0.000000 0.363008
5
C 0.000000 0.000000 0.000000
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
'
yestfile=energy.dat
With keywords ygradxyzestfile
and hessianestfile
, gradients and hessian can also be obtained in the file specified.
Since DM21 functionals are integrated in PySCF, user would need to use method
and qmprog
keywords in input file as the way to define QM methods in MLatom. Here is an example of the input file for using DM21 functional with 631G* basis set:
method=DM21/631G*
qmprog=pyscf
xyzfile=sp.xyz
yestfile=energy.dat
Python API provides a flexible alternative to use methods in MLatom. In our case, user can define method by using mlatom.models.methods
module and specify the keywords mentioned above, e.g.
method = mlatom.models.methods(method='ANI1xnr')
# method = mlatom.models.methods(method='DM21/631G*', program='pyscf')
We provide here an example to calculate energy, gradients and hessian with ANI1xnr.
import mlatom as ml
# read molecule from .xyz file
molDB = ml.data.molecular_database.from_xyz_file('sp.xyz')
# define method
model = ml.models.methods(method='ANI1xnr')
model.predict(
molecular_database=molDB,
calculate_energy=True,
calculate_energy_gradients=True,
calculate_hessian=True)
print(f'Energy in Hartree for molecule 0: {molDB[0].energy}')
print(f'Gradients in Hartree/Angstrom for molecule 1: {molDB[1].get_energy_gradients()}')
print(f'Hessian in Hartree/Angstrom^2 for molecule 1: {molDB[1].hessian}')
For more details on how to perform singlepoint calculations with MLatom, please check our tutorial.
Geometry optimization and frequency calculations
Geometry optimization is a common task in studying chemical system and subsequent frequency calculation on the optimized molecule accompanied by thermochemical properties are also crutial for analysis.
To perform geometry optimization with input file in MLatom, user just need to request geomopt
option in the first line along with the method to be used and initial guess:
geomopt # 1. requests geometry optimization
ANI1ccx # 2. universal MLP
xyzfile=' # 3. initial geometry guess
9
C 1.691449880 0.315985130 0.000000000
H 1.334777040 0.188413060 0.873651500
H 1.334777040 0.188413060 0.873651500
H 2.761449880 0.315971940 0.000000000
C 1.178134160 1.767917280 0.000000000
H 1.534806620 2.272315330 0.873651740
H 1.534807450 2.272316160 0.873650920
O 0.251865840 1.767934180 0.000001150
H 0.572301420 2.672876720 0.000175020
'
optxyz=opt.xyz # 4. (optional) file with optimized geometry.
optprog=geometric # 5. request geometric optimizer
Each optimization step will be printed to the output file, which can be controlled by printall
and printmin
keywords in after version 3.4.0. User can also choose whether to dump the optimization trajectory with keyword dumpopttrajs
.
After geometry optimization, frequency calculation can be performed with freq
option in the input file as is shown below:
freq # 1. requests frequency calculation
ANI1ccx # 2. universal MLP
xyzfile=' # 3. optimized geometry
9
C 1.672571 0.341122 0.000001
H 1.307766 0.181713 0.885095
H 1.307762 0.181707 0.885099
H 2.764560 0.305014 0.000003
C 1.188732 1.771664 0.000009
H 1.559124 2.298647 0.885998
H 1.559099 2.298653 0.885987
O 0.237878 1.729915 0.000028
H 0.575701 2.626896 0.000135
'
In the output file, user will find the vibration analysis including frequency, reduced mass and force constant of each normal mode, and also thermochemistry results. The output file in this case can be downloaded here
.
For more details on these two tasks with MLatom, please check our tutorials on geometry optimization and frequency calculations.
Molecular dynamics
One of the advantages of machine learning potentials is the ultrafast speed to propagate thousands of trajectories within several hours compared with a few weeks for commonly used DFT methods (if you do not use DM21 that is). MLatom provides an easy way to run MD and also quasiclassical MD which is popular in chemical reaction simulation.
For using input file in MLatom, the only difference here is the keywords for method
. For example, if you want to use AIMNet2 targeting RKS B973c to run dynamics for hydrogen molecule in the NVT ensemble using the Nosé–Hoover thermostat, the input file can look like:
MD # 1. requests molecular dynamics
AIMNet2@b973c # 2. use AIMNet2@B973c method
initConditions=userdefined # 3. use userdefined initial conditions
initXYZ=h2_init.xyz # 4. file with initial geometry; Unit: Angstrom
initVXYZ=h2_init.vxyz # 5. file with initial velocity; Unit: Angstrom/fs
dt=0.3 # 6. time step; Unit: fs
trun=30 # 7. total time; Unit: fs
thermostat=NoseHoover # 8. use NoseHoover thermostat
ensemble=NVT # 9. NVT ensemble
temperature=300 # 10. Run MD at 300 Kelvin
The initial XYZ coordinates and velocities can be downloaded here: h2_init.xyz
, h2_init.vxyz
We also provide below the snippet to run the same task with Python API. As usual, only the code to define the method used will be changed.
import mlatom as ml
# Use userdefined initial conditions
mol = ml.data.molecule.from_xyz_file('h2_init.xyz')
init_cond_db = ml.generate_initial_conditions(molecule=mol,
generation_method='userdefined',
file_with_initial_xyz_coordinates='h2_init.xyz',
file_with_initial_xyz_velocities='h2_init.vxyz')
init_mol = init_cond_db[0]
# Initializing model
model = ml.models.methods(method='AIMNet2@b973c')
# Initializing thermostat
nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)
# Run molecular dynamics
dyn = ml.md(model=model,
molecule_with_initial_conditions=init_mol,
thermostat=nose_hoover,
ensemble='NVT',
time_step=0.3,
maximum_propagation_time=30.0)
# Dump trajectory
traj = dyn.molecular_trajectory
traj.dump(filename='traj', format='plain_text')
traj.dump(filename='traj.h5', format='h5md')
print(f"Number of steps in the trajectory: {len(traj.steps)}")
AIQM1
AIQM1 (artificial intelligence–quantum mechanical method 1) is a generalpurpose method approaching the goldstandard coupled cluster quantum mechanical method with high computational speed of the approximate lowlevel semiempirical quantum mechanical methods for the groundstate, closedshell species, but also transferable for calculation of charged and radical species as well as for excitedstate calculations with a good accuracy. See AIQM1 paper for more details. Please cite this paper alongside other required citations:
Peikun Zheng, Roman Zubatyuk, Wei Wu, Olexandr Isayev, Pavlo O. Dral. Artificial IntelligenceEnhanced Quantum Chemical Method with Broad Applicability. Nat. Commun. 2021, 12, 7022, DOI: 10.1038/s41467021273402.
Strengths: AIQM1 is especially good for energy calculations and geometry optimizations of closedshell molecules in their groundstate.
Limitations: This method is currently limited to compounds only containing H, C, N, and O elements.
The detailed tutorial is available.
DM21
DM21 is an MLenhanced DFT method published in Science by DeepMind (please cite it when you use this method). Our installation follows the GitHub page. There are four variants of DM21 (DM21  default, DM21m, DM21mc, DM21mu), see the above GitHub page for the details.
Using DM21 and its variants is similar to using common DFT functionals. Users need to specify both the functional and the basis set to use. Worth noting is that DM21 is not stable and there is no gaurantee to converge. Time for prediction is longer than previous methods since by default in MLatom, it will start from the relatively cheap functional B3LYP as suggested by their official documentation to make SCF faster. It can be only used for singlepoint calculations in the current implementation (the interface program does not provide gradients or hessians and we did not implement numerical derivatives for this method yet).
Example of an input file:
method=DM21/631G*
qmprog=pyscf
xyzfile='
2
H 0.000000 0.000000 0.363008
H 0.000000 0.000000 0.363008
5
C 0.000000 0.000000 0.000000
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
'
yestfile=energy.dat
In Python:
import mlatom as ml
# read molecule from .xyz file
molDB = ml.data.molecular_database.from_xyz_file('sp.xyz')
# define method
method = mlatom.models.methods(method='DM21/631G*', program='pyscf')
method.predict(
molecular_database=molDB,
calculate_energy=True,
calculate_energy_gradients=True,
calculate_hessian=True)
print(f'Energy in Hartree for molecule 0: {molDB[0].energy}')
print(f'Gradients in Hartree/Angstrom for molecule 1: {molDB[1].get_energy_gradients()}')
print(f'Hessian in Hartree/Angstrom^2 for molecule 1: {molDB[1].hessian}'')
ANI models zoo
MLatom contains 3 public models in ANI model zoo from TorchANI: ANI1x, ANI1ccx and ANI2x. In addition, MLatom also allows to use D4dispersion corrected methods ANI1xD4 and ANI2xD4. Below we provide some useful notes when using these methods in MLatom.
ANI1x and ANI2x were trainied on DFT level data
ANI1ccx possess the highest accuracy targeting CCSD(T)*/CBS.
ANI1ccx and ANI1x are limited to CHNO elements, while ANI2x can be used for CHNOFClS elements.
D4 dispersion correction in ANI1xD4 and ANI2xD4 correspond to ωB97X functional.
These methods are limited to predicting energies and forces for neutral closedshell compounds in their ground state.
MLatom will report uncertainties for calculations with these methods based on the standard deviation between neural network (NN) predictions.
Example of an input file:
ANI1ccx
geomopt
xyzfile='
2
H 0.000000 0.000000 0.363008
H 0.000000 0.000000 0.363008
5
C 0.000000 0.000000 0.000000
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
'
In Python:
import mlatom as ml
# read molecule from .xyz file
molDB = ml.data.molecular_database.from_xyz_file('sp.xyz')
# define method
method = mlatom.models.methods(method='ANI1ccx')
method.predict(
molecular_database=molDB,
calculate_energy=True,
calculate_energy_gradients=True,
calculate_hessian=True)
print(f'Energy in Hartree for molecule 0: {molDB[0].energy}')
print(f'Gradients in Hartree/Angstrom for molecule 1: {molDB[1].get_energy_gradients()}')
print(f'Hessian in Hartree/Angstrom^2 for molecule 1: {molDB[1].hessian})
Reactive ANI: ANI1xnr
ANI1xnr is a general reactive ANItype NN trained on condensedphase reactive data capable of realworld reactive systems containing C, H, N, O elements, see the Nature Chemistry publication. Implementation is done by interfacing to the model from ani1xnr GitHub repository.
Note
The first time any of the models are istantiated, the models will be downloaded automatically from the animodelzoo repository to the local folder ./local
. User can choose to download them beforehand.
The input is analogous to other ANI models.
AIMNet2
AIMNet2 aims to solve the problem of ANI which is less capable of dealing with nonlocal interaction and openshell charged species. There are two pretrained model targeting B973c and ωB97MD3 accuracy (the user need to choose one of them using the keywords aimnet2@b973c
or aimnet2@wb97md3
). It is applicable to 14 elements including H, B, C, N, O, F, Si, P, S, Cl, As, Se, Br, I. Currently, hessian is not supported in MLatom.
Note
The first time any of the models are istantiated, the models will be downloaded automatically from the AIMNet2 GitHub repository to the local folder ./local
. User can choose to download them beforehand.
Example of an input file:
AIMNet2@wb97md3
geomopt
xyzfile='
2
H 0.000000 0.000000 0.363008
H 0.000000 0.000000 0.363008
5
C 0.000000 0.000000 0.000000
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
H 0.627580 0.627580 0.627580
'
In Python:
import mlatom as ml
# read molecule from .xyz file
molDB = ml.data.molecular_database.from_xyz_file('sp.xyz')
# define method
method = mlatom.models.methods(method='AIMNet2@wb97md3')
method.predict(
molecular_database=molDB,
calculate_energy=True,
calculate_energy_gradients=True,
calculate_hessian=True)
print(f'Energy in Hartree for molecule 0: {molDB[0].energy}')
print(f'Gradients in Hartree/Angstrom for molecule 1: {molDB[1].get_energy_gradients()}')
print(f'Hessian in Hartree/Angstrom^2 for molecule 1: {molDB[1].hessian})