This tutorial shows how to run molecular dynamics with MLatom.
We start with the example of running dynamics via input file/command line and later give a tutorial on how to use more versatile Python API. For the tutorial, you can download the files with the initial XYZ coordinates h2_md_kreg_init.xyz
, initial XYZ velocities h2_md_kreg_init.vxyz
and model file h2_kreg_energies.unf
.
The input file h2_md_kreg.inp
shows how to run dynamics for the hydrogen molecule in the NVT ensemble using the Nosé–Hoover thermostat:
# h2_md_kreg.inp
MD # 1. requests molecular dynamics
initConditions=user-defined # 2. use user-defined initial conditions
initXYZ=h2_md_kreg_init.xyz # 3. file with initial geometry; Unit: Angstrom
initVXYZ=h2_md_kreg_init.vxyz # 4. file with initial velocity; Unit: Angstrom/fs
dt=0.3 # 5. time step; Unit: fs
trun=30 # 6. total time; Unit: fs
thermostat=Nose-Hoover # 7. use Nose-Hoover thermostat
ensemble=NVT # 8. NVT ensemble
temperature=300 # 9. Run MD at 300 Kelvin
MLmodelType=KREG # 10. KREG model is used
MLprog=MLatomF # 11. use KREG implemented in the Fortran part of MLatom
MLmodelIn=h2_kreg_energies.unf # 12. file with the trained model
The initial coordinates are defiend in the XYZ format (Angstrom) as:
2
H 0.0 0.0 1.0
H 0.0 0.0 0.0
The initial velocities are also defined in a simplified XYZ format (Angstrom/fs):
2
0.0 0.0 -0.05
0.0 0.0 0.05
Run the MD simulation.
mlatom h2_md_kreg.inp &> h2_md_kreg.out
After the end of the MD, the trajectory will be saved in the files traj.xyz
with the XYZ coordinates, traj.vxyz
with XYZ velocities, traj.ekin
, traj.epot
, and traj.etot
with kinetic, potential, and totla energies (Hartree), respectively, and traj.grad
with energy gradients (Hartree/Angstrom).
Note
In MD via input file or command line, translational and angular momenta are removed by default. The user cannot keep angular momenta or change the degrees of freedom as in Python API. The linearity of the molecule is checked automatically to ensure correct degrees of freedom.
Apart from using user-defined initial conditions shown above, the user can also use random velocities via input file. initTemperature
is used to control the kinetic energy of the molecule (Ekin=NDOF*kB*Tinit/2).
MD # 1. requests molecular dynamics
initConditions=random # 2. generate random velocities
initXYZ=h2_md_kreg_init.xyz # 3. file with initial geometry; Unit: Angstrom
initTemperature=300 # 4. initial temperature
dt=0.3 # 5. time step; Unit: fs
trun=30 # 6. total time; Unit: fs
thermostat=Nose-Hoover # 7. use Nose-Hoover thermostat
ensemble=NVT # 8. NVT ensemble
temperature=300 # 9. Run MD at 300 Kelvin
MLmodelType=KREG # 10. KREG model is used
MLprog=MLatomF # 11. use KREG implemented in the Fortran part of MLatom
MLmodelIn=h2_kreg_energies.unf # 12. file with the trained model
MLatom currenly supports NVE and NVT ensembles which can be requested with the ensemble=NVE
and ensemble=NVT
keywords, respectively. In the case of the NVT ensemble, the user can choose between the Andersen thermostat (thermostat=Andersen
) and Nosé–Hoover (thermostat=Nose-Hoover
).
The user can benefit from MLatom’s many choices of the methods or models to be used for MD.
Above examples are shown for the KREG model trained by the user and defined with the following keywords:
...
MLmodelType=KREG # 10. KREG model is used
MLprog=MLatomF # 11. use KREG implemented in the Fortran part of MLatom
MLmodelIn=h2_kreg_energies.unf # 12. file with the trained model
See manuals and tutorials on how to train such models, e.g., for MACE.
The above lines can be replaced with some of the automatically recognized methods such as ANI-1ccx or AIQM1, e.g., simply as:
...
ANI-1ccx # or AIQM1
The general way to define the QM method is to use method
and the corresponding QM program providing its implementations (MLatom uses interfaces to such programs). For example, ab initio (e.g., HF or MP2) or DFT (e.g., B3LYP/6-31G*) can be requested with the input:
...
method=B3LYP/6-31G* # 2. request running DFT optimization with B3LYP
qmprog=PySCF # 3. request using PySCF for B3LYP calculations; qmprog=Gaussian can be also used
Note
In principle, our default recomendation is to use AIQM1 as it is usually the most accurate method for affordable cost. However, it is currently only applicable to the compounds with CHNO elements. Another limitation is that to unlock its full potential, we recommend to install the MNDO program for its QM part. The XACS cloud uses Sparrow instead, which currently only provides numerical gradients limiting its applicability for MD. ANI-1ccx is hence a good alternative on the XACS cloud if it can be applied (it has additional limitations to neutral closed-shell compounds in ground state).
Running molecular dynamics in PyAPI consists of 4 steps: generating initial conditions, initializing model, initializing thermostat and running dynamics. Here we use a pre-trained KREG model to run MD simulation for \(\text{H}_2\).
MLatom provides a function ml.generate_initial_conditions
to generate initial geometries and velocities for MD, which returns a ml.data.molecular_database
class with the ml.data.molecule
objects containing these properties. Example below uses initial conditions provided by the user. They can be initial conditions from previous calculations or generated by other softwares.
import mlatom as ml
# Use user-defined initial conditions
mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
init_cond_db = ml.generate_initial_conditions(molecule=mol,
generation_method='user-defined',
file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz',
file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz')
init_mol = init_cond_db[0]
The user can also generate random velocities with MLatom via generation_method='random'
. The initial_temperature
(in Kelvin) can be used to control the kinetic energy of the molecule. The kinetic energy is calculated by (NDOF*KB*Tinit/2), where NDOF is the degrees of freedom of the molecule, KB is the Boltzmann constant and Tinit is the initial temperature set by the user. You can also use initial_kinetic_energy
argument to directly set the initial kinetic energy in Hartree.
import mlatom as ml
# Generate random velocities given geometry
mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
init_cond_db = ml.generate_initial_conditions(molecule=mol,
generation_method='random',
initial_temperature=300, # Set initial temperature in K
#initial_kinetic_energy=4.75E-4, # Or set initial kinetic energy in Hartree
number_of_initial_conditions=1)
Note
In MLatom, if generation_method='random'
is used, translational and angular momenta (velocities) are removed by default NDOF=3Natoms−6 for non-linear molecules). If you want to keep angular momentum, you can use eliminate_angular_momentum=False
(NDOF=3Natoms−3). By default, for diatomic molecules, (NDOF=3Natoms−5), while for other linear molecules (Natoms>2), NDOF=3Natoms−6 (but MLatom will check the linearity of the molecule and print a warning message). You can use degrees_of_freedom
argument to override the default setting of degrees of freedom in MLatom.
A model or method should be provided to calculate forces for MD simulation. Here, a pre-trained ML model is provided h2_kreg_energies.unf
but other choices are possible, e.g., see the commented out lines:
# Initializing model
model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF')
# or another ML model, e.g., MACE:
#mymodel = ml.models.mace(model_file='mace.pt')
# or choose one of the predifined (automatically recognized) methods
#model = ml.models.methods(method='ANI-1ccx')
# or QM method, e.g., B3LYP with Gaussian
#model = ml.models.methods(method='B3LYP/6-31G*', program='Gaussian')
A thermostat should be specified unless NVE ensemble (micro-canonical ensemble, the system has constant particle N, constant volume V and constant energy E) is used. Here, the Nosé–Hoover thermostat is initialized in order to run MD in NVT ensemble (canonical ensemble, the system has constant particle N, constant volume V and temperature fluctuating around T).
# Initializing thermostat
nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)
Note
Degrees of freedom in Nosé–Hoover thermostat is basically the same as that in initial conditions except for linear molecules. The default degrees of freedom for all linear molecules are \(3N_\text{atoms}-5\). You can still use degrees_of_freedom
argument to override the default settings in MLatom.
Advanced settings for the Nosé–Hoover thermostat such as Nosé–Hoover chain length (nose_hoover_chain_length
), multiple time step (multiple_time_step
), the number of the Yoshida–Suzuki steps (number_of_yoshida_suzuki_steps
) and the Nosé–Hoover chain frequency (nose_hoover_chain_frequency
) can also be changed. Please refer to the manual for more information.
Andersen thermostat is also available in MLatom.
andersen = ml.md.Andersen_thermostat(temperature=300)
Note
The random collision in Andersen thermostat is applied to all 3Natoms−5 degrees of freedom.
You can use gamma
to modify the collision rate (in fs−1) in Andersen thermostat, which is 0.2 by default.
The last step is running molecular dynamics. The maximum propagation time (in fs) and time step (in fs) should be provided. Here, the maximum propagation time is 30 fs and the time step is 0.3 fs.
# Run molecular dynamics
dyn = ml.md(model=model,
molecule_with_initial_conditions=init_mol,
thermostat=nose_hoover,
ensemble='NVT',
time_step=0.3,
maximum_propagation_time=30.0)
The trajectory is saved as an attribute of ml.md
, which can be dumped into plain text files and h5md format file.
# Dump trajectory
traj = dyn.molecular_trajectory
traj.dump(filename='traj', format='plain_text')
traj.dump(filename='traj.h5', format='h5md')
After combining the 4 steps shown above, the code for the entire workflow from setting up the initial conditions to obtaining the dynamics trajectory files look like this:
import mlatom as ml
# Use user-defined initial conditions
mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
init_cond_db = ml.generate_initial_conditions(molecule=mol,
generation_method='user-defined',
file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz',
file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz')
init_mol = init_cond_db[0]
# Initializing model
model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF')
# Initializing thermostat
nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)
# Run molecular dynamics
dyn = ml.md(model=model,
molecule_with_initial_conditions=init_mol,
thermostat=nose_hoover,
ensemble='NVT',
time_step=0.3,
maximum_propagation_time=30.0)
# Dump trajectory
traj = dyn.molecular_trajectory
traj.dump(filename='traj', format='plain_text')
traj.dump(filename='traj.h5', format='h5md')
print(f"Number of steps in the trajectory: {len(traj.steps)}")
You will find XYZ coordinates in traj.xyz
, XYZ velocities in traj.vxyz
, kinetic energies in traj.ekin
, potential energies in traj.epot
, total energies in traj.etot
and energy gradients in traj.grad
. There are 101 steps in the trajectory (including the initial condition) and the geometry of the last step is:
2
H 0.0000000000000 0.0000000000000 0.8764323118216
H 0.0000000000000 0.0000000000000 0.1235676881784
You can define a stop function in MD so that it will stop when some conditions are met (only available via Python API). For example, we show below how to stop MD when the bond length is shorter than 0.9 Angstrom.
import mlatom as ml
# Use user-defined initial conditions
mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
init_cond_db = ml.generate_initial_conditions(molecule=mol,
generation_method='user-defined',
file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz',
file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz')
init_mol = init_cond_db[0]
# Initializing model
model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF')
# Initializing thermostat
nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)
# User-defined stop function
def stop_function(molecule):
dist_matrix = molecule.get_internuclear_distance_matrix()
# Get bond length
bond_length = dist_matrix[0,1]
if bond_length < 0.9:
return True
else:
return False
# Run molecular dynamics
dyn = ml.md(model=model,
molecule_with_initial_conditions=init_mol,
thermostat=nose_hoover,
ensemble='NVT',
time_step=0.3,
maximum_propagation_time=30.0,
stop_function=stop_function,
stop_function_kwargs={})
# Dump trajectory
traj = dyn.molecular_trajectory
traj.dump(filename='traj_stop_function', format='plain_text')
traj.dump(filename='traj_stop_function.h5', format='h5md')
print(f"Number of steps in the trajectory: {len(traj.steps)}")
There are 4 steps in the trajectory and the geometry of the last step is:
2
H 0.0000000000000 0.0000000000000 0.9417784217628
H 0.0000000000000 0.0000000000000 0.0582215782372
Note
In every step, the molecule will be passed to the stop function to check whether the molecular dynamics needs to stop. The first argument in the user-defined stop function has to take in a molecule
object.
If you have further questions, criticism, and suggestions, we would be happy to receive them in English or Chinese via email, Slack (preferred), or WeChat (please send an email to request to add you to the XACS user support group).