.. _tutorial_md: Molecular dynamics ===================== This tutorial shows how to run molecular dynamics with MLatom. We start with the example of running dynamics via input file/command line and later give a tutorial on how to use more versatile :ref:`Python API `. For the tutorial, you can download the files with the initial XYZ coordinates :download:`h2_md_kreg_init.xyz `, initial XYZ velocities :download:`h2_md_kreg_init.vxyz ` and model file :download:`h2_kreg_energies.unf `. The input file :download:`h2_md_kreg.inp ` shows how to run dynamics for the hydrogen molecule in the NVT ensemble using the Nosé--Hoover thermostat: .. code-block:: # h2_md_kreg.inp MD # 1. requests molecular dynamics initConditions=user-defined # 2. use user-defined initial conditions initXYZ=h2_md_kreg_init.xyz # 3. file with initial geometry; Unit: Angstrom initVXYZ=h2_md_kreg_init.vxyz # 4. file with initial velocity; Unit: Angstrom/fs dt=0.3 # 5. time step; Unit: fs trun=30 # 6. total time; Unit: fs thermostat=Nose-Hoover # 7. use Nose-Hoover thermostat ensemble=NVT # 8. NVT ensemble temperature=300 # 9. Run MD at 300 Kelvin MLmodelType=KREG # 10. KREG model is used MLprog=MLatomF # 11. use KREG implemented in the Fortran part of MLatom MLmodelIn=h2_kreg_energies.unf # 12. file with the trained model The initial coordinates are defined in the XYZ format (Angstrom) as: .. code-block:: 2 H 0.0 0.0 1.0 H 0.0 0.0 0.0 The initial velocities are also defined in a simplified XYZ format (Angstrom/fs): .. code-block:: 2 0.0 0.0 -0.05 0.0 0.0 0.05 Run the MD simulation. .. code-block:: mlatom h2_md_kreg.inp &> h2_md_kreg.out After the end of the MD, the trajectory will be saved in the files ``traj.xyz`` with the XYZ coordinates, ``traj.vxyz`` with XYZ velocities, ``traj.ekin``, ``traj.epot``, and ``traj.etot`` with kinetic, potential, and totla energies (Hartree), respectively, and ``traj.grad`` with energy gradients (Hartree/Angstrom). .. note:: In MD via input file or command line, translational and angular momenta are removed by default. The user cannot keep angular momenta or change the degrees of freedom as in Python API. The linearity of the molecule is checked automatically to ensure correct degrees of freedom. Initial conditions ++++++++++++++++++ Apart from using user-defined initial conditions shown above, the user can also use random velocities via input file. ``initTemperature`` is used to control the kinetic energy of the molecule (:math:`E_\text{kin}=N_\text{DOF}k_\text{B}T_\text{init}/2`). .. code-block:: MD # 1. requests molecular dynamics initConditions=random # 2. generate random velocities initXYZ=h2_md_kreg_init.xyz # 3. file with initial geometry; Unit: Angstrom initTemperature=300 # 4. initial temperature dt=0.3 # 5. time step; Unit: fs trun=30 # 6. total time; Unit: fs thermostat=Nose-Hoover # 7. use Nose-Hoover thermostat ensemble=NVT # 8. NVT ensemble temperature=300 # 9. Run MD at 300 Kelvin MLmodelType=KREG # 10. KREG model is used MLprog=MLatomF # 11. use KREG implemented in the Fortran part of MLatom MLmodelIn=h2_kreg_energies.unf # 12. file with the trained model The user can also use normal mode sampling to generate initial conditions. We provide two options: Wigner distribution (``initConditions=wigner``) and quantum harmonic Boltzmann distribution (``initConditions=quantum-harmonic-boltzmann``). .. code-block:: MD # 1. requests molecular dynamics method=AIQM1 # 2. use AIQM1 initConditions=wigner # 3. use Wigner distribution #initConditions=harmonic-quantum-boltzmann # 3. use harmonic quantum Boltzmann distribution normalModeFile=ethanol_freq.json # 4. molecule with frequencies and normal modes initTemperature=300 # 5. initial temperature trun=30 # 6. total time; Unit: fs dt=0.3 # 7. time step; Unit: fs ``normalModeFile`` specifies the json file that saves the molecule object, which should contain information needed for normal mode sampling, i.e., frequencies and normal modes. This file is automatically generated when you do :ref:`frequency calculations ` (check ``freq[your molecule number].json`` file). Choosing ensemble and thermostat ++++++++++++++++++++++++++++++++ MLatom currenly supports NVE and NVT ensembles which can be requested with the ``ensemble=NVE`` and ``ensemble=NVT`` keywords, respectively. In the case of the NVT ensemble, the user can choose between the Andersen thermostat (``thermostat=Andersen``) and Nosé--Hoover (``thermostat=Nose-Hoover``). Choosing the method or model ---------------------------- The user can benefit from MLatom's many choices of the methods or models to be used for MD. Above examples are shown for the KREG model trained by the user and defined with the following keywords: .. code-block:: ... MLmodelType=KREG # 10. KREG model is used MLprog=MLatomF # 11. use KREG implemented in the Fortran part of MLatom MLmodelIn=h2_kreg_energies.unf # 12. file with the trained model See manuals and tutorials on how to train such models, e.g., `for MACE `_. The above lines can be replaced with some of the automatically recognized methods such as ANI-1ccx or AIQM1, e.g., simply as: .. code-block:: ... ANI-1ccx # or AIQM1 The general way to define the QM method is to use ``method`` and the corresponding QM program providing its implementations (MLatom uses interfaces to such programs). For example, ab initio (e.g., HF or MP2) or DFT (e.g., B3LYP/6-31G*) can be requested with the input: .. code-block:: ... method=B3LYP/6-31G* # 2. request running DFT optimization with B3LYP qmprog=PySCF # 3. request using PySCF for B3LYP calculations; qmprog=Gaussian can be also used .. note:: In principle, our default recomendation is to use AIQM1 as it is usually the most accurate method for affordable cost. However, it is currently only applicable to the compounds with CHNO elements. Another limitation is that to unlock its full potential, we recommend to install the MNDO program for its QM part. The XACS cloud uses Sparrow instead, which currently only provides numerical gradients limiting its applicability for MD. ANI-1ccx is hence a good alternative on the XACS cloud if it can be applied (it has additional limitations to neutral closed-shell compounds in ground state). .. _md_via_pyapi: Using PyAPI ------------- Running molecular dynamics in PyAPI consists of 4 steps: :ref:`generating initial conditions `, :ref:`initializing model `, :ref:`initializing thermostat ` and :ref:`running dynamics `. Here we use a pre-trained KREG model to run MD simulation for :math:`\text{H}_2`. .. _initcond: Initial conditions +++++++++++++++++++ MLatom provides a function ``ml.generate_initial_conditions`` to generate initial geometries and velocities for MD, which returns a :class:`ml.data.molecular_database` class with the :class:`ml.data.molecule` objects containing these properties. Example below uses initial conditions provided by the user. They can be initial conditions from previous calculations or generated by other softwares. .. code-block:: python import mlatom as ml # Use user-defined initial conditions mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz') init_cond_db = ml.generate_initial_conditions(molecule=mol, generation_method='user-defined', file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz', file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz') init_mol = init_cond_db[0] The user can also generate random velocities with MLatom via ``generation_method='random'``. The ``initial_temperature`` (in Kelvin) can be used to control the kinetic energy of the molecule. The kinetic energy is calculated by :math:`N_\text{DOF}k_\text{B}T_\text{init}/2`, where :math:`N_\text{DOF}` is the degrees of freedom of the molecule, :math:`k_\text{B}` is the Boltzmann constant and :math:`T_\text{init}` is the initial temperature set by the user. You can also use ``initial_kinetic_energy`` argument to directly set the initial kinetic energy in Hartree. .. code-block:: python import mlatom as ml # Generate random velocities given geometry mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz') init_cond_db = ml.generate_initial_conditions(molecule=mol, generation_method='random', initial_temperature=300, # Set initial temperature in K #initial_kinetic_energy=4.75E-4, # Or set initial kinetic energy in Hartree number_of_initial_conditions=1) .. note:: In MLatom, if ``generation_method='random'`` is used, translational and angular momenta (velocities) are removed by default (:math:`N_\text{DOF}=3N_\text{atoms}-6` for non-linear molecules). If you want to keep angular momentum, you can use ``eliminate_angular_momentum=False`` (:math:`N_\text{DOF}=3N_\text{atoms}-3`). By default, for diatomic molecules, :math:`N_\text{DOF}=3N_\text{atoms}-5`, while for other linear molecules (:math:`N_\text{atoms}>2`), :math:`N_\text{DOF}=3N_\text{atoms}-6` (but MLatom will check the linearity of the molecule and print a warning message). You can use ``degrees_of_freedom`` argument to override the default setting of degrees of freedom in MLatom. Apart from random velocities, MLatom provides two normal mode sampling methods as well which are required for :ref:`quasi-classical MD `. One is based on the non-sharp Wigner distribution as implemented in Newton-X (see Eq. 20 in `the software's paper `_). The codes are modified and rewritten in Python. The other one is based on the harmonic quantum Boltzmann distribution as described in `this paper `_. Below we show how to use normal mode sampling and the molecule with frequencies and normal modes can be downloaded here :download:`ethanol_freq.json `. .. code-block:: python import mlatom as ml eqmol = ml.data.molecule() eqmol.load('ethanol_freq.json',format='json') # Sample in Wigner distribution init_cond_db = ml.generate_initial_conditions(molecule=eqmol, generation_method='wigner', initial_temperature=300, # Set initial temperature in K number_of_initial_conditions=1) # Sample in harmonic quantum Boltzmann distribution init_cond_db = ml.generate_initial_conditions(molecule=eqmol, generation_method='harmonic-quantum-boltzmann', initial_temperature=300, # Set initial temperature in K number_of_initial_conditions=1) .. note:: Normal mode sampling needs frequencies and normal modes. Make sure that the molecule object has these properties (basically, just run :ref:`frequency calculations `). For transition states, we recommend using harmonic quantum Boltzmann distribution (make sure your frequency calculations produced only one imaginary frequency). The reaction coordinate is treated seperately. .. _Model: Method or model +++++++++++++++ A model or method should be provided to calculate forces for MD simulation. Here, a pre-trained ML model is provided :download:`h2_kreg_energies.unf ` but other choices are possible, e.g., see the commented out lines: .. code-block:: python # Initializing model model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF') # or another ML model, e.g., MACE: #mymodel = ml.models.mace(model_file='mace.pt') # or choose one of the predifined (automatically recognized) methods #model = ml.models.methods(method='ANI-1ccx') # or QM method, e.g., B3LYP with Gaussian #model = ml.models.methods(method='B3LYP/6-31G*', program='Gaussian') .. _Thermostat: Thermostat +++++++++++ A thermostat should be specified unless NVE ensemble (micro-canonical ensemble, the system has constant particle N, constant volume V and constant energy E) is used. Here, the Nosé--Hoover thermostat is initialized in order to run MD in NVT ensemble (canonical ensemble, the system has constant particle N, constant volume V and temperature fluctuating around T). .. code-block:: python # Initializing thermostat nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol) .. note:: Degrees of freedom in Nosé--Hoover thermostat is basically the same as that in initial conditions except for linear molecules. The default degrees of freedom for all linear molecules are :math:`3N_\text{atoms}-5`. You can still use ``degrees_of_freedom`` argument to override the default settings in MLatom. Advanced settings for the Nosé--Hoover thermostat such as Nosé--Hoover chain length (``nose_hoover_chain_length``), multiple time step (``multiple_time_step``), the number of the Yoshida--Suzuki steps (``number_of_yoshida_suzuki_steps``) and the Nosé--Hoover chain frequency (``nose_hoover_chain_frequency``) can also be changed. Please refer to the manual for more information. Andersen thermostat is also available in MLatom. .. code-block:: python andersen = ml.md.Andersen_thermostat(temperature=300) .. note:: The random collision in Andersen thermostat is applied to all :math:`3N_\text{atoms}` degrees of freedom. You can use ``gamma`` to modify the collision rate (in :math:`\text{fs}^{-1}`) in Andersen thermostat, which is 0.2 by default. .. _Dynamics: Dynamics +++++++++ The last step is running molecular dynamics. The maximum propagation time (in fs) and time step (in fs) should be provided. Here, the maximum propagation time is 30 fs and the time step is 0.3 fs. .. code-block:: python # Run molecular dynamics dyn = ml.md(model=model, molecule_with_initial_conditions=init_mol, thermostat=nose_hoover, ensemble='NVT', time_step=0.3, maximum_propagation_time=30.0, dump_trajectory_interval=1 # e.g., you can dump every time step (by default nothing is dumped!) filename='mytraj.h5' # file where you want to save the traj format='h5md' # format of dumped traj (supported formats: plain_text, h5md, json) ) The trajectory is saved as an attribute of :class:`ml.md`, which can be dumped into plain text files and h5md format file. .. code-block:: python # Dump trajectory traj = dyn.molecular_trajectory traj.dump(filename='traj', format='plain_text') traj.dump(filename='traj.h5', format='h5md') Bringing it all together +++++++++++++++++++++++++++ After combining the 4 steps shown above, the code for the entire workflow from setting up the initial conditions to obtaining the dynamics trajectory files look like this: .. code-block:: python import mlatom as ml # Use user-defined initial conditions mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz') init_cond_db = ml.generate_initial_conditions(molecule=mol, generation_method='user-defined', file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz', file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz') init_mol = init_cond_db[0] # Initializing model model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF') # Initializing thermostat nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol) # Run molecular dynamics dyn = ml.md(model=model, molecule_with_initial_conditions=init_mol, thermostat=nose_hoover, ensemble='NVT', time_step=0.3, maximum_propagation_time=30.0) # Dump trajectory traj = dyn.molecular_trajectory traj.dump(filename='traj', format='plain_text') traj.dump(filename='traj.h5', format='h5md') print(f"Number of steps in the trajectory: {len(traj.steps)}") You will find XYZ coordinates in ``traj.xyz``, XYZ velocities in ``traj.vxyz``, kinetic energies in ``traj.ekin``, potential energies in ``traj.epot``, total energies in ``traj.etot`` and energy gradients in ``traj.grad``. There are 101 steps in the trajectory (including the initial condition) and the geometry of the last step is: .. code-block:: 2 H 0.0000000000000 0.0000000000000 0.8764323118216 H 0.0000000000000 0.0000000000000 0.1235676881784 Stop function +++++++++++++ You can define a stop function in MD so that it will stop when some conditions are met (only available via Python API). For example, we show below how to stop MD when the bond length is shorter than 0.9 Angstrom. .. code-block:: python import mlatom as ml # Use user-defined initial conditions mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz') init_cond_db = ml.generate_initial_conditions(molecule=mol, generation_method='user-defined', file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz', file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz') init_mol = init_cond_db[0] # Initializing model model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF') # Initializing thermostat nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol) # User-defined stop function def stop_function(molecule): dist_matrix = molecule.get_internuclear_distance_matrix() # Get bond length bond_length = dist_matrix[0,1] if bond_length < 0.9: return True else: return False # Run molecular dynamics dyn = ml.md(model=model, molecule_with_initial_conditions=init_mol, thermostat=nose_hoover, ensemble='NVT', time_step=0.3, maximum_propagation_time=30.0, stop_function=stop_function, stop_function_kwargs={}) # Dump trajectory traj = dyn.molecular_trajectory traj.dump(filename='traj_stop_function', format='plain_text') traj.dump(filename='traj_stop_function.h5', format='h5md') print(f"Number of steps in the trajectory: {len(traj.steps)}") There are 4 steps in the trajectory and the geometry of the last step is: .. code-block:: 2 H 0.0000000000000 0.0000000000000 0.9417784217628 H 0.0000000000000 0.0000000000000 0.0582215782372 .. note:: In every step, the molecule will be passed to the stop function to check whether the molecular dynamics needs to stop. The first argument in the user-defined stop function has to take in a :class:`molecule` object. Any questions or suggestions? ----------------------------- If you have further questions, criticism, and suggestions, we would be happy to receive them in English or Chinese via `email `_, `Slack `_ (preferred), or WeChat (please send an email to request to add you to the XACS user support group).