.. _tutorial_md:

Molecular dynamics
=====================
This tutorial shows how to run molecular dynamics with MLatom.

We start with the  example of running dynamics via input file/command line and later give a tutorial on how to use more versatile :ref:`Python API <md_via_pyapi>`. For the tutorial, you can download the files with the initial XYZ coordinates :download:`h2_md_kreg_init.xyz <files/h2_md_kreg_init.xyz>`, initial XYZ velocities :download:`h2_md_kreg_init.vxyz <files/h2_md_kreg_init.vxyz>` and model file :download:`h2_kreg_energies.unf <files/h2_kreg_energies.unf>`.

The input file :download:`h2_md_kreg.inp <files/h2_md_kreg.inp>` shows how to run dynamics for the hydrogen molecule in the NVT ensemble using the Nosé--Hoover thermostat:

.. code-block:: 

    # h2_md_kreg.inp
    MD                                # 1. requests molecular dynamics
    initConditions=user-defined       # 2. use user-defined initial conditions
    initXYZ=h2_md_kreg_init.xyz       # 3. file with initial geometry; Unit: Angstrom
    initVXYZ=h2_md_kreg_init.vxyz     # 4. file with initial velocity; Unit: Angstrom/fs
    dt=0.3                            # 5. time step; Unit: fs
    trun=30                           # 6. total time; Unit: fs
    thermostat=Nose-Hoover            # 7. use Nose-Hoover thermostat
    ensemble=NVT                      # 8. NVT ensemble
    temperature=300                   # 9. Run MD at 300 Kelvin
    MLmodelType=KREG                  # 10. KREG model is used
    MLprog=MLatomF                    # 11. use KREG implemented in the Fortran part of MLatom
    MLmodelIn=h2_kreg_energies.unf    # 12. file with the trained model

The initial coordinates are defined in the XYZ format (Angstrom) as:

.. code-block:: 

    2

    H 0.0 0.0 1.0
    H 0.0 0.0 0.0


The initial velocities are also defined in a simplified XYZ format (Angstrom/fs):

.. code-block::

    2

    0.0 0.0 -0.05
    0.0 0.0  0.05

Run the MD simulation.

.. code-block:: 

    mlatom h2_md_kreg.inp &> h2_md_kreg.out 

After the end of the MD, the trajectory will be saved in the files ``traj.xyz`` with the XYZ coordinates, ``traj.vxyz`` with XYZ velocities, ``traj.ekin``, ``traj.epot``, and ``traj.etot`` with kinetic, potential, and totla energies (Hartree), respectively, and ``traj.grad`` with energy gradients (Hartree/Angstrom).

.. note::

    In MD via input file or command line, translational and angular momenta are removed by default. The user cannot keep angular momenta or change the degrees of freedom as in Python API. The linearity of the molecule is checked automatically to ensure correct degrees of freedom.

Initial conditions 
++++++++++++++++++

Apart from using user-defined initial conditions shown above, the user can also use random velocities via input file. ``initTemperature`` is used to control the kinetic energy of the molecule (:math:`E_\text{kin}=N_\text{DOF}k_\text{B}T_\text{init}/2`).

.. code-block::

    MD                                # 1. requests molecular dynamics
    initConditions=random             # 2. generate random velocities
    initXYZ=h2_md_kreg_init.xyz       # 3. file with initial geometry; Unit: Angstrom
    initTemperature=300               # 4. initial temperature 
    dt=0.3                            # 5. time step; Unit: fs
    trun=30                           # 6. total time; Unit: fs
    thermostat=Nose-Hoover            # 7. use Nose-Hoover thermostat
    ensemble=NVT                      # 8. NVT ensemble
    temperature=300                   # 9. Run MD at 300 Kelvin
    MLmodelType=KREG                  # 10. KREG model is used
    MLprog=MLatomF                    # 11. use KREG implemented in the Fortran part of MLatom
    MLmodelIn=h2_kreg_energies.unf    # 12. file with the trained model

The user can also use normal mode sampling to generate initial conditions. We provide two options: Wigner distribution (``initConditions=wigner``) and quantum harmonic Boltzmann distribution (``initConditions=quantum-harmonic-boltzmann``).

.. code-block:: 

    MD                                            # 1. requests molecular dynamics
    method=AIQM1                                  # 2. use AIQM1 
    initConditions=wigner                         # 3. use Wigner distribution
    #initConditions=harmonic-quantum-boltzmann    # 3. use harmonic quantum Boltzmann distribution 
    normalModeFile=ethanol_freq.json              # 4. molecule with frequencies and normal modes 
    initTemperature=300                           # 5. initial temperature 
    trun=30                                       # 6. total time; Unit: fs
    dt=0.3                                        # 7. time step; Unit: fs

``normalModeFile`` specifies the json file that saves the molecule object, which should contain information needed for normal mode sampling, i.e., frequencies and normal modes. This file is automatically generated when you do :ref:`frequency calculations <tutorial_freq>` (check ``freq[your molecule number].json`` file).

Choosing ensemble and thermostat
++++++++++++++++++++++++++++++++

MLatom currenly supports NVE and NVT ensembles which can be requested with the ``ensemble=NVE`` and ``ensemble=NVT`` keywords, respectively. In the case of the NVT ensemble, the user can choose between the Andersen thermostat (``thermostat=Andersen``) and Nosé--Hoover (``thermostat=Nose-Hoover``).

Choosing the method or model
----------------------------

The user can benefit from MLatom's many choices of the methods or models to be used for MD. 

Above examples are shown for the KREG model trained by the user and defined with the following keywords:


.. code-block::

    ...
    MLmodelType=KREG                  # 10. KREG model is used
    MLprog=MLatomF                    # 11. use KREG implemented in the Fortran part of MLatom
    MLmodelIn=h2_kreg_energies.unf    # 12. file with the trained model

See manuals and tutorials on how to train such models, e.g., `for MACE <http://MLatom.com/mace>`_.

The above lines can be replaced with some of the automatically recognized methods such as ANI-1ccx or AIQM1, e.g., simply as:


.. code-block::

    ...
    ANI-1ccx # or AIQM1

The general way to define the QM method is to use ``method`` and the corresponding QM program providing its implementations (MLatom uses interfaces to such programs). For example, ab initio (e.g., HF or MP2) or DFT (e.g., B3LYP/6-31G*) can be requested with the input:

.. code-block:: 

    ...
    method=B3LYP/6-31G*     # 2. request running DFT optimization with B3LYP
    qmprog=PySCF            # 3. request using PySCF for B3LYP calculations; qmprog=Gaussian can be also used

.. note::
    In principle, our default recomendation is to use AIQM1 as it is usually the most accurate method for affordable cost. However, it is currently only applicable to the compounds with CHNO elements. Another limitation is that to unlock its full potential, we recommend to install the MNDO program for its QM part. The XACS cloud uses Sparrow instead, which currently only provides numerical gradients limiting its applicability for MD. ANI-1ccx is hence a good alternative on the XACS cloud if it can be applied (it has additional limitations to neutral closed-shell compounds in ground state).

.. _md_via_pyapi:

Using PyAPI 
-------------

Running molecular dynamics in PyAPI consists of 4 steps: :ref:`generating initial conditions <initcond>`, :ref:`initializing model <Model>`, :ref:`initializing thermostat <Thermostat>` and :ref:`running dynamics <Dynamics>`. Here we use a pre-trained KREG model to run MD simulation for :math:`\text{H}_2`.

.. _initcond:

Initial conditions
+++++++++++++++++++

MLatom provides a function ``ml.generate_initial_conditions`` to generate initial geometries and velocities for MD, which returns a :class:`ml.data.molecular_database` class with the :class:`ml.data.molecule` objects containing these properties. Example below uses initial conditions provided by the user. They can be initial conditions from previous calculations or generated by other softwares.


.. code-block:: python 

    import mlatom as ml 
    # Use user-defined initial conditions
    mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
    init_cond_db = ml.generate_initial_conditions(molecule=mol,
                                                  generation_method='user-defined',
                                                  file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz', 
                                                  file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz')
    init_mol = init_cond_db[0]

The user can also generate random velocities with MLatom via ``generation_method='random'``. The ``initial_temperature`` (in Kelvin) can be used to control the kinetic energy of the molecule. The kinetic energy is calculated by :math:`N_\text{DOF}k_\text{B}T_\text{init}/2`, where :math:`N_\text{DOF}` is the degrees of freedom of the molecule, :math:`k_\text{B}` is the Boltzmann constant and :math:`T_\text{init}` is the initial temperature set by the user. You can also use ``initial_kinetic_energy`` argument to directly set the initial kinetic energy in Hartree.

.. code-block:: python 

    import mlatom as ml 
    # Generate random velocities given geometry 
    mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz') 
    init_cond_db = ml.generate_initial_conditions(molecule=mol,
                                                  generation_method='random',
                                                  initial_temperature=300,           # Set initial temperature in K
                                                  #initial_kinetic_energy=4.75E-4,   # Or set initial kinetic energy in Hartree
                                                  number_of_initial_conditions=1)


.. note:: 

    In MLatom, if ``generation_method='random'`` is used, translational and angular momenta (velocities) are removed by default (:math:`N_\text{DOF}=3N_\text{atoms}-6` for non-linear molecules). If you want to keep angular momentum, you can use ``eliminate_angular_momentum=False`` (:math:`N_\text{DOF}=3N_\text{atoms}-3`). By default, for diatomic molecules, :math:`N_\text{DOF}=3N_\text{atoms}-5`, while for other linear molecules (:math:`N_\text{atoms}>2`), :math:`N_\text{DOF}=3N_\text{atoms}-6` (but MLatom will check the linearity of the molecule and print a warning message). You can use ``degrees_of_freedom`` argument to override the default setting of degrees of freedom in MLatom.

Apart from random velocities, MLatom provides two normal mode sampling methods as well which are required for :ref:`quasi-classical MD <tutorial_qct>`. One is based on the non-sharp Wigner distribution as implemented in Newton-X (see Eq. 20 in `the software's paper <https://doi.org/10.1021/acs.jctc.2c00804>`_). The codes are modified and rewritten in Python. The other one is based on the harmonic quantum Boltzmann distribution as described in `this paper <https://doi.org/10.1021/jp973273y>`_. Below we show how to use normal mode sampling and the molecule with frequencies and normal modes can be downloaded here :download:`ethanol_freq.json <files/ethanol_freq.json>`.

.. code-block:: python 

    import mlatom as ml 
    eqmol = ml.data.molecule()
    eqmol.load('ethanol_freq.json',format='json')
    # Sample in Wigner distribution
    init_cond_db = ml.generate_initial_conditions(molecule=eqmol,
                                                  generation_method='wigner',
                                                  initial_temperature=300,    # Set initial temperature in K
                                                  number_of_initial_conditions=1)

    # Sample in harmonic quantum Boltzmann distribution
    init_cond_db = ml.generate_initial_conditions(molecule=eqmol,
                                                  generation_method='harmonic-quantum-boltzmann',
                                                  initial_temperature=300,    # Set initial temperature in K
                                                  number_of_initial_conditions=1)

.. note:: 

    Normal mode sampling needs frequencies and normal modes. Make sure that the molecule object has these properties (basically, just run :ref:`frequency calculations <tutorial_freq>`). For transition states, we recommend using harmonic quantum Boltzmann distribution (make sure your frequency calculations produced only one imaginary frequency). The reaction coordinate is treated seperately.

.. _Model:

Method or model 
+++++++++++++++

A model or method should be provided to calculate forces for MD simulation. Here, a pre-trained ML model is provided :download:`h2_kreg_energies.unf <files/h2_kreg_energies.unf>` but other choices are possible, e.g., see the commented out lines:

.. code-block:: python 

    # Initializing model
    model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF')
    # or another ML model, e.g., MACE:
    #mymodel = ml.models.mace(model_file='mace.pt')
    # or choose one of the predifined (automatically recognized) methods
    #model = ml.models.methods(method='ANI-1ccx')
    # or QM method, e.g., B3LYP with Gaussian
    #model = ml.models.methods(method='B3LYP/6-31G*', program='Gaussian')

.. _Thermostat:

Thermostat 
+++++++++++

A thermostat should be specified unless NVE ensemble (micro-canonical ensemble, the system has constant particle N, constant volume V and constant energy E) is used. Here, the Nosé--Hoover thermostat is initialized in order to run MD in NVT ensemble (canonical ensemble, the system has constant particle N, constant volume V and temperature fluctuating around T).

.. code-block:: python 

    # Initializing thermostat
    nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)

.. note:: 

    Degrees of freedom in Nosé--Hoover thermostat is basically the same as that in initial conditions except for linear molecules. The default degrees of freedom for all linear molecules are :math:`3N_\text{atoms}-5`. You can still use ``degrees_of_freedom`` argument to override the default settings in MLatom.

Advanced settings for the Nosé--Hoover thermostat such as Nosé--Hoover chain length (``nose_hoover_chain_length``), multiple time step (``multiple_time_step``), the number of the Yoshida--Suzuki steps (``number_of_yoshida_suzuki_steps``) and the Nosé--Hoover chain frequency (``nose_hoover_chain_frequency``) can also be changed. Please refer to the manual for more information.

Andersen thermostat is also available in MLatom.

.. code-block:: python 

    andersen = ml.md.Andersen_thermostat(temperature=300)

.. note::

    The random collision in Andersen thermostat is applied to all :math:`3N_\text{atoms}` degrees of freedom.

You can use ``gamma`` to modify the collision rate (in :math:`\text{fs}^{-1}`) in Andersen thermostat, which is 0.2 by default.

.. _Dynamics:

Dynamics 
+++++++++

The last step is running molecular dynamics. The maximum propagation time (in fs) and time step (in fs) should be provided. Here, the maximum propagation time is 30 fs and the time step is 0.3 fs.

.. code-block:: python 

    # Run molecular dynamics 
    dyn = ml.md(model=model,
                molecule_with_initial_conditions=init_mol,
                thermostat=nose_hoover,
                ensemble='NVT',
                time_step=0.3,
                maximum_propagation_time=30.0,
                dump_trajectory_interval=1 # e.g., you can dump every time step (by default nothing is dumped!)
                filename='mytraj.h5'       # file where you want to save the traj
                format='h5md'              # format of dumped traj (supported formats: plain_text, h5md, json)
                )

The trajectory is saved as an attribute of :class:`ml.md`, which can be dumped into plain text files and h5md format file.

.. code-block:: python 

    # Dump trajectory 
    traj = dyn.molecular_trajectory 
    traj.dump(filename='traj', format='plain_text')
    traj.dump(filename='traj.h5', format='h5md')

Bringing it all together 
+++++++++++++++++++++++++++

After combining the 4 steps shown above, the code for the entire workflow from setting up the initial conditions to obtaining the dynamics trajectory files look like this:

.. code-block:: python 

    import mlatom as ml 
    # Use user-defined initial conditions
    mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
    init_cond_db = ml.generate_initial_conditions(molecule=mol,
                                                    generation_method='user-defined',
                                                    file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz', 
                                                    file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz')
    init_mol = init_cond_db[0]

    # Initializing model
    model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF')

    # Initializing thermostat
    nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)

    # Run molecular dynamics 
    dyn = ml.md(model=model,
                molecule_with_initial_conditions=init_mol,
                thermostat=nose_hoover,
                ensemble='NVT',
                time_step=0.3,
                maximum_propagation_time=30.0)

    # Dump trajectory 
    traj = dyn.molecular_trajectory 
    traj.dump(filename='traj', format='plain_text')
    traj.dump(filename='traj.h5', format='h5md')

    print(f"Number of steps in the trajectory: {len(traj.steps)}")

You will find XYZ coordinates in ``traj.xyz``, XYZ velocities in ``traj.vxyz``, kinetic energies in ``traj.ekin``, potential energies in ``traj.epot``, total energies in ``traj.etot`` and energy gradients in ``traj.grad``. There are 101 steps in the trajectory (including the initial condition) and the geometry of the last step is: 

.. code-block:: 

    2

    H             0.0000000000000           0.0000000000000           0.8764323118216
    H             0.0000000000000           0.0000000000000           0.1235676881784

Stop function 
+++++++++++++

You can define a stop function in MD so that it will stop when some conditions are met (only available via Python API). For example, we show below how to stop MD when the bond length is shorter than 0.9 Angstrom.

.. code-block:: python 

    import mlatom as ml 
    # Use user-defined initial conditions
    mol = ml.data.molecule.from_xyz_file('h2_md_kreg_init.xyz')
    init_cond_db = ml.generate_initial_conditions(molecule=mol,
                                                    generation_method='user-defined',
                                                    file_with_initial_xyz_coordinates='h2_md_kreg_init.xyz', 
                                                    file_with_initial_xyz_velocities='h2_md_kreg_init.vxyz')
    init_mol = init_cond_db[0]

    # Initializing model
    model = ml.models.kreg(model_file='h2_kreg_energies.unf',ml_program='MLatomF')

    # Initializing thermostat
    nose_hoover = ml.md.Nose_Hoover_thermostat(temperature=300, molecule=init_mol)

    # User-defined stop function 
    def stop_function(molecule):
        dist_matrix = molecule.get_internuclear_distance_matrix() 
        # Get bond length
        bond_length = dist_matrix[0,1]
        if bond_length < 0.9:
            return True 
        else: 
            return False

    # Run molecular dynamics 
    dyn = ml.md(model=model,
                molecule_with_initial_conditions=init_mol,
                thermostat=nose_hoover,
                ensemble='NVT',
                time_step=0.3,
                maximum_propagation_time=30.0,
                stop_function=stop_function,
                stop_function_kwargs={})

    # Dump trajectory 
    traj = dyn.molecular_trajectory 
    traj.dump(filename='traj_stop_function', format='plain_text')
    traj.dump(filename='traj_stop_function.h5', format='h5md')

    print(f"Number of steps in the trajectory: {len(traj.steps)}")

There are 4 steps in the trajectory and the geometry of the last step is:

.. code-block:: 

    2

    H             0.0000000000000           0.0000000000000           0.9417784217628
    H             0.0000000000000           0.0000000000000           0.0582215782372

.. note:: 

    In every step, the molecule will be passed to the stop function to check whether the molecular dynamics needs to stop. The first argument in the user-defined stop function has to take in a :class:`molecule` object.

Any questions or suggestions?
-----------------------------

If you have further questions, criticism, and suggestions, we would be happy to receive them in English or Chinese via `email <xacs@xmu.edu.cn>`_, `Slack <https://join.slack.com/t/xacs-support/shared_invite/zt-1gm1lpn68-pReQhfYGu813eCqwmvdGvA>`_ (preferred), or WeChat (please send an email to request to add you to the XACS user support group).