.. _tutorial_qct: Quasi-classical molecular dynamics ===================================== Quasi-classical molecular dynamics (also known as quasi-classical trajectories (QCT)) accounts for the zero-point energy (ZPE) in contrast to classical dynamics and is very popular in studying chemical reactions (see, e.g., works by Houk et al. in `PNAS 2012, 109, 12860 `__ and `J. Am. Chem. Soc. 2023, 145, 14446 `__). In this tutorial, we show how to perform such simulations on an :ref:`example ` reproducing the above PNAS paper. See also related :ref:`tutorial on MD `. The implementation details are given in the following work (please cite it alongside other required citations when using this feature): - Yi-Fan Hou, Lina Zhang, Quanhao Zhang, Fuchun Ge, Pavlo O. Dral. `Physics-informed active learning for accelerating quantum chemical simulations `__. *arXiv* **2024**, DOI: 10.48550/arXiv.2404.11811. Background ---------- The key to performing QCT is to make sure that the initial conditions (geometries and velocities) contain the required ZPE. MLatom supports two normal mode sampling methods of generating such conditions: - Wigner sampling (see Eq. 20 in `the corresponding paper `_), sampling only from the lowest vibrational level. It is also widely used in :ref:`surface-hopping dynamics ` and spectra generation - sampling from the harmonic quantum Boltzmann distribution (implemented following `this work `_, also called "thermal sampling" in the manual to its early implementation in the VENUS program). In contrast to Wigner sampling, it samples from higher vibrational levels. This sampling is commonly used for performing QCT investigation of transition states: for all the real normal modes that are perpendicular to the reaction coordinate, the quantum number :math:`n_{i}` of normal mode :math:`i` is first sampled from a harmonic quantum Boltzmann distribution: :math:`p(n_{i})=e^{- n_{i} h v_{i} / k_{\text{B}} T} (1 - e^{-h v_{i} / k_{\text{B}} T})` where :math:`h` is the Planck constant, :math:`v_{i}` is the vibrational frequency, :math:`k_{B}` is the Boltzmann constant and :math:`T` is the temperature. The coordinates :math:`Q_{i}` and momenta :math:`P_{i}` are then generated by :math:`Q_{i}=A_{i} \text{cos} (2 \pi R_{i})`, :math:`P_{i}=-\omega_{i}A_{i} \text{cos} (2 \pi R_{i})`, here, :math:`\omega_{i}=2 \pi v_{i}`, :math:`A_{i}= \sqrt{2E_{i}} / \omega_{i}` and :math:`R_{i}` is a uniform random number on :math:`[0,1]`. The reaction coordinate (corresponding to the normal mode with the imaginary frequency) is fixed while the mass-weighted momentum :math:`P` is calculated by :math:`\pm \sqrt{-2 k_{B} T \text{ln} (1-R)}`, where :math:`R` is also a uniform random number on :math:`[0,1]`. .. _tutorial_qct_da: Example -------------------------- Below we will show you how to run quasi-classical MD in MLatom and reproduce results in the `PNAS 2012, 109, 12860 `__ and `arXiv.2404.11811 `__ for the Diels-Alder reaction between 1,3-butadiene and ethene. The general procedure is: 1. optimize the geometry, in our case do TS optimization, 2. run :ref:`frequency calculations ` which would generate the molecule object file dumped in the JSON format (here we provide this file for you: :download:`da_eqmol.json `, generated for the TS), 3. use the normal modes in the molecule file (in our case ``da_eqmol.json``) to generate initial conditions, 4. run dynamics with these conditions, 5. analyze results. Here, we are going to calculate the average time to traverse the transition zone and the average time gap of C-C bond formation. See below a full version with Python API and a brief version with the input file (only showing how to do normal mode sampling and propagating trajectories). .. _tutorial_qct_cli: Example with input file +++++++++++++++++++++++++++++++++ Generating initial conditions and running a single trajectory can be done with the following input file example: .. code-block:: MD # 1. requests molecular dynamics method=UB3LYP/6-31G* qmprog=gaussian # 2. use DFT method (if you have Gaussian). You can also use, e.g., AIQM1 or ANI-1ccx, or B3LYP and qmprog=PySCF... initConditions=harmonic-quantum-boltzmann # 3. use harmonic quantum Boltzmann distribution #initConditions=wigner # 3. also possible - use Wigner sampling normalModeFile=da_eqmol.json # 4. molecule with frequencies and normal modes initTemperature=300 # 5. initial temperature trun=150 # 6. total time; Unit: fs dt=0.5 # 7. time step; Unit: fs You will get tons of output files for this trajectory (such as xyz coordinates which you can view with VMD, and also h5md trajectory file, etc). Above example will take you quite long time (30 min for one trajectory), but if you use ML it will be much faster. Here provide a pre-trained model for this reaction :download:`da_energy_ani.npz ` which would take you less than a minute for the same simulations using the following input file: .. code-block:: MD MLmodelType=ANI MLmodelIn=da_energy_ani.npz initConditions=harmonic-quantum-boltzmann normalModeFile=da_eqmol.json initTemperature=300 trun=150 dt=0.5 .. _tutorial_qct_api: Example with Python API +++++++++++++++++++++++++++++++++ A more complete example using Python API: .. code-block:: python import mlatom as ml import os import numpy as np # Load molecule with frequencies and normal modes eqmol = ml.data.molecule() eqmol.load('da_eqmol.json',format='json') # Generate initial conditions if not os.path.exists('init_cond_db.json'): init_cond_db = ml.generate_initial_conditions(molecule=eqmol, generation_method='harmonic-quantum-boltzmann', initial_temperature=298, number_of_initial_conditions=128) init_cond_db.dump('init_cond_db.json',format='json') else: init_cond_db = ml.data.molecular_database.load('init_cond_db.json',format='json') # Reverse the velocities to get the backward trajectories re_init_cond_db = init_cond_db.copy() for mol in re_init_cond_db: for atom in mol: atom.xyz_velocities = -atom.xyz_velocities # Define method method = ml.models.methods(method='UB3LYP/6-31G*',program='Gaussian',nthreads=8) # Run dynamics for imol in range(128): if not os.path.exists(f'forward_traj{imol+1}.xyz'): dyn = ml.md(model=method, molecule_with_initial_conditions = init_cond_db[imol], ensemble='NVE', time_step=0.5, maximum_propagation_time=150.0) traj = dyn.molecular_trajectory traj.dump(filename=f'forward_traj{imol+1}',format='plain_text') if not os.path.exists(f'backward_traj{imol+1}.xyz'): dyn = ml.md(model=method, molecule_with_initial_conditions = re_init_cond_db[imol], ensemble='NVE', time_step=0.5, maximum_propagation_time=150.0) traj = dyn.molecular_trajectory traj.dump(filename=f'backward_traj{imol+1}',format='plain_text') # Analyze trajectories def bond_length(molecule): d1 = molecule.internuclear_distance(0,10) d2 = molecule.internuclear_distance(6,11) return d1, d2 def check_traj(forward_traj,backward_traj): forward_label = 'ts' backward_label = 'ts' forward_time = -1 backward_time = -1 forward_distances_list = [] backward_distances_list = [] for imol in range(len(forward_traj)): d1,d2 = bond_length(forward_traj[imol]) min_dist = min(d1,d2) max_dist = max(d1,d2) distances = [d1,d2] forward_distances_list.append(distances) if min_dist > 5.0: forward_label = 'reactant' break if max_dist < 1.6: forward_label = 'product' break if (min_dist > 2.52 or max_dist < 2.02) and forward_time < 0: forward_time = imol*0.5 for imol in range(len(backward_traj)): d1,d2 = bond_length(backward_traj[imol]) min_dist = min(d1,d2) max_dist = max(d1,d2) distances = [d1,d2] backward_distances_list.append(distances) if min_dist > 5.0: backward_label = 'reactant' break if max_dist < 1.6: backward_label = 'product' break if (min_dist > 2.52 or max_dist < 2.02) and backward_time < 0: backward_time = imol*0.5 # Check time gap if (forward_label=='product' and backward_label=='reactant') or (forward_label=='reactant' and backward_label=='product'): if forward_label == 'product': distances_list = forward_distances_list else: distances_list = backward_distances_list bond1_formed = False bond2_formed = False for istep in range(len(distances_list)): distances = distances_list[istep] # print(distances) if min(distances) < 1.6 and not bond1_formed: gap = istep * 0.5 bond1_formed = True if max(distances) < 1.6 and not bond2_formed: gap = istep * 0.5 - gap bond2_formed = True break else: gap = None return forward_label,backward_label,forward_time+backward_time,gap reactive = 0 nonreactive = 0 traverse_time_list = [] gap_list = [] for imol in range(128): forward_traj = ml.data.molecular_database.from_xyz_file(f'forward_traj{imol+1}.xyz') backward_traj = ml.data.molecular_database.from_xyz_file(f'backward_traj{imol+1}.xyz') forward_label, backward_label, traverse_time, gap = check_traj(forward_traj,backward_traj) if (forward_label == 'product' and backward_label == 'reactant') or (forward_label == 'reactant' and backward_label == 'product'): reactive += 1 # print(f'{forward_time} + {backward_time} = {forward_time+backward_time}') traverse_time_list.append(traverse_time) gap_list.append(gap) else: nonreactive += 1 print(f"Number of reactive trajectories: {reactive}") print(f"Number of non-reactive trajectories: {nonreactive}") print(f"Average time: {np.mean(traverse_time_list)} fs") print(f"Median time: {np.median(traverse_time_list)} fs") print(f"Standard deviation: {np.std(traverse_time_list)} fs") print(f"Average time gap: {np.mean(gap_list)} fs") print(f"Median time gap: {np.median(gap_list)} fs") print(f"Standard deviation: {np.std(gap_list)} fs") Unfortunately, each DFT trajectory will take roughly 30 minutes and 128 trajectories will take more than 2 days. You definitely do not want to wait that long. Using machine learning model can greatly accelerate the simulations, so we provide a pre-trained model for this reaction :download:`da_energy_ani.npz `. You can simply replace the DFT method with this ANI model: .. code-block:: python method = ml.models.ani(model_file='da_energy_ani.npz',device='cpu') # If you want to use gpu, set device='cuda' We provide our initial conditions (:download:`init_cond_db.json `) so that you can reproduce our results. The simulations should take about 15 minutes (it will be much faster if you run them in parallel by using MLatom's ``simulation.run_in_parallel`` function!) and you will be able to get the following outputs: .. code-block:: Number of reactive trajectories: 121 Number of non-reactive trajectories: 7 Average time: 53.35537190082645 fs Median time: 53.0 fs Standard deviation: 11.622406049624619 fs Average time gap: 3.479338842975207 fs Median time gap: 3.0 fs Standard deviation: 2.885126215935735 fs