ML of excited states
There are several ways to learn and predict excited-state energies and forces with MLatom which are shown below. It is recommended that you get familiar with excited-state data format in MLatom which are required for training ML models. See the manuals for other cases such as application of ML for UV/vis one- and two-photon absorption calculations.
Multi-state ANI models
Multi-state learning model (MS-ANI) that has unrivaled accuracy for excited state properties (accuracy is often better than for models targeting only ground state!). We demonstrate that this model can be used for trajectory-surface hopping of multiple molecules (not just for a single molecule!) in:
Mikołaj Martyka, Lina Zhang, Fuchun Ge, Yi-Fan Hou, Joanna Jankowska, Mario Barbatti, Pavlo O. Dral. Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning. 2024. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-dtc1w.
Zip with tutorial materials
including Jupyter notebook:
#Import MLatom as a Python package.
import mlatom as ml
#First, we define a MS-ANI model
model = ml.models.msani(model_file='fulvene_tutorial.npz',nstates=2)
the trained MS-ANI model will be saved in fulvene_tutorial.npz
#then, we load the training data as a MLatom molecular database. This is a small dataset for
#demo purposes.
train_data = ml.data.molecular_database.load("tutorial_data.json", format="json")
#Now we can train the model.
model.train(molecular_database=train_data,
property_to_learn='energy',
xyz_derivative_property_to_learn='energy_gradients',
hyperparameters={'max_epochs': 100}) #100 epochs is not enough, only for demo.
#The model can be used to make single-point predictions
#First we need to load a fulvene molecule from .xyz
mol = ml.data.molecule()
mol.read_from_xyz_file('fulvene.xyz')
mol.view()
3Dmol.js failed to load for some reason. Please check your browser console for error messages.
#Now we can make a prediction
model.predict(molecule=mol, nstates=2, current_state=0,
calculate_energy=True, calculate_energy_gradients=True)
#And print the results
for state in mol.electronic_states:
print(state.energy)
gap = (mol.electronic_states[1].energy-mol.electronic_states[0].energy)*ml.constants.Hartree2eV
print("The S1-S0 gap is {} eV".format(gap))
-230.56873760870891 -230.49484480298358 The S1-S0 gap is 2.010727281361726 eV
#Finally, we can use the MS-ANI model to run NAMD.
#For that we will load a model trained in the AL loop.
model_namd = ml.models.msani(model_file='fulvene_energy_iteration_19.npz')
model loaded from fulvene_energy_iteration_19.npz
#Load an initial conditions database
init_db = ml.data.molecular_database.load("init_cond_db.json", format="json")
#And define NAMD arguments
timemax = 60 # fs
namd_kwargs = {
'model': model_namd,
'time_step': 0.1, # fs
'maximum_propagation_time': timemax,
'hopping_algorithm': 'LZBL',
'nstates': 2,
'reduce_kinetic_energy': True,
}
#And finally run the NAMD. This should take about a minute.
dyns = ml.simulations.run_in_parallel(molecular_database=init_db[:4], task=ml.namd.surface_hopping_md, task_kwargs=namd_kwargs, create_and_keep_temp_directories=False)
#Now we can plot the results
trajs = [d.molecular_trajectory for d in dyns]
ml.namd.plot_population(trajectories=trajs, time_step=0.1,
max_propagation_time=timemax, nstates=2)
#MLatom trajectories can be analyzed and visualized using simple Python code. An example below.
energies = [[],[]]
time = []
gap = []
for step in trajs[0].steps:
energies[0].append(step.molecule.electronic_states[0].energy)
energies[1].append(step.molecule.electronic_states[1].energy)
time.append(step.time)
gap.append(step.molecule.electronic_states[1].energy-step.molecule.electronic_states[0].energy)
import matplotlib.pyplot as plt
import numpy as np
plt.plot(time, np.transpose(energies))
plt.xlabel("Time (fs)")
plt.ylabel("Energy (Hartree)")
plt.legend(['S0', 'S1'])
<matplotlib.legend.Legend at 0x2adbeae2d940>
ML-NAMD with single-state models
In this tutorial, we show an example of running surface-hopping MD with single-state ML models. Please see a separate tutorial on machine learning potentials.
See our paper for more details (please also cite it if you use the corresponding features):
Lina Zhang, Sebastian Pios, Mikołaj Martyka, Fuchun Ge, Yi-Fan Hou, Yuxinxin Chen, Joanna Jankowska, Lipeng Chen, Mario Barbatti, Pavlo O. Dral. MLatom software ecosystem for surface hopping dynamics in Python with quantum mechanical and machine learning methods. J. Chem. Theory Comput. 2024, 20, 5043–5057. DOI: 10.1021/acs.jctc.4c00468. Preprint on arXiv: https://arxiv.org/abs/2404.06189.
You can download
the Jupyter notebook with the required initial conditions and ML models from this paper.
The tutorial calculations are very fast and you should be able to get the final population plot for 5 fs with 0.25 fs time step from 30 trajectories within a minute. Here is the Jupyter notebook code snippet:
import mlatom as ml
import os
import numpy as np
# Read initial conditions
init_cond_db = ml.data.molecular_database.load(filename='materials/init_cond_db_for_pyrazine.json', format='json')
# We need to create a class that accepts the specific arguments shown below and saves the calculated electronic state properties in the molecule object
class mlmodels():
def __init__(self, nstates = 5):
folder_with_models = 'materials/lz_models'
self.models = [None for istate in range(nstates)]
for istate in range(nstates):
self.models[istate] = [ml.models.ani(model_file=f'{folder_with_models}/ensemble{ii+1}s{istate}.pt') for ii in range(2)]
for ii in range(2): self.models[istate][ii].nthreads = 1
def predict(self,
molecule=None,
nstates=5,
current_state=0,
calculate_energy=True,
calculate_energy_gradients=True):
molecule.electronic_states = [molecule.copy() for ii in range(nstates)]
for istate in range(nstates):
moltmp = molecule.electronic_states[istate]
moltmpens = [moltmp.copy() for ii in range(2)]
for ii in range(2):
self.models[istate][ii].predict(molecule=moltmpens[ii], calculate_energy = True, calculate_energy_gradients = True)
moltmp.energy = np.mean([moltmpens[ii].energy for ii in range(2)])
moltmp.energy_gradients = np.mean([moltmpens[ii].energy_gradients for ii in range(2)], axis=0)
molecule.energy = molecule.electronic_states[current_state].energy
molecule.energy_gradients = molecule.electronic_states[current_state].energy_gradients
models = mlmodels()
# Arguments for running NAMD trajectories
timemax = 5 # fs
namd_kwargs = {
'model': models,
'time_step': 0.25, # fs
'maximum_propagation_time': timemax,
'dump_trajectory_interval': None,
'hopping_algorithm': 'LZBL',
'nstates': 5,
'random_seed': 1, # making sure that the hopping probabilities are the same (should not be used in actual calculations!)
'rescale_velocity_direction': 'along velocities',
'reduce_kinetic_energy': False,
}
# Run 30 trajectories
dyns = ml.simulations.run_in_parallel(molecular_database=init_cond_db[:30], task=ml.namd.surface_hopping_md, task_kwargs=namd_kwargs)
trajs = [d.molecular_trajectory for d in dyns]
ml.namd.analyze_trajs(trajectories=trajs, maximum_propagation_time=timemax)
# Dump the trajectories
for itraj in range(len(trajs)):
trajs[itraj].dump(filename=f'traj{itraj+1}.json', format='json')
# Prepare the population plot
ml.namd.plot_population(trajectories=trajs, time_step=0.25,
max_propagation_time=timemax, nstates=5, filename=f'pop.png')
Since we used the fixed random seed, you should get the following final population: