面跳跃动力学
In this tutorial we show how to perform nonadiabatic molecular dynamics (NAMD) with MLatom (and its interfaces to Newton-X whenever required). The simulations are only possible through the Python API. MLatom currently supports only NAC-free Landau–Zener–Belyev–Lebedev (LZBL) surface hopping.
Tutorials covered here:
how to use the universal AIQM1 method for excited-state simulations (ML-improved semi-empirical QM approach faster than DFT),
surface-hopping dynamics with ML models:
perform gapMD for efficient sampling of the region near conical intersections,
active learning for surface-hopping dynamics.
See our papers for more details (please also cite them if you use the corresponding features):
Lina Zhang, Sebastian Pios, Mikołaj Martyka, Fuchun Ge, Yi-Fan Hou, Yuxinxin Chen, Joanna Jankowska, Lipeng Chen, Mario Barbatti, Pavlo O. Dral. MLatom software ecosystem for surface hopping dynamics in Python with quantum mechanical and machine learning methods. J. Chem. Theory Comput. 2024, 20, 5043–5057. DOI: 10.1021/acs.jctc.4c00468. Preprint on arXiv: https://arxiv.org/abs/2404.06189.
Mikołaj Martyka, Lina Zhang, Fuchun Ge, Yi-Fan Hou, Joanna Jankowska, Mario Barbatti, Pavlo O. Dral. Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning. 2024. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-dtc1w.
设置
根据电子结构方法或机器学习模型的不同,您可能需要安装MLatom的开发版本以及外部程序,如 MNDO (用于AIQM1和半经验QM)、 COLUMBUS (用于CASSCF)、 Turbomole (用于ADC(2))、或 TorchANI (用于AIQM1和ANI势)。AIQM1还需要 dftd4 。
AIQM1
本教程包含两部分:
单点激发态计算
请下载教程文件( sp_aiqm1.zip
)
首先,我们在 cnh4+.xyz
文件中提供了 \(\text{CNH}_4^+\) 的初始几何结构,其中的坐标以埃为单位,内容如下:
6
symmetry c1
N 0.051443000 -0.125748000 0.596619000
C 0.067113000 -0.025698000 -0.683445000
H 0.002169000 0.695516000 1.199263000
H 0.087711000 -1.030762000 1.065358000
H 0.027120000 0.954772000 -1.143351000
H 0.120118000 -0.922908000 -1.288953000
然后我们可以使用这个几何结构来计算AIQM1/CIS水平的激发态性质(也可计算注释中的AIQM1/MRCI):
#!/usr/bin/env python
# coding: utf-8
import mlatom as ml
# get the molecule object
mol = ml.data.molecule()
mol.read_from_xyz_file('cnh4+.xyz')
mol.charge = 1
# define the method
aiqm1 = ml.models.methods(method='AIQM1')
# To run AIQM1/MRCI calculations, you can uncomment the following line and make sure that mndokw file is available (see below)
#aiqm1 = ml.models.methods(method='AIQM1',qm_program_kwargs={'read_keywords_from_file':'mndokw'})
# calculate electronic state properties
aiqm1.predict(molecule=mol,
nstates=3, # Number of electronic states to calculate
current_state=2, # Requests calculating properties for the S2 electronic state (the third one or second excited)
calculate_energy=True,
calculate_energy_gradients=[True, True, True], # requests calculating gradients for all three states
#calculate_energy_gradients=True # requests calculating gradients only for the current state for many methods (but not for AIQM1 yet)
calculate_nacv=False, # calculate nonadiabatic coupling vectors
read_density_matrix=False)
# show the energy and gradients of each state
for i in range(len(mol.electronic_states)):
print(f'Energy of state {i}, {mol.electronic_states[i].energy}')
print(f'Energy gradients of state {i}')
print(mol.electronic_states[i].energy_gradients)
# or alternatively:
print('Electronic state energies:', mol.state_energies)
print('Energy gradients in all electronic states')
print(mol.state_gradients)
# save the molecule object in file
mol.dump(filename="mol.json", format="json")
以上在Python脚本的输出如下所示:
Energy of state 0, -94.87746605828193
Energy gradients of state 0
[[-2.20979325e-05 -3.09923481e-05 4.19648750e-04]
[ 1.84361461e-05 7.32504232e-05 -9.31292919e-04]
[-2.19042984e-04 4.91462433e-03 -8.93827657e-04]
[ 2.61506161e-04 -4.71725762e-03 -1.64437563e-03]
[ 1.19805843e-04 -2.90049761e-03 1.30834139e-03]
[-1.58607238e-04 2.66087287e-03 1.74150332e-03]]
Energy of state 1, -94.63515481241645
Energy gradients of state 1
[[ 5.04737245e-04 3.19599783e-03 -4.08726944e-02]
[ 4.00318642e-04 2.59615245e-03 -3.32555938e-02]
[ 2.10040873e-04 2.67707897e-05 -1.31297761e-02]
[ 1.06425061e-04 2.01394042e-03 -1.29785531e-02]
[-1.28339570e-04 -1.37049107e-02 4.93587653e-02]
[-1.09318226e-03 5.87204910e-03 5.08778502e-02]]
Energy of state 2, -94.58438542616139
Energy gradients of state 2
[[-3.22384930e-04 -1.95587619e-03 2.50398870e-02]
[ 1.58487817e-03 1.00715972e-02 -1.28916150e-01]
[-9.11050390e-06 3.73598849e-03 -1.03835407e-02]
[ 2.78260000e-04 -2.07710696e-03 -1.08367181e-02]
[ 2.43944556e-03 -6.96100621e-02 5.75290677e-02]
[-3.97108828e-03 5.98354594e-02 6.75674517e-02]]
Electronic state energies: [-94.87746606 -94.63515481 -94.58438543]
Energy gradients in all electronic states
[[[-2.20979325e-05 -3.09923481e-05 4.19648750e-04]
...
所有计算结果将被存储在 mol.json
中,您可以检查该文件以更好地了解数据在MLatom中的存储方式,以及如何通过其Python API访问不同的属性。
如果您使用了AIQM1/MRCI,您需要选择活性空间并对传递给MNDO程序的关键字进行其他调整(请参阅 其文档 )。在这里,您需要使用 mndokw
文件,其内容如下:
jop=-2 +
iop=-22 immdp=-1 igeom=1 iform=1 nsav15=3 +
icuts=-1 icutg=-1 kitscf=9999 iscf=9 iplscf=9 +
iprint=-1 kprint=-5 lprint=-2 mprint=0 jprint=-1 +
kharge=1 imult=0 nprint=-1 +
kci=5 movo=-1 ici1=2 ici2=1 jci1=1 jci2=1 nciref=0 +
iroot=3 iuvcd=2 +
ncisym=-1 ioutci=3 ipubo=1 +
ncigrd=3 icross=1
输出为:
Energy of state 0, -94.88480497779682
Energy gradients of state 0
[[ 1.34242995e-04 9.67083232e-04 -1.23502153e-02]
[-1.78362286e-04 -1.17842971e-03 1.50824778e-02]
[-9.10435874e-05 2.63194636e-03 -2.00336109e-03]
[ 1.56325835e-04 -2.28966888e-03 -2.38851941e-03]
[ 3.31705685e-05 -9.16708517e-04 7.62840787e-04]
[-5.43335121e-05 7.85777537e-04 8.96776412e-04]]
Energy of state 1, -94.63484529385708
Energy gradients of state 1
[[ 0.00050423 0.00321188 -0.04107714]
[ 0.00044946 0.00290478 -0.0372039 ]
[ 0.00017054 0.00049241 -0.01193939]
[ 0.00012002 0.001369 -0.01187431]
[-0.00012679 -0.01405619 0.0502662 ]
[-0.00111746 0.00607812 0.05182854]]
Energy of state 2, -94.56976745482731
Energy gradients of state 2
[[ 6.21429387e-03 3.96821088e-02 -5.07668904e-01]
[-6.24036735e-03 -3.98880933e-02 5.10344195e-01]
[-9.20936201e-05 2.29106816e-03 -1.36149279e-03]
[ 1.28184773e-04 -2.05330005e-03 -1.70182967e-03]
[ 9.81727653e-05 -2.05300286e-03 3.51054010e-05]
[-1.08190451e-04 2.02121919e-03 3.52925482e-04]]
动力学
在这里,我们扩展了上述示例,展示了如何在传播LZBL面跳跃动力学中使用与之前相同的方法(AIQM1/MRCI)。请下载教程文件( namd_aiqm1.zip
)。
以下脚本展示了所有步骤:
优化几何构型
运行频率计算
使用Newton-X的例程从Wigner分布中生成初始条件。该示例可以扩展以支持按激发能量窗口进行筛选(有关更多详情,请参阅 手册 )。
并行传播多条轨迹(此处为16条轨迹,每条轨迹为5 fs,时间步长为0.1 fs)。
将轨迹储存为h5md格式
通过生成群体图来分析结果。
脚本如下:
import mlatom as ml
# Load the initial geometry of a molecule
mol = ml.data.molecule()
mol.charge=1
mol.read_from_xyz_file('cnh4+.xyz')
# Define methods
# .. for NAMD
aiqm1 = ml.models.methods(method='AIQM1',
qm_program_kwargs={'save_files_in_current_directory': True,
'read_keywords_from_file':'../materials/mndokw'})
# .. for optimization, frequencies and normal mode calculations
method_optfreq = ml.models.methods(method='B3LYP/Def2SVP', program='pyscf')
# Optimize geometry
geomopt = ml.simulations.optimize_geometry(model=method_optfreq,
initial_molecule=mol)
eqmol = geomopt.optimized_molecule
eqmol.write_file_with_xyz_coordinates('eq.xyz')
# Get frequencies
ml.simulations.freq(model=method_optfreq,
molecule=eqmol)
eqmol.dump(filename='eqmol.json', format='json')
# Get initial conditions
init_cond_db = ml.generate_initial_conditions(molecule=eqmol,
generation_method='wigner',
number_of_initial_conditions=16,
initial_temperature=0,
random_seed=1) # To ensure we always get the same initial conditions (should not be used in actual calculations)
init_cond_db.dump('test.json','json')
# Propagate multiple LZBL surface-hopping trajectories in parallel
# .. setup dynamics calculations
namd_kwargs = {
'model': aiqm1,
'time_step': 0.25,
'maximum_propagation_time': 5,
'hopping_algorithm': 'LZBL',
'nstates': 3,
'initial_state': 2, # Numbering of states starts from 0!
'random_seed': 1 # To ensure we always get the same initial conditions (should not be used in actual calculations)
}
# .. run trajectories in parallel
dyns = ml.simulations.run_in_parallel(molecular_database=init_cond_db,
task=ml.namd.surface_hopping_md,
task_kwargs=namd_kwargs,
create_and_keep_temp_directories=True)
trajs = [d.molecular_trajectory for d in dyns]
# Dump the trajectories
itraj=0
for traj in trajs:
itraj+=1
traj.dump(filename=f"traj{itraj}.h5",format='h5md')
# Analyze the result of trajectories and make the population plot
ml.namd.analyze_trajs(trajectories=trajs, maximum_propagation_time=5)
ml.namd.plot_population(trajectories=trajs, time_step=0.25,
max_propagation_time=5, nstates=3, filename=f'pop.png', pop_filename='pop.txt')
这是最终的群体图(由于初始条件和跃迁中的随机种子不同,您的图可能会有所不同):
您还将获得带有群体信息的文本文件 pop.txt
,其内容应如下:
0.000 0.0 0.0 1.0
0.250 0.0 0.0 1.0
0.500 0.0 0.0 1.0
0.750 0.0 0.0 1.0
...
机器学习
Multi-state ANI models
Multi-state learning model (MS-ANI) that has unrivaled accuracy for excited state properties (accuracy is often better than for models targeting only ground state!). We demonstrate that this model can be used for trajectory-surface hopping of multiple molecules (not just for a single molecule!) in:
Mikołaj Martyka, Lina Zhang, Fuchun Ge, Yi-Fan Hou, Joanna Jankowska, Mario Barbatti, Pavlo O. Dral. Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning. 2024. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-dtc1w.
Zip with tutorial materials
including Jupyter notebook:
#Import MLatom as a Python package.
import mlatom as ml
#First, we define a MS-ANI model
model = ml.models.msani(model_file='fulvene_tutorial.npz',nstates=2)
the trained MS-ANI model will be saved in fulvene_tutorial.npz
#then, we load the training data as a MLatom molecular database. This is a small dataset for
#demo purposes.
train_data = ml.data.molecular_database.load("tutorial_data.json", format="json")
#Now we can train the model.
model.train(molecular_database=train_data,
property_to_learn='energy',
xyz_derivative_property_to_learn='energy_gradients',
hyperparameters={'max_epochs': 100}) #100 epochs is not enough, only for demo.
#The model can be used to make single-point predictions
#First we need to load a fulvene molecule from .xyz
mol = ml.data.molecule()
mol.read_from_xyz_file('fulvene.xyz')
mol.view()
3Dmol.js failed to load for some reason. Please check your browser console for error messages.
#Now we can make a prediction
model.predict(molecule=mol, nstates=2, current_state=0,
calculate_energy=True, calculate_energy_gradients=True)
#And print the results
for state in mol.electronic_states:
print(state.energy)
gap = (mol.electronic_states[1].energy-mol.electronic_states[0].energy)*ml.constants.Hartree2eV
print("The S1-S0 gap is {} eV".format(gap))
-230.56873760870891 -230.49484480298358 The S1-S0 gap is 2.010727281361726 eV
#Finally, we can use the MS-ANI model to run NAMD.
#For that we will load a model trained in the AL loop.
model_namd = ml.models.msani(model_file='fulvene_energy_iteration_19.npz')
model loaded from fulvene_energy_iteration_19.npz
#Load an initial conditions database
init_db = ml.data.molecular_database.load("init_cond_db.json", format="json")
#And define NAMD arguments
timemax = 60 # fs
namd_kwargs = {
'model': model_namd,
'time_step': 0.1, # fs
'maximum_propagation_time': timemax,
'hopping_algorithm': 'LZBL',
'nstates': 2,
'reduce_kinetic_energy': True,
}
#And finally run the NAMD. This should take about a minute.
dyns = ml.simulations.run_in_parallel(molecular_database=init_db[:4], task=ml.namd.surface_hopping_md, task_kwargs=namd_kwargs, create_and_keep_temp_directories=False)
#Now we can plot the results
trajs = [d.molecular_trajectory for d in dyns]
ml.namd.plot_population(trajectories=trajs, time_step=0.1,
max_propagation_time=timemax, nstates=2)
#MLatom trajectories can be analyzed and visualized using simple Python code. An example below.
energies = [[],[]]
time = []
gap = []
for step in trajs[0].steps:
energies[0].append(step.molecule.electronic_states[0].energy)
energies[1].append(step.molecule.electronic_states[1].energy)
time.append(step.time)
gap.append(step.molecule.electronic_states[1].energy-step.molecule.electronic_states[0].energy)
import matplotlib.pyplot as plt
import numpy as np
plt.plot(time, np.transpose(energies))
plt.xlabel("Time (fs)")
plt.ylabel("Energy (Hartree)")
plt.legend(['S0', 'S1'])
<matplotlib.legend.Legend at 0x2adbeae2d940>
ML-NAMD with single-state models
在本教程中,我们展示了一个更高级的示例,演示了如何使用ML模型进行表面跃迁分子动力学模拟。请参阅有关 通用机器学习势 的单独教程,之后我们将提供有关如何专门为激发态训练MLP的教程。
您可以从本文 下载
具有所需初始条件和ML模型的Jupyter笔记本。
教程中的计算速度非常快,您应该能够在一分钟内从30条轨迹中获得5 fs的最终群体图,时间步长为0.25 fs。以下是Jupyter笔记本的代码片段:
import mlatom as ml
import os
import numpy as np
# Read initial conditions
init_cond_db = ml.data.molecular_database.load(filename='materials/init_cond_db_for_pyrazine.json', format='json')
# We need to create a class that accepts the specific arguments shown below and saves the calculated electronic state properties in the molecule object
class mlmodels():
def __init__(self, nstates = 5):
folder_with_models = 'materials/lz_models'
self.models = [None for istate in range(nstates)]
for istate in range(nstates):
self.models[istate] = [ml.models.ani(model_file=f'{folder_with_models}/ensemble{ii+1}s{istate}.pt') for ii in range(2)]
for ii in range(2): self.models[istate][ii].nthreads = 1
def predict(self,
molecule=None,
nstates=5,
current_state=0,
calculate_energy=True,
calculate_energy_gradients=True):
molecule.electronic_states = [molecule.copy() for ii in range(nstates)]
for istate in range(nstates):
moltmp = molecule.electronic_states[istate]
moltmpens = [moltmp.copy() for ii in range(2)]
for ii in range(2):
self.models[istate][ii].predict(molecule=moltmpens[ii], calculate_energy = True, calculate_energy_gradients = True)
moltmp.energy = np.mean([moltmpens[ii].energy for ii in range(2)])
moltmp.energy_gradients = np.mean([moltmpens[ii].energy_gradients for ii in range(2)], axis=0)
molecule.energy = molecule.electronic_states[current_state].energy
molecule.energy_gradients = molecule.electronic_states[current_state].energy_gradients
models = mlmodels()
# Arguments for running NAMD trajectories
timemax = 5 # fs
namd_kwargs = {
'model': models,
'time_step': 0.25, # fs
'maximum_propagation_time': timemax,
'dump_trajectory_interval': None,
'hopping_algorithm': 'LZBL',
'nstates': 5,
'random_seed': 1, # making sure that the hopping probabilities are the same (should not be used in actual calculations!)
'rescale_velocity_direction': 'along velocities',
'reduce_kinetic_energy': False,
}
# Run 30 trajectories
dyns = ml.simulations.run_in_parallel(molecular_database=init_cond_db[:30], task=ml.namd.surface_hopping_md, task_kwargs=namd_kwargs)
trajs = [d.molecular_trajectory for d in dyns]
ml.namd.analyze_trajs(trajectories=trajs, maximum_propagation_time=timemax)
# Dump the trajectories
for itraj in range(len(trajs)):
trajs[itraj].dump(filename=f'traj{itraj+1}.json', format='json')
# Prepare the population plot
ml.namd.plot_population(trajectories=trajs, time_step=0.25,
max_propagation_time=timemax, nstates=5, filename=f'pop.png')
由于我们使用了固定的随机种子,您应该得到以下最终的群体分布:
GapMD
GapMD is an efficient way to explore the region near the conical intersections, see for details:
Mikołaj Martyka, Lina Zhang, Fuchun Ge, Yi-Fan Hou, Joanna Jankowska, Mario Barbatti, Pavlo O. Dral. Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning. 2024. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-dtc1w.
Zip with tutorial materials
including Jupyter notebook:
import mlatom as ml
#Load some database of initial conditions.
initial_conditions = ml.data.molecular_database.load("init_cond_db.json", format="json")
#Define a model, we will use the AL-trained ML model for fulvene.
ml_model = ml.models.msani(model_file='fulvene_energy_iteration_19.npz')
model loaded from fulvene_energy_iteration_19.npz
#Initialize a gapMD run
dyn = ml.gap_md(
model = ml_model,
molecule = initial_conditions[-1],
ensemble='NVE',
time_step=0.5,
maximum_propagation_time = 100,
lower_state =0,
upper_state =1,
current_state=1,
nstates=2,
gap_threshold=0.03,
)
#View the gap dynamics. Notice the ethylenic-type CI twisting between the
#5-membered ring and =CH2.
dyn.molecular_trajectory.view()
3Dmol.js failed to load for some reason. Please check your browser console for error messages.
#GapMD can also be ran in parallel. For that we need to prepare parameters,
#like for NAMD...
gapmd_kwargs ={
'model':ml_model,
'ensemble':'NVE',
'time_step':0.5,
'maximum_propagation_time':100,
'lower_state':0,
'upper_state':1,
'current_state':1,
'nstates':2,
'gap_threshold':0.03}
#... and use the same method, run_in_parallel to run them all.
dyns = ml.simulations.run_in_parallel(molecular_database=initial_conditions[:2], task=ml.gap_md, task_kwargs=gapmd_kwargs, create_and_keep_temp_directories=False)
trajs = [d.molecular_trajectory for d in dyns]
for traj in trajs:
traj.view()
3Dmol.js failed to load for some reason. Please check your browser console for error messages.
3Dmol.js failed to load for some reason. Please check your browser console for error messages.
#MLatom trajectories can be analyzed and visualized using simple Python code. An example below.
energies = [[],[]]
time = []
gap = []
for step in trajs[0].steps:
energies[0].append(step.molecule.electronic_states[0].energy)
energies[1].append(step.molecule.electronic_states[1].energy)
time.append(step.time)
gap.append(step.molecule.electronic_states[1].energy-step.molecule.electronic_states[0].energy)
import matplotlib.pyplot as plt
import numpy as np
plt.plot(time, np.transpose(energies))
plt.xlabel("Time (fs)")
plt.ylabel("Energy (Hartree)")
plt.legend(['S0', 'S1'])
<matplotlib.legend.Legend at 0x2b520fd35880>
plt.plot(time, gap)
plt.xlabel("Time (fs)")
plt.ylabel("Gap (Hartree)")
Text(0, 0.5, 'Gap (Hartree)')
Active learning
Efficient and robust active learning for surface-hopping dynamics: often, you can do surface-hopping dynamics from start to finish within a couple of days on a single GPU! See for details:
Mikołaj Martyka, Lina Zhang, Fuchun Ge, Yi-Fan Hou, Joanna Jankowska, Mario Barbatti, Pavlo O. Dral. Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning. 2024. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-dtc1w.
Zip with tutorial materials
including Jupyter notebook:
import mlatom as ml
# Setting up the reference method. Note: you might want a better method than AIQM1/CIS for NAMD.
# Here, it is used for demo purposes.
aiqm1= ml.models.methods(method="AIQM1")
aiqm1_kwargs = {'nstates':2}
# Load the molecule for AL. Needs to have frequencies computed.
# It will be used in the normal mode sampling (Wigner sampling)
# you can generate this file with frequency calculations using MLatom
# i.e., ml.simulations.freq(molecule=..., ...) - see the dedicated tutorial
eqmol = ml.molecule.load('eqmol.json',format='json')
# Initial conditions for ML-NAMD in the AL loop
init_sampler_kwargs = {
'eqmol':eqmol,
'initial_temperature':0,
'number_of_initial_conditions': 16 # defines number of trajectories here
}
# Arguments for the AL ML-NAMD sampler
# you might want to use the defaults or stricter criteria
# this arguments are chosen to run the tutorial faster
sampler_kwargs = {
'maximum_propagation_time':10.0, # fs
'time_step':0.1, # fs
'nstates':2, # number of electronic state for TSH
'initial_state':1, # initial state for ML-NAMD
'prevent_back_hop':False, # prevent back hops?
'dump_trajectory_interval':10, # interval to dump trajectory.
'reduce_memory_usage':True, # do not keep entire trajectories in memory (dumps and cuts the trajectory as defined in dump_trajectory_interval, if dump_trajectory_interval=None then the trajectory is dumped at the end). If reduce_memory_usage=False then AL will not dump anything but you may run out of memory.
'min_surfuq_points': 12, # minimum points for AL-NAMD to be considered converged.
'ngapmd_trajs':16, # total number of gapMD trajectories. Default the same as the number of trajectories. If the specified value is greater than the nuber of ML-NAMD trajectories (number_of_initial_conditions), then it is changed to number_of_initial_conditions.
'max_hopprobuq_points':4, # maximum number of uncertain hopping probability points, default - 15.
'prob_uq_threshold':0.1, # threshold for probability-based sampling.
'plot_populations':True, # plot populations each iteration? It can take extra time in longer dynamics.
'stop_uncertain_md':False, # stop MD when it is uncertain, otherwise keep propagating (has no impact on sampling, is slower but allows for plotting populations)
'initcond_sampler_kwargs':init_sampler_kwargs
}
#Arguments for the initial points sampling procedure.
init_dataset_sampler_kwargs = {
'molecule':eqmol,
'initial_temperature': 0,
'number_of_initial_conditions': 50
}
# ML model arguments.
# again, you might want to use the default for gap_weight
ml_model_kwargs = {'nstates': 2, # number of states in the MS-ANI model
'gap_weight':1.0} # weight of the gap loss
# Running active learning.
# Some settings better leave default (like nthreads, initial_point_refinement).
# Here they are changed for faster calculations.
ml.al(molecule = eqmol,
reference_method = aiqm1, # reference method for labeling
ml_model = 'msani', # specify the MS-ANI model
sampler = 'lzsh', # use LZSH for sampling, with the AL-NAMD loop
initdata_sampler_kwargs=init_dataset_sampler_kwargs,
initial_points_refinement= 'one-shot', # use one-shot method for initial points sampling
initdata_sampler = 'wigner', # sample initial data from Wigner
sampler_kwargs = sampler_kwargs,
refmethod_kwargs = aiqm1_kwargs, # arguments of the reference method
nthreads = 2, # number of threads
ml_model_kwargs = ml_model_kwargs,# arguments for setting up the ML model
new_points = 50, # number of new points each iteration
)
Initial points sampling time: 16.064443078357726 s Active learning iteration 0 Points to label: 50 New labeled points: 50 0 points are abandoned due to failed calculation Number of points in the labeled data set: 50 New threshold: 0.02166365133151412 New threshold: 0.021648109401762487 Number of training points: 50 Number of subtraining points: 45 Number of validation points: 5 Main model Subtraining set: RMSE of values = 0.001697495306656166 Correlation coefficient = 0.9976927354489449 RMSE of gradients = 0.004054396915404656 Correlation coefficient = 0.9982806152425137 Validation set: RMSE of values = 4.964569583224559e-08 Correlation coefficient = 0.9999999999969875 RMSE of gradients = 3.5428167685899343e-08 Correlation coefficient = 0.9999999999998586 Auxiliary model Subtraining set: RMSE of values = 0.0026502502558855 Correlation coefficient = 0.9610827810297459 Validation set: RMSE of values = 0.01078232560595051 Correlation coefficient = 0.674217189362713 Trajectory 1 number of steps: 101 Adding molecule from trajectory 1 at time 5.00 fs Starting gapMD from trajectory 1 at time 4.00 fs Trajectory 2 number of steps: 101 Adding molecule from trajectory 2 at time 2.70 fs Starting gapMD from trajectory 2 at time 2.60 fs Trajectory 3 number of steps: 101 Adding molecule from trajectory 3 at time 0.00 fs Starting gapMD from trajectory 3 at time 0.00 fs Trajectory 4 number of steps: 101 Adding molecule from trajectory 4 at time 5.60 fs Starting gapMD from trajectory 4 at time 1.60 fs Trajectory 5 number of steps: 101 Adding molecule from trajectory 5 at time 0.00 fs Starting gapMD from trajectory 5 at time 0.00 fs Trajectory 6 number of steps: 101 Adding molecule from trajectory 6 at time 6.10 fs Starting gapMD from trajectory 6 at time 4.50 fs Trajectory 7 number of steps: 101 Adding molecule from trajectory 7 at time 5.20 fs Starting gapMD from trajectory 7 at time 3.80 fs Trajectory 8 number of steps: 101 Adding molecule from trajectory 8 at time 4.00 fs Starting gapMD from trajectory 8 at time 1.40 fs Trajectory 9 number of steps: 101 Adding molecule from trajectory 9 at time 1.10 fs Starting gapMD from trajectory 9 at time 0.20 fs Trajectory 10 number of steps: 101 Adding molecule from trajectory 10 at time 0.00 fs Starting gapMD from trajectory 10 at time 0.00 fs Trajectory 11 number of steps: 101 Adding molecule from trajectory 11 at time 5.50 fs Starting gapMD from trajectory 11 at time 3.80 fs Trajectory 12 number of steps: 101 Adding molecule from trajectory 12 at time 4.90 fs Starting gapMD from trajectory 12 at time 0.50 fs Trajectory 13 number of steps: 101 Adding molecule from trajectory 13 at time 0.00 fs Starting gapMD from trajectory 13 at time 0.00 fs Trajectory 14 number of steps: 101 Adding molecule from trajectory 14 at time 0.00 fs Starting gapMD from trajectory 14 at time 0.00 fs Trajectory 15 number of steps: 101 Adding molecule from trajectory 15 at time 5.30 fs Starting gapMD from trajectory 15 at time 2.00 fs Trajectory 16 number of steps: 101 Adding molecule from trajectory 16 at time 4.70 fs Starting gapMD from trajectory 16 at time 2.70 fs The surface stopping times are: 5.0 2.7 0.0 5.6000000000000005 0.0 6.1000000000000005 5.2 4.0 1.1 0.0 5.5 4.9 0.0 0.0 5.300000000000001 4.7 The model is 0.0 % converged Running gapMD trajs on lower surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 1.00 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 1.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 1.50 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 1.00 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 1.50 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 1.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 1.50 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 1.00 fs Running gapMD trajs on upper surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 0.00 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 4.50 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 2.00 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 1.00 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 2.00 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 1.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 1.50 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 0.50 fs 32 Trajectory 1 number of steps: 101 Adding molecule from trajectory 1 at time 0.00 fs Starting gapMD from trajectory 1 at time 0.00 fs Trajectory 2 number of steps: 101 Adding molecule from trajectory 2 at time 6.40 fs Starting gapMD from trajectory 2 at time 3.70 fs Trajectory 3 number of steps: 101 Adding molecule from trajectory 3 at time 6.60 fs Starting gapMD from trajectory 3 at time 0.70 fs Trajectory 4 number of steps: 101 Adding molecule from trajectory 4 at time 4.50 fs Starting gapMD from trajectory 4 at time 3.60 fs Trajectory 5 number of steps: 101 Adding molecule from trajectory 5 at time 0.00 fs Starting gapMD from trajectory 5 at time 0.00 fs Trajectory 6 number of steps: 101 Adding molecule from trajectory 6 at time 2.80 fs Starting gapMD from trajectory 6 at time 2.80 fs Trajectory 7 number of steps: 101 Adding molecule from trajectory 7 at time 1.90 fs Starting gapMD from trajectory 7 at time 0.70 fs Trajectory 8 number of steps: 101 Adding molecule from trajectory 8 at time 4.80 fs Starting gapMD from trajectory 8 at time 4.40 fs Trajectory 9 number of steps: 101 Adding molecule from trajectory 9 at time 4.10 fs Starting gapMD from trajectory 9 at time 1.80 fs Trajectory 10 number of steps: 101 Adding molecule from trajectory 10 at time 0.00 fs Starting gapMD from trajectory 10 at time 0.00 fs Trajectory 11 number of steps: 101 Adding molecule from trajectory 11 at time 2.10 fs Starting gapMD from trajectory 11 at time 0.10 fs Trajectory 12 number of steps: 101 Adding molecule from trajectory 12 at time 0.00 fs Starting gapMD from trajectory 12 at time 0.00 fs Trajectory 13 number of steps: 101 Adding molecule from trajectory 13 at time 4.80 fs Starting gapMD from trajectory 13 at time 0.70 fs Trajectory 14 number of steps: 101 Adding molecule from trajectory 14 at time 6.60 fs Starting gapMD from trajectory 14 at time 6.50 fs Trajectory 15 number of steps: 101 Adding molecule from trajectory 15 at time 5.20 fs Starting gapMD from trajectory 15 at time 1.00 fs Trajectory 16 number of steps: 101 Adding molecule from trajectory 16 at time 0.00 fs Starting gapMD from trajectory 16 at time 0.00 fs The surface stopping times are: 0.0 6.4 6.6000000000000005 4.5 0.0 2.8000000000000003 1.9000000000000001 4.800000000000001 4.1000000000000005 0.0 2.1 0.0 4.800000000000001 6.6000000000000005 5.2 0.0 The model is 0.0 % converged Running gapMD trajs on lower surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 0.50 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 0.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 0.00 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 3.00 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 1.50 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 1.00 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 0.50 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 0.00 fs Running gapMD trajs on upper surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 2.50 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 0.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 0.00 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 0.50 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 0.00 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 2.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 0.00 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 6.00 fs 32 Iteration 0 takes 565.5004676911049 s Labeling time: 15.793768778908998 s Training time: 420.94353390298784 s Sampling time: 128.76316500920802 s Active learning iteration 1 Points to label: 50 New labeled points: 50 0 points are abandoned due to failed calculation Number of points in the labeled data set: 100 Current threshold for state 0: 0.02166365133151412 Current threshold for state 1: 0.021648109401762487 Number of training points: 100 Number of subtraining points: 90 Number of validation points: 10 Main model Subtraining set: RMSE of values = 0.0009359529276005658 Correlation coefficient = 0.9994262114468375 RMSE of gradients = 0.003836567826145676 Correlation coefficient = 0.9989704130475721 Validation set: RMSE of values = 0.0019189207731975563 Correlation coefficient = 0.9977948236666376 RMSE of gradients = 0.0073077039742743466 Correlation coefficient = 0.9948677670569402 Auxiliary model Subtraining set: RMSE of values = 0.002208234503521621 Correlation coefficient = 0.9948439852171983 Validation set: RMSE of values = 0.004628894208882872 Correlation coefficient = 0.9825574728467348 Trajectory 1 number of steps: 101 Adding molecule from trajectory 1 at time 7.60 fs Starting gapMD from trajectory 1 at time 4.50 fs Trajectory 2 number of steps: 101 Adding molecule from trajectory 2 at time 0.00 fs Starting gapMD from trajectory 2 at time 0.00 fs Trajectory 3 number of steps: 101 Starting gapMD from trajectory 3 at time 1.90 fs Trajectory 4 number of steps: 101 Adding molecule from trajectory 4 at time 7.90 fs Starting gapMD from trajectory 4 at time 2.50 fs Trajectory 5 number of steps: 101 Adding molecule from trajectory 5 at time 7.20 fs Starting gapMD from trajectory 5 at time 0.20 fs Trajectory 6 number of steps: 101 Adding molecule from trajectory 6 at time 7.30 fs Starting gapMD from trajectory 6 at time 0.40 fs Trajectory 7 number of steps: 101 Adding molecule from trajectory 7 at time 1.30 fs Starting gapMD from trajectory 7 at time 0.10 fs Trajectory 8 number of steps: 101 Starting gapMD from trajectory 8 at time 4.80 fs Trajectory 9 number of steps: 101 Adding molecule from trajectory 9 at time 8.30 fs Starting gapMD from trajectory 9 at time 2.90 fs Trajectory 10 number of steps: 101 Adding molecule from trajectory 10 at time 6.30 fs Starting gapMD from trajectory 10 at time 1.30 fs Trajectory 11 number of steps: 101 Adding molecule from trajectory 11 at time 8.30 fs Starting gapMD from trajectory 11 at time 4.30 fs Trajectory 12 number of steps: 101 Starting gapMD from trajectory 12 at time 4.30 fs Trajectory 13 number of steps: 101 Adding molecule from trajectory 13 at time 6.30 fs Starting gapMD from trajectory 13 at time 6.00 fs Trajectory 14 number of steps: 101 Adding molecule from trajectory 14 at time 8.90 fs Starting gapMD from trajectory 14 at time 2.90 fs Trajectory 15 number of steps: 101 Adding molecule from trajectory 15 at time 6.00 fs Starting gapMD from trajectory 15 at time 2.10 fs Trajectory 16 number of steps: 101 Adding molecule from trajectory 16 at time 7.30 fs Starting gapMD from trajectory 16 at time 1.50 fs The surface stopping times are: 7.6000000000000005 0.0 7.9 7.2 7.300000000000001 1.3 8.3 6.300000000000001 8.3 6.300000000000001 8.9 6.0 7.300000000000001 The model is 18.75 % converged Running gapMD trajs on lower surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 2.50 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 4.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 2.00 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 4.00 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 4.00 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 4.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 5.50 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 1.00 fs Running gapMD trajs on upper surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 4.50 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 4.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 3.00 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 1.50 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 6.00 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 6.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 4.00 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 3.00 fs 29 Trajectory 1 number of steps: 101 Adding molecule from trajectory 1 at time 7.70 fs Starting gapMD from trajectory 1 at time 2.80 fs Trajectory 2 number of steps: 101 Adding molecule from trajectory 2 at time 8.00 fs Starting gapMD from trajectory 2 at time 2.70 fs Trajectory 3 number of steps: 101 Adding molecule from trajectory 3 at time 0.10 fs Starting gapMD from trajectory 3 at time 0.00 fs Trajectory 4 number of steps: 101 Adding molecule from trajectory 4 at time 6.70 fs Starting gapMD from trajectory 4 at time 5.80 fs Trajectory 5 number of steps: 101 Adding molecule from trajectory 5 at time 6.60 fs Starting gapMD from trajectory 5 at time 0.90 fs Trajectory 6 number of steps: 101 Adding molecule from trajectory 6 at time 7.10 fs Starting gapMD from trajectory 6 at time 6.50 fs Trajectory 7 number of steps: 101 Starting gapMD from trajectory 7 at time 2.90 fs Trajectory 8 number of steps: 101 Adding molecule from trajectory 8 at time 8.50 fs Starting gapMD from trajectory 8 at time 8.10 fs Trajectory 9 number of steps: 101 Adding molecule from trajectory 9 at time 0.00 fs Starting gapMD from trajectory 9 at time 0.00 fs Trajectory 10 number of steps: 101 Adding molecule from trajectory 10 at time 9.50 fs Starting gapMD from trajectory 10 at time 7.20 fs Trajectory 11 number of steps: 101 Adding molecule from trajectory 11 at time 9.50 fs Starting gapMD from trajectory 11 at time 1.50 fs Trajectory 12 number of steps: 101 Adding molecule from trajectory 12 at time 0.00 fs Starting gapMD from trajectory 12 at time 0.00 fs Trajectory 13 number of steps: 101 Adding molecule from trajectory 13 at time 6.50 fs Starting gapMD from trajectory 13 at time 0.10 fs Trajectory 14 number of steps: 101 Adding molecule from trajectory 14 at time 9.10 fs Starting gapMD from trajectory 14 at time 0.40 fs Trajectory 15 number of steps: 101 Adding molecule from trajectory 15 at time 8.00 fs Starting gapMD from trajectory 15 at time 4.90 fs Trajectory 16 number of steps: 101 Adding molecule from trajectory 16 at time 0.00 fs Starting gapMD from trajectory 16 at time 0.00 fs The surface stopping times are: 7.7 8.0 0.1 6.7 6.6000000000000005 7.1000000000000005 8.5 0.0 9.5 9.5 0.0 6.5 9.1 8.0 0.0 The model is 6.25 % converged Running gapMD trajs on lower surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 7.00 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 0.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 2.50 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 0.50 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 0.50 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 2.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 6.00 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 1.00 fs Running gapMD trajs on upper surfaces Trajectory 1 number of steps: 20 Adding molecule from trajectory 1 at time 4.50 fs Trajectory 2 number of steps: 20 Adding molecule from trajectory 2 at time 1.00 fs Trajectory 3 number of steps: 20 Adding molecule from trajectory 3 at time 3.00 fs Trajectory 4 number of steps: 20 Adding molecule from trajectory 4 at time 2.50 fs Trajectory 5 number of steps: 20 Adding molecule from trajectory 5 at time 3.00 fs Trajectory 6 number of steps: 20 Adding molecule from trajectory 6 at time 1.50 fs Trajectory 7 number of steps: 20 Adding molecule from trajectory 7 at time 6.00 fs Trajectory 8 number of steps: 20 Adding molecule from trajectory 8 at time 0.00 fs 31 Iteration 1 takes 683.7381067443639 s Labeling time: 12.741916863247752 s Training time: 540.8486425001174 s Sampling time: 130.14754738099873 s Active learning iteration 2 Points to label: 50 New labeled points: 50 0 points are abandoned due to failed calculation Number of points in the labeled data set: 150 Current threshold for state 0: 0.02166365133151412 Current threshold for state 1: 0.021648109401762487 Number of training points: 150 Number of subtraining points: 135 Number of validation points: 15 Main model Subtraining set: RMSE of values = 0.0005569757242605153 Correlation coefficient = 0.999893743202432 RMSE of gradients = 0.002572013976181595 Correlation coefficient = 0.9996511422987345 Validation set: RMSE of values = 0.0007755574094383158 Correlation coefficient = 0.9997931428803224 RMSE of gradients = 0.004419000198206317 Correlation coefficient = 0.9989716001762989 Auxiliary model Subtraining set: RMSE of values = 0.0019819420967923454 Correlation coefficient = 0.99851033686998 Validation set: RMSE of values = 0.005016435923752971 Correlation coefficient = 0.9912751692411135 Trajectory 1 number of steps: 101 Starting gapMD from trajectory 1 at time 0.90 fs Trajectory 2 number of steps: 101 Adding molecule from trajectory 2 at time 9.60 fs Starting gapMD from trajectory 2 at time 2.00 fs probUQ diverged probUQ diverged Trajectory 3 number of steps: 101 Starting gapMD from trajectory 3 at time 1.00 fs Trajectory 4 number of steps: 101 Adding molecule from trajectory 4 at time 8.90 fs Starting gapMD from trajectory 4 at time 1.40 fs Trajectory 5 number of steps: 101 Adding molecule from trajectory 5 at time 9.90 fs Starting gapMD from trajectory 5 at time 5.60 fs Trajectory 6 number of steps: 101 Starting gapMD from trajectory 6 at time 4.30 fs Trajectory 7 number of steps: 101 Starting gapMD from trajectory 7 at time 8.70 fs Trajectory 8 number of steps: 101 Starting gapMD from trajectory 8 at time 5.70 fs Trajectory 9 number of steps: 101 Starting gapMD from trajectory 9 at time 4.40 fs Trajectory 10 number of steps: 101 Starting gapMD from trajectory 10 at time 5.00 fs Trajectory 11 number of steps: 101 Starting gapMD from trajectory 11 at time 6.20 fs Trajectory 12 number of steps: 101 Starting gapMD from trajectory 12 at time 5.80 fs Trajectory 13 number of steps: 101 Starting gapMD from trajectory 13 at time 3.90 fs Trajectory 14 number of steps: 101 Adding molecule from trajectory 14 at time 9.10 fs Starting gapMD from trajectory 14 at time 2.30 fs Trajectory 15 number of steps: 101 Starting gapMD from trajectory 15 at time 8.30 fs Trajectory 16 number of steps: 101 Adding molecule from trajectory 16 at time 8.00 fs Starting gapMD from trajectory 16 at time 6.80 fs The surface stopping times are: 9.600000000000001 8.9 9.9 9.1 8.0 The model is 68.75 % converged 12 5 The ML-NAMD trajectories are converged, AL-NAMD converged. 0 Number of points to be labeled is less than 5, active learning converged Iteration 2 takes 1201.5959568759426 s Labeling time: 13.451935027725995 s Training time: 1136.9589645522647 s Sampling time: 51.18505729595199 s
<mlatom_alnamd.mlatom.al.al at 0x2b38020aa310>
<Figure size 1500x1200 with 0 Axes>
<Figure size 1500x1200 with 0 Axes>
<Figure size 1500x1200 with 0 Axes>
<Figure size 1500x1200 with 0 Axes>
<Figure size 1500x1200 with 0 Axes>
<Figure size 1500x1200 with 0 Axes>
Results above do not show any population as the settings were not strict enough to get to any hoppings and the trajectories are not propagated long enough