.. _releases:
Releases
=================
Current release is ``MLatom, version 3.18.2``.
Upgrade your MLatom to the latest one with:
.. code-block:: bash
pip install --upgrade mlatom
.. _release_3.18:
MLatom 3.18
---------------------
- 3.18.0 -- released on 09.06.2025
- 3.18.1 -- released on 26.06.2025
- 3.18.2 -- released on 02.07.2025
Check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.18.2``
What's new?
++++++++++++
- MDtrajNet model
- MDtrajNet combines equivariant neural networks with a Transformer-based architecture to achieve strong accuracy and transferability in predicting long-time trajectories for both known and unseen systems.
- See our preprint for more details: Fuchun Ge and Pavlo O. Dral*. Artificial intelligence for direct prediction of molecular dynamics across chemical space. *ChemRxiv*. **2025**. DOI: `10.26434/chemrxiv-2025-kc7sn `_.
- Click :ref:`here ` to check the tutorial.
- ECTS
- ECTS is an ultra-fast diffusion model for exploring chemical reactions with equivariant consistency.
- Related paper: Xu M, Li B, Dong Z, Dral P, Zhu T, Chen H. ECTS: An ultra-fast diffusion model for exploring chemical reactions with equivariant consistency. *ChemRxiv*. **2025**. DOI:`10.26434/chemrxiv-2025-f9vdp `_.
- Click :ref:`here ` to check the tutorial.
- Fewest-switches surface hopping (FSSH)
- MRCI or CASSCF through the interface to COLUMBUS.
- FSSH with any ML models that provides energies, forces and NACs for electronic states of interset.
- Click :ref:`here ` to check the tutorial.
- Kernel ridge regression in Julia
- We provide another implementation of KRR in Julia.
- Click :ref:`here ` to check the tutorial.
- Machine learning of nonadiabatic coupling vectors (ML-NAC)
- Related paper: Jakub Martinka, Lina Zhang, Yi-Fan Hou, Mikołaj Martyka, Jiří Pittner, Mario Barbatti, Pavlo O. Dral. A descriptor is all you need: accurate machine learning of nonadiabatic coupling vectors. *ChemRxiv*. **2025**. DOI: `10.26434/chemrxiv-2025-wzkst `_.
- Click here to check the tutorial.
- In 3.18.1: bug fixes.
- In 3.18.2: the user does not have to install e3nn unless MDtrajNet is used.
.. _release_3.17:
MLatom 3.17
---------------------
- 3.17.3 -- released on 21.05.2025
- 3.17.2 -- released on 16.04.2025.
- 3.17.1 -- released on 26.03.2025.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.17.3``
What's new?
+++++++++++
- More support for reaction exploration including:
- QST2 and QST3 via interface to Gaussian
- Stable NEB method via interface to ASE
- `reaction_database` as the mlatom data format for storing reaction data
- Improved data analysis and format transform including:
- Transfer between h5 file and `molecular_database`
- More options for calculating RMSD between two structures
- Improved uvvis plot
- Improved error handling
- Many bug fixes and code refactoring to make MLatom more efficient and light-weighted
- in 3.17.2: fixed a severe bug, where the command line input was broken.
.. - in 3.17.3: fixed a severe bug with AIMNet2 models preventing its use in geometry optimization and frequency calculations.
.. _release_3.16:
MLatom 3.16
---------------------
- 3.16.2 -- released on 18.12.2024.
- 3.16.1 -- released on 11.12.2024.
- 3.16.0 -- released on 04.12.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.16.2``
What's new?
+++++++++++
- :ref:`TDDFT calculations with Gaussian and TDDFT and TDA calculations with PySCF `.
- parsing Gaussian output files into :ref:`MLatom data format `.
- new in 3.16.1:
- DFTB and TD-DFTB calculations via the interface to DFTB+ program (see the :ref:`installation instructions ` on how to setup it with MLatom).
- fixed the speed for importing mlatom as a Python package: ``import mlatom`` takes now less than a second.
- new in 3.16.2:
- calculating bond lengths, angles, dihedral angles with respective functions (``molecule.bond_length(atom1_index, atom2_index)``, etc.).
- calculating RMSD between two structures as ``mlatom.xyz.rmsd(molecule1, molecule2)``.
- running ``geomopt`` (or ``ts``) and ``freq`` calculations in the same job, e.g., input file can contain both keywords and will do frequency calculations after geometry optimization of minimum or transition state structure.
.. _release_3.16_nutshell:
In a nutshell
~~~~~~~~~~~~~~~~
To run TDDFT and TDA calculations with PySCF, just define your method as:
.. code-block:: python
import mlatom as ml
tddft=ml.models.methods(method='TD-B3LYP/6-31G*', program='PySCF')
# or
tda=ml.models.methods(method='TDA-B3LYP/6-31G*', program='PySCF')
Interface to Gaussian supports TDDFT as:
.. code-block:: python
import mlatom as ml
tddft=ml.models.methods(method='TD-B3LYP/6-31G*', program='Gaussian')
# note that by default the Gaussian output files will not be saved. You can request to save them, e.g., in the current directory as:
tddft=ml.models.methods(method='TD-B3LYP/6-31G*', program='Gaussian', working_directory='.')
To run DFTB or TD-DFTB calculations:
.. code-block:: python
import mlatom as ml
dftb=ml.models.methods(method='DFTB') # that will work for both DFTB and TD-DFTB, see below the difference when used in predict
Once defined, you can use the methods as usual for calculating excited-state properties, e.g., for :ref:`UV/vis spectra simulations `. Here is an example for single-point calculations for a single molecule or a molecular database (e.g., to get properties required for nuclear-ensemble approach in UV/vis spectra simulations or to generate data for ML):
.. code-block:: python
tddft.predict(molecule=single_molecule, calculate_energy=True, nstates=10, current_state=5)
# or
tda.predict(molecular_database=my_molecular_database, calculate_energy_gradients=True, nstates=10, current_state=5)
# or for TD-DFTB
dftb.predict(molecular_database=my_molecular_database, calculate_energy_gradients=True, nstates=10, current_state=5)
# for just ground-state DFTB
dftb.predict(molecular_database=my_molecular_database, calculate_energy_gradients=True)
The nice thing is also that if you have Gaussian output files lying around, you can directly parse them with MLatom to get the molecules in its data format, e.g.:
.. code-block:: python
import mlatom as ml
mol = ml.molecule.load('gaussian_output_file.log', format='gaussian')
mol.dump('parsed_molecule.json', format='json')
# or
db = ml.molecular_database.load(filename='geomopt.log', format='gaussian')
# or
opttraj = ml.data.molecular_trajectory()
opttraj.load(filename='sample_outputs/gaussian/geomopt.log', format='gaussian')
# or
mol = ml.molecule.load(filename='sample_outputs/gaussian/geomopt.log', format='gaussian')
mol.optimization_trajectory
mol.molecular_database
MLatom also now provides ways to calculate RMSD between structures:
.. code-block:: python
#Example of the simple use:
rmsd = mlatom.xyz.rmsd(molecule1.xyz_coordinates, molecule2.xyz_coordinates)
#Example of using Hungarian algorithm to check for homonuclear atom permutation and reflections:
rmsd = mlatom.xyz.rmsd(molecule1, molecule2, reorder='Hungarian', check_reflection=True)
Calculating structural parameters is now also easy:
.. code-block:: python
bond_length = mol.bond_length(0, 1) # bond lengths between atoms 1 and 2 (using Python indexing starging from zero)
bond_angle = mol.bond_length(0, 1, 2, degrees=True) # angles in degrees. If you want in radians, use degrees=False
dihedral_angle = mol.bond_length(0, 1, 2, 4, degrees=True)
And finally, after many requests, you can request in the same input file geometry optimization followed by frequency calculations:
.. code-block:: bash
geomopt
freq
xyzfile=init.xyz
for TS optimization:
.. code-block:: bash
ts
freq
xyzfile=init.xyz
Example how end of output file would look like:
.. code-block:: bash
Iteration Energy (Hartree)
1 -0.9037967543490
2 -0.9737420405670
3 -0.9675615854860
4 -0.9826833773690
5 -0.9826861683990
Final energy of molecule 1: -0.9826861683990 Hartree
==============================================================================
Vibration analysis for molecule 1
==============================================================================
Multiplicity: 1
This is a linear molecule
Mode Frequencies Reduced masses Force Constants IR intensities
(cm^-1) (AMU) (mDyne/A) (km/mol)
1 3755.3923 1.0078 8.3743 0.0000
==============================================================================
Thermochemistry for molecule 1
==============================================================================
ZPE-exclusive internal energy at 0 K: -0.98269 Hartree
Zero-point vibrational energy : 0.00856 Hartree
Internal energy at 0 K: -0.97413 Hartree
Enthalpy at 298 K: -0.97083 Hartree
Gibbs free energy at 298 K: -0.98570 Hartree
Atomization enthalpy at 0 K: 0.18717 Hartree 117.44811 kcal/mol
ZPE-exclusive atomization energy at 0 K: 0.19572 Hartree 122.81645 kcal/mol
Heat of formation at 298 K: -0.02252 Hartree -14.13239 kcal/mol
==============================================================================
Wall-clock time: 5.03 s (0.08 min, 0.00 hours)
MLatom terminated on 17.12.2024 at 21:30:24
==============================================================================
Contributors
++++++++++++
- Pavlo O. Dral (TDDFT in Gaussian, parsing of Gaussian output files, DFTB+ interface, ``geomopt freq`` task)
- Vignesh Balaji Kumar (TDDFT and TDA in PySCF)
- Yi-Fan Hou (parsing of Gaussian output files)
- Xin-Yu Tong (DFTB and TD-DFTB calculations via DFTB+ interface)
- Mikolaj Martyka (bond lengths, bond angles, and dihedral angles)
.. _release_3.15.0:
MLatom 3.15.0
---------------------
- 3.15.0 -- released on 27.11.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.15.0``
What's new?
+++++++++++
- :ref:`fine-tuning universal ANI models `.
When using this feature, please cite:
* Seyedeh Fatemeh Alavi, Yuxinxin Chen, Yi-Fan Hou, Fuchun Ge, Peikun Zheng, Pavlo O. Dral. Towards Accurate and Efficient Anharmonic Vibrational Frequencies with the Universal Interatomic Potential ANI-1ccx-gelu and Its Fine-Tuning. 2024. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-c8s16 (2024-10-09).
Contributors
++++++++++++
- Yuxinxin Chen
- Fuchun Ge
.. _release_3.14.0:
MLatom 3.14.0
---------------------
- 3.14.0 -- released on 20.11.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.14.0``
What's new?
+++++++++++
- :ref:`UV/vis spectra ` from single-point convolution and nuclear-ensemble approach via Python API with improved plotting routines.
- Updated interface to MACE to support its latest 0.3.8 version.
- minor bug fixes.
Contributors
++++++++++++
- Pavlo O. Dral
- Fuchun Ge
- Matheus de Oliveira Bispo
.. _release_3.13.0:
MLatom 3.13.0
---------------------
- 3.13.0 -- released on 06.11.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.13.0``
What's new?
+++++++++++
- :ref:`IR spectra calculations ` with AIQM1, AIQM2, UAIQM with semi-empirical baseline, and a range of QM methods (DFT, semi-empirical, ab initio wavefunction), with empirical scaling for better accuracy, special spectra module with plotting routines in Python.
Contributors
++++++++++++
- Yi-Fan Hou
- Fuchun Ge
- Pavlo O. Dral
.. _release_3.12.0:
MLatom 3.12.0
---------------------
- 3.12.0 -- released on 08.10.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.12.0``
What's new?
+++++++++++
- `AIQM2 `__
- `ANI-1ccx-gelu `__.
Contributors
++++++++++++
- Yuxinxin Chen
.. _release_3.11.0:
MLatom 3.11.0
---------------------
- 3.11.0 -- released on 23.09.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.11.0``
What's new?
+++++++++++
- :ref:`DENS24 implementation for our ensembles of DFT functionals `.
- :ref:`IR spectra with harmonic oscillator approximation with AIQM1 and DFT methods `.
- simpler input for methods, e.g., ``B3LYP/6-31G*`` in input will work, no need for ``method=B3LYP/6-31G* program=PySCF`` and similarly in Python API, no need to specify the ``program`` if the default is suitable.
- major bug fixes, particularly in active learning.
Contributors
++++++++++++
- Yuxinxin Chen
- Yifan Hou
- Mikolaj Martyka
- Fuchun Ge
.. _release_3.10.0:
MLatom 3.10.0
---------------------
- 3.10.1 -- released on 21.08.2024.
- 3.10.0 -- released on 21.08.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.10.1``
What's new?
+++++++++++
- big improvements for excited-state simulations and surface-hopping molecular dynamics:
- :ref:`active learning ` for surface-hopping dynamics. It is efficient and robust: often, you can do surface-hopping dynamics from start to finish within a couple of days on a single GPU!
- :ref:`multi-state learning model (MS-ANI) ` that has unrivaled accuracy for excited state properties (accuracy is often better than for models targeting only ground state!). We demonstrate that this model can be used for trajectory-surface hopping of multiple molecules (not just for a single molecule!)
- :ref:`gapMD ` for efficient sampling of the vicinity of conical intersection
- quality of life improvements:
- :ref:`visualizing molecules and their vibrations in Jupyter `, e.g., simply use ``mymol.view(normal_mode=1)``, etc.
- you can now load the molecule using both:
- ``mol = molecule(); mol.load(filename='mymol.json')``
- ``mol = molecule.load(filename='mymol.json')``
- now you can view the MD trajectory in Jupyter as ``mytraj.show()`` for quick checks
- same for molecular databases: ``moldb.show()``
When using the above improvements for surface-hopping dynamics, please cite:
- Mikołaj Martyka, Lina Zhang, Fuchun Ge, Yi-Fan Hou, Joanna Jankowska, Mario Barbatti, Pavlo O. Dral. Charting electronic-state manifolds across molecules with multi-state learning and gap-driven dynamics via efficient and robust active learning. **2024**. Preprint on ChemRxiv: https://doi.org/10.26434/chemrxiv-2024-dtc1w.
Contributors
++++++++++++
- Mikolaj Martyka (:red:`New contributor, welcome to the MLatom developers team!`)
- Lina Zhang
- Pavlo O. Dral
.. _release_3.9.0:
MLatom 3.9.0
---------------------
- 3.9.0 -- released on 23.07.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.9.0``
What's new?
+++++++++++
- :ref:`periodic boundary conditions `
Contributed to this release:
- Fuchun Ge
.. _release_3.8.0:
MLatom 3.8.0
---------------------
- 3.8.0 -- released on 17.07.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.8.0``
What's new?
+++++++++++
- :ref:`Directly learning dynamics with GICnet `
The implementation details are given in the following work (please cite it alongside other required citations when using this feature):
- Fuchun Ge, Lina Zhang, Yi-Fan Hou, Yuxinxin Chen, Arif Ullah, Pavlo O. Dral*. Four-dimensional-spacetime atomistic artificial intelligence models. *J. Phys. Chem. Lett*. **2023**, 14, 7732–7743. DOI: `10.1021/acs.jpclett.3c01592 `_.
Contributed to this release:
- Fuchun Ge
.. _release_3.7.0:
MLatom 3.7.0 -- 3.7.1
---------------------
- 3.7.1 -- released on 04.07.2024. Bug fix.
- 3.7.0 -- released on 03.07.2024.
:download:`Download zip `, check these versions on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.7.1``
What's new?
+++++++++++
- :ref:`Active learning `
- batch parallelization of MD (heavily used in AL).
The implementation details are given in the following work (please cite it alongside other required citations when using this feature):
- Yi-Fan Hou, Lina Zhang, Quanhao Zhang, Fuchun Ge, Pavlo O. Dral. `Physics-informed active learning for accelerating quantum chemical simulations `__. *arXiv* **2024**, DOI: 10.48550/arXiv.2404.11811.
Contributed to this release:
- Yi-Fan Hou (implementations and tests of AL, documentation, design of API)
- Pavlo O. Dral (implementation of earlier version of AL, design of API, documentation)
- Fuchun Ge (batch parallelization of MD)
.. include:: releases/license_3.7.0.rst
.. _release_3.6.0:
MLatom 3.6.0
------------
Released on 15.05.2024.
:download:`Download zip `, check this version on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.6.0``
What's new?
+++++++++++
This is a major release with the implementation of many new universal ML-based models (see :ref:`tutorial `):
- DM21
- AIMNet2
- ANI-1xnr
Contributed to this release: Yuxinxin Chen.
.. include:: releases/license_3.6.0.rst
.. _release_3.5.0:
MLatom 3.5.0
------------
Released on 08.05.2024.
:download:`Download zip `, check this version on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.5.0``
What's new?
+++++++++++
This is a major release with the implementation of the quasi-classical MD:
- sampling from the harmonic quantum Boltzmann distribution
- :ref:`quasi-classical molecular dynamics (quasi-classical trajectories (QCT)) `
The implementation details are given in the following work (please cite it alongside other required citations when using this feature):
- Yi-Fan Hou, Lina Zhang, Quanhao Zhang, Fuchun Ge, Pavlo O. Dral. `Physics-informed active learning for accelerating quantum chemical simulations `__. *arXiv* **2024**, DOI: 10.48550/arXiv.2404.11811.
Contributed to this release: Yi-Fan Hou.
.. include:: releases/license_3.5.0.rst
.. _release_3.4.0:
MLatom 3.4.0
------------
Released on 29.04.2024.
:download:`Download zip `, check this version on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.4.0``
What's new?
+++++++++++
This release is a major release with usability improvements:
- :ref:`simpler input ` of xyz coordinates (and other data) directly in the input file.
- :ref:`more informative output ` of geometry optimizations.
- :ref:`interface ` to `geomeTRIC `__ package for geometry optimizations.
- and :ref:`other improvements ` related to geometry optimizations.
Contributed to this release: Pavlo O. Dral, Yuxinxin Chen, Fuchun Ge, and Yi-Fan Hou.
.. _release_3.4.0_nutshell:
In a nutshell
~~~~~~~~~~~~~~~~
Input becomes much simpler. Here is an example of the geometry optimization job:
.. code-block::
geomopt
GFN2-xTB
XYZfile='
2
H 0 0 0
H 0 0 0.8
'
which would print out much more informative output, e.g., a snippet:
.. code-block::
------------------------------------------------------------------------------
Iteration 8
------------------------------------------------------------------------------
Molecule with 2 atom(s): H, H
XYZ coordinates, Angstrom
1 H 0.000000 0.000000 -0.138338
2 H 0.000000 0.000000 0.638338
Interatomic distance matrix, Angstrom
[[0. 0.77667652]
[0.77667652 0. ]]
Energy: -0.982686 Hartree
Energy gradients, Hartree/Angstrom
1 H 0.000000 0.000000 0.000020
2 H 0.000000 0.000000 -0.000020
Energy gradients norm: 0.000029 Hartree/Angstrom
.. _release_3.4.0_input:
Simpler input
~~~~~~~~~~~~~~~~
If you ever wanted to get rid of the auxiliary files, now it is possible. You just need to enclose the content of such files between ``'`` characters instead of providing the input file name, e.g., as above:
.. code-block::
geomopt
GFN2-xTB
XYZfile='
2
H 0 0 0
H 0 0 0.8
'
This also works for other files such as velocities, etc.
Optimized geometries will be saved in ``optgeoms.xyz`` if option ``optxyz=`` is not provided.
.. _release_3.4.0_output:
Informative output
~~~~~~~~~~~~~~~~~~
Now you can track the progress of your geometry optimization in the output file which would print out the information about geometries, energies, and gradients at each iteration (if you have no more than 10 molecules):
.. code-block::
------------------------------------------------------------------------------
Iteration 8
------------------------------------------------------------------------------
Molecule with 2 atom(s): H, H
XYZ coordinates, Angstrom
1 H 0.000000 0.000000 -0.138338
2 H 0.000000 0.000000 0.638338
Interatomic distance matrix, Angstrom
[[0. 0.77667652]
[0.77667652 0. ]]
Energy: -0.982686 Hartree
Energy gradients, Hartree/Angstrom
1 H 0.000000 0.000000 0.000020
2 H 0.000000 0.000000 -0.000020
Energy gradients norm: 0.000029 Hartree/Angstrom
In addition, geometry optimizations will dump many useful files:
- optimized geometries in ``optgeoms.xyz`` or other file saved under name requested with ``optxyz=``.
- the optimization trajectories in XYZ format ``opttraj1.xyz`` and JSON format ``opttraj1.json`` and so on for each molecule.
- in case of optimizations with the Gaussian optimizer, you will also get the corresponding input and output files ``gaussian1.com`` and ``gaussian1.log`` etc for each molecule.
If you want to control how much information is saved (e.g., for big molecules and many molecules):
- ``printmin`` will not print information about every iteration.
- ``printall`` will print detailed information at each iteration.
- ``dumpopttrajs=False`` will not dump any optimization trajectories.
The corresponding controls are also availalbe in Python API, i.e., arguments for ``ml.simulations.optimize_geometry``:
- ``print_properties`` (``None`` or ``str``, optional): properties to print. Default: ``None``. Possible ``'all'``.
- ``dump_trajectory_interval`` (``int``, optional): dump trajectory at every time step (1). Set to ``None`` to disable dumping (default).
- ``filename`` (``str``, optional): the file that saves the dumped trajectory. Default: ``None``.
- ``format`` (``str``, optional): format in which the dumped trajectory is saved. Default: ``'json'``.
.. _release_3.4.0_geometric:
geomeTRIC
~~~~~~~~~~~~~~~~~~
MLatom now supports geometry optimizations (including TS optimization) with `geomeTRIC `__. To install it, just run ``pip install geometric``.
If you want to use it, in command line add ``optprog=geomeTRIC``, in Python API ``program=geometric``.
If you use this program, please cite:
* L.-P. Wang, C. C. Song, *J. Chem. Phys.* **2016**, *144*, 214108.
.. _release_3.4.0_misc:
Other changes
~~~~~~~~~~~~~~~~~~
- overwrite (and print warning) the XYZ file with optimized geometry if it exists. Before MLatom would terminate and complain that the file exists. Practice showed that it is annoying behavior as often we want to rerun calculations in the same folder and replace the old calculation result.
- more graceful handling of failed geometry optimizations.
.. include:: releases/license_3.4.0.rst
.. _release_3.3.0:
MLatom 3.3.0
------------
Released on 03.04.2024.
:download:`Download zip `, check this version on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.3.0``
What's new?
+++++++++++
This is a major release with:
* :ref:`surface-hopping dynamics ` (within NAC-free Landau--Zener--Belyev--Lebedev (LZBL) approximation)
* support of :ref:`excited-state calculations ` with AIQM1, *ab initio* methods through COLUMBUS (for CASSCF) and Turbomole (for ADC(2)), semi-empirical methods through the MNDO program, and machine-learning and hybrid QM/ML methods.
* :ref:`Wigner sampling ` (with and without filtering by excitation energy window) - useful for generating initial conditions for surface-hopping dynamics or spectra simulations. The routine for Wigner sampling is adapted to Python from `Newton-X `__. It is not needed to be installed, but you must cite the corresponding `Newton-X paper `__ when using the Wigner sampling.
Minor change:
* Since this release, we fixed how the validation loss is calculated in the training of ANI networks. Now it is calculated as the overall mean squared error over all batches, while before it was calculated as averaged RMSE of validation batches. This might lead to different numerical results as the model training uses validation RMSE for, e.g., early stopping and learning rate.
Contributed to this release: Pavlo O. Dral, Zhang Lina, Fuchun Ge, Sebastian Pios, Yi-Fan Hou, and Yuxinxin Chen.
See our paper for more details (please also cite it if you use this feature):
* Lina Zhang, Sebastian Pios, Mikołaj Martyka, Fuchun Ge, Yi-Fan Hou, Yuxinxin Chen, Joanna Jankowska, Lipeng Chen, Mario Barbatti, `Pavlo O. Dral `__. `MLatom software ecosystem for surface hopping dynamics in Python with quantum mechanical and machine learning methods `__. *J. Chem. Theory Comput.* **2024**, *20*, 5043--5057. DOI: 10.1021/acs.jctc.4c00468. Preprint on *arXiv*: https://arxiv.org/abs/2404.06189.
The code snippet to give an idea how to use these new features (see the :ref:`dedicated tutorial `):
.. code-block:: python
import mlatom as ml
# Load the initial geometry of a molecule
mol = ml.data.molecule()
mol.charge=1
mol.read_from_xyz_file('cnh4+.xyz')
# Define model
aiqm1 = ml.models.methods(method='AIQM1',qm_program_kwargs={'save_files_in_current_directory': True, 'read_keywords_from_file':f'mndokw'})
method_optfreq = ml.models.methods(method='B3LYP/Def2SVP', program='pyscf') # You can also use AIQM1 (it is recommended for neutral species because if its better quality)
# Optimize geometry
geomopt = ml.simulations.optimize_geometry(model=method_optfreq,
initial_molecule=mol)
eqmol = geomopt.optimized_molecule
eqmol.write_file_with_xyz_coordinates('eq.xyz')
# Get frequencies
ml.simulations.freq(model=method_optfreq,
molecule=eqmol)
eqmol.dump(filename='eqmol.json', format='json')
# Get initial conditions
init_cond_db = ml.generate_initial_conditions(molecule=eqmol,
generation_method='wigner',
number_of_initial_conditions=16,
initial_temperature=0)
init_cond_db.dump('test.json','json')
# Propagate multiple LZBL surface-hopping trajectories in parallel
# .. setup dynamics calculations
namd_kwargs = {
'model': aiqm1,
'time_step': 0.25,
'maximum_propagation_time': 5,
'hopping_algorithm': 'LZBL',
'nstates': 3,
'initial_state': 2,
}
# .. run trajectories in parallel
dyns = ml.simulations.run_in_parallel(molecular_database=init_cond_db, task=ml.namd.surface_hopping_md, task_kwargs=namd_kwargs, create_and_keep_temp_directories=True)
trajs = [d.molecular_trajectory for d in dyns]
# Dump the trajectories
itraj=0
for traj in trajs:
itraj+=1
traj.dump(filename=f"traj{itraj}.h5",format='h5md')
# Analyze the result of trajectories and make the population plot
ml.namd.analyze_trajs(trajectories=trajs, maximum_propagation_time=5)
ml.namd.plot_population(trajectories=trajs, time_step=0.25,
max_propagation_time=5, nstates=3, filename=f'pop.png')
.. include:: releases/license_3.3.0.rst
.. _release_3.2.0:
MLatom 3.2.0
------------
Released on 19.03.2024.
:download:`Download zip `, check this version on `PyPI `__ and `GitHub `__.
``pip install mlatom==3.2.0``
What's new?
+++++++++++
This is a major release with many new features, usability and performance improvements, and bug fixes.
In short, we:
* implemented :ref:`energy-weighted training ` of ANI machine learning potentials
* implemented :ref:`diffusion Monte Carlo `
* made AIQM1 on the XACS cloud :ref:`much faster and more stable `
* made :ref:`frequency calculations more robust `
* improved :ref:`ORCA interface `
* implemented CCSD(T)*/CBS calculations for :ref:`charged and open-shell species `
* implemented :ref:`geometry optimization with constraints `.
.. _release_3.2.0_weight_e:
Energy-weighted training of ANI-type machine learning potentials
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is hard to obtain machine learning potentials with balanced description of different PES regions when training on global PES data with many strongly distorted molecular geometries which have high deformation energies. Hence, we implemented training of ANI machine learning potentials using the energy weighting function that downweights the importance of PES regions with high deformation energies. We also provide recommendations, tutorials, and training scripts on an example of the glycine global PES.
See our paper for more details (please also cite it if you use this feature):
* F. Ge, R. Wang, C. Qu, P. Zheng, A. Nandi, R. Conte, P. L. Houston, J. M. Bowman, `P. O. Dral `__.
`Tell machine learning potentials what they are needed for: Simulation-oriented training exemplified for glycine `__. *J. Phys. Chem. Lett.* **2024**, *15*, 4451--4460. DOI: 10.1021/acs.jpclett.4c00746. Preprint on *arXiv*: https://arxiv.org/abs/2403.11216.
The code snippet to give an idea how to use this new features (see the `dedicated tutorial `__):
.. code-block::
import mlatom as ml
# ...
# define the weighing_function
def weighting_function(energy_reference, a):
# a - is a parameter defining the shape of the function
global_minimum = -284.33376035
reference_tensor = torch.tensor(energy_reference - global_minimum)
x=a*reference_tensor
x_tensor = torch.tensor(x)
x_pow5 = x_tensor ** 5
x_pow4 = x_tensor ** 4
x_pow3 = x_tensor ** 3
w = -6 * x_pow5 + 15 * x_pow4 - 10 * x_pow3 + 1
w = torch.clamp(w, min=0.000001)
return w
# train the ANI model with the energy weighting function
# get subsets subtrain_molDB and validate_molDB from somewhere (not shown here)
ani = ml.models.ani(model_file=f"glycine_ani_a2.15.pt")
ani.train(
molecular_database=subtrain_molDB,
validation_molecular_database=validate_molDB,
property_to_learn='energy',
xyz_derivative_property_to_learn='energy_gradients',
energy_weighting_function=weighting_function,
energy_weighting_function_kwargs={'a': 2.15},
hyperparameters={'lrReducePatience': 32}
)
.. _release_3.2.0_dmc:
Diffusion Monte Carlo
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Diffusion Monte Carlo (DMC) is a computationally expensive way of obtaining accurate frequencies and ZPVEs, and hence, the use machine learning potentials is indespensible. We interfaced MLatom to the great code `PyVibDMC `__ to enable the DMC simulations with different models.
See our paper for an example of DMC calculations for glycine conformers (please also cite it and PyVibDMC if you use this feature):
* F. Ge, R. Wang, C. Qu, P. Zheng, A. Nandi, R. Conte, P. L. Houston, J. M. Bowman, `P. O. Dral `__.
Tell machine learning potentials what they are needed for: Simulation-oriented training exemplified for glycine. **2024**, *15*, 4451--4460. Preprint on *arXiv*: https://arxiv.org/abs/2403.11216.
The code snippet to give an idea how to use this new features (see the `dedicated tutorial `__):
.. code-block::
import mlatom as mlatom
# ...
dmc=ml.simulations.dmc(model=model, initial_molecule=conf)
dmc.run(number_of_walkers=30000, number_of_timesteps=55000)
print(f'ZPVE: {(dmc.get_zpe(start_step=-1000) + 284.33355671) * 219474.63} cm-1')
.. _release_3.2.0_aiqm1:
AIQM1 on the XACS cloud: much faster and more stable
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yes, we know, AIQM1 has been too slow and numerically unstable on the XACS cloud. It is frustrating to both us and you. The reason is that we cannot use on the cloud the MNDO program which has analytical gradients (due to license reasons), and rely on Sparrow which is a great software for many semi-empirical approaches but unfortunately still has no analytical gradients for AIQM1 (it has for many other methods...). Thus, as a poor man's solution, we did some tweaking with calculations based on numerical derivatives and that greatly improved both speed and stability of AIQM1 on the cloud for geometry optimizations and, to some extend, also for frequencies and thermochemical calculations (see below). With more CPUs available for parallelization **AIQM1 on the cloud might be even faster than with single-CPU MNDO program**! Geometries are good for the tested molecules when optimized with the Sparrow-based AIQM1. Frequencies and thermochemistry are ok for small molecules like hydrogen or methane but not for bigger molecules like vinylacetylene and, hence, the output may have many negative frequencies which are purely artifact of errors introduced by numerical differentiation. Thus, when you use Sparrow-based AIQM1, please check the frequencies -- if there are too many negative ones, the frequencies and thermochemistries are not reliable. Currently, the only solution is to use MNDO-based AIQM1, but we have some methods in the workings that should improve the situation in the future releases.
Please give another try to AIQM1 on the `XACS cloud `__!
.. _release_3.2.0_freqprog:
New keyword ``freqprog``. Frequencies and thermochemistry with PySCF
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Now you can run frequency and thermochemistry calculations with PySCF with ``freqprog=pyscf`` for command-line use of MLatom. In Python, you can use the similar option ``mlatom.freq(program=`PySCF`, ...)`` and ``mlatom.thermochemistry(program=`PySCF`, ...)``. This release also introduces the new keyword ``freqprog`` for command-line use and, hence, resolve the inconsistency in the previous releases when to choose the program for frequency calculations, the users have to use the odd-looking ``optprog`` which should only be used for choosing the program for optimization as its name suggests...
Using PySCF has several advantages: it is open-source and can be used on our XACS cloud (where it is the default option now) and it has much more consistent handling of frequencies than our previous implementation based on TorchANI (and ASE for thermochemistry). PySCF automatically removes the rotational and translational frequencies; these can particularly erroneous and messing up thermochemistry if you use numerical gradients and Hessians (as we do on the cloud for AIQM1). If you want to see those, however, our old implementation is the way to go (use ``freqprog=ASE``).
PySCF is now made the default program if no Gaussian is detected. The old workhorse Gaussian still works like a charm if you have it and, hence, it is still the default option in MLatom.
.. _release_3.2.0_orca:
ORCA interface
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Orca is a popular program and we use it to generate data for our ML models. Hence, we implemented what we needed so far. The current interface works nicely for DFT calculations but if you want to do something else, the interface has some dirty tricks which you can find in the API manual (that might be changed in the future though, if someone wants to improve the interface).
The example of the Orca use in the input file is as usual:
.. code-block::
method=B3LYP/6-31G*
qmprog=Orca
geomopt
xyzfile=init.xyz
optxyz=opt.xyz
And in Python API:
.. code-block::
import mlatom as ml
# ...
dft_with_orca = ml.models.methods(method='B3LYP/6-31G*', program='Orca')
.. _release_3.2.0_ccsdt:
CCSD(T)*/CBS for charged and open-shell species
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We use CCSD(T)*/CBS to generate data and test ML models. It is an expensive but a rather accurate approach. So far our implementation only supported closed-shell, neutral molecules because we didn't need anything else for our research. With the new orca interface, we rewrote the old code and the new one naturally supports charged and open-shell species.
If you want to try it out for this purpose, the input file would look like:
.. code-block::
CCSD(T)*/CBS
yestfile=energy.dat
xyzfile=init.xyz
charges=1,0
multiplicities=2,1
In Python:
.. code-block::
import mlatom as ml
# ...
ccsd = ml.models.methods(method='CCSD(T)*/CBS')
mol.charge=1 ; mol.multiplicity=2
ccsd.predict(molecule=mol)
print(mol.energy)
.. _release_3.2.0_geomopt_constraints:
Geometry optimization with constraints
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We use ASE implementation of constraints so this is only available when you use ASE optimizer via Python API. It needs one more argument ``constraints`` and it should be used like this: ``constraints={'bonds':[[target,[index0,index1]], ...],'angles':[[target,[index0,index1,index2]], ...],'dihedrals':[[target,[index0,index1,index2,index3]], ...]}`` (Check `FixInternals class in ASE `__ for more information). The units of bond lengths are Angstrom and those of angles and dihedrals are degrees. Note that the indices of atoms start from 0!
Below shows an example of optimizing ethane using AIQM1 while setting the target of C-C bond length as 1.8 Angstrom.
The initial XYZ coordinates are:
.. code-block::
8
C -3.41779278 -2.06081078 0.00000000
H -3.06113836 -3.06962078 0.00000000
H -3.06111994 -1.55641259 -0.87365150
H -4.48779278 -2.06079760 0.00000000
C -2.90445057 -1.33485451 1.25740497
H -1.83445237 -1.33656157 1.25838332
H -3.25950713 -0.32548149 1.25642758
H -3.26271953 -1.83812251 2.13105517
In Python script:
.. code-block:: python
import mlatom as ml
mol = ml.data.molecule.from_xyz_file('ethane_initial.xyz')
print(f"Initial C-C bond length: {mol.get_internuclear_distance_matrix()[0][4]} Angstrom")
constraints = {'bonds':[[1.8,[0,4]]]}
aiqm1 = ml.models.methods(method='AIQM1',qm_program='sparrow')
geomopt = ml.optimize_geometry(model=aiqm1,initial_molecule=mol,program='ase',constraints=constraints)
optmol = geomopt.optimized_molecule
print(f"Final C-C bond length: {optmol.get_internuclear_distance_matrix()[0][4]} Angstrom")
print("XYZ coordinates:")
print(optmol.get_xyz_string())
The output should look like this:
.. code-block::
Initial C-C bond length: 1.5399999964612658 Angstrom
Step Time Energy fmax
LBFGS: 0 10:11:55 -2169.358192 0.6954
LBFGS: 1 10:11:56 -2168.493055 1.5444
LBFGS: 2 10:11:56 -2168.324507 2.1746
LBFGS: 3 10:11:57 -2168.702277 0.3925
LBFGS: 4 10:11:57 -2168.719565 0.3556
LBFGS: 5 10:11:58 -2168.744194 0.4116
LBFGS: 6 10:11:58 -2168.765605 0.3914
LBFGS: 7 10:11:58 -2168.779525 0.2010
LBFGS: 8 10:11:59 -2168.782649 0.0282
LBFGS: 9 10:11:59 -2168.782767 0.0237
LBFGS: 10 10:12:00 -2168.782733 0.0370
LBFGS: 11 10:12:00 -2168.782749 0.0236
LBFGS: 12 10:12:00 -2168.782623 0.0873
LBFGS: 13 10:12:01 -2168.782774 0.0203
LBFGS: 14 10:12:01 -2168.782777 0.0217
LBFGS: 15 10:12:01 -2168.782654 0.0988
LBFGS: 16 10:12:02 -2168.782762 0.0299
LBFGS: 17 10:12:02 -2168.782723 0.0372
LBFGS: 18 10:12:03 -2168.782762 0.0250
LBFGS: 19 10:12:03 -2168.777521 0.5284
LBFGS: 20 10:12:03 -2168.782764 0.0369
LBFGS: 21 10:12:04 -2168.782758 0.0397
LBFGS: 22 10:12:04 -2168.781062 0.1423
LBFGS: 23 10:12:04 -2168.782774 0.0375
LBFGS: 24 10:12:05 -2168.782774 0.0251
LBFGS: 25 10:12:05 -2168.782670 0.0433
LBFGS: 26 10:12:06 -2168.782769 0.0209
LBFGS: 27 10:12:06 -2168.782777 0.0177
Final C-C bond length: 1.799999998386536 Angstrom
XYZ coordinates:
8
C -3.4609594100000 -2.1223293000000 -0.1060679300000
H -3.0827425100000 -3.1376804600000 -0.0751292200000
H -3.0834089300000 -1.5865323800000 -0.9697472700000
H -4.5445733300000 -2.1041917000000 -0.0741819000000
C -2.8607234100000 -1.2737628100000 1.3635074000000
H -1.7772071700000 -1.2935363400000 1.3338165800000
H -3.2376294200000 -0.2576124300000 1.3304506900000
H -3.2417292800000 -1.8070164100000 2.2269711900000
.. include:: releases/license_3.2.0.rst
Older releases
--------------
Older releases you can find on the `legacy page `_.