Releases

Current release is MLatom, version 3.5.0.

MLatom 3.5.0

Released on 08.05.2024.

Download zip, check this version on PyPI and GitHub.

pip install mlatom==3.5.0

What’s new?

This is a major release with the implementation of the quasi-classical MD:

The implementation details are given in the following work (please cite it alongside other required citations when using this feature):

Contributed to this release: Yi-Fan Hou.

License and citations for MLatom 3.5.0

License

MLatom is an open-source software under the MIT license (modified to request proper citations).

Copyright (c) 2013- Pavlo O. Dral (dr-dral.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. When this Software or its derivatives are used in scientific publications, it shall be cited as:

The citations for MLatom’s interfaces and features shall be eventually included too. See program output, header.py, ref.json, and MLatom.com.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Citations

Citations mentioned above should be included. For convenience, below we provide the citations in the Bibtex format and you can also download EndNote files.

  @article{MLatom 3,
  author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Barbatti, Mario and Isayev, Olexandr and Wang, Cheng and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Ou, Yanchi},
  title = {MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflows},
  journal = {J. Chem. Theory Comput.},
  volume = {20},
  number = {3},
  pages = {1193--1213},
  DOI = {10.1021/acs.jctc.3c01203},
  year = {2024},
  type = {Journal Article}
  }

  @article{MLatom2,
  author = {Dral, Pavlo O. and Ge, Fuchun and Xue, Bao-Xin and Hou, Yi-Fan and Pinheiro Jr, Max and Huang, Jianxing and Barbatti, Mario},
  title = {MLatom 2: An Integrative Platform for Atomistic Machine Learning},
  journal = {Top. Curr. Chem.},
  volume = {379},
  number = {4},
  pages = {27},
  DOI = {10.1007/s41061-021-00339-5},
  year = {2021},
  type = {Journal Article}
  }

  @article{MLatom1,
  author = {Dral, Pavlo O.},
  title = {MLatom: A Program Package for Quantum Chemical Research Assisted by Machine Learning},
  journal = {J. Comput. Chem.},
  volume = {40},
  number = {26},
  pages = {2339--2347},
  DOI = {10.1002/jcc.26004},
  year = {2019},
  type = {Journal Article}
  }

  @misc{MLatomProg,
author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Pios, Sebastian V. and Ou, Yanchi},
  title = {MLatom: A Package for Atomistic Simulations with Machine Learning, version 3.5.0},
  year = {2013--2024},
  type = {Computer Program}
  }

MLatom 3.4.0

Released on 29.04.2024.

Download zip, check this version on PyPI and GitHub.

pip install mlatom==3.4.0

What’s new?

This release is a major release with usability improvements:

Contributed to this release: Pavlo O. Dral, Yuxinxin Chen, Fuchun Ge, and Yi-Fan Hou.

In a nutshell

Input becomes much simpler. Here is an example of the geometry optimization job:

geomopt
GFN2-xTB
XYZfile='
2

H 0 0 0
H 0 0 0.8
'

which would print out much more informative output, e.g., a snippet:

------------------------------------------------------------------------------
Iteration 8
------------------------------------------------------------------------------

    Molecule with 2 atom(s): H, H

    XYZ coordinates, Angstrom

    1    H             0.000000           0.000000          -0.138338
    2    H             0.000000           0.000000           0.638338

    Interatomic distance matrix, Angstrom

    [[0.         0.77667652]
    [0.77667652 0.        ]]

    Energy:          -0.982686 Hartree

    Energy gradients, Hartree/Angstrom

    1    H             0.000000           0.000000           0.000020
    2    H             0.000000           0.000000          -0.000020

    Energy gradients norm:           0.000029 Hartree/Angstrom

Simpler input

If you ever wanted to get rid of the auxiliary files, now it is possible. You just need to enclose the content of such files between ' characters instead of providing the input file name, e.g., as above:

geomopt
GFN2-xTB
XYZfile='
2

H 0 0 0
H 0 0 0.8
'

This also works for other files such as velocities, etc.

Optimized geometries will be saved in optgeoms.xyz if option optxyz= is not provided.

Informative output

Now you can track the progress of your geometry optimization in the output file which would print out the information about geometries, energies, and gradients at each iteration (if you have no more than 10 molecules):

------------------------------------------------------------------------------
Iteration 8
------------------------------------------------------------------------------

    Molecule with 2 atom(s): H, H

    XYZ coordinates, Angstrom

    1    H             0.000000           0.000000          -0.138338
    2    H             0.000000           0.000000           0.638338

    Interatomic distance matrix, Angstrom

    [[0.         0.77667652]
    [0.77667652 0.        ]]

    Energy:          -0.982686 Hartree

    Energy gradients, Hartree/Angstrom

    1    H             0.000000           0.000000           0.000020
    2    H             0.000000           0.000000          -0.000020

    Energy gradients norm:           0.000029 Hartree/Angstrom

In addition, geometry optimizations will dump many useful files:

  • optimized geometries in optgeoms.xyz or other file saved under name requested with optxyz=.

  • the optimization trajectories in XYZ format opttraj1.xyz and JSON format opttraj1.json and so on for each molecule.

  • in case of optimizations with the Gaussian optimizer, you will also get the corresponding input and output files gaussian1.com and gaussian1.log etc for each molecule.

If you want to control how much information is saved (e.g., for big molecules and many molecules):

  • printmin will not print information about every iteration.

  • printall will print detailed information at each iteration.

  • dumpopttrajs=False will not dump any optimization trajectories.

The corresponding controls are also availalbe in Python API, i.e., arguments for ml.simulations.optimize_geometry:

  • print_properties (None or str, optional): properties to print. Default: None. Possible 'all'.

  • dump_trajectory_interval (int, optional): dump trajectory at every time step (1). Set to None to disable dumping (default).

  • filename (str, optional): the file that saves the dumped trajectory. Default: None.

  • format (str, optional): format in which the dumped trajectory is saved. Default: 'json'.

geomeTRIC

MLatom now supports geometry optimizations (including TS optimization) with geomeTRIC. To install it, just run pip install geometric.

If you want to use it, in command line add optprog=geomeTRIC, in Python API program=geometric.

If you use this program, please cite:

  • L.-P. Wang, C. C. Song, J. Chem. Phys. 2016, 144, 214108.

Other changes

  • overwrite (and print warning) the XYZ file with optimized geometry if it exists. Before MLatom would terminate and complain that the file exists. Practice showed that it is annoying behavior as often we want to rerun calculations in the same folder and replace the old calculation result.

  • more graceful handling of failed geometry optimizations.

License and citations for MLatom 3.4.0

License

MLatom is an open-source software under the MIT license (modified to request proper citations).

Copyright (c) 2013- Pavlo O. Dral (dr-dral.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. When this Software or its derivatives are used in scientific publications, it shall be cited as:

The citations for MLatom’s interfaces and features shall be eventually included too. See program output, header.py, ref.json, and MLatom.com.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Citations

Citations mentioned above should be included. For convenience, below we provide the citations in the Bibtex format and you can also download EndNote files.

  @article{MLatom 3,
  author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Barbatti, Mario and Isayev, Olexandr and Wang, Cheng and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Ou, Yanchi},
  title = {MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflows},
  journal = {J. Chem. Theory Comput.},
  volume = {20},
  number = {3},
  pages = {1193--1213},
  DOI = {10.1021/acs.jctc.3c01203},
  year = {2024},
  type = {Journal Article}
  }

  @article{MLatom2,
  author = {Dral, Pavlo O. and Ge, Fuchun and Xue, Bao-Xin and Hou, Yi-Fan and Pinheiro Jr, Max and Huang, Jianxing and Barbatti, Mario},
  title = {MLatom 2: An Integrative Platform for Atomistic Machine Learning},
  journal = {Top. Curr. Chem.},
  volume = {379},
  number = {4},
  pages = {27},
  DOI = {10.1007/s41061-021-00339-5},
  year = {2021},
  type = {Journal Article}
  }

  @article{MLatom1,
  author = {Dral, Pavlo O.},
  title = {MLatom: A Program Package for Quantum Chemical Research Assisted by Machine Learning},
  journal = {J. Comput. Chem.},
  volume = {40},
  number = {26},
  pages = {2339--2347},
  DOI = {10.1002/jcc.26004},
  year = {2019},
  type = {Journal Article}
  }

  @misc{MLatomProg,
author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Pios, Sebastian V. and Ou, Yanchi},
  title = {MLatom: A Package for Atomistic Simulations with Machine Learning, version 3.4.0},
  year = {2013--2024},
  type = {Computer Program}
  }

MLatom 3.3.0

Released on 03.04.2024.

Download zip, check this version on PyPI and GitHub.

pip install mlatom==3.3.0

What’s new?

This is a major release with:

  • surface-hopping dynamics (within NAC-free Landau–Zener–Belyev–Lebedev (LZBL) approximation)

  • support of excited-state calculations with AIQM1, ab initio methods through COLUMBUS (for CASSCF) and Turbomole (for ADC(2)), semi-empirical methods through the MNDO program, and machine-learning and hybrid QM/ML methods.

  • Wigner sampling (with and without filtering by excitation energy window) - useful for generating initial conditions for surface-hopping dynamics or spectra simulations. The routine for Wigner sampling is adapted to Python from Newton-X. It is not needed to be installed, but you must cite the corresponding Newton-X paper when using the Wigner sampling.

Minor change:

  • Since this release, we fixed how the validation loss is calculated in the training of ANI networks. Now it is calculated as the overall mean squared error over all batches, while before it was calculated as averaged RMSE of validation batches. This might lead to different numerical results as the model training uses validation RMSE for, e.g., early stopping and learning rate.

Contributed to this release: Pavlo O. Dral, Zhang Lina, Fuchun Ge, Sebastian Pios, Yi-Fan Hou, and Yuxinxin Chen.

See our paper for more details (please also cite it if you use this feature):

  • Lina Zhang, Sebastian Pios, Mikołaj Martyka, Fuchun Ge, Yi-Fan Hou, Yuxinxin Chen, Joanna Jankowska, Lipeng Chen, Mario Barbatti, Pavlo O. Dral. MLatom software ecosystem for surface hopping dynamics in Python with quantum mechanical and machine learning methods. 2024, 15, 4451–4460. Preprint on arXiv: https://arxiv.org/abs/2404.06189.

The code snippet to give an idea how to use these new features (see the dedicated tutorial):

import mlatom as ml

# Load the initial geometry of a molecule
mol = ml.data.molecule()
mol.charge=1
mol.read_from_xyz_file('cnh4+.xyz')

# Define model
aiqm1 = ml.models.methods(method='AIQM1',qm_program_kwargs={'save_files_in_current_directory': True, 'read_keywords_from_file':f'mndokw'})
method_optfreq = ml.models.methods(method='B3LYP/Def2SVP', program='pyscf') # You can also use AIQM1 (it is recommended for neutral species because if its better quality)

# Optimize geometry
geomopt = ml.simulations.optimize_geometry(model=method_optfreq,
                                        initial_molecule=mol)
eqmol = geomopt.optimized_molecule
eqmol.write_file_with_xyz_coordinates('eq.xyz')

# Get frequencies
ml.simulations.freq(model=method_optfreq,
                    molecule=eqmol)
eqmol.dump(filename='eqmol.json', format='json')

# Get initial conditions
init_cond_db = ml.generate_initial_conditions(molecule=eqmol,
                                    generation_method='wigner',
                                    number_of_initial_conditions=16,
                                    initial_temperature=0)
init_cond_db.dump('test.json','json')

# Propagate multiple LZBL surface-hopping trajectories in parallel
# .. setup dynamics calculations
namd_kwargs = {
            'model': aiqm1,
            'time_step': 0.25,
            'maximum_propagation_time': 5,
            'hopping_algorithm': 'LZBL',
            'nstates': 3,
            'initial_state': 2,
            }

# .. run trajectories in parallel
dyns = ml.simulations.run_in_parallel(molecular_database=init_cond_db, task=ml.namd.surface_hopping_md, task_kwargs=namd_kwargs, create_and_keep_temp_directories=True)
trajs = [d.molecular_trajectory for d in dyns]

# Dump the trajectories
itraj=0
for traj in trajs:
    itraj+=1
    traj.dump(filename=f"traj{itraj}.h5",format='h5md')

# Analyze the result of trajectories and make the population plot
ml.namd.analyze_trajs(trajectories=trajs, maximum_propagation_time=5)
ml.namd.plot_population(trajectories=trajs, time_step=0.25,
                        max_propagation_time=5, nstates=3, filename=f'pop.png')

License and citations for MLatom 3.3.0

License

MLatom is an open-source software under the MIT license (modified to request proper citations).

Copyright (c) 2013- Pavlo O. Dral (dr-dral.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. When this Software or its derivatives are used in scientific publications, it shall be cited as:

The citations for MLatom’s interfaces and features shall be eventually included too. See program output, header.py, ref.json, and MLatom.com.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Citations

Citations mentioned above should be included. For convenience, below we provide the citations in the Bibtex format and you can also download EndNote files.

  @article{MLatom 3,
  author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Barbatti, Mario and Isayev, Olexandr and Wang, Cheng and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Ou, Yanchi},
  title = {MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflows},
  journal = {J. Chem. Theory Comput.},
  volume = {20},
  number = {3},
  pages = {1193--1213},
  DOI = {10.1021/acs.jctc.3c01203},
  year = {2024},
  type = {Journal Article}
  }

  @article{MLatom2,
  author = {Dral, Pavlo O. and Ge, Fuchun and Xue, Bao-Xin and Hou, Yi-Fan and Pinheiro Jr, Max and Huang, Jianxing and Barbatti, Mario},
  title = {MLatom 2: An Integrative Platform for Atomistic Machine Learning},
  journal = {Top. Curr. Chem.},
  volume = {379},
  number = {4},
  pages = {27},
  DOI = {10.1007/s41061-021-00339-5},
  year = {2021},
  type = {Journal Article}
  }

  @article{MLatom1,
  author = {Dral, Pavlo O.},
  title = {MLatom: A Program Package for Quantum Chemical Research Assisted by Machine Learning},
  journal = {J. Comput. Chem.},
  volume = {40},
  number = {26},
  pages = {2339--2347},
  DOI = {10.1002/jcc.26004},
  year = {2019},
  type = {Journal Article}
  }

  @misc{MLatomProg,
author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Pios, Sebastian V. and Ou, Yanchi},
  title = {MLatom: A Package for Atomistic Simulations with Machine Learning, version 3.3.0},
  year = {2013--2024},
  type = {Computer Program}
  }

MLatom 3.2.0

Released on 19.03.2024.

Download zip, check this version on PyPI and GitHub.

pip install mlatom==3.2.0

What’s new?

This is a major release with many new features, usability and performance improvements, and bug fixes. In short, we:

Energy-weighted training of ANI-type machine learning potentials

It is hard to obtain machine learning potentials with balanced description of different PES regions when training on global PES data with many strongly distorted molecular geometries which have high deformation energies. Hence, we implemented training of ANI machine learning potentials using the energy weighting function that downweights the importance of PES regions with high deformation energies. We also provide recommendations, tutorials, and training scripts on an example of the glycine global PES.

See our paper for more details (please also cite it if you use this feature):

The code snippet to give an idea how to use this new features (see the dedicated tutorial):

import mlatom as ml
# ...
# define the weighing_function
def weighting_function(energy_reference, a):
    # a - is a parameter defining the shape of the function
    global_minimum = -284.33376035
    reference_tensor = torch.tensor(energy_reference - global_minimum)
    x=a*reference_tensor
    x_tensor = torch.tensor(x)
    x_pow5 = x_tensor ** 5
    x_pow4 = x_tensor ** 4
    x_pow3 = x_tensor ** 3
    w = -6 * x_pow5 + 15 * x_pow4 - 10 * x_pow3 + 1
    w = torch.clamp(w, min=0.000001)
    return w

# train the ANI model with the energy weighting function
# get subsets subtrain_molDB and validate_molDB from somewhere (not shown here)
ani = ml.models.ani(model_file=f"glycine_ani_a2.15.pt")
ani.train(
    molecular_database=subtrain_molDB,
    validation_molecular_database=validate_molDB,
    property_to_learn='energy',
    xyz_derivative_property_to_learn='energy_gradients',
    energy_weighting_function=weighting_function,
    energy_weighting_function_kwargs={'a': 2.15},
    hyperparameters={'lrReducePatience': 32}
)

Diffusion Monte Carlo

Diffusion Monte Carlo (DMC) is a computationally expensive way of obtaining accurate frequencies and ZPVEs, and hence, the use machine learning potentials is indespensible. We interfaced MLatom to the great code PyVibDMC to enable the DMC simulations with different models.

See our paper for an example of DMC calculations for glycine conformers (please also cite it and PyVibDMC if you use this feature):

  • F. Ge, R. Wang, C. Qu, P. Zheng, A. Nandi, R. Conte, P. L. Houston, J. M. Bowman, P. O. Dral. Tell machine learning potentials what they are needed for: Simulation-oriented training exemplified for glycine. 2024, 15, 4451–4460. Preprint on arXiv: https://arxiv.org/abs/2403.11216.

The code snippet to give an idea how to use this new features (see the dedicated tutorial):

import mlatom as mlatom
# ...
dmc=ml.simulations.dmc(model=model, initial_molecule=conf)
dmc.run(number_of_walkers=30000, number_of_timesteps=55000)
print(f'ZPVE: {(dmc.get_zpe(start_step=-1000) + 284.33355671) * 219474.63} cm-1')

AIQM1 on the XACS cloud: much faster and more stable

Yes, we know, AIQM1 has been too slow and numerically unstable on the XACS cloud. It is frustrating to both us and you. The reason is that we cannot use on the cloud the MNDO program which has analytical gradients (due to license reasons), and rely on Sparrow which is a great software for many semi-empirical approaches but unfortunately still has no analytical gradients for AIQM1 (it has for many other methods…). Thus, as a poor man’s solution, we did some tweaking with calculations based on numerical derivatives and that greatly improved both speed and stability of AIQM1 on the cloud for geometry optimizations and, to some extend, also for frequencies and thermochemical calculations (see below). With more CPUs available for parallelization AIQM1 on the cloud might be even faster than with single-CPU MNDO program! Geometries are good for the tested molecules when optimized with the Sparrow-based AIQM1. Frequencies and thermochemistry are ok for small molecules like hydrogen or methane but not for bigger molecules like vinylacetylene and, hence, the output may have many negative frequencies which are purely artifact of errors introduced by numerical differentiation. Thus, when you use Sparrow-based AIQM1, please check the frequencies – if there are too many negative ones, the frequencies and thermochemistries are not reliable. Currently, the only solution is to use MNDO-based AIQM1, but we have some methods in the workings that should improve the situation in the future releases.

Please give another try to AIQM1 on the XACS cloud!

New keyword freqprog. Frequencies and thermochemistry with PySCF

Now you can run frequency and thermochemistry calculations with PySCF with freqprog=pyscf for command-line use of MLatom. In Python, you can use the similar option mlatom.freq(program=`PySCF`, ...) and mlatom.thermochemistry(program=`PySCF`, ...). This release also introduces the new keyword freqprog for command-line use and, hence, resolve the inconsistency in the previous releases when to choose the program for frequency calculations, the users have to use the odd-looking optprog which should only be used for choosing the program for optimization as its name suggests…

Using PySCF has several advantages: it is open-source and can be used on our XACS cloud (where it is the default option now) and it has much more consistent handling of frequencies than our previous implementation based on TorchANI (and ASE for thermochemistry). PySCF automatically removes the rotational and translational frequencies; these can particularly erroneous and messing up thermochemistry if you use numerical gradients and Hessians (as we do on the cloud for AIQM1). If you want to see those, however, our old implementation is the way to go (use freqprog=ASE).

PySCF is now made the default program if no Gaussian is detected. The old workhorse Gaussian still works like a charm if you have it and, hence, it is still the default option in MLatom.

ORCA interface

Orca is a popular program and we use it to generate data for our ML models. Hence, we implemented what we needed so far. The current interface works nicely for DFT calculations but if you want to do something else, the interface has some dirty tricks which you can find in the API manual (that might be changed in the future though, if someone wants to improve the interface).

The example of the Orca use in the input file is as usual:

method=B3LYP/6-31G*
qmprog=Orca
geomopt
xyzfile=init.xyz
optxyz=opt.xyz

And in Python API:

import mlatom as ml
# ...
dft_with_orca = ml.models.methods(method='B3LYP/6-31G*', program='Orca')

CCSD(T)*/CBS for charged and open-shell species

We use CCSD(T)*/CBS to generate data and test ML models. It is an expensive but a rather accurate approach. So far our implementation only supported closed-shell, neutral molecules because we didn’t need anything else for our research. With the new orca interface, we rewrote the old code and the new one naturally supports charged and open-shell species.

If you want to try it out for this purpose, the input file would look like:

CCSD(T)*/CBS
yestfile=energy.dat
xyzfile=init.xyz
charges=1,0
multiplicities=2,1

In Python:

import mlatom as ml
# ...
ccsd = ml.models.methods(method='CCSD(T)*/CBS')
mol.charge=1 ; mol.multiplicity=2
ccsd.predict(molecule=mol)
print(mol.energy)

Geometry optimization with constraints

We use ASE implementation of constraints so this is only available when you use ASE optimizer via Python API. It needs one more argument constraints and it should be used like this: constraints={'bonds':[[target,[index0,index1]], ...],'angles':[[target,[index0,index1,index2]], ...],'dihedrals':[[target,[index0,index1,index2,index3]], ...]} (Check FixInternals class in ASE for more information). The units of bond lengths are Angstrom and those of angles and dihedrals are degrees. Note that the indices of atoms start from 0!

Below shows an example of optimizing ethane using AIQM1 while setting the target of C-C bond length as 1.8 Angstrom.

The initial XYZ coordinates are:

8

C                 -3.41779278   -2.06081078    0.00000000
H                 -3.06113836   -3.06962078    0.00000000
H                 -3.06111994   -1.55641259   -0.87365150
H                 -4.48779278   -2.06079760    0.00000000
C                 -2.90445057   -1.33485451    1.25740497
H                 -1.83445237   -1.33656157    1.25838332
H                 -3.25950713   -0.32548149    1.25642758
H                 -3.26271953   -1.83812251    2.13105517

In Python script:

import mlatom as ml
mol = ml.data.molecule.from_xyz_file('ethane_initial.xyz')
print(f"Initial C-C bond length: {mol.get_internuclear_distance_matrix()[0][4]} Angstrom")
constraints = {'bonds':[[1.8,[0,4]]]}
aiqm1 = ml.models.methods(method='AIQM1',qm_program='sparrow')
geomopt = ml.optimize_geometry(model=aiqm1,initial_molecule=mol,program='ase',constraints=constraints)
optmol = geomopt.optimized_molecule
print(f"Final C-C bond length: {optmol.get_internuclear_distance_matrix()[0][4]} Angstrom")
print("XYZ coordinates:")
print(optmol.get_xyz_string())

The output should look like this:

Initial C-C bond length: 1.5399999964612658 Angstrom
       Step     Time          Energy         fmax
LBFGS:    0 10:11:55    -2169.358192        0.6954
LBFGS:    1 10:11:56    -2168.493055        1.5444
LBFGS:    2 10:11:56    -2168.324507        2.1746
LBFGS:    3 10:11:57    -2168.702277        0.3925
LBFGS:    4 10:11:57    -2168.719565        0.3556
LBFGS:    5 10:11:58    -2168.744194        0.4116
LBFGS:    6 10:11:58    -2168.765605        0.3914
LBFGS:    7 10:11:58    -2168.779525        0.2010
LBFGS:    8 10:11:59    -2168.782649        0.0282
LBFGS:    9 10:11:59    -2168.782767        0.0237
LBFGS:   10 10:12:00    -2168.782733        0.0370
LBFGS:   11 10:12:00    -2168.782749        0.0236
LBFGS:   12 10:12:00    -2168.782623        0.0873
LBFGS:   13 10:12:01    -2168.782774        0.0203
LBFGS:   14 10:12:01    -2168.782777        0.0217
LBFGS:   15 10:12:01    -2168.782654        0.0988
LBFGS:   16 10:12:02    -2168.782762        0.0299
LBFGS:   17 10:12:02    -2168.782723        0.0372
LBFGS:   18 10:12:03    -2168.782762        0.0250
LBFGS:   19 10:12:03    -2168.777521        0.5284
LBFGS:   20 10:12:03    -2168.782764        0.0369
LBFGS:   21 10:12:04    -2168.782758        0.0397
LBFGS:   22 10:12:04    -2168.781062        0.1423
LBFGS:   23 10:12:04    -2168.782774        0.0375
LBFGS:   24 10:12:05    -2168.782774        0.0251
LBFGS:   25 10:12:05    -2168.782670        0.0433
LBFGS:   26 10:12:06    -2168.782769        0.0209
LBFGS:   27 10:12:06    -2168.782777        0.0177
Final C-C bond length: 1.799999998386536 Angstrom
XYZ coordinates:
8

C            -3.4609594100000          -2.1223293000000          -0.1060679300000
H            -3.0827425100000          -3.1376804600000          -0.0751292200000
H            -3.0834089300000          -1.5865323800000          -0.9697472700000
H            -4.5445733300000          -2.1041917000000          -0.0741819000000
C            -2.8607234100000          -1.2737628100000           1.3635074000000
H            -1.7772071700000          -1.2935363400000           1.3338165800000
H            -3.2376294200000          -0.2576124300000           1.3304506900000
H            -3.2417292800000          -1.8070164100000           2.2269711900000

License and citations for MLatom 3.2.0

License

MLatom is an open-source software under the MIT license (modified to request proper citations).

Copyright (c) 2013- Pavlo O. Dral (dr-dral.com)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. When this Software or its derivatives are used in scientific publications, it shall be cited as:

The citations for MLatom’s interfaces and features shall be eventually included too. See program output, header.py, ref.json, and MLatom.com.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Citations

Citations mentioned above should be included. For convenience, below we provide the citations in the Bibtex format and you can also download EndNote files.

@article{MLatom 3,
author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Barbatti, Mario and Isayev, Olexandr and Wang, Cheng and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Ou, Yanchi},
title = {MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflows},
journal = {J. Chem. Theory Comput.},
volume = {20},
number = {3},
pages = {1193--1213},
DOI = {10.1021/acs.jctc.3c01203},
year = {2024},
type = {Journal Article}
}

@article{MLatom2,
author = {Dral, Pavlo O. and Ge, Fuchun and Xue, Bao-Xin and Hou, Yi-Fan and Pinheiro Jr, Max and Huang, Jianxing and Barbatti, Mario},
title = {MLatom 2: An Integrative Platform for Atomistic Machine Learning},
journal = {Top. Curr. Chem.},
volume = {379},
number = {4},
pages = {27},
DOI = {10.1007/s41061-021-00339-5},
year = {2021},
type = {Journal Article}
}

@article{MLatom1,
author = {Dral, Pavlo O.},
title = {MLatom: A Program Package for Quantum Chemical Research Assisted by Machine Learning},
journal = {J. Comput. Chem.},
volume = {40},
number = {26},
pages = {2339--2347},
DOI = {10.1002/jcc.26004},
year = {2019},
type = {Journal Article}
}

@misc{MLatomProg,
author = {Dral, Pavlo O. and Ge, Fuchun and Hou, Yi-Fan and Zheng, Peikun and Chen, Yuxinxin and Xue, Bao-Xin and Pinheiro Jr, Max and Su, Yuming and Dai, Yiheng and Chen, Yangtao and Zhang, Shuang and Zhang, Lina and Ullah, Arif and Zhang, Quanhao and Ou, Yanchi},
title = {MLatom: A Package for Atomistic Simulations with Machine Learning},
year = {2013--2024},
type = {Computer Program}
}

Older releases

Older releases you can find on the legacy page.

Support and contact

If you have further questions, criticism, and suggestions, we would be happy to receive them in English or Chinese via email, Slack (preferred), or WeChat (please send an email to request to add you to the XACS user support group).