Installation
The easiest way to use MLatom is not to install it locally but run on the XACS cloud. If you want to install it anyway, please follow instruction below. You can watch a short video demonstrating how to install and use MLatom.
MLatom is a Python package and can be easily installed and upgraded using pip:
pip install --upgrade mlatom
You also need to install required dependences in your Python environment as described below.
Dependencies
Minimal setup
pip install numpy scipy torch torchani tqdm matplotlib statsmodels h5py pyh5md
Useful optional modules
pip install sgdml rmsd openbabel xgboost scikit-learn pyscf rmsd rdkit pandas \
ase fortranformat tensorflow geometric
Full Anaconda environment setup:
Download mlatom.yml
.
Then,
conda create -N mlatom --file mlatom.yml
conda activate mlatom
Additional software packages
Many MLatom features are relying on other third-party software packages which are not Python modules and should be installed and setup separately as described here.
The third-party packages below are optional and can be installed separately to enable specific features. In alphabetical order:
ASE (can be used for geometry optimizations and thermochemistry)
COLUMBUS (required for CASSCF)
DeePMD-kit (enables several machine learning potentials implemented there)
dftd4 (required for the D4 dispersion correction, required for AIQM1, ANI-2x-D4, and ANI-1x-D4)
GAP and QUIP (required for GAP-SOAP machine learning potential)
Gaussian (can be used for QM calculations, geometry optimizations, frequencies and thermochemistry, required for IRC and anharmonic frequencies)
hyperopt (can be used for hyperparameter optimization)
MACE (required for the MACE potential)
MNDO (can be used & recommended for AIQM1 and many other semi-empirical QM methods)
Newton-X (required for UV/vis spectra simulations)
Orca (required for CCSD(T)*/CBS and can be used for DFT calculations)
PhysNet (required for PhysNet potential)
sGDML (required for sGDML potential)
Sparrow (can be used & recommended for AIQM1 and many other semi-empirical QM methods)
TorchANI (required for AIQM1 and ANI potentials)
Turbomole (required for ADC(2) calculations)
COLUMBUS
COLUMBUS 7 is required for CASSCF calculations. It can be obtained and installed as described on the program website.
It must be made available to MLatom by setting up environmental variable COLUMBUS
, e.g.:
export COLUMBUS=[path to COLUMBUS directory with runc executable]
Turbomole
Turbomole is required for ADC(2) calculations. It can be obtained and installed as described on the program website.
It must be made available to MLatom by setting up environmental variable TURBODIR
.
Orca
Orca is required for CCSD(T)*/CBS calculations and can be used for DFT calculations. Here we use Orca 4.2.0. It can be obtained and installed as described on the program website.
It must be made available to MLatom by setting up environmental variable orcabin
, e.g.:
export orcabin=[path to Orca executable]
Newton-X
Newton-X is required for ML-NEA calculations.
Install Newton-X (NX, preferably version==2.2 for which our implementations were tested)
use
export NX=/path/to/Newton-X
to define the$NX
TorchANI
TorchANI is required for calculations with AIQM1 and ANI family of potentials.
install Numpy and nightly version of PyTorch (if you do not have them already):
pip install numpy tensorboard
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html
install TorchANI:
pip install torchani
Visit https://aiqm.github.io/torchani/ for more info. The latest version of TorchANI used for testing was v2.2, you can
install this version by pip install torchani==2.2
if there are any problems when running with the newest version of
TorchANI. The CUDA extension for AEV calculation is not supported for the NN part of AIQM1 and ANI-1ccx now.
DeePMD-kit
Required for DPMD and DeepPot-SE potentials.
download installer for DeePMD-kit from GitHub (tested v1.2.2)
run installer
add environmental variable
$DeePMDkit
that point to the where dp binary is located (bin/
in your installation directory), e.g.export DeePMDkit=/export/home/fcge/deepmd-kit-1.2/bin
.
GAP and QUIP
Required for GAP-SOAP potentials.
compile QUIP and GAP from source
1.1 install prerequisites
sudo apt-get install gcc gfortran python python-pip libblas-dev liblapack-dev # for system uses apt, do equivalent for your OS
pip install numpy ase f90wrap
1.2 get source code of QUIP and GAP
git clone --recursive https://github.com/libAtoms/QUIP.git
Get source code of GAP from http://www.libatoms.org/gap/gap_download.html (form-filling required).
Then put source code in QUIP/src/
.
1.3 compile
cd QUIP
export QUIP_ARCH=linux_x86_64_gfortran_openmp # enable multi-threading, use 'export QUIP_ARCH=linux_x86_64_gfortran' if no OpenMP thus no MT capability
export QUIPPY_INSTALL_OPTS=--user # omit for a system-wide installation
make config
Enter Y
for gap or edit build/linux_x86_64_gfortran/Makefile.inc
with HAVE_GAP=1
, then: make
.
Built binaries are in QUIP/build/linux_x86_64_gfortran/quip
and QUIP/build/linux_x86_64_gfortran/gap_fit
.
add environmental variable
$quip
and$gap_fit
for quip and gap_fit, e.g.
export quip='/export/home/fcge/GAP-SOAP/QUIP/build/linux_x86_64_gfortran_openmp/quip'
export gap_fit='/export/home/fcge/GAP-SOAP/QUIP/build/linux_x86_64_gfortran_openmp/gap_fit'
visit https://libatoms.github.io/GAP/index.html for more information.
PhysNet
Required for PhysNet models.
clone form PhysNet’s GitHub page
git clone https://github.com/MMunibas/PhysNet.git
install TensorFlow:
pip install tensorflow
if you use TensorFlow v2, you need to execute the command below in PhysNet’s directory to make the scripts compatible with TFv2.
for i in `find . -name '*.py'`; do sed -i -e 's/import tensorflow as tf/import tensorflow.compat.v1 as tf\ntf.disable_v2_behavior()/g'
-e 's/import tensorflow as tf/import tensorflow.compat.v1 as tf\ntf.disable_v2_behavior()/g' $i; done
add environmental variable
$PhysNet
to the directory, e.g.
export PhysNet=/export/home/fcge/PhysNet/
sGDML
Required for GDML and sGDML potentials.
install sGDML
pip install sgdml==0.4.4
add the path of sGDML binary to environmental variable
$sGDML
, e.g.
export sGDML=/export/home/fcge/.linuxbrew/bin/sgdml
Visit http://quantum-machine.org/gdml/doc/ for more information.
Note
In our tests we found that installation is more stable with: pip install scipy==1.7.1
.
MACE
Required for MACE potentials.
clone MACE git from GitHub
git clone https://github.com/ACEsuit/mace.git
install MACE with pip
pip install ./mace
Warning
We tested version 0.3.2 of MACE, please use this version if you encounter any problem.
MNDO
MNDO program is required to provide the ODM2* part of AIQM1. Alternatively, a (development) version of SCINE Sparrow can be used (at the moment it has no analytical derivatives for ODM2* part and hence only single-point simulations are recommended; see a paper on Sparrow; note that the development version of Sparrow also implements single-point AIQM1 calculations).
The free binary and open-source code of the MNDO program is available from the official distributors of the MNDO code as described at https://mndo.kofo.mpg.de.
After the MNDO program is installed, you need to set the environmental variable pointing to the MNDO executable (typically mndo99
), e.g., in bash:
export mndobin=[path to the executable]/mndo99
Sparrow
The SCINE Sparrow program can be used to provide many of the semi-empirical methods. See its website for the installation instructions.
Note that the a (development) version can also be used instead of the MNDO program to provide the ODM2* part of AIQM1, but the biggest limitation is that it has no analytical gradients at the moment. See a paper on Sparrow for details. Note that the development version of Sparrow also implements single-point AIQM1 calculations. Our recommendation is just to use this development version of Sparrow for AIQM1 single-point calculations on the XACS cloud. It is difficult to install this version.
dftd4
dftd4 program is required to provide the D4 part of AIQM1.
The dftd4 program can be obtained as both executable and open-source code. We recommend using dftd4 v3.5.0 (dftd4 v2.5.0 for the MLatom versions earlier than 3.0.1), which can calculate Hessian needed for thermochemical calculations. To install the dftd4 program from source code, please see the README.md file on dftd4 GitHub page for more details.
After the dftd4 program is installed, you need to set the environmental variable pointing to the dftd4
executable, e.g., in bash:
export dftd4bin=[path to the executable]/dftd4
Gaussian
Required for geometry optimizations, freq, TS search, IRC, thermochemistry, and ML-NEA. For some of these tasks, alternatively, ASE can be used, see below.
Our implementation work with both Gaussian 09 and Gaussian 16. It is a commercial program, which can be obtained and installed separately.
To use Gaussian interface, make sure that your environmental variable $GAUSS_EXEDIR
points to the right place.
ASE
Required for geometry optimizations, freq, and thermochemistry. Alternatively, Gaussian can be used, see above.
The ASE (Atomic Simulation Environment) are Python modules, which can be installed as described on ASE website, i.e.:
pip install ase
hyperopt
To enable hyperopt, please run pip install hyperopt
to install the hyperopt package.