.. _installation:

Installation
====================
The easiest way to use MLatom is not to install it locally but run on the :doc:`XACS cloud <cloud>`.
If you want to install it anyway, please follow instruction below. 
You can watch a short video demonstrating how to install and use MLatom.

.. raw:: html

    <iframe src="https://player.bilibili.com/player.html?isOutside=true&bvid=BV1PC4y1q7gX&p=1&autoplay=0" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true" height=500 width=800></iframe>

MLatom is a Python package and can be easily installed and upgraded using pip:

.. code-block:: bash

   pip install --upgrade mlatom

You also need to install required dependences in your Python environment as described below.

Dependencies
------------

**Minimal setup**

.. code-block::
    
   pip install numpy scipy torch torchani tqdm matplotlib statsmodels h5py pyh5md

**Useful optional modules**

.. code-block::
   
   pip install sgdml rmsd openbabel xgboost scikit-learn pyscf rmsd rdkit pandas \
      ase fortranformat tensorflow geometric

**Full Anaconda environment setup:**

Download :download:`mlatom.yml <files/mlatom.yml>`.

Then,

.. code-block::

   conda create -N mlatom --file mlatom.yml
   conda activate mlatom

Additional software packages
----------------------------

Many MLatom features are relying on other third-party software packages which are not Python modules and should be installed and setup separately as `described here <http://mlatom.com/download/#Installation_of_third-party_packages>`_. 

The third-party packages below are optional and can be installed separately to enable specific features. In alphabetical order:

- :ref:`ASE <install_ASE>` (can be used for geometry optimizations and thermochemistry)
- :ref:`COLUMBUS <install_COLUMBUS>` (required for CASSCF)
- :ref:`DeePMD-kit <install_DeePMD-kit>` (enables several machine learning potentials implemented there)
- :ref:`dftd4 <install_dftd4>` (required for the D4 dispersion correction, required for AIQM1, ANI-2x-D4, and ANI-1x-D4)
- :ref:`GAP and QUIP <install_GAP_QUIP>` (required for GAP-SOAP machine learning potential)
- :ref:`Gaussian <install_Gaussian>` (can be used for QM calculations, geometry optimizations, frequencies and thermochemistry, required for IRC and anharmonic frequencies)
- :ref:`hyperopt <install_hyperopt>` (can be used for hyperparameter optimization)
- :ref:`MACE <install_MACE>` (required for the MACE potential)
- :ref:`MNDO <install_MNDO>` (can be used & recommended for AIQM1 and many other semi-empirical QM methods)
- :ref:`Newton-X <install_Newton-X>` (required for UV/vis spectra simulations)
- :ref:`Orca <install_Orca>` (required for CCSD(T)*/CBS and can be used for DFT calculations)
- :ref:`PhysNet <install_PhysNet>` (required for PhysNet potential)
- :ref:`sGDML <install_sGDML>` (required for sGDML potential)
- :ref:`Sparrow <install_Sparrow>` (can be used & recommended for AIQM1 and many other semi-empirical QM methods)
- :ref:`TorchANI <install_TorchANI>` (required for AIQM1 and ANI potentials)
- :ref:`Turbomole <install_Turbomole>` (required for ADC(2) calculations)

.. _install_COLUMBUS:

COLUMBUS
++++++++

COLUMBUS 7 is required for CASSCF calculations.
It can be obtained and installed as described on the `program website <https://columbus-program-system.gitlab.io/columbus/>`__.

It must be made available to MLatom by setting up environmental variable ``COLUMBUS``, e.g.:

``export COLUMBUS=[path to COLUMBUS directory with runc executable]``

.. _install_Turbomole:

Turbomole
+++++++++

Turbomole is required for ADC(2) calculations.
It can be obtained and installed as described on the `program website <https://www.turbomole.org/>`__.

It must be made available to MLatom by setting up environmental variable ``TURBODIR``.

.. _install_Orca:

Orca
+++++++++

Orca is required for CCSD(T)*/CBS calculations and can be used for DFT calculations. Here we use Orca 4.2.0.
It can be obtained and installed as described on the `program website <https://www.kofo.mpg.de/en/research/services/orca>`__.

It must be made available to MLatom by setting up environmental variable ``orcabin``, e.g.:

``export orcabin=[path to Orca executable]``

.. _install_dftbplus:

DFTB+
+++++

Using the conda command is the recommended installation method:

``conda install conda-forge::dftbplus``

Alternatively, you can also install it by downloading the binary from https://www.dftbplus.org/download/stable.html or compiling from source:

.. code-block:: bash

	git clone https://github.com/dftbplus/dftbplus.git
	cmake --build _build -- -j 
	cmake --install _build 

After installation, you need to download the `DFTB+ parameter files <https://dftb.org/parameters>`__ (also called Slater--Koster files). Once downloaded, extract the files and add their path to your environment variable as (in bash):

``export skfiles=/path/to/SK/files``

.. _install_Newton-X:

Newton-X
++++++++

`Newton-X <https://newtonx.org/>`__ is required for `ML-NEA <https://doi.org/10.1021/acs.jpca.0c05310>`_ calculations.

1. Install `Newton-X <https://amubox.univ-amu.fr/s/aBMnEq7dXeZPm2H>`__ (NX, preferably version==2.2 for which our implementations were tested)
2. use ``export NX=/path/to/Newton-X`` to define the ``$NX``

.. _install_TorchANI:

TorchANI
++++++++

TorchANI is required for calculations with AIQM1 and ANI family of potentials.

1. install Numpy and nightly version of PyTorch (if you do not have them already):

.. code-block::

   pip install numpy tensorboard
   pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu100/torch_nightly.html

2. install TorchANI:

.. code-block::

   pip install torchani

Visit https://aiqm.github.io/torchani/ for more info. The latest version of TorchANI used for testing was v2.2, you can 
install this version by ``pip install torchani==2.2`` if there are any problems when running with the newest version of 
TorchANI. The CUDA extension for AEV calculation is not supported for the NN part of AIQM1 and ANI-1ccx now.

.. _install_DeePMD-kit:

DeePMD-kit
++++++++++

Required for DPMD and DeepPot-SE potentials.

1. download installer for DeePMD-kit from `GitHub <https://github.com/deepmodeling/deepmd-kit/releases>`__ (tested v1.2.2)
2. run installer
3. add environmental variable ``$DeePMDkit`` that point to the where dp binary is located (``bin/`` in your installation directory), 
   e.g. ``export DeePMDkit=/export/home/fcge/deepmd-kit-1.2/bin``.


.. _install_GAP_QUIP:

GAP and QUIP
++++++++++++

Required for GAP-SOAP potentials.

1. compile QUIP and GAP from source

1.1 install prerequisites

.. code-block::

   sudo apt-get install gcc gfortran python python-pip libblas-dev liblapack-dev # for system uses apt, do equivalent for your OS
   pip install numpy ase f90wrap

1.2 get source code of QUIP and GAP

.. code-block::

   git clone --recursive https://github.com/libAtoms/QUIP.git

Get source code of GAP from http://www.libatoms.org/gap/gap_download.html (form-filling required).

Then put source code in ``QUIP/src/``.

1.3 compile

.. code-block::

   cd QUIP
   export QUIP_ARCH=linux_x86_64_gfortran_openmp # enable multi-threading, use 'export QUIP_ARCH=linux_x86_64_gfortran' if no OpenMP thus no MT capability
   export QUIPPY_INSTALL_OPTS=--user # omit for a system-wide installation
   make config 

Enter ``Y`` for gap or edit ``build/linux_x86_64_gfortran/Makefile.inc`` with ``HAVE_GAP=1``, then: ``make``.
Built binaries are in ``QUIP/build/linux_x86_64_gfortran/quip`` and ``QUIP/build/linux_x86_64_gfortran/gap_fit``.

2. add environmental variable ``$quip`` and ``$gap_fit`` for quip and gap_fit, e.g.

.. code-block::

   export quip='/export/home/fcge/GAP-SOAP/QUIP/build/linux_x86_64_gfortran_openmp/quip'
   export gap_fit='/export/home/fcge/GAP-SOAP/QUIP/build/linux_x86_64_gfortran_openmp/gap_fit'

visit https://libatoms.github.io/GAP/index.html for more information.

.. _install_PhysNet:

PhysNet
+++++++

Required for PhysNet models.

1. clone form PhysNet's GitHub page

.. code-block::

   git clone https://github.com/MMunibas/PhysNet.git

2. install TensorFlow:

.. code-block::

   pip install tensorflow

3. if you use TensorFlow v2, you need to execute the command below in PhysNet's directory to make the scripts compatible with TFv2.

.. code-block::

   for i in `find . -name '*.py'`; do sed -i -e 's/import tensorflow as tf/import tensorflow.compat.v1 as tf\ntf.disable_v2_behavior()/g'
   -e 's/import tensorflow as tf/import tensorflow.compat.v1 as tf\ntf.disable_v2_behavior()/g' $i; done

4. add environmental variable ``$PhysNet`` to the directory, e.g.

.. code-block::

   export PhysNet=/export/home/fcge/PhysNet/

.. _install_sGDML:

sGDML
+++++

Required for GDML and sGDML potentials.

1. install sGDML

.. code-block::

   pip install sgdml==0.4.4

2. add the path of sGDML binary to environmental variable ``$sGDML``, e.g.

.. code-block::

   export sGDML=/export/home/fcge/.linuxbrew/bin/sgdml

Visit http://quantum-machine.org/gdml/doc/ for more information.

.. note::
   In our tests we found that installation is more stable with: ``pip install scipy==1.7.1``.

.. _install_MACE:

MACE
++++

Required for MACE potentials.

.. code-block::

   pip install --upgrade mace-torch


.. warning::

   MLatom versions >3.13 works with MACE-Torch 0.3.8.
   If you encounter any problem, you can install these versions via ``pip install mace-torch==0.3.8``.

   MLatom versions ≤3.13 works with version 0.3.2 of MACE according to our tests.
   If, for some reason, you need to downgrade MACE, it is not that straightforward because you would need to find the related commit on github. The old installation instructions which will not work but you can try to adjust:

   1. clone MACE git from GitHub
   
   .. code-block::
      
      git clone https://github.com/ACEsuit/mace.git
      
   2. install MACE with pip

   .. code-block::
      
      pip install ./mace


.. _install_MNDO:

MNDO
++++

MNDO program is required to provide the ODM2* part of AIQM1. Alternatively, a `(development) version <https://zenodo.org/record/7362585>`_ 
of `SCINE Sparrow <https://scine.ethz.ch/download/sparrow>`_ can be used (at the moment it has no analytical derivatives for ODM2* part and hence only single-point simulations are recommended; see a `paper <https://doi.org/10.1063/5.0136404>`_ on Sparrow; 
note that the development version of Sparrow also implements single-point AIQM1 calculations).

The free binary and open-source code of the MNDO program is available from the official distributors of the MNDO code as described at https://mndo.kofo.mpg.de.

After the MNDO program is installed, you need to set the environmental variable pointing to the MNDO executable (typically ``mndo99``), e.g., in bash:

.. code-block::

   export mndobin=[path to the executable]/mndo99

.. _install_Sparrow:

Sparrow
+++++++

The `SCINE Sparrow <https://scine.ethz.ch/download/sparrow>`_ program can be used to provide many of the semi-empirical methods. See its website for the installation instructions.

Note that the a `(development) version <https://zenodo.org/record/7362585>`_ can also be used instead of the MNDO program to provide the ODM2* part of AIQM1, but the biggest limitation is that it has no analytical gradients at the moment.
See a `paper <https://doi.org/10.1063/5.0136404>`_ on Sparrow for details. Note that the development version of Sparrow also implements single-point AIQM1 calculations.
Our recommendation is just to use this development version of Sparrow for AIQM1 single-point calculations on the XACS cloud. It is difficult to install this version.

.. _install_dftd4:

dftd4
+++++

dftd4 program is required to provide the D4 part of AIQM1.

The dftd4 program can be `obtained <https://github.com/dftd4/dftd4>`_ as both executable and open-source code. We recommend using `dftd4 v3.5.0 <https://github.com/dftd4/dftd4/releases/tag/v3.5.0>`_ (`dftd4 v2.5.0 <https://github.com/dftd4/dftd4/releases/tag/v2.5.0>`_ for the MLatom versions earlier than 3.0.1), 
which can calculate Hessian needed for thermochemical calculations. To install the dftd4 program from source code, please see the README.md file 
on dftd4 GitHub page for more details.

After the dftd4 program is installed, you need to set the environmental variable pointing to the ``dftd4`` executable, e.g., in bash:
   
.. code-block::

   export dftd4bin=[path to the executable]/dftd4

.. _install_Gaussian:

Gaussian
++++++++

Required for geometry optimizations, freq, TS search, IRC, thermochemistry, and ML-NEA. For some of these tasks, alternatively, ASE can be used, see below.

Our implementation work with both Gaussian 09 and Gaussian 16. It is a commercial program, which can be `obtained and installed separately <https://gaussian.com/>`_.

To use Gaussian interface, make sure that your environmental variable ``$GAUSS_EXEDIR`` points to the right place.

.. _install_ASE:

ASE
+++

Required for geometry optimizations, freq, and thermochemistry. Alternatively, Gaussian can be used, see above.

The ASE (Atomic Simulation Environment) are Python modules, which can be installed as described on `ASE website <https://wiki.fysik.dtu.dk/ase/>`_, i.e.:

.. code-block::

   pip install ase

.. _install_hyperopt:

hyperopt
++++++++

To enable hyperopt, please run ``pip install hyperopt`` to install the `hyperopt <http://hyperopt.github.io/hyperopt/>`__ package.