# Density Functional Theory (DFT)

DFT is widely used in physics, chemistry and materials science due to their decent balance between accuracy and computational cost. In MLatom, DFT is implemented via interfaces to Gaussian, ORCA and PySCF (follow the link for detailed information on functionals availble in each interface). MLatom uses PySCF as the default option. Below we provide information on how to define DFT method in MLatom and various tasks that can be performed with command line/input files and Python API including:

Note

At the moment, simpler input described in this documentation only works for the MLatom@XACS version on the XACS cloud computing.
If you install MLatom locally, you always need specify additional keywords such as `method=B3LYP/6-31G*`

and `qmprog=gaussian`

etc. to make it work. Local version still has some other limitations such as using only a single CPU with PySCF by default (you need specify `nthreads`

.

## Defining DFT methods

The format for the name of a DFT method follows the convention, i.e. `[functional]/[basis set]`

. For MLatom input file, the user just need to request the method in the first line. Here is an example to use B3LYP/6-31G* for a single-point calculation:

```
B3LYP/6-31G*
xyzfile='2
1 0.0000 0.0000 0.0000
1 0.7414 0.0000 0.0000
'
yestfile=enest.dat
```

To use other program for calculation, `qmprog`

keyword is needed, e.g.

```
B3LYP/6-31G*
qmprog=gaussian # qmprog=orca
xyzfile='2
1 0.0000 0.0000 0.0000
1 0.7414 0.0000 0.0000
'
yestfile=enest.dat
```

To properly use these interfaces, please refer to the installation information to set correct environment variables.

For Python API, the DFT method can be defined via `mlatom.models.methods`

module with `method`

and `program`

arguments:

```
import mlatom as ml
b3lyp = ml.models.methods(method='B3LYP/6-31G*')
# b3lyp = ml.models.methods(method='B3LYP/6-31G*', program='pyscf', nthreads=18)
# b3lyp = ml.models.methods(method='B3LYP/6-31G*', program='gaussian')
# b3lyp = ml.models.methods(method='B3LYP/6-31G*', program='orca')
```

`nthreads`

can be used to set number of CPUs used and by default all CPUs available on the machine will be used.

Note

By default, the calculations will assume that the molecule is closed-shell and neutral. You can use keywords `charges`

and `multiplicities`

in input file or set them in Python for your molecule object as `mol.charge`

and `mol.multiplicity`

. Then the calculations will invoke the unrestricted DFT automatically.

## Single-point calculations

For detailed information on how to use MLatom to perform single-point calculation, please refer to our tutorial.

In MLatom input file, after defining the DFT method, the XYZ geometries of molecules need to be provided with `xyzfile`

keyword. Energy, gradients and hessian can be obtained by using `yestfile`

, `ygradxyzestfile`

and `hessianestfile`

keywords:

```
wb97x/def2-tzvpp
xyzfile=ch4.xyz
yestfile=enest.dat
ygradxyzestfile=grad.dat
hessianestfile=hess.dat
```

where `ch4.xyz`

contains:

```
5
C 0.0000000000 0.0000000000 0.0000000000
H 1.0870000000 0.0000000000 0.0000000000
H -0.3623333220 -1.0248334322 -0.0000000000
H -0.3623333220 0.5124167161 -0.8875317869
H -0.3623333220 0.5124167161 0.8875317869
```

Python API provides more flexible control for calculation with DFT methods. After defining the method, user can use `mlatom.data.molecule`

or `mlatom.data.molecular_database`

to load molecule(s) and `predict()`

function to get energies, gradients and hessians, e.g.:

```
import mlatom as ml
mol = ml.data.molecule.from_xyz_file('ch4.xyz')
wb97xdz = ml.models.methods(method='wb97x/6-31G*')
wb97xdz.predict(
molecule = mol,
calculate_energy = True,
calculate_energy_gradients = True,
calculate_hessian = True
)
print(mol.energy)
print(mol.get_energy_gradients())
print(mol.hessian)
```

## Geometry optimization

For detailed information on how to use MLatom to perform geometry optimization, please refer to our tutorial.

In input file, keyword `geomopt`

is required after defining the DFT methods, i.e.

```
B3LYP/6-31G*
geomopt
xyzfile='2
1 0.0000 0.0000 0.0000
1 0.7414 0.0000 0.0000
'
```

User can use `optxyz`

to specify the file with optimized geometry (by default it is stored in `optgeoms.xyz`

).
MLatom also provides options to control how much information is saved with :

`printmin`

will not print information about every iteration.`printall`

will print detailed information at each iteration.`dumpopttrajs=False`

will not dump any optimization trajectories.

For Python API, MLatom uses `mlatom.optimize_geometry`

to perfrom geometry optimization, e.g.:

```
import mlatom as ml
# Get the initial guess for the molecules to optimize
initmol = ml.data.molecule.from_xyz_string('''5
C 0.0000000000 0.0000000000 0.0000000000
H 1.0870000000 0.0000000000 0.0000000000
H -0.3623333220 -1.0248334322 -0.0000000000
H -0.3623333220 0.5124167161 -0.8875317869
H -0.3623333220 0.5124167161 0.8875317869
''')
# define DFT method
b3lyp = ml.models.methods(method='b3lyp/6-31g*')
# Optimize the geometry with the choosen optimizer:
geomopt = ml.optimize_geometry(model=b3lyp, initial_molecule=initmol, program='geometric')
# Get the final geometry
final_mol = geomopt.optimized_molecule
# and its XYZ coordinates
final_mol.xyz_coordinates
# save the final geometry
final_mol.write_file_with_xyz_coordinates(filename='final.xyz')
```

Available optimizers are `gaussian`

, `ase`

and `geometric`

. User can request one of them with `optprog`

in input file and `program`

in Python API.

## Frequency calculations

For detailed information on how to use MLatom to perform frequency calculation, please refer to our tutorial.

In input file, keyword `freq`

is required after defining the DFT methods, i.e.

```
B3LYP/6-31G*
freq
xyzfile='2
1 0.0000 0.0000 0.0000
1 0.7414 0.0000 0.0000
'
```

After frequency calculation, vibration analysis and thermochemistry for the molecule are provided in the output file as :

```
==============================================================================
Vibration analysis for molecule 1
==============================================================================
Multiplicity: 1
This is a linear molecule
Mode Frequencies Reduced masses Force Constants
(cm^-1) (AMU) (mDyne/A)
1 4472.8365 1.0080 11.8817
==============================================================================
Thermochemistry for molecule 1
==============================================================================
ZPE-exclusive internal energy at 0 K: -1.17548 Hartree
Zero-point vibrational energy : 0.01019 Hartree
Internal energy at 0 K: -1.16529 Hartree
Enthalpy at 298 K: -1.16199 Hartree
Gibbs free energy at 298 K: -1.17678 Hartree
Atomization enthalpy at 0 K: 0.16475 Hartree 103.37947 kcal/mol
ZPE-exclusive atomization energy at 0 K: 0.17494 Hartree 109.77371 kcal/mol
Heat of formation at 298 K: -0.00010 Hartree -0.06397 kcal/mol
==============================================================================
```

For Python API, MLatom uses `mlatom.freq`

to perfrom frequency calculation and with `mlatom.thermochemistry`

additional themochemical analysis is provided. A simple example is shown below for workflow of optimizing geometry and calculating frequency:

```
import mlatom as ml
# Get the initial guess for the molecules to optimize
initmol = ml.data.molecule.from_xyz_string('''5
C 0.0000000000 0.0000000000 0.0000000000
H 1.0870000000 0.0000000000 0.0000000000
H -0.3623333220 -1.0248334322 -0.0000000000
H -0.3623333220 0.5124167161 -0.8875317869
H -0.3623333220 0.5124167161 0.8875317869
''')
# define DFT method
b3lyp = ml.models.methods(method='b3lyp/6-31g*')
# Optimize the geometry with the choosen optimizer:
geomopt = ml.optimize_geometry(model=b3lyp, initial_molecule=initmol, program='geometric')
# Get the final geometry
final_mol = geomopt.optimized_molecule
# Do vibration analysis and thermochemistry calculation
freq = ml.thermochemistry(model=b3lyp, molecule=final_mol, program='pyscf')
# or vibration analysis only
# freq = ml.freq(model=mymodel, molecule=final_mol, program='')
# Save the molecule with vibration analysis and thermochemistry results
final_mol.dump(filename='final_mol.json',format='json')
# Check vibration analysis
print("Mode Frequencies Reduced masses Force Constants")
print(" (cm^-1) (AMU) (mDyne/A)")
for ii in range(len(final_mol.frequencies)):
print("%d %13.4f %13.4f %13.4f"%(ii,final_mol.frequencies[ii],final_mol.reduced_masses[ii],final_mol.force_constants[ii]))
# Check thermochemistry results
print(f"Zero-point vibrational energy: {final_mol.ZPE} Hartree")
print(f"Enthalpy at 298 K: {final_mol.H} Hartree")
print(f"Gibbs Free energy at 298 K: {final_mol.G} Hartree")
print(f"Heat of formation at 298 K: {final_mol.DeltaHf298} Hartree")
```

Available programs to perform frequency calculations are `gaussian`

, `pyscf`

and `ase`

. User can request one of them with `freqprog`

in input file and `program`

in Python API.

## Any questions or suggestions?

If you have further questions, criticism, and suggestions, we would be happy to receive them in English or Chinese via email, Slack (preferred), or WeChat (please send an email to request to add you to the XACS user support group).