Density Functional Theory (DFT)

DFT is widely used in physics, chemistry and materials science due to their decent balance between accuracy and computational cost. In MLatom, DFT is implemented via interfaces to Gaussian, ORCA and PySCF (follow the link for detailed information on functionals availble in each interface). MLatom uses PySCF as the default option. Below we provide information on how to define DFT method in MLatom and various tasks that can be performed with command line/input files and Python API including:

Note

At the moment, simpler input described in this documentation only works for the MLatom@XACS version on the XACS cloud computing. If you install MLatom locally, you always need specify additional keywords such as method=B3LYP/6-31G* and qmprog=gaussian etc. to make it work. Local version still has some other limitations such as using only a single CPU with PySCF by default (you need specify nthreads.

Defining DFT methods

The format for the name of a DFT method follows the convention, i.e. [functional]/[basis set]. For MLatom input file, the user just need to request the method in the first line. Here is an example to use B3LYP/6-31G* for a single-point calculation:

B3LYP/6-31G*
xyzfile='2

1     0.0000    0.0000    0.0000
1     0.7414    0.0000    0.0000
'
yestfile=enest.dat

To use other program for calculation, qmprog keyword is needed, e.g.

B3LYP/6-31G*
qmprog=gaussian # qmprog=orca
xyzfile='2

1     0.0000    0.0000    0.0000
1     0.7414    0.0000    0.0000
'
yestfile=enest.dat

To properly use these interfaces, please refer to the installation information to set correct environment variables.

For Python API, the DFT method can be defined via mlatom.models.methods module with method and program arguments:

import mlatom as ml

b3lyp = ml.models.methods(method='B3LYP/6-31G*')
# b3lyp = ml.models.methods(method='B3LYP/6-31G*', program='pyscf', nthreads=18)
# b3lyp = ml.models.methods(method='B3LYP/6-31G*', program='gaussian')
# b3lyp = ml.models.methods(method='B3LYP/6-31G*', program='orca')

nthreads can be used to set number of CPUs used and by default all CPUs available on the machine will be used.

Note

By default, the calculations will assume that the molecule is closed-shell and neutral. You can use keywords charges and multiplicities in input file or set them in Python for your molecule object as mol.charge and mol.multiplicity. Then the calculations will invoke the unrestricted DFT automatically.

Single-point calculations

For detailed information on how to use MLatom to perform single-point calculation, please refer to our tutorial.

In MLatom input file, after defining the DFT method, the XYZ geometries of molecules need to be provided with xyzfile keyword. Energy, gradients and hessian can be obtained by using yestfile, ygradxyzestfile and hessianestfile keywords:

wb97x/def2-tzvpp
xyzfile=ch4.xyz
yestfile=enest.dat
ygradxyzestfile=grad.dat
hessianestfile=hess.dat

where ch4.xyz contains:

5

C         0.0000000000        0.0000000000        0.0000000000
H         1.0870000000        0.0000000000        0.0000000000
H        -0.3623333220       -1.0248334322       -0.0000000000
H        -0.3623333220        0.5124167161       -0.8875317869
H        -0.3623333220        0.5124167161        0.8875317869

Python API provides more flexible control for calculation with DFT methods. After defining the method, user can use mlatom.data.molecule or mlatom.data.molecular_database to load molecule(s) and predict() function to get energies, gradients and hessians, e.g.:

import mlatom as ml

mol = ml.data.molecule.from_xyz_file('ch4.xyz')

wb97xdz = ml.models.methods(method='wb97x/6-31G*')

wb97xdz.predict(
    molecule = mol,
    calculate_energy = True,
    calculate_energy_gradients = True,
    calculate_hessian = True
)

print(mol.energy)
print(mol.get_energy_gradients())
print(mol.hessian)

Geometry optimization

For detailed information on how to use MLatom to perform geometry optimization, please refer to our tutorial.

In input file, keyword geomopt is required after defining the DFT methods, i.e.

B3LYP/6-31G*
geomopt
xyzfile='2

1     0.0000    0.0000    0.0000
1     0.7414    0.0000    0.0000
'

User can use optxyz to specify the file with optimized geometry (by default it is stored in optgeoms.xyz). MLatom also provides options to control how much information is saved with :

  • printmin will not print information about every iteration.

  • printall will print detailed information at each iteration.

  • dumpopttrajs=False will not dump any optimization trajectories.

For Python API, MLatom uses mlatom.optimize_geometry to perfrom geometry optimization, e.g.:

import mlatom as ml

# Get the initial guess for the molecules to optimize
initmol = ml.data.molecule.from_xyz_string('''5

C         0.0000000000        0.0000000000        0.0000000000
H         1.0870000000        0.0000000000        0.0000000000
H        -0.3623333220       -1.0248334322       -0.0000000000
H        -0.3623333220        0.5124167161       -0.8875317869
H        -0.3623333220        0.5124167161        0.8875317869
''')

# define DFT method
b3lyp = ml.models.methods(method='b3lyp/6-31g*')

# Optimize the geometry with the choosen optimizer:
geomopt = ml.optimize_geometry(model=b3lyp, initial_molecule=initmol, program='geometric')

# Get the final geometry
final_mol = geomopt.optimized_molecule
# and its XYZ coordinates
final_mol.xyz_coordinates

# save the final geometry
final_mol.write_file_with_xyz_coordinates(filename='final.xyz')

Available optimizers are gaussian, ase and geometric. User can request one of them with optprog in input file and program in Python API.

Frequency calculations

For detailed information on how to use MLatom to perform frequency calculation, please refer to our tutorial.

In input file, keyword freq is required after defining the DFT methods, i.e.

B3LYP/6-31G*
freq
xyzfile='2

1     0.0000    0.0000    0.0000
1     0.7414    0.0000    0.0000
'

After frequency calculation, vibration analysis and thermochemistry for the molecule are provided in the output file as :

==============================================================================
                  Vibration analysis for molecule      1
==============================================================================
Multiplicity: 1
This is a linear molecule
Mode     Frequencies     Reduced masses     Force Constants
            (cm^-1)            (AMU)           (mDyne/A)
    1       4472.8365          1.0080            11.8817
==============================================================================
                    Thermochemistry for molecule      1
==============================================================================
ZPE-exclusive internal energy at      0 K:        -1.17548 Hartree
Zero-point vibrational energy            :         0.01019 Hartree
Internal energy                  at   0 K:        -1.16529 Hartree
Enthalpy                         at 298 K:        -1.16199 Hartree
Gibbs free energy                at 298 K:        -1.17678 Hartree

Atomization enthalpy             at   0 K:         0.16475 Hartree       103.37947 kcal/mol
ZPE-exclusive atomization energy at   0 K:         0.17494 Hartree       109.77371 kcal/mol
Heat of formation                at 298 K:        -0.00010 Hartree        -0.06397 kcal/mol

==============================================================================

For Python API, MLatom uses mlatom.freq to perfrom frequency calculation and with mlatom.thermochemistry additional themochemical analysis is provided. A simple example is shown below for workflow of optimizing geometry and calculating frequency:

import mlatom as ml

# Get the initial guess for the molecules to optimize
initmol = ml.data.molecule.from_xyz_string('''5

C         0.0000000000        0.0000000000        0.0000000000
H         1.0870000000        0.0000000000        0.0000000000
H        -0.3623333220       -1.0248334322       -0.0000000000
H        -0.3623333220        0.5124167161       -0.8875317869
H        -0.3623333220        0.5124167161        0.8875317869
''')

# define DFT method
b3lyp = ml.models.methods(method='b3lyp/6-31g*')

# Optimize the geometry with the choosen optimizer:
geomopt = ml.optimize_geometry(model=b3lyp, initial_molecule=initmol, program='geometric')

# Get the final geometry
final_mol = geomopt.optimized_molecule

# Do vibration analysis and thermochemistry calculation
freq = ml.thermochemistry(model=b3lyp, molecule=final_mol, program='pyscf')
# or vibration analysis only
# freq = ml.freq(model=mymodel, molecule=final_mol, program='')

# Save the molecule with vibration analysis and thermochemistry results
final_mol.dump(filename='final_mol.json',format='json')

# Check vibration analysis
print("Mode     Frequencies     Reduced masses     Force Constants")
print("           (cm^-1)            (AMU)           (mDyne/A)")
for ii in range(len(final_mol.frequencies)):
    print("%d   %13.4f   %13.4f   %13.4f"%(ii,final_mol.frequencies[ii],final_mol.reduced_masses[ii],final_mol.force_constants[ii]))

# Check thermochemistry results
print(f"Zero-point vibrational energy: {final_mol.ZPE} Hartree")
print(f"Enthalpy at 298 K: {final_mol.H} Hartree")
print(f"Gibbs Free energy at 298 K: {final_mol.G} Hartree")
print(f"Heat of formation at 298 K: {final_mol.DeltaHf298} Hartree")

Available programs to perform frequency calculations are gaussian, pyscf and ase. User can request one of them with freqprog in input file and program in Python API.

Any questions or suggestions?

If you have further questions, criticism, and suggestions, we would be happy to receive them in English or Chinese via email, Slack (preferred), or WeChat (please send an email to request to add you to the XACS user support group).