Estimate model accuracy

MLatom provides a host of powerful machine learning potentials via native implementations and interfaces to third-party programs:

  • kernel methods for single-molecular PES:
    • KREG (native). See tutorial. Can only be used for single-molecule PES.

    • KRR-CM (KRR with Coulomb matrix, native).

    • sGDML (through sGDML). Can only be used for single-molecule PES

  • neural network methods applicable to creating PES for different molecules:
    • ANI (through TorchANI)

    • DeepPot-SE and DPMD (through DeePMD-kit)

    • GAP-SOAP (through GAP suite and QUIP)

    • PhysNet (through PhysNet)

The choice of a suitable potential is not trivial. MLatom can help along by providing different ways of judging its accuracy. To evaluate an ML model’s performance, we need to test it with unseen data. MLatom provides estAccMLmodel task to do so for you by splitting the data set into the training and test sets.

We will estimate two machine learning potential models for ethanol (see this paper for details):

  • KREG (based on kernel ridge regression and global descriptor).

  • ANI (based on neural network and local descriptor).

KREG

Prepare the input file ethanol_estAcc_KREG.inp to test a KREG model. In additon, we need to provide two auxiliary files: the geometries of ethanol ethanol_geometries.xyz and the energies of ethanol ethanol_energies.txt:

# ethanol_estAcc_KREG.inp
estAccMLmodel
MLmodelType=KREG
XYZfile=ethanol_geometries.xyz
Yfile=ethanol_energies.txt
sigma=opt
lambda=opt

Now run the input file.

mlatom ethanol_estAcc_KREG.inp

TorchANI

Same as above, to estimate a ANI model, we prepare the input file ethanol_estAcc_ANI.inp and the same auxiliary files.

# ethanol_estAcc_ANI.inp
estAccMLmodel
MLmodelType=ANI
XYZfile=ethanol_geometries.xyz
Yfile=ethanol_energies.txt

Run the input file.

mlatom ethanol_estAcc_ANI.inp      # this training will take a lot of time