Train and use models for H\ :sub:`2`\ ====================================== Here we will show how to train and use different machine learning potential models for H\ :sub:`2`\ as examples: - :ref:`KREG ` - :ref:`TorchANI ` .. _KREG: KREG ~~~~~ Firstly, we train a KREG model for H\ :sub:`2`\ . For command line, prepare the input file :download:`h2_train_KREG.inp `. Two auxiliary files are needed: :download:`h2.xyz ` with XYZ geometries of hydrogen and :download:`E_FCI_451.dat ` with labels (reference FCI/aug-cc-pV6Z energies in Hartree). The trained model will be saved in ``energies.unf``. .. code-block:: # h2_train_KREG.inp createMLmodel # Specify the task for MLatom MLmodelType=KREG # Specify the model type MLmodelOut=energies.unf # Save model in energies.unf XYZfile=h2.xyz # File with XYZ geometries Yfile=E_FCI_451.dat # The file with FCI energies sigma=opt # Optimize hyperparameter sigma lgSigmaL=-4 # Lower bound of log2(sigma) lambda=opt # Optimize hyperparameter lambda And then run it: .. code-block:: mlatom h2_train_KREG.inp In Python, we as well as need to provide the same two auxiliary files. .. code-block:: import mlatom as ml # load data set molDB = ml.data.molecular_database.from_xyz_file('h2.xyz') molDB.add_scalar_properties_from_file('E_FCI_451.dat', 'energy') # define the model model = ml.models.kreg(model_file='energies') # split data set for optimizing hyperparameters [subtraining_molDB, validation_molDB] = ml.data.sample(number_of_splits=2, fraction_of_points_in_splits=[0.8, 0.2], molecular_database_to_split=molDB, sampling='random') # optimize hyperparameters model.hyperparameters["sigma"].minval = 2**-4 model.optimize_hyperparameters(subtraining_molecular_database=subtraining_molDB, validation_molecular_database=validation_molDB, optimization_algorithm='nelder-mead', hyperparameters=['lambda', 'sigma'], training_kwargs={'property_to_learn': 'energy'}, prediction_kwargs=None) lmbd = model.hyperparameters['lambda'].value sigma = model.hyperparameters['sigma'].value print(f'Optimized hyperparameters: lambda={lmbd}, sigma={sigma}') # train the final model model.train(molecular_database=molDB, property_to_learn='energy') Now we can use the model. For command line, prepare the input file :download:`h2_opt_KREG.inp ` for geometry optimization. We need to provide the initial geometry of H\ :sub:`2`\ (:download:`h2_init.xyz `) and the trained model in the previous step (``energies.unf``) .. code-block:: # h2_opt_KREG.inp geomopt # Request geometry optimization MLmodelType=KREG # of the KREG type MLmodelIn=energies.unf # in energies.unf file XYZfile=h2_init.xyz # The file with initial guess optXYZ=eq_KREG.xyz # optimized geometry output ------------------------------------------------------------------- # h2_init.xyz 2 H 0.0000000000000 0.0000000000000 0.0000000000000 H 0.0000000000000 0.0000000000000 0.8000000000000 Perform geometry optimization. .. code-block:: mlatom h2_opt_KREG.inp The output of optimized geometry is in ``eq_KREG.xyz``. .. code-block:: cat eq_KREG.xyz In Python, we need to provide the same auxiliary files. .. code-block:: import mlatom as ml # load initial geometry mol = ml.data.molecule.from_xyz_file('h2_init.xyz') print(mol.get_xyz_string()) # load the model model = ml.models.kreg(model_file='energies') # run geometry optimization ml.optimize_geometry(model=model, molecule=mol, program='ASE') print(mol.get_xyz_string()) .. _TorchANI: TorchANI ~~~~~~~~~ Except for the KREG model, we can also use other machine learning potential models, e.g., ANI model. Same as above, for command line, prepare the input file :download:`h2_train_ANI.inp ` and auxiliary files (:download:`h2.xyz `, :download:`E_FCI_451.dat `). The trained model will be saved in ``energies_ani.pt``. .. code-block:: # h2_train_ANI.inp createMLmodel # Specify the task for MLatom MLmodelType=ANI # Specify the model type MLmodelOut=energies_ani.pt # Save model in energies_ani.pt XYZfile=h2.xyz # File with XYZ geometries Yfile=E_FCI_451.dat # The file with FCI energies but can be any other property #ani.max_epochs=16 # Only train 16 epochs Run it: .. code-block:: mlatom h2_train_ANI.inp In Python, we need to prepare the same two auxiliary files. .. code-block:: import mlatom as ml # load data set molDB = ml.data.molecular_database.from_xyz_file('h2.xyz') molDB.add_scalar_properties_from_file('E_FCI_451.dat', 'energy') # define the model model = ml.models.ani(model_file='energies_ani_api.pt', hyperparameters={'max_epochs': 16}) # train the final model model.train(molecular_database=molDB, property_to_learn='energy') Now we can use the model for geometry optimization, for command line, prepare the input file :download:`h2_opt_ANI.inp ` and the auxiliary files: the initial geometry of H\ :sub:`2`\ (:download:`h2_init.xyz `) and the trained model in the previous step (``energies_ani.pt``). .. code-block:: # h2_opt_ANI.inp geomopt # Request geometry optimization MLmodelType=ANI # of the KREG type MLmodelIn=energies_ani.pt # in energies_ani.pt file XYZfile=h2_init.xyz # The file with initial guess optXYZ=eq_ANI.xyz # optimized geometry output Perform geometry optimization. .. code-block:: mlatom h2_opt_ANI.inp The output of optimized geometry is in ``eq_ANI.xyz``. .. code-block:: cat eq_ANI.xyz In Python, we need to prepare the same auxiliary files. .. code-block:: import mlatom as ml # load initial geometry mol = ml.data.molecule.from_xyz_file('h2_init.xyz') print(mol.get_xyz_string()) # load the model model = ml.models.ani(model_file='energies_ani_api.pt') # run geometry optimization ml.optimize_geometry(model=model, molecule=mol, program='ASE') print(mol.get_xyz_string())