.. _abfe-workflow: The ABFE workflow ================== The *Absolute binding free energy* (ABFE) workflow provides a complete workflow that uses the energy files generated by MD engine as input and generates the binding free energy as well as the analysis plots. Fully Automatic analysis ------------------------ *Absolute binding free energy* (ABFE) calculations can be analyzed with two lines of code in a fully automated manner (similar to `Alchemical Analysis `_). In this case, any parameters are set when invoking :class:`~alchemlyb.workflows.abfe.ABFE` and reasonable defaults are chosen for any parameters not set explicitly. The two steps are to 1. initialize an instance of the :class:`~alchemlyb.workflows.abfe.ABFE` class 2. invoke the :meth:`~alchemlyb.workflows.ABFE.run` method to execute complete workflow. For a GROMACS ABFE simulation, executing the workflow would look similar to the following code (:ref:`See how to configure the logger `). :: >>> from alchemtest.gmx import load_ABFE >>> from alchemlyb.workflows import ABFE >>> # Obtain the path of the data >>> import os >>> dir = os.path.dirname(load_ABFE()['data']['complex'][0]) >>> print(dir) 'alchemtest/gmx/ABFE/complex' >>> workflow = ABFE(units='kcal/mol', software='GROMACS', dir=dir, >>> prefix='dhdl', suffix='xvg', T=298, outdirectory='./') >>> workflow.run(skiptime=10, uncorr='dhdl', threshold=50, >>> estimators=('MBAR', 'BAR', 'TI'), overlap='O_MBAR.pdf', >>> breakdown=True, forwrev=10) The workflow uses the :class:`~alchemlyb.parsing` to parse the data from the energy files, remove the initial unequilibrated frames and decorrelate the data with :class:`~alchemlyb.preprocessing.subsampling`. The decorrelated dataset :ref:`dHdl ` and :ref:`u_nk ` are then passed to :class:`~alchemlyb.estimators` for free energy estimation. The workflow will also perform a set of analysis that allows the user to examine the quality of the estimation. File Input ^^^^^^^^^^ This command expects the energy files to be structured in two common ways. It could either be :: simulation ├── lambda_0 │   ├── prod.xvg │   └── ... ├── lambda_1 │   ├── prod.xvg │   └── ... └── ... Where :code:`dir='simulation/lambda_*', prefix='prod', suffix='xvg'`. Or :: dhdl_files ├── dhdl_0.xvg ├── dhdl_1.xvg └── ... Where :code:`dir='dhdl_files', prefix='dhdl_', suffix='xvg'`. Output ^^^^^^ The workflow returns the free energy estimate using all of :class:`~alchemlyb.estimators.TI`, :class:`~alchemlyb.estimators.BAR`, :class:`~alchemlyb.estimators.MBAR`. For ABFE calculations, the alchemical transformation is usually done is three stages, the *bonded*, *coul* and *vdw* which corresponds to the free energy contribution from applying the restraint to restrain the ligand to the protein, decouple/annihilate the coulombic interaction between the ligand and the protein and decouple/annihilate the protein-ligand lennard jones interactions. The result will be stored in :attr:`~alchemlyb.workflows.ABFE.summary` as :class:`pandas.Dataframe`. :: MBAR MBAR_Error BAR BAR_Error TI TI_Error States 0 -- 1 0.065967 0.001293 0.066544 0.001661 0.066663 0.001675 1 -- 2 0.089774 0.001398 0.089303 0.002101 0.089566 0.002144 2 -- 3 0.132036 0.001638 0.132687 0.002990 0.133292 0.003055 ... 26 -- 27 1.243745 0.011239 1.245873 0.015711 1.248959 0.015762 27 -- 28 1.128429 0.012859 1.124554 0.016999 1.121892 0.016962 28 -- 29 1.010313 0.016442 1.005444 0.017692 1.019747 0.017257 Stages coul 10.215658 0.033903 10.017838 0.041839 10.017854 0.048744 vdw 22.547489 0.098699 22.501150 0.060092 22.542936 0.106723 bonded 2.374144 0.014995 2.341631 0.005507 2.363828 0.021078 TOTAL 35.137291 0.103580 34.860619 0.087022 34.924618 0.119206 Output Files ^^^^^^^^^^^^ For quality assessment, a couple of plots were generated and written to the folder specified by `outdirectory`. The :ref:`overlay matrix for the MBAR estimator ` will be plotted and saved to :file:`O_MBAR.pdf`, which examines the overlap between different lambda windows. The :ref:`dHdl for TI ` will be plotted to :file:`dhdl_TI.pdf`, allows one to examine if the lambda scheduling has covered the change of the gradient in the lambda space. The :ref:`dF states ` will be plotted to :file:`dF_state.pdf` in portrait model and :file:`dF_state_long.pdf` in landscape model, which allows the user to example the contributions from each lambda window. The forward and backward convergence will be plotted to :file:`dF_t.pdf` using :class:`~alchemlyb.estimators.MBAR` and saved in :attr:`~alchemlyb.workflows.ABFE.convergence`, which allows the user to examine if the simulation time is enough to achieve a converged result. Semi-automatic analysis ----------------------- The same analysis could also performed in steps allowing access and modification to the data generated at each stage of the analysis. :: >>> from alchemtest.gmx import load_ABFE >>> from alchemlyb.workflows import ABFE >>> # Obtain the path of the data >>> import os >>> dir = os.path.dirname(load_ABFE()['data']['complex'][0]) >>> print(dir) 'alchemtest/gmx/ABFE/complex' >>> # Load the data >>> workflow = ABFE(software='GROMACS', dir=dir, >>> prefix='dhdl', suffix='xvg', T=298, outdirectory='./') >>> # Set the unit. >>> workflow.update_units('kcal/mol') >>> # Read the data >>> workflow.read() >>> # Decorrelate the data. >>> workflow.preprocess(skiptime=10, uncorr='dhdl', threshold=50) >>> # Run the estimator >>> workflow.estimate(estimators=('mbar', 'bar', 'ti')) >>> # Retrieve the result >>> summary = workflow.generate_result() >>> # Plot the overlap matrix >>> workflow.plot_overlap_matrix(overlap='O_MBAR.pdf') >>> # Plot the dHdl for TI >>> workflow.plot_ti_dhdl(dhdl_TI='dhdl_TI.pdf') >>> # Plot the dF states >>> workflow.plot_dF_state(dF_state='dF_state.pdf') >>> # Convergence analysis >>> workflow.check_convergence(10, dF_t='dF_t.pdf') Parallelisation of Data Reading and Decorrelation ------------------------------------------------- The estimation step of the workflow is parallelized using JAX. However, the reading and decorrelation stages can be parallelized using `joblib`. This is achieved by passing the number of jobs to run in parallel via the `n_jobs` parameter to the following methods: - :meth:`~alchemlyb.workflows.ABFE.read` - :meth:`~alchemlyb.workflows.ABFE.preprocess` To enable parallel execution, specify the `n_jobs` parameter. Setting `n_jobs=-1` allows the use of all available resources. :: >>> workflow = ABFE(units='kcal/mol', software='GROMACS', dir=dir, >>> prefix='dhdl', suffix='xvg', T=298, outdirectory='./') >>> workflow.read(n_jobs=-1) >>> workflow.preprocess(n_jobs=-1) In a fully automated mode, the `n_jobs=-1` parameter can be passed directly to the :meth:`~alchemlyb.workflows.ABFE.run` method. This will implicitly parallelise the reading and decorrelation stages. :: >>> workflow = ABFE(units='kcal/mol', software='GROMACS', dir=dir, >>> prefix='dhdl', suffix='xvg', T=298, outdirectory='./') >>> workflow.run(n_jobs=-1) While the default `joblib` settings are suitable for most environments, you can customize the parallelisation backend depending on the infrastructure. For example, using the threading backend can be specified as follows. :: >>> import joblib >>> workflow = ABFE(units='kcal/mol', software='GROMACS', dir=dir, >>> prefix='dhdl', suffix='xvg', T=298, outdirectory='./') >>> with joblib.parallel_config(backend="threading"): >>> workflow.run(n_jobs=-1) API Reference ------------- .. autoclass:: alchemlyb.workflows.ABFE :members: :inherited-members: