Convergence API Reference¶
Functions for assessing convergence of free energy estimates and raw data.
The alchemlyb.convergence.convergence
module contains building blocks that perform a specific convergence analysis. They typically operate on lists of raw data and either run estimators on these data sets to obtain free energies as a function of the amount of data or they directly assess the convergence of the raw data.
Note
Read the original literature to learn the exact meaning of parameters and how to interpret the output of the convergence analysis.
All convergence functions are located in this submodule but for convenience they are also made available from alchemlyb.convergence
, as shown here:
- alchemlyb.convergence.forward_backward_convergence(df_list, estimator='MBAR', num=10, error_tol: float = 3, **kwargs)¶
Forward and backward convergence of the free energy estimate.
Generate the free energy estimate as a function of time in both directions, with the specified number of equally spaced points in the time [Klimovich2015]. For example, setting num to 10 would give the forward convergence which is the free energy estimate from the first 10%, 20%, 30%, … of the data. The Backward would give the estimate from the last 10%, 20%, 30%, … of the data.
- Parameters:
df_list (list) – List of DataFrame of either dHdl or u_nk.
estimator ({'MBAR', 'BAR', 'TI'}) –
Name of the estimators. See the important note below on the use of “MBAR”.
Deprecated since version 1.0.0: Lower case input is also accepted until release 2.0.0.
num (int) – The number of time points.
error_tol (float) –
The maximum error tolerated for analytic error. If the analytic error is bigger than the error tolerance, the bootstrap error will be used.
Added in version 2.3.0.
kwargs (dict) – Keyword arguments to be passed to the estimator.
- Returns:
The DataFrame with convergence data.
Forward Forward_Error Backward Backward_Error data_fraction 0 3.016442 0.052748 3.065176 0.051036 0.1 1 3.078106 0.037170 3.078567 0.036640 0.2 2 3.072561 0.030186 3.047357 0.029775 0.3 3 3.048325 0.026070 3.057527 0.025743 0.4 4 3.049769 0.023359 3.037454 0.023001 0.5 5 3.034078 0.021260 3.040484 0.021075 0.6 6 3.043274 0.019642 3.032495 0.019517 0.7 7 3.035460 0.018340 3.036670 0.018261 0.8 8 3.042032 0.017319 3.046597 0.017233 0.9 9 3.044149 0.016405 3.044385 0.016402 1.0
- Return type:
Added in version 0.6.0.
Changed in version 1.0.0: The
estimator
accepts uppercase input. The default for usingestimator='MBAR'
was changed fromMBAR
toAutoMBAR
.Changed in version 2.0.0: Use pymbar.MBAR instead of the AutoMBAR option.
- alchemlyb.convergence.fwdrev_cumavg_Rc(series, precision=0.01, tol=2)¶
Generate the convergence criteria \(R_c\) for a single simulation.
The input will be
pandas.Series
generated bydecorrelate_u_nk()
ordecorrelate_dhdl()
.The output will be the float \(R_c\) [Fan2020] [Fan2021] and a
pandas.DataFrame
with the forward and backward cumulative average at precision fractional increments, as described below.\(R_c = 0\) indicates that the system is well equilibrated right from the beginning while \(R_c = 1\) signifies that the whole trajectory is not equilibrated.
- Parameters:
series (pandas.Series) – The input energy array.
precision (float) – The precision of the output \(R_c\). To speed the calculation up, the data has been block-averaged before doing the calculation, the size of the block is controlled by the desired precision.
tol (float) – Tolerance (or convergence threshold \(\epsilon\) in [Fan2021]) in \(kT\).
- Returns:
float – Convergence time fraction \(R_c\) [Fan2021]
-
The DataFrame with moving average.
Forward Backward data_fraction 0 3.016442 3.065176 0.1 1 3.078106 3.078567 0.2 2 3.072561 3.047357 0.3 3 3.048325 3.057527 0.4 4 3.049769 3.037454 0.5 5 3.034078 3.040484 0.6 6 3.043274 3.032495 0.7 7 3.035460 3.036670 0.8 8 3.042032 3.046597 0.9 9 3.044149 3.044385 1.0
Notes
This function computes \(R_c\) from equation 16 from [Fan2021]. The code is modified based on Shujie Fan’s (@VOD555) work. Zhiyi Wu (@xiki-tempula) improved the performance of the original algorithm.
Please cite [Fan2021] when using this function.
See also
Added in version 1.0.0.
- alchemlyb.convergence.A_c(series_list, precision=0.01, tol=2)¶
Generate the ensemble convergence criteria \(A_c\) for a set of simulations.
The input is a
list
ofpandas.Series
generated bydecorrelate_u_nk()
ordecorrelate_dhdl()
.The output will the float \(A_c\) [Fan2020] [Fan2021]. \(A_c\) is a number between 0 and 1 that can be interpreted as the ratio of the total equilibrated simulation time to the whole simulation time for a full set of simulations. \(A_c = 1\) means that all simulation time frames in all windows can be considered equilibrated, while \(A_c = 0\) indicates that nothing is equilibrated.
- Parameters:
series_list (list) – A list of
pandas.Series
energy array.precision (float) – The precision of the output \(A_c\). To speed the calculation up, the data has been block-averaged before doing the calculation, the size of the block is controlled by the desired precision.
tol (float) – Tolerance (or convergence threshold \(\epsilon\) in [Fan2021]) in \(kT\).
- Returns:
The area \(A_c\) under curve for convergence time fraction.
- Return type:
Notes
This function computes \(A_c\) from equation 18 from [Fan2021].
Please cite [Fan2021] when using this function.
See also
Added in version 1.0.0.