Convergence API Reference

Functions for assessing convergence of free energy estimates and raw data.

The alchemlyb.convergence.convergence module contains building blocks that perform a specific convergence analysis. They typically operate on lists of raw data and either run estimators on these data sets to obtain free energies as a function of the amount of data or they directly assess the convergence of the raw data.

Note

Read the original literature to learn the exact meaning of parameters and how to interpret the output of the convergence analysis.

All convergence functions are located in this submodule but for convenience they are also made available from alchemlyb.convergence, as shown here:

alchemlyb.convergence.forward_backward_convergence(df_list, estimator='MBAR', num=10, **kwargs)

Forward and backward convergence of the free energy estimate.

Generate the free energy estimate as a function of time in both directions, with the specified number of equally spaced points in the time [Klimovich2015]. For example, setting num to 10 would give the forward convergence which is the free energy estimate from the first 10%, 20%, 30%, … of the data. The Backward would give the estimate from the last 10%, 20%, 30%, … of the data.

Parameters
  • df_list (list) – List of DataFrame of either dHdl or u_nk.

  • estimator ({'MBAR', 'BAR', 'TI'}) –

    Name of the estimators. See the important note below on the use of “MBAR”.

    Deprecated since version 1.0.0: Lower case input is also accepted until release 2.0.0.

  • num (int) – The number of time points.

  • kwargs (dict) – Keyword arguments to be passed to the estimator.

Returns

The DataFrame with convergence data.

    Forward  Forward_Error  Backward  Backward_Error  data_fraction
0  3.016442       0.052748  3.065176        0.051036            0.1
1  3.078106       0.037170  3.078567        0.036640            0.2
2  3.072561       0.030186  3.047357        0.029775            0.3
3  3.048325       0.026070  3.057527        0.025743            0.4
4  3.049769       0.023359  3.037454        0.023001            0.5
5  3.034078       0.021260  3.040484        0.021075            0.6
6  3.043274       0.019642  3.032495        0.019517            0.7
7  3.035460       0.018340  3.036670        0.018261            0.8
8  3.042032       0.017319  3.046597        0.017233            0.9
9  3.044149       0.016405  3.044385        0.016402            1.0

Return type

pandas.DataFrame

New in version 0.6.0.

Changed in version 1.0.0: The estimator accepts uppercase input. The default for using estimator='MBAR' was changed from MBAR to AutoMBAR.

Changed in version 2.0.0: Use pymbar.MBAR instead of the AutoMBAR option.

alchemlyb.convergence.fwdrev_cumavg_Rc(series, precision=0.01, tol=2)

Generate the convergence criteria \(R_c\) for a single simulation.

The input will be pandas.Series generated by decorrelate_u_nk() or decorrelate_dhdl().

The output will be the float \(R_c\) [Fan2020] [Fan2021] and a pandas.DataFrame with the forward and backward cumulative average at precision fractional increments, as described below.

\(R_c = 0\) indicates that the system is well equilibrated right from the beginning while \(R_c = 1\) signifies that the whole trajectory is not equilibrated.

Parameters
  • series (pandas.Series) – The input energy array.

  • precision (float) – The precision of the output \(R_c\). To speed the calculation up, the data has been block-averaged before doing the calculation, the size of the block is controlled by the desired precision.

  • tol (float) – Tolerance (or convergence threshold \(\epsilon\) in [Fan2021]) in \(kT\).

Returns

  • float – Convergence time fraction \(R_c\) [Fan2021]

  • pandas.DataFrame

    The DataFrame with moving average.

        Forward  Backward  data_fraction
    0  3.016442  3.065176            0.1
    1  3.078106  3.078567            0.2
    2  3.072561  3.047357            0.3
    3  3.048325  3.057527            0.4
    4  3.049769  3.037454            0.5
    5  3.034078  3.040484            0.6
    6  3.043274  3.032495            0.7
    7  3.035460  3.036670            0.8
    8  3.042032  3.046597            0.9
    9  3.044149  3.044385            1.0
    

Notes

This function computes \(R_c\) from equation 16 from [Fan2021]. The code is modified based on Shujie Fan’s (@VOD555) work. Zhiyi Wu (@xiki-tempula) improved the performance of the original algorithm.

Please cite [Fan2021] when using this function.

See also

A_c

New in version 1.0.0.

alchemlyb.convergence.A_c(series_list, precision=0.01, tol=2)

Generate the ensemble convergence criteria \(A_c\) for a set of simulations.

The input is a list of pandas.Series generated by decorrelate_u_nk() or decorrelate_dhdl().

The output will the float \(A_c\) [Fan2020] [Fan2021]. \(A_c\) is a number between 0 and 1 that can be interpreted as the ratio of the total equilibrated simulation time to the whole simulation time for a full set of simulations. \(A_c = 1\) means that all simulation time frames in all windows can be considered equilibrated, while \(A_c = 0\) indicates that nothing is equilibrated.

Parameters
  • series_list (list) – A list of pandas.Series energy array.

  • precision (float) – The precision of the output \(A_c\). To speed the calculation up, the data has been block-averaged before doing the calculation, the size of the block is controlled by the desired precision.

  • tol (float) – Tolerance (or convergence threshold \(\epsilon\) in [Fan2021]) in \(kT\).

Returns

The area \(A_c\) under curve for convergence time fraction.

Return type

float

Notes

This function computes \(A_c\) from equation 18 from [Fan2021].

Please cite [Fan2021] when using this function.

See also

fwdrev_cumavg_Rc

New in version 1.0.0.