Skip to content

tests

Benchmark Markov State Model.

This submodule holds methods for validating Markov state models.

chapman_kolmogorov_test(trajs, lagtimes, tmax)

Calculate the Chapman-Kolmogorov equation.

This method evaluates both sides of the Chapman-Kolmogorov equation

\[T(\tau n) = T^n(\tau)\;.\]

So to compare the transition probability estimated based on the lag time \(n\tau\) (referred as "MD") with the transition probability estimated based on the lag time \(\tau\) and propagated \(n\) times (referred as "MSM"), we can use the Chapman-Kolmogorov test. If the model is Markovian, both sides are identical, and the deviation indicates how Markovian the model is. The Chapman-Kolmogorov test is commonly projected onto the diagonal (so limiting to \(T_{ii}\)). For more details, see the review by Prinz et al. 1.

The returned dictionary can be visualized using msmhelper.plot.plot_ck_test. An example can be found in the tutorial.


  1. Prinz et al., Markov models of molecular kinetics: Generation and validation, J. Chem. Phys., 134, 174105 (2011), doi:10.1063/1.3565032 

Parameters:

  • trajs (StateTraj or list or ndarray or list of ndarray) –

    State trajectory/trajectories. The states should start from zero and need to be integers.

  • lagtimes (list or ndarray int) –

    Lagtimes for estimating the markov model given in [frames].

  • tmax (int) –

    Longest time to evaluate the CK equation given in [frames].

Returns:

  • cktest ( dict ) –

    Dictionary holding for each lagtime the CK equation and with 'md' the reference.

Source code in src/msmhelper/msm/tests.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
@decorit.alias('ck_test')
def chapman_kolmogorov_test(trajs, lagtimes, tmax):
    r"""Calculate the Chapman-Kolmogorov equation.

    This method evaluates both sides of the Chapman-Kolmogorov equation

    $$T(\tau n) = T^n(\tau)\;.$$

    So to compare the transition probability estimated based on the lag time
    $n\tau$ (referred as "MD") with the transition probability estimated based
    on the lag time $\tau$ and propagated $n$ times (referred as "MSM"), we can
    use the Chapman-Kolmogorov test. If the model is Markovian, both sides are
    identical, and the deviation indicates how Markovian the model is. The
    Chapman-Kolmogorov test is commonly projected onto the diagonal (so
    limiting to $T_{ii}$). For more details, see the review by Prinz et al.
    [^1].

    The returned dictionary can be visualized using
    [msmhelper.plot.plot_ck_test][]. An example can be found in the
    [tutorial](/msmhelper/tutorials/msm/#chapman-kolmogorov-test).

    [^1]: Prinz et al., **Markov models of molecular kinetics: Generation and
        validation**, *J. Chem. Phys.*, 134, 174105 (2011),
        doi:[10.1063/1.3565032](https://doi.org/10.1063/1.3565032)

    Parameters
    ----------
    trajs : StateTraj or list or ndarray or list of ndarray
        State trajectory/trajectories. The states should start from zero and
        need to be integers.
    lagtimes : list or ndarray int
        Lagtimes for estimating the markov model given in [frames].
    tmax : int
        Longest time to evaluate the CK equation given in [frames].

    Returns
    -------
    cktest : dict
        Dictionary holding for each lagtime the CK equation and with 'md' the
        reference.

    """
    # format input
    trajs = StateTraj(trajs)
    lagtimes = np.atleast_1d(lagtimes)
    lagtimes = np.sort(lagtimes)

    # check that lag times are array of integers
    if not np.issubdtype(lagtimes.dtype, np.integer):
        raise TypeError(
            'Lagtimes needs to be integers but are {0}'.format(lagtimes.dtype),
        )
    if not (lagtimes > 0).all():
        raise TypeError('Lagtimes needs to be positive integers')

    if lagtimes.ndim != 1:
        raise TypeError(
            'Lagtimes needs to be maximal 1d, but {0}'.format(lagtimes),
        )

    if not isinstance(tmax, int) or tmax < 0:
        raise TypeError('tmax needs to be a positive integer')

    ckeqs = {}
    for lagtime in lagtimes:
        ckeqs[lagtime] = _chapman_kolmogorov_test(trajs, lagtime, tmax)
    ckeqs['md'] = _chapman_kolmogorov_test_md(
        trajs, tmin=lagtimes[0], tmax=tmax,
    )

    return ckeqs