dcTMD via Force

Introduction¶

Another option to directly analyze the constraint force time traces, is to calculate $\Delta G$ on-the-fly starting from the friction estimate ${\it{\Gamma}}$ which is calculated from the force $f_c(t)$ auto correlation via:

$$ \begin{align} \Gamma &= \beta \int_{0}^{t} \text{d}\tau \left<\delta f_c(t)\delta f_c(t-\tau)\right>_\text{N},\\ \Delta G(s) &= - v_c\int_{s_0}^{s} \text{d}s' \Gamma(s') + \int_{s_0}^{s} \text{d}s' \left<f_c(s')\right>_\text{N} \end{align} $$

This approach is computionally more demanding, since the full resolution of the force time traces is needed to determine friction and free energy estimates. Therefore, we advise the use of the WorkEstimator class, which is computationally less demanding, since it does not require the full resolution of the force time traces.

Workflow:¶

0. load packages and define variables¶

In [1]:

Copied!





import numpy as np
from dcTMD.dcTMD import ForceEstimator
from dcTMD.storing import ForceSet, load
import dcTMD
import numpy as np
from dcTMD.dcTMD import ForceEstimator
from dcTMD.storing import ForceSet, load
import dcTMD

In [2]:

Copied!





# define variables
velocity = 0.001
res = 1
verbose = True
temperature = 300
# define variables
velocity = 0.001
res = 1
verbose = True
temperature = 300

I. create forceset¶

To calculate free energy and friction estimates a forceset containing all the force time traces is needed.

an array containing the filenames is generated. This can be done via the function dcTMD.io.load_pullf() which takes either a glob pattern or a file containing the pullf file names as argument.

In [3]:

Copied!

pullf_files = '../../tests/testdata/pullf_filenames.dat'
pullf_files = '../../tests/testdata/*pullf.xvg'
filenames = dcTMD.io.load_pullf(pullf_files)

filenames
pullf_files = '../../tests/testdata/pullf_filenames.dat'
pullf_files = '../../tests/testdata/*pullf.xvg'
filenames = dcTMD.io.load_pullf(pullf_files)

filenames

file ../../tests/testdata/*pullf.xvg not found. using glob.glob(../../tests/testdata/*pullf.xvg)

Out[3]:

['../../tests/testdata/t_middle_32_pullf.xvg',
 '../../tests/testdata/t_middle_03_pullf.xvg',
 '../../tests/testdata/t_middle_34_pullf.xvg',
 '../../tests/testdata/t_middle_24_pullf.xvg',
 '../../tests/testdata/t_middle_21_pullf.xvg',
 '../../tests/testdata/t_middle_04_pullf.xvg',
 '../../tests/testdata/t_middle_29_pullf.xvg',
 '../../tests/testdata/t_middle_16_pullf.xvg',
 '../../tests/testdata/t_middle_30_pullf.xvg',
 '../../tests/testdata/t_middle_19_pullf.xvg',
 '../../tests/testdata/t_middle_01_pullf.xvg',
 '../../tests/testdata/t_middle_28_pullf.xvg',
 '../../tests/testdata/t_middle_26_pullf.xvg',
 '../../tests/testdata/t_middle_31_pullf.xvg',
 '../../tests/testdata/t_middle_09_pullf.xvg',
 '../../tests/testdata/t_middle_17_pullf.xvg',
 '../../tests/testdata/t_middle_25_pullf.xvg',
 '../../tests/testdata/t_middle_05_pullf.xvg']

the forceset is created by creating a ForceSet instance which is fitted with the filenames.

In [4]:

Copied!





# create ForceSet instance
forceset = ForceSet(velocity=velocity,
                  resolution=res,
                  verbose=verbose,
                  )
forceset
# create ForceSet instance
forceset = ForceSet(velocity=velocity,
                  resolution=res,
                  verbose=verbose,
                  )
forceset

Out[4]:

ForceSet(velocity=0.001, verbose=True)

In [5]:

Copied!





# fit/fill workset
forceset.fit(filenames)
# save workset
#dcTMD.storing.save('my_forceset', forceset)
# fit/fill workset
forceset.fit(filenames)
# save workset
#dcTMD.storing.save('my_forceset', forceset)

Using ../../tests/testdata/t_middle_32_pullf.xvg to initialize arrays.
length of pullf file is 20001
reduced length is 20001

Loading force files:  44%|████▍     | 8/18 [00:00<00:00, 79.37it/s]

Reading file ../../tests/testdata/t_middle_32_pullf.xvg
Reading file ../../tests/testdata/t_middle_03_pullf.xvg
Reading file ../../tests/testdata/t_middle_34_pullf.xvg
Reading file ../../tests/testdata/t_middle_24_pullf.xvg
Reading file ../../tests/testdata/t_middle_21_pullf.xvg
Reading file ../../tests/testdata/t_middle_04_pullf.xvg
Reading file ../../tests/testdata/t_middle_29_pullf.xvg
Reading file ../../tests/testdata/t_middle_16_pullf.xvg
Reading file ../../tests/testdata/t_middle_30_pullf.xvg
Reading file ../../tests/testdata/t_middle_19_pullf.xvg
Reading file ../../tests/testdata/t_middle_01_pullf.xvg
Reading file ../../tests/testdata/t_middle_28_pullf.xvg

Loading force files:  44%|████▍     | 8/18 [00:00<00:00, 79.37it/s]

Reading file ../../tests/testdata/t_middle_26_pullf.xvg
Reading file ../../tests/testdata/t_middle_31_pullf.xvg
Reading file ../../tests/testdata/t_middle_09_pullf.xvg

Loading force files:  44%|████▍     | 8/18 [00:00<00:00, 79.37it/s]

Reading file ../../tests/testdata/t_middle_17_pullf.xvg
Reading file ../../tests/testdata/t_middle_25_pullf.xvg

Loading force files: 100%|██████████| 18/18 [00:00<00:00, 80.68it/s]

Reading file ../../tests/testdata/t_middle_05_pullf.xvg

Out[5]:

ForceSet(velocity=0.001, verbose=True)

II. check normality of work distribution¶

One of the main conditions which need to be fulfilled for dcTMD is a normally distributed work.

This can be checked via different methods. e.g. plotting the work time traces, normality checks at different x positions, Kolmogorov-Smirnov Test, Shapiro-Wilk Test, Anderson-Darling Test,

CAUTION: if the work distribution is not normal you results are compromised. And a path separation is necessary. For the theory on path separation see...

In [6]:

Copied!





# plot workset
import matplotlib.pyplot as plt
from dcTMD.utils import plotting

fig, ax = plt.subplots()
plotting.plot_worklines(forceset, ax)
plt.show()
# plot workset
import matplotlib.pyplot as plt
from dcTMD.utils import plotting

fig, ax = plt.subplots()
plotting.plot_worklines(forceset, ax)
plt.show()

No description has been provided for this image

In [7]:

Copied!





# check if work distribution follows a normal distribution
from scipy.stats import kstest, shapiro, anderson
from dcTMD.utils import plotting

index = [5000, 15000]
x = forceset.position_

fig, axs = plotting.plot_worknormalitychecks(forceset, index)

for i, p in enumerate(index):
    # Shapiro-Wilk Test
    shapiro_test = shapiro(forceset.work_[:,p])
    print(f'shapiro wilkins results at x={x[p]} is {shapiro_test}')
    # Anderson-Darling Test
    # If the test statsitics is larger than the critical value of a given
    # significance_level in percent, the null hypothesis that the work
    # is normally distributed has to be rejected.
    anderson_test = anderson(forceset.work_[:,p], 'norm')
    print(f'anderson darling results at x={x[p]} is {anderson_test}.')
    # Kolmogorov-Smirnov Test (requires centering and scaling of input data)
    kstest_test = kstest(
        (forceset.work_[:,p]-np.mean(forceset.work_[:,p]))/np.std(forceset.work_[:,p]),
        'norm'
    )
    print(f'Kolmogorov-Smirnov results at x={x[p]} is {kstest_test}')
# check if work distribution follows a normal distribution
from scipy.stats import kstest, shapiro, anderson
from dcTMD.utils import plotting

index = [5000, 15000]
x = forceset.position_

fig, axs = plotting.plot_worknormalitychecks(forceset, index)

for i, p in enumerate(index):
    # Shapiro-Wilk Test
    shapiro_test = shapiro(forceset.work_[:,p])
    print(f'shapiro wilkins results at x={x[p]} is {shapiro_test}')
    # Anderson-Darling Test
    # If the test statsitics is larger than the critical value of a given
    # significance_level in percent, the null hypothesis that the work
    # is normally distributed has to be rejected.
    anderson_test = anderson(forceset.work_[:,p], 'norm')
    print(f'anderson darling results at x={x[p]} is {anderson_test}.')
    # Kolmogorov-Smirnov Test (requires centering and scaling of input data)
    kstest_test = kstest(
        (forceset.work_[:,p]-np.mean(forceset.work_[:,p]))/np.std(forceset.work_[:,p]),
        'norm'
    )
    print(f'Kolmogorov-Smirnov results at x={x[p]} is {kstest_test}')

shapiro wilkins results at x=0.5 is ShapiroResult(statistic=0.9809889793395996, pvalue=0.9596364498138428)
anderson darling results at x=0.5 is AndersonResult(statistic=0.1889729620402001, critical_values=array([0.503, 0.573, 0.687, 0.802, 0.954]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ])).
Kolmogorov-Smirnov results at x=0.5 is KstestResult(statistic=0.1222997987555296, pvalue=0.920955734836253)
shapiro wilkins results at x=1.5 is ShapiroResult(statistic=0.9350144863128662, pvalue=0.23768189549446106)
anderson darling results at x=1.5 is AndersonResult(statistic=0.38109849215264546, critical_values=array([0.503, 0.573, 0.687, 0.802, 0.954]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ])).
Kolmogorov-Smirnov results at x=1.5 is KstestResult(statistic=0.14917443864214264, pvalue=0.7646028245726414)

III. derive estimates from forceset¶

create ForceEstimator instance
fit ForceEstimator instance with previously created forceset

In [8]:

Copied!





## load force
# force = load('my_force_set')
# Instantiate a ForceEstimator instance and fit it with the ForceSet
# instance
forceestimator = ForceEstimator(temperature)
forceestimator.fit(forceset)
vars(forceestimator)
## load force
# force = load('my_force_set')
# Instantiate a ForceEstimator instance and fit it with the ForceSet
# instance
forceestimator = ForceEstimator(temperature)
forceestimator.fit(forceset)
vars(forceestimator)

Out[8]:

{'temperature': 300,
 'verbose': False,
 'force_set': ForceSet(velocity=0.001, verbose=True),
 'names_': array(['t_middle_32_pullf.xvg', 't_middle_03_pullf.xvg',
        't_middle_34_pullf.xvg', 't_middle_24_pullf.xvg',
        't_middle_21_pullf.xvg', 't_middle_04_pullf.xvg',
        't_middle_29_pullf.xvg', 't_middle_16_pullf.xvg',
        't_middle_30_pullf.xvg', 't_middle_19_pullf.xvg',
        't_middle_01_pullf.xvg', 't_middle_28_pullf.xvg',
        't_middle_26_pullf.xvg', 't_middle_31_pullf.xvg',
        't_middle_09_pullf.xvg', 't_middle_17_pullf.xvg',
        't_middle_25_pullf.xvg', 't_middle_05_pullf.xvg'], dtype='<U32'),
 'delta_force_array': array([[-8.73736883e+02,  2.06404027e+03, -1.64675083e+03, ...,
         -3.84514611e+02, -6.60017244e+02,  1.03823811e+03],
        [-1.21761218e+03, -1.68094973e+03, -5.91505833e+02, ...,
         -1.45385561e+03,  6.41780156e+02, -6.25663889e+02],
        [-2.50703183e+02,  6.28820267e+02, -7.42814833e+02, ...,
          1.87819439e+03,  6.96557756e+02, -1.87626089e+03],
        ...,
        [-2.08374518e+03,  1.03954027e+03,  2.25121917e+03, ...,
          1.37587439e+03,  1.52341276e+03, -5.30872889e+02],
        [-2.15387183e+02,  2.92739027e+03,  1.24891917e+03, ...,
         -2.93190561e+03,  1.06975556e+00, -6.28384889e+02],
        [ 2.67058167e+01, -4.78409733e+02, -7.94146833e+02, ...,
         -3.65687561e+03,  1.25278176e+03, -2.15804889e+02]]),
 'position_': array([0.0000e+00, 1.0000e-04, 2.0000e-04, ..., 1.9998e+00, 1.9999e+00,
        2.0000e+00]),
 'W_mean_': array([0.00000000e+00, 4.41837458e-02, 4.44572742e-02, ...,
        9.33136501e+01, 9.32887153e+01, 9.32516402e+01]),
 'W_diss_': array([0.00000000e+00, 1.31284676e-03, 5.44972880e-03, ...,
        1.03329455e+02, 1.03220321e+02, 1.03098458e+02]),
 'dG_': array([  0.        ,   0.0428709 ,   0.03900755, ..., -10.01580464,
         -9.93160537,  -9.846818  ]),
 'friction_': array([       0.        ,    26256.93524359,    56480.70560122, ...,
         1453172.40056802, -3635855.06739585,  1198605.76619241])}

Visualize results¶

In the package a couple of simple plot functions to get an overview of the results are implemented. e.g. plot_dcTMD_results()

In [9]:

Copied!

# plot dcTMD results
from dcTMD.utils import plotting

fig, axs = plotting.plot_dcTMD_results(forceestimator)

plt.show()
# plot dcTMD results
from dcTMD.utils import plotting

fig, axs = plotting.plot_dcTMD_results(forceestimator)

plt.show()

Smooth friction estimate¶

Finally, the friction estimate needs to be smoothed. This can be done via dcTMD.utils.smoothing.gaussfilter_friction() or dcTMD.WorkEstimator.smooth_friction(sigma, mode) sigma is the standard deviation of gaussian kernel in nm the mode parameter determines how the input array is extended beyond its boundaries.

Caution: this can lead to long computations using large datasets and a big smoothing window.

In [10]:

Copied!





# smooth friction and plot results
forceestimator.smooth_friction(sigma=0.1)

# if the friction is smoothed that value is automatically plotted
fig, axs = plotting.plot_dcTMD_results(forceestimator)
plt.show()

# but one can also specify the friction
#fig, axs = plotting.plot_dcTMD_results(
#    forceestimator,
#    friction=forceestimator.friction_
#)
# smooth friction and plot results
forceestimator.smooth_friction(sigma=0.1)

# if the friction is smoothed that value is automatically plotted
fig, axs = plotting.plot_dcTMD_results(forceestimator)
plt.show()

# but one can also specify the friction
#fig, axs = plotting.plot_dcTMD_results(
#    forceestimator,
#    friction=forceestimator.friction_
#)

Using different smoothing windows and modes changes the results significantly. Because of the different boundary handling in the modes. mode='nearest' leads to an overestimation at the right hand boarder (end of the simulation), while mode='reflect' leads to a wash out of the boarder on the left hand side.

Here, are some examples:

In [11]:

Copied!





fig, axs = plt.subplots()

x = forceestimator.position_
plotting.plot_Gamma(x,
                    forceestimator.friction_smooth_,
                    axs,
                    label='default (reflect) 0.1nm'
                    )

# using different smoothing windows and modes changes the results significantly
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   x,
                                                   sigma=.01,
                                                   mode='reflect',
                                                   )
axs.plot(x, smooth_friction, label='reflect .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   x,
                                                   sigma=.01, 
                                                   mode='nearest',
                                                   )
axs.plot(x, smooth_friction, label='nearest .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   x,
                                                   sigma=0.2,
                                                   mode='reflect',
                                                   )
                                                   
axs.plot(x, smooth_friction, label='reflect .2nm')
axs.legend()

plt.show()
fig, axs = plt.subplots()

x = forceestimator.position_
plotting.plot_Gamma(x,
                    forceestimator.friction_smooth_,
                    axs,
                    label='default (reflect) 0.1nm'
                    )

# using different smoothing windows and modes changes the results significantly
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   x,
                                                   sigma=.01,
                                                   mode='reflect',
                                                   )
axs.plot(x, smooth_friction, label='reflect .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   x,
                                                   sigma=.01, 
                                                   mode='nearest',
                                                   )
axs.plot(x, smooth_friction, label='nearest .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   x,
                                                   sigma=0.2,
                                                   mode='reflect',
                                                   )
                                                   
axs.plot(x, smooth_friction, label='reflect .2nm')
axs.legend()

plt.show()

IV. Save and load results¶

In [12]:

Copied!

# save forceestimator instance
# dcTMD.storing.save('my_forceestimator', forceestimator)

# save data as .npz and .dat file
outname = 'my_forceestimator_results'
dcTMD.io.write_output(outname, forceestimator)

results = np.load(f'{outname}_N{len(forceestimator.names_)}_dG.npz')

results.files
# save forceestimator instance
# dcTMD.storing.save('my_forceestimator', forceestimator)

# save data as .npz and .dat file
outname = 'my_forceestimator_results'
dcTMD.io.write_output(outname, forceestimator)

results = np.load(f'{outname}_N{len(forceestimator.names_)}_dG.npz')

results.files

save file my_forceestimator_results_N18_dG.dat
save file my_forceestimator_results_N18_dG.npz

Out[12]:

['x', 'Wmean', 'Wdiss', 'dG', 'Gamma', 'Gamma_smooth']