Usage
Command-line interface
Generate a default config file and run training:
# Generate template config
tempest --generate_config
# Train with your config
tempest --config my_config.yaml
Configuration parameters
| Parameter | Description | Default |
|---|---|---|
data_path |
Path to input coordinates file (whitespace-separated) | — |
inducing_points_path |
Path to inducing point timestamps | — |
save_path |
Output directory | — |
dim_input |
Number of input features | — |
dim_latent |
Latent space dimensionality | 2 |
neurons_ae |
Hidden layer sizes, e.g. [32, 32, 32] |
[32, 32, 32] |
epochs |
Training epochs | 100 |
batch_size |
Batch size (larger is better, ≥512 recommended) | 1024 |
learning_rate |
AdamW learning rate | 1e-4 |
weight_decay |
AdamW weight decay | 1e-6 |
beta |
Weight of the GP regularization term | 50 |
kernel_nu |
Matérn smoothness: 0.5, 1.5, or 2.5 | 1.5 |
kernel_scale |
Time scale of the GP prior | 1000 |
cuda |
Use GPU if available | true |
Python API
import numpy as np
import torch
from gptempest import TEMPEST, MaternKernel
from gptempest.utils import load_prepare_data
# Load data
dataset = load_prepare_data("data.dat", dtype=torch.float64)
inducing_points = np.loadtxt("inducing_points.dat")
# Build model
kernel = MaternKernel(nu=1.5, scale=1e3, dtype=torch.float64)
model = TEMPEST(
cuda=torch.cuda.is_available(),
kernel=kernel,
dim_input=2,
dim_latent=2,
layers_hidden_encoder=[32, 32, 32],
layers_hidden_decoder=[32, 32, 32],
inducing_points=inducing_points,
beta=50.0,
N_data=len(dataset),
dtype=torch.float64,
)
# Train
model.train_model(
dataset,
train_size=1,
learning_rate=1e-4,
weight_decay=1e-6,
batch_size=1024,
n_epochs=100,
)
# Extract embedding
embedding = model.extract_latent_space(dataset, batch_size=1024)
np.savetxt("embedding.dat", embedding, fmt="%.4f")
Choosing inducing points
Inducing points are timestamps that should cover the important events in your trajectory — metastable states and transitions. A simple choice is to use uniformly spaced time points:
n_inducing = 50
inducing_points = np.linspace(0, len(data) - 1, n_inducing)
np.savetxt("inducing_points.dat", inducing_points)
For best results, choose points that sample transitions and metastable regions.
Kernel smoothness (ν)
| ν | Process | When to use |
|---|---|---|
| 0.5 | Ornstein–Uhlenbeck | Rough, fast dynamics |
| 1.5 | Once differentiable | General MD trajectories |
| 2.5 | Twice differentiable | Smoother, slower dynamics |