Introduction

alchemrs is a CLI-first tool for alchemical free energy analysis.

The project is organized as a Cargo workspace with a core Rust package, a CLI binary, and a Python extension package. Inside the library, the code is split into focused modules for:

parsing supported AMBER and GROMACS outputs into typed Rust data structures
preprocessing time series by trimming and decorrelation
estimating free energies with TI, BAR, MBAR, IEXP, DEXP, NES, UWHAM, and ATM
computing overlap diagnostics
computing schedule advice for u_nk, TI, and NES workflows
exposing the workflow through a CLI, a clean top-level Rust API, and Python bindings

The intended user journey is:

Start with the alchemrs CLI for the standard workflows.
Drop to the Rust API when you need embedding, scripting, tighter control, or integration into a larger application.
Use the existing Python package when you want the same core implementation from Python or OpenMM-driven workflows.

At the moment, the practical workflow is:

Parse supported engine outputs into DhdlSeries, UNkMatrix, or SwitchingTrajectory, or construct AtmSampleSet / AtmLogQMatrix directly for ATM workflows.
Trim and decorrelate those data if needed.
Apply an estimator.
Inspect uncertainties, overlap diagnostics, or schedule advice.

This can be done from the CLI or the library API, but the CLI is the default front door.

This book documents the current codebase rather than a future roadmap.

Getting Started

The fastest way to use alchemrs is through the CLI. The Rust API is documented separately for embedding and custom workflows.

Prerequisites

Rust 1.85 or newer
AMBER output files or GROMACS dhdl.xvg files
Python 3.9 or newer plus maturin if you want the local Python package

The repository is a normal Cargo workspace and can be built and tested with standard Rust tooling.

Build the package

cargo build

Build the CLI binary

cargo build --release

Then run:

./target/release/alchemrs --help

Try a CLI workflow

./target/release/alchemrs ti \
  --temperature 300 \
  ./fixtures/amber/acetamide_tiny/*/acetamide.prod.out

Run tests

cargo test

Run the Rust examples

cargo run --example amber_ti -- 300 ./fixtures/amber/acetamide_tiny/*/acetamide.prod.out

cargo run --example amber_mbar -- 300 ./fixtures/amber/acetamide_tiny/*/acetamide.prod.out

cargo run --example openmm_u_kln_mbar

Run the Python package locally

Build and install the Python extension into the active environment:

maturin develop

Then run the Python tests and example scripts from the repository root:

python -m pytest python/tests -q
python python/examples/amber_fixture_analysis.py
python python/examples/atm_analysis.py

The OpenMM toy-system examples are also available when openmm is installed:

python python/examples/openmm_u_kln_mbar.py
python python/examples/openmm_nes.py
python python/examples/openmm_atm.py

See Python and OpenMM for the Python package layout, helper APIs, and the current local-usage workflow.

Build this book

This documentation uses mdBook.

Install it once:

cargo install mdbook

Preview the docs locally:

mdbook serve docs

Build static HTML:

mdbook build docs

CLI Guide

The CLI is the primary entry point for alchemrs. It ships as the alchemrs binary in the same package as the Rust library crate.

Top-level commands:

advise-schedule
ti
bar
mbar
nes
iexp
dexp

Build

cargo build --release

Top-Level Usage

alchemrs [--threads <N>] <COMMAND>

Top-level options:

--threads <THREADS>
- Number of worker threads to use for internal Rayon-backed parallelism.
-h, --help
- Print top-level help.

Common CLI Behavior

Most commands follow the same overall workflow:

load one simulation output per lambda window, or one AMBER nonequilibrium switching trajectory per file for nes
infer temperature from input files unless --temperature is provided
optionally trim initial samples with --remove-burnin
optionally run equilibration detection with --auto-equilibrate
optionally decorrelate retained samples with --decorrelate
fit the requested estimator or advisor
write text, JSON, CSV, or an HTML report depending on command and flags

Input files are positional arguments:

INPUTS...
- Required for every subcommand.
- Accepted file types are AMBER .out and GROMACS dhdl.xvg.
- Pass one file per lambda window for TI/BAR/MBAR/FEP.
- Pass one file per switching trajectory for nes.

Shared Option Semantics

These options recur across commands.

--temperature <TEMPERATURE>
- Temperature in kelvin.
- If omitted, the CLI attempts to infer it from the input files.
--decorrelate
- Apply decorrelation after burn-in removal / equilibration detection.
- ti decorrelates the dH/dλ series.
- bar, mbar, iexp, dexp, and advise-schedule decorrelate using the selected u_nk observable when operating on u_nk data.
- nes does not expose preprocessing flags because each file is already treated as one complete switching trajectory.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before any other analysis.
- Default: 0
--auto-equilibrate
- Automatically detect equilibration and remove burn-in.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Control conservative subsampling.
- Default: true
- Allowed values: true, false
- Examples:
  - --conservative
  - --conservative=true
  - --conservative=false
--nskip <NSKIP>
- Stride used during equilibration detection.
- Default: 1
--output-units <OUTPUT_UNITS>
- Available on estimator commands and advise-schedule.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write command output to a file instead of stdout.
--parallel
- Enable parallel processing.
- Available on ti, bar, mbar, iexp, and dexp.
--u-nk-observable <U_NK_OBSERVABLE>
- Available on bar, mbar, iexp, dexp, and advise-schedule.
- Hidden-but-recognized on ti only to provide a clearer error message.
- Allowed values:
  - epot
  - de
  - all
- Default: de
--overlap-summary
- Include overlap scalar and overlap eigenvalues in estimator output.
- Available on bar, mbar, iexp, and dexp.

Observable Selection

For u_nk-based estimators:

de
- Default.
- Matches the adjacent-state ΔE style observable used by alchemlyb.
all
- Uses the full u_nk row sum.
epot
- Uses an engine-provided potential-energy observable.
- AMBER: EPtot
- GROMACS dhdl.xvg: Potential Energy

Use epot when:

you want the CLI's external-observable path
the u_nk matrix contains positive infinity values that make de unusable
you need a preprocessing observable that does not depend on adjacent-state ΔE semantics

For multidimensional u_nk schedules:

bar, mbar, iexp, and dexp accept them
CLI output renders lambda states as JSON arrays or bracketed values in text / CSV output
parser-derived lambda component labels are included in provenance when available
de remains one-dimensional, so multidimensional schedules generally require all or epot

TI-Specific Behavior

Current TI-specific behavior:

ti --method auto automatically selects an integration method and records both ti_method and ti_method_reason in provenance.
For GROMACS files with multidimensional lambda schedules, the CLI currently rejects TI input because multiple dH/dλ components cannot yet be collapsed into one scalar TI series safely.
--u-nk-observable is intentionally not part of the public ti --help surface. If supplied, it is only accepted so the command can emit a domain-specific error instead of a generic unknown-flag parse error.

`advise-schedule`

Usage:

alchemrs advise-schedule [OPTIONS] <INPUTS>...

Purpose:

Run schedule diagnostics instead of computing a final scalar free-energy estimate.
Supports u_nk-based schedule analysis, TI-style dH/dλ schedule analysis, and AMBER nonequilibrium switching (NES) trajectory diagnostics.
In u_nk report mode, includes an MBAR-derived overlap-matrix view inspired by the overlap-matrix diagnostic discussed in Klimovich, Shirts, and Mobley, "Guidelines for the analysis of free energy calculations", J Comput Aided Mol Des 29, 397-411 (2015), doi:10.1007/s10822-015-9840-9.

Arguments:

INPUTS...
- Required.
- Simulation output files, one per lambda window in u_nk / TI modes.
- One AMBER switching-trajectory output per file in NES mode.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--estimator <ESTIMATOR>
- Estimator used for adjacent-edge estimates in u_nk advisor mode.
- Allowed values:
  - mbar
  - bar
- Default: mbar
--output-units <OUTPUT_UNITS>
- Output units for energy-valued advisor diagnostics.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--decorrelate
- Apply decorrelation to each window using the selected observable.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before analysis.
- Default: 0
--auto-equilibrate
- Detect equilibration and remove burn-in automatically.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Use conservative subsampling.
- Default: true
- Allowed values: true, false
--nskip <NSKIP>
- Equilibration-detection stride.
- Default: 1
--u-nk-observable <U_NK_OBSERVABLE>
- Observable used for u_nk auto-equilibration and decorrelation.
- Allowed values:
  - epot
  - de
  - all
- Default: de
--input-kind <INPUT_KIND>
- Force the advisor to treat inputs as u_nk, dH/dλ, or NES data.
- Allowed values:
  - auto
  - u-nk
  - dhdl
  - nes
- Default: auto
--overlap-min <OVERLAP_MIN>
- Minimum adjacent overlap before suggesting a new window.
- Default: 0.03
--block-cv-min <BLOCK_CV_MIN>
- Minimum block coefficient of variation before suggesting more sampling.
- Default: 0.15
--n-blocks <N_BLOCKS>
- Number of blocks used for block averaging.
- Default: 4
--no-midpoints
- Disable midpoint proposals for insert-window suggestions.
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write advisor output to a file.
--report <REPORT>
- Write a standalone HTML advisor report to a file.

Behavior notes:

u_nk mode reports overlap-driven diagnostics and window insertion / sampling suggestions.
dhdl mode reports TI spacing diagnostics such as means, SEMs, block CV, split-half drift, slope, curvature, trapezoid contributions, and interval uncertainty.
nes mode reports cumulative Jarzynski convergence, relative uncertainty, recent midpoint-to-final drift, an ensemble-mean dV/dλ(λ) switching profile, and ranked high-curvature lambda regions.
nes mode rejects --decorrelate, --auto-equilibrate, and nonzero --remove-burnin because each input file is already one full switching trajectory.
When --report is provided, the command writes a standalone HTML report in addition to normal structured output.

Examples:

cargo run --release -- advise-schedule \
  --temperature 300 \
  --decorrelate \
  --u-nk-observable de \
  --report schedule-report.html \
  --output-format json \
  path/to/*/prod.out

Force TI-style schedule diagnostics:

cargo run --release -- advise-schedule \
  --temperature 300 \
  --input-kind dhdl \
  --decorrelate \
  --report ti-schedule-report.html \
  --output-format json \
  path/to/*/prod.out

NES switching-trajectory advisor mode:

cargo run --release -- advise-schedule \
  --temperature 300 \
  --input-kind nes \
  --report nes-report.html \
  path/to/run_*/fwd.out

`ti`

Usage:

alchemrs ti [OPTIONS] <INPUTS>...

Purpose:

Perform thermodynamic integration on dH/dλ windows.

Arguments:

INPUTS...
- Required.
- Simulation output files, one per lambda window.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--method <METHOD>
- Integration method.
- Allowed values:
  - auto
  - trapezoidal
  - simpson
  - cubic-spline
  - pchip
  - akima
  - gaussian-quadrature
- Default: trapezoidal
--output-units <OUTPUT_UNITS>
- Output units.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write output to a file.
--parallel
- Enable parallel processing.
--decorrelate
- Apply decorrelation to each window using the dH/dλ series.
--input-stride <INPUT_STRIDE>
- For AMBER TI inputs, retain every input-strideth value from the final DV/DL summary block.
- If omitted, AMBER parsing uses the file's ntpr value.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before analysis.
- Default: 0
--auto-equilibrate
- Detect equilibration and remove burn-in automatically.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Use conservative subsampling.
- Default: true
- Allowed values: true, false
--nskip <NSKIP>
- Equilibration-detection stride.
- Default: 1

Notes:

ti does not publicly expose --u-nk-observable.
--method auto chooses a supported TI integration method after preprocessing.

Example:

cargo run --release -- ti \
  --temperature 300 \
  --method auto \
  --decorrelate \
  path/to/*/prod.out

`bar`

Usage:

alchemrs bar [OPTIONS] <INPUTS>...

Purpose:

Compute free-energy differences with BAR.

Arguments:

INPUTS...
- Required.
- Simulation output files, one per lambda window.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--method <METHOD>
- BAR root-finding method.
- Allowed values:
  - false-position
  - self-consistent-iteration
  - bisection
- Default: false-position
--output-units <OUTPUT_UNITS>
- Output units.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write output to a file.
--overlap-summary
- Include overlap scalar and eigenvalues in the output.
--parallel
- Enable parallel processing.
--decorrelate
- Apply decorrelation using the selected u_nk observable.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before analysis.
- Default: 0
--auto-equilibrate
- Detect equilibration and remove burn-in automatically.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Use conservative subsampling.
- Default: true
- Allowed values: true, false
--nskip <NSKIP>
- Equilibration-detection stride.
- Default: 1
--u-nk-observable <U_NK_OBSERVABLE>
- Observable used for u_nk auto-equilibration and decorrelation.
- Allowed values:
  - epot
  - de
  - all
- Default: de

Example:

cargo run --release -- bar \
  --temperature 300 \
  --decorrelate \
  --u-nk-observable de \
  --overlap-summary \
  path/to/*/prod.out

`mbar`

Usage:

alchemrs mbar [OPTIONS] <INPUTS>...

Purpose:

Compute free-energy differences with MBAR.

Arguments:

INPUTS...
- Required.
- Simulation output files, one per lambda window.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--decorrelate
- Apply decorrelation using the selected u_nk observable.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before analysis.
- Default: 0
--auto-equilibrate
- Detect equilibration and remove burn-in automatically.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Use conservative subsampling.
- Default: true
- Allowed values: true, false
--nskip <NSKIP>
- Equilibration-detection stride.
- Default: 1
--u-nk-observable <U_NK_OBSERVABLE>
- Observable used for u_nk auto-equilibration and decorrelation.
- Allowed values:
  - epot
  - de
  - all
- Default: de
--max-iterations <MAX_ITERATIONS>
- Maximum number of MBAR iterations.
- Default: 10000
--tolerance <TOLERANCE>
- Relative convergence tolerance for MBAR.
- Default: 1.0e-7
--fixed-point-mbar
- Use the fixed-point MBAR solver backend.
- Useful for reproducing older fixed-point results or comparing solver behavior.

Default MBAR behavior:

uses the L-BFGS backend
falls back to fixed-point automatically when the evaluated grid contains states with zero sampled counts
--no-uncertainty
- Disable uncertainty estimation.
--output-units <OUTPUT_UNITS>
- Output units.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write output to a file.
--overlap-summary
- Include overlap scalar and eigenvalues in the output.
--parallel
- Enable parallel processing.

Example:

cargo run --release -- mbar \
  --temperature 300 \
  --auto-equilibrate \
  --decorrelate \
  --u-nk-observable epot \
  --output-format json \
  path/to/*/prod.out

`nes`

Usage:

alchemrs nes [OPTIONS] <INPUTS>...

Purpose:

Compute a Jarzynski nonequilibrium switching estimate from AMBER fwd.out / rev.out trajectories.

Arguments:

INPUTS...
- Required.
- AMBER nonequilibrium switching output files, one per switching trajectory.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--n-bootstrap <N_BOOTSTRAP>
- Number of bootstrap replicates for uncertainty estimation.
- 0 switches to analytic delta-method uncertainty.
- Default: 0
--seed <SEED>
- Bootstrap random seed.
- Default: 0
--no-uncertainty
- Disable uncertainty estimation entirely.
--output-units <OUTPUT_UNITS>
- Output units.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write output to a file.

Behavior notes:

nes parses the final Summary of dvdl values over ... steps: block in each AMBER file.
Each file contributes one reduced work value to the estimator.
The default uncertainty is analytic and is computed across trajectory-level Jarzynski weights.
--n-bootstrap > 0 switches uncertainty estimation to bootstrap resampling over trajectories.

Example:

cargo run --release -- nes \
  --temperature 300 \
  --output-format json \
  path/to/run_*/fwd.out

FEP

See Estimators for background on naming conventions

`iexp`

Usage:

alchemrs iexp [OPTIONS] <INPUTS>...

Purpose:

Compute exponential averaging estimates, typically in the direction of particle insertion.

Arguments:

INPUTS...
- Required.
- Simulation output files, one per lambda window.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--decorrelate
- Apply decorrelation using the selected u_nk observable.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before analysis.
- Default: 0
--auto-equilibrate
- Detect equilibration and remove burn-in automatically.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Use conservative subsampling.
- Default: true
- Allowed values: true, false
--nskip <NSKIP>
- Equilibration-detection stride.
- Default: 1
--u-nk-observable <U_NK_OBSERVABLE>
- Observable used for u_nk auto-equilibration and decorrelation.
- Allowed values:
  - epot
  - de
  - all
- Default: de
--no-uncertainty
- Disable uncertainty estimation.
--output-units <OUTPUT_UNITS>
- Output units.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write output to a file.
--overlap-summary
- Include overlap scalar and eigenvalues in the output.
--parallel
- Enable parallel processing.

Example:

cargo run --release -- iexp --temperature 300 path/to/*/prod.out

`dexp`

Usage:

alchemrs dexp [OPTIONS] <INPUTS>...

Purpose:

Compute exponential averaging estimates, typically in the direction of particle deletion.

Arguments:

INPUTS...
- Required.
- Simulation output files, one per lambda window.

Options:

--temperature <TEMPERATURE>
- Temperature in kelvin.
- Inferred from inputs when omitted.
--decorrelate
- Apply decorrelation using the selected u_nk observable.
--remove-burnin <REMOVE_BURNIN>
- Skip this many initial samples before analysis.
- Default: 0
--auto-equilibrate
- Detect equilibration and remove burn-in automatically.
--fast
- Use the fast statistical inefficiency estimate.
--conservative[=<CONSERVATIVE>]
- Use conservative subsampling.
- Default: true
- Allowed values: true, false
--nskip <NSKIP>
- Equilibration-detection stride.
- Default: 1
--u-nk-observable <U_NK_OBSERVABLE>
- Observable used for u_nk auto-equilibration and decorrelation.
- Allowed values:
  - epot
  - de
  - all
- Default: de
--no-uncertainty
- Disable uncertainty estimation.
--output-units <OUTPUT_UNITS>
- Output units.
- Allowed values:
  - kt
  - kcal
  - kj
- Default: kt
--output-format <OUTPUT_FORMAT>
- Output format.
- Allowed values:
  - text
  - json
  - csv
- Default: text
--output <OUTPUT>
- Write output to a file.
--overlap-summary
- Include overlap scalar and eigenvalues in the output.
--parallel
- Enable parallel processing.

Example:

cargo run --release -- dexp --temperature 300 path/to/*/prod.out

Effective Settings with Auto-Equilibration

When --auto-equilibrate is enabled, the CLI reports the effective preprocessing policy in provenance:

fast = true
conservative = false

That override is deliberate and mirrors the alchemlyb / pymbar equilibration-detection workflow.

Repository Layout

The repository is organized as a Cargo workspace with two Rust packages:

alchemrs: the core library crate and CLI binary
alchemrs-py: the Python binding crate built as a cdylib

The surrounding top-level directories then hold tests, examples, fixtures, documentation, the python/ package wrapper, and pyproject.toml for the repo-local Python build workflow.

`alchemrs` library layout

The alchemrs crate is organized into modules rather than separate library crates:

data: domain types and scientific data-model structures, including ATM schedules and samples
error: shared crate-level error handling
parse: engine-specific parsers for AMBER outputs and GROMACS dhdl.xvg files
prep: time-series preparation such as duplicate cleanup, sorting, equilibration detection, and decorrelation
estimators: TI, BAR, MBAR, IEXP, DEXP, NES, UWHAM, and ATM implementations
analysis: overlap diagnostics, schedule advisors, and plot-ready convergence series
plot: optional SVG rendering helpers behind the plotting feature

The crate root also re-exports the most common types and functions directly for a flatter API.

`alchemrs-py` package

alchemrs-py is a separate Rust package in the same workspace. It provides the native Python extension layer and depends on the core alchemrs crate, but the dependency does not go the other way.

`python/` package wrapper

The python/ directory contains:

the importable python/alchemrs package
OpenMM helper utilities in python/alchemrs/openmm.py
runnable Python examples
Python-side tests

pyproject.toml points maturin at alchemrs-py/Cargo.toml and stages the extension into that package layout.

CLI binary

The alchemrs binary wraps parsing, preprocessing, estimation, and output formatting into a command-line workflow. The scientific logic lives in the library; the CLI is a thin consumer of that API.

Layering

The intended dependency direction inside the library is:

parse -> data + error
prep -> data + error
estimators -> data + error
analysis -> data + error + estimators

The CLI depends on the core library crate, and alchemrs-py depends on the same core library crate through the workspace.

Data Model

The alchemrs::data module defines the typed structures that the rest of the library operates on.

`StatePoint`

StatePoint stores:

one or more lambda values
temperature in Kelvin

Validation rules:

temperature must be finite and positive
every lambda value must be finite

The type itself can store multiple lambda dimensions.

Current estimator support is split:

UNkMatrix-based library estimators support multidimensional lambda states when windows share the same evaluated-state grid.
DhdlSeries-based TI remains one-dimensional.
The CLI supports multidimensional u_nk estimator inputs and renders multidimensional endpoints as lambda vectors.

`DhdlSeries`

DhdlSeries represents a time-indexed dH/dlambda series for one state.

It stores:

the associated StatePoint
time_ps
scalar values

Validation rules:

time_ps.len() == values.len()
all times must be finite
all values must be finite
time must be monotonic nondecreasing

This is the primary input type for TI.

`SwitchingTrajectory`

SwitchingTrajectory represents one nonequilibrium switching trajectory.

It stores:

the initial StatePoint
the final StatePoint
one reduced work value for the full trajectory
an optional lambda_path
an optional dvdl_path in reduced units

Validation rules:

the initial and final states must share the same temperature
lambda_path.len() == dvdl_path.len()
all stored values must be finite

This is the primary input type for NES. The reduced work is sufficient for the Jarzynski estimator itself, while lambda_path and dvdl_path support the NES advisor's profile and curvature diagnostics.

`UNkMatrix`

UNkMatrix stores reduced energies evaluated across states.

Conceptually:

rows are samples
columns are evaluated states
sampled_state identifies the state the trajectory was sampled from
evaluated_states identifies the column states
lambda_labels, when present, name the lambda-vector components in order

Validation rules:

data.len() == n_samples * n_states
evaluated_states.len() == n_states
time_ps.len() == n_samples
time must be monotonic nondecreasing
u_nk values may be finite or positive infinity, but not NaN or negative infinity

That last point matters for preprocessing:

some u_nk-derived observables fail on positive infinity
observable-based preprocessing using an external potential-energy series such as EPtot can still work if the external observable is finite

`AtmState` and `AtmSchedule`

ATM workflows use dedicated schedule types in addition to generic StatePoint.

AtmState stores:

state_id
leg direction (Forward or Reverse)
standard-softplus parameters lambda1, lambda2, alpha, u0, w0
optional multi-softplus parameters lambda3, u1
temperature in Kelvin

Validation rules:

all numeric parameters must be finite
temperature must be finite and positive
lambda3 and u1 must either both be present or both be absent

AtmSchedule stores one ATM leg as an ordered set of AtmState values.

Validation rules:

at least one state
unique state_id
one leg direction per schedule
one temperature per schedule
either all states use standard softplus or all use multi-softplus

Each ATM state can also be converted into a StatePoint. The resulting state vector is ordered as lambda1, lambda2, optional lambda3, alpha, u0, optional u1, w0, and those names are retained as lambda_labels when the ATM data is converted into matrix results.

`AtmSample` and `AtmSampleSet`

AtmSample represents one ATM observation.

It stores:

state_id
potential_energy_kcal_per_mol
perturbation_energy_kcal_per_mol

Validation rules:

both stored energies must be finite

AtmSampleSet combines:

one AtmSchedule
one or more AtmSample values assigned to that schedule

Validation rules:

at least one sample
every sample state_id must exist in the schedule

This is the most convenient library and Python entry point for ATM leg analysis, because it can be converted into a pooled AtmLogQMatrix automatically.

`AtmLogQMatrix`

AtmLogQMatrix stores pooled log unnormalized densities for ATM/UWHAM-style analysis.

It stores:

n_observations
n_states
dense log_q data in observation-major order
the ordered state list
sampled_counts per state
lambda_labels, when present

Validation rules:

log_q.len() == n_observations * n_states
states.len() == n_states
sampled_counts.len() == n_states
sum(sampled_counts) == n_observations
log_q values may be finite or negative infinity, but not NaN or positive infinity

`FreeEnergyEstimate`

Represents a scalar free energy result with:

delta_f
optional uncertainty
from_state
to_state

This is used by TI, NES, and ATM leg results.

`DeltaFMatrix`

Represents pairwise free energy differences between all states.

It stores:

a dense n_states x n_states matrix of values
optional uncertainties
the ordered state list
lambda_labels, when present, naming the lambda-vector components in state order

This is used by BAR, MBAR, IEXP, DEXP, UWHAM, and ATM matrix results.

`OverlapMatrix`

Represents the overlap matrix between states and is used by the diagnostics layer.

It stores:

a dense n_states x n_states matrix of overlap values
the ordered state list
lambda_labels, when present, naming the lambda-vector components in state order

Error model

The crate uses the shared CoreError type from alchemrs::error for:

shape mismatches
invalid state metadata
invalid time ordering
non-finite numerical values
parse failures converted into the shared error model
convergence failures
unsupported inputs

This keeps error handling consistent across parsing, preprocessing, estimation, and analysis.

Parsing Outputs

The parser implementation supports AMBER outputs in alchemrs::parse::amber and GROMACS dhdl.xvg outputs in alchemrs::parse::gromacs.

The top-level parser entry points alchemrs::extract_dhdl, alchemrs::extract_nes_trajectory, alchemrs::extract_u_nk, and alchemrs::extract_u_nk_with_potential auto-detect between the supported formats where appropriate.

Available entry points

extract_dhdl(path, temperature_k)
extract_nes_trajectory(path, temperature_k)
extract_u_nk(path, temperature_k)
extract_u_nk_with_potential(path, temperature_k)

`extract_dhdl`

Returned value:

DhdlSeries

AMBER parser behavior:

extracts temp0, clambda, dt, ntpr, the begin time coordinate, and DV/DL values from the final Summary of dvdl values over ... steps: block
reconstructs the sampled dH/dλ series on the saved-output stride rather than keeping every dense summary value
converts gradients to reduced units by multiplying by beta = 1 / (k_B T)

GROMACS parser behavior:

reads dhdl.xvg headers to identify the simulation temperature and lambda
extracts the dH/dlambda series from the legend tagged with dH when the file contains exactly one derivative component
keeps the x-axis values as the returned time_ps values
rejects multidimensional schedules with multiple dH/dlambda components because DhdlSeries is still scalar

Common failure modes:

missing temperature or lambda metadata
missing dH/dlambda samples
temperature mismatch between the file and the requested temperature

`extract_nes_trajectory`

Returned value:

SwitchingTrajectory

AMBER parser behavior:

extracts temp0, clambda, dynlmb, and ntave
parses the final Summary of dvdl values over ... steps: block
converts each dV/dλ sample to reduced units
reconstructs the lambda path using Δλ = dynlmb / ntave
integrates the reduced work as the discrete sum over the switching path
stores the full lambda-resolved profile in the returned SwitchingTrajectory

Common failure modes:

missing switching metadata such as dynlmb or ntave
missing or truncated DV/DL summary blocks
temperature mismatch between the file and the requested temperature

`extract_u_nk`

Returned value:

UNkMatrix

AMBER parser behavior:

reads MBAR output blocks and constructs a UNkMatrix
requires a valid MBAR lambda list in the file header
requires clambda to be present in the evaluated-state grid
keeps sample rows with positive infinity values
fails if no finite MBAR samples remain after filtering unusable rows

GROMACS parser behavior:

reads dhdl.xvg Delta H columns and maps them into reduced-energy rows, including multidimensional lambda tuples from legends like to (coul, vdw)
inserts the sampled lambda state explicitly with zero reduced energy
sorts the evaluated-state grid by lambda before building the matrix
requires at least one foreign-lambda Delta H column

`extract_u_nk_with_potential`

Returned value:

(UNkMatrix, Vec<f64>)

AMBER parser behavior:

does everything extract_u_nk does and additionally extracts EPtot from NSTEP blocks
requires the number of EPtot samples to match the retained MBAR sample count exactly

GROMACS parser behavior:

does everything extract_u_nk does and additionally extracts the Potential Energy legend from dhdl.xvg
requires that potential-energy observable to be present when this entry point is used

This function exists because an external energy observable is often useful for:

auto-equilibration
decorrelation

especially when the u_nk matrix contains positive infinity values that make DE-based preprocessing invalid.

Temperature validation

The parser checks the temperature in the AMBER file against the temperature passed by the caller and fails if they differ by more than a small tolerance.

This is deliberate: parser output is temperature-dependent because reduced energies depend on beta.

Current scope

The public parser surface in this repository supports:

AMBER equilibrium .out files for dH/dλ and u_nk
AMBER nonequilibrium switching .out files for NES trajectories
GROMACS dhdl.xvg files for dH/dλ and u_nk

The architecture still leaves room for additional engines later.

Preprocessing

Preprocessing lives in the alchemrs::prep module and is where most of the scientifically important workflow details live.

Core concepts

The prep module supports:

duplicate-time cleanup
sorting by time
optional time slicing
equilibration detection
decorrelation / subsampling

The main option struct is DecorrelationOptions.

Default values are:

drop_duplicates = true
sort = true
conservative = true
remove_burnin = false
fast = false
nskip = 1
lower = None
upper = None
step = None

Time cleanup

Before decorrelation or equilibration detection, the prep module can:

drop duplicate time values
sort by time

This matters because the timeseries logic assumes one contiguous series ordered in time.

For UNkMatrix, cleanup preserves row/state alignment by selecting or reordering entire rows, not individual values.

Time slicing

lower, upper, and step allow simple time-domain slicing before decorrelation.

These options are applied to:

the series itself for DhdlSeries
both the selected scalar observable and the aligned u_nk rows for UNkMatrix

Statistical inefficiency and `g`

Decorrelation is based on an estimate of statistical inefficiency g.

The implementation follows the same basic pymbar.timeseries logic:

center the series
estimate autocorrelation contributions at increasing lag
stop once the correlation function becomes non-positive after a minimum lag
clamp g to at least 1

`fast`

fast changes how g is estimated.

fast = false evaluates lag contributions one lag at a time
fast = true increases the lag increment as it goes, which is cheaper but less accurate

This affects:

plain decorrelation
equilibration detection

because equilibration detection repeatedly estimates g on suffixes of the series.

`conservative`

conservative changes how the code turns g into retained sample indices.

conservative = true uses a uniform stride of ceil(g)
conservative = false uses rounded multiples of fractional g

The conservative mode keeps fewer samples and is intentionally cautious.

Equilibration detection

Equilibration detection returns:

t0: the chosen start index for equilibrated data
g: statistical inefficiency for the retained suffix
neff_max: the estimated number of effectively uncorrelated samples

The heuristic scans possible start positions and chooses the one that maximizes:

Neff = (N - t0) / g

This is the same style of automated equilibration detection used by pymbar, but alchemrs counts the actual retained suffix length N - t0 when reporting Neff_max. That means Neff_max can differ by 1 from pymbar/alchemlyb, while the broader workflow remains the same. For further reading on the topic, see Chodera 2016.

Note: when nskip > 1, alchemrs maximizes Neff only over the sampled candidate time origins (0, nskip, 2*nskip, ...). This intentionally differs from pymbar's current implementation, which pre-fills arrays for all indices before taking the maximum, because the documented meaning of nskip is to try only every nskip-th sample as a potential origin.

`remove_burnin`

In the prep module, remove_burnin = true means:

run equilibration detection
discard all data before t0
then decorrelate the remaining suffix using the g estimated for that suffix

This is different from the CLI flag --remove-burnin <N>, which is a fixed-count trim done before any automated detection.

`u_nk` observables

u_nk decorrelation needs a scalar time series. The prep module supports two native derived observables and one external-observable path.

`UNkSeriesMethod::DE`

For each sample row:

identify the sampled-state column
use the next evaluated-state column if it exists
otherwise use the previous column for the last state
compute:

DE_t = u_nk[t, other] - u_nk[t, sampled]

This matches the alchemlyb dE convention.

Important detail:

“adjacent” means adjacent in evaluated-state order, not nearest by numeric lambda distance
DE preprocessing is currently only supported for one-dimensional lambda states

`UNkSeriesMethod::All`

For each sample row, sum all evaluated-state reduced energies in that row.

This is a generic matrix-derived scalar, but it is usually less targeted than DE.

All remains defined for multidimensional parsed states because it does not need to identify a sampled-state neighbor.

`decorrelate_u_nk_with_observable`

This path accepts an external scalar observable, most commonly a potential-energy observable such as EPtot.

The observable determines which sample indices are retained, and those retained indices are then applied back to the full u_nk rows.

This is useful when:

the matrix contains infinite energy values that make DE invalid

Non-finite `u_nk` behavior

decorrelate_u_nk rejects non-finite derived scalar series.

In practice:

DE or All can fail if the derived scalar is not finite
decorrelate_u_nk_with_observable can still succeed if the supplied observable itself is finite

That is why the CLI exposes epot as an observable choice instead of only supporting de.

CLI preprocessing order

For CLI commands, preprocessing order is:

fixed-count --remove-burnin <N>
--auto-equilibrate
--decorrelate

TI uses dH/dlambda as the scalar series.

BAR, MBAR, IEXP, and DEXP use the observable chosen by --u-nk-observable <de|all|epot>.

NES does not expose CLI preprocessing flags because each input file is treated as one complete switching trajectory rather than a windowed equilibrium timeseries.

Multidimensional GROMACS states can be parsed into u_nk and passed through the CLI estimators.

The remaining restriction is observable choice:

de is still one-dimensional only
all and epot remain valid for multidimensional schedules

CLI auto-equilibrate overrides

When --auto-equilibrate is enabled in the CLI, the effective preprocessing flags become:

fast = true
conservative = false

even if the user supplied different --fast or --conservative values.

This is intentional and mirrors the alchemlyb / pymbar automated-equilibration workflow:

detect_equilibration(..., fast=True)
then subsample_correlated_data(..., conservative=False)

Those effective values are still recorded in CLI provenance so the output remains auditable.

Estimators

Estimator implementations live in the alchemrs::estimators module.

The CLI currently exposes TI, BAR, MBAR, IEXP, DEXP, and NES. The library and Python bindings additionally expose UWHAM and ATM.

Shared assumptions

For UNkMatrix-based estimators, the current implementation assumes:

consistent lambda-vector dimensionality across windows
consistent evaluated-state grids across windows
sampled_state is set and present in evaluated_states
state temperatures are consistent across all windows
lambda_labels, when present, are consistent across windows

If those invariants are violated, the estimators fail early.

TI

Types:

TiEstimator
TiFit
TiOptions
IntegrationMethod::{Trapezoidal, Simpson, CubicSpline, Pchip, Akima, GaussianQuadrature}
recommend_ti_method(...)
TiMethodRecommendation
TiMethodAssessment
TiMethodRecommendationOptions

Input:

slice of DhdlSeries

Behavior:

sorts windows by lambda
computes the mean dH/dlambda for each window
integrates across lambda using trapezoidal, Simpson, a natural cubic spline, PCHIP, Akima interpolation, or Gaussian quadrature
fit(...) returns TiFit
result() materializes the FreeEnergyEstimate

Current uncertainty behavior:

trapezoidal mode reports an uncertainty estimate
Simpson mode reports an uncertainty estimate by propagating the Simpson-rule weights through the per-window SEMs
cubic-spline mode reports an uncertainty estimate by propagating the implied spline-integration weights through the per-window SEMs
PCHIP and Akima report an uncertainty estimate by numerically differentiating the integrated result with respect to the window means and propagating the per-window SEMs through that Jacobian
Gaussian quadrature reports an uncertainty estimate from the quadrature weights and per-window SEMs

Interpolation references:

Pchip follows the piecewise-cubic Hermite shape-preserving literature of Fritsch and Carlson, Monotone Piecewise Cubic Interpolation, SIAM Journal on Numerical Analysis 17(2), 1980, DOI: https://doi.org/10.1137/0717021, and Fritsch and Butland, A Method for Constructing Local Monotone Piecewise Cubic Interpolants, SIAM Journal on Scientific and Statistical Computing 5(2), 1984, DOI: https://doi.org/10.1137/0905021
Akima follows Akima, A New Method of Interpolation and Smooth Curve Fitting Based on Local Procedures, Journal of the ACM 17(4), 1970, DOI: https://doi.org/10.1145/321607.321609

Gaussian quadrature behavior:

only supports one-dimensional lambda schedules
only supports Gauss-Legendre schedules with 1..=16 sampled windows
validates that the sampled lambdas match the supported quadrature nodes
reports the free energy over the full [0, 1] interval, so result().from_state() is lambda=0 and result().to_state() is lambda=1 even though the sampled windows are interior quadrature nodes

Method recommendation:

recommend_ti_method(...) lives in alchemrs::analysis
evaluates all currently supported TI methods on the same preprocessed DhdlSeries
combines method eligibility, TI schedule diagnostics, monotonicity, and cross-method spread
returns the recommended IntegrationMethod plus per-method assessments and a human-readable rationale

BAR

Types:

BarEstimator
BarFit
BarOptions
BarMethod::{FalsePosition, SelfConsistentIteration, Bisection}

Input:

slice of UNkMatrix

Behavior:

expects windows mapped to sampled states on one evaluated-state grid
computes work values between adjacent states
fit(...) returns BarFit
result() materializes the full pairwise DeltaFMatrix
preserves lambda_labels() from the input windows when available

Uncertainty behavior:

adjacent BAR uncertainties come from the BAR implicit-root equation via a delta-method linearization of the Fermi weights
cumulative non-adjacent uncertainties include the neighboring-edge covariance induced by the shared intermediate window rather than assuming adjacent BAR edges are independent

MBAR

Types:

MbarEstimator
MbarOptions

Important options:

max_iterations
tolerance
initial_f_k
parallel
solver

Default solver:

MbarSolver::Lbfgs
falls back to fixed-point automatically when the evaluated grid contains states with zero sampled counts, because the LBFGS objective requires positive sampled counts
use MbarSolver::FixedPoint when reproducing older fixed-point results or comparing solver behavior

The MBAR L-BFGS solver path is inspired by the large-scale MBAR/UWHAM convex optimization approach described by Ding, Vilseck, and Brooks in "Fast Solver for Large Scale Multistate Bennett Acceptance Ratio Equations", Journal of Chemical Theory and Computation 15, 799-802 (2019), doi:10.1021/acs.jctc.8b01010. UwhamEstimator has a separate solver implementation; its Newton and L-BFGS backends are documented in the UWHAM section below.

Input:

slice of UNkMatrix

Behavior:

combines windows into a shared reduced-energy representation
solves for free energies iteratively
returns an MbarFit that can derive a DeltaFMatrix, overlap diagnostics, and MBAR log weights without resolving the same window set
preserves lambda_labels() from the input windows when available

Typical usage:

fit(...).result()
fit(...).result_with_uncertainty()
fit(...).overlap_matrix()
fit(...).overlap_scalar()

The analysis layer uses MBAR-derived log weights to compute overlap diagnostics.

UWHAM

Types:

UwhamEstimator
UwhamFit
UwhamOptions
UwhamSolver

Important options:

max_iterations
tolerance
parallel
solver

Solver behavior:

UwhamSolver::Newton is the default.
UwhamSolver::Lbfgs is available as an experimental comparison backend.
Current synthetic benchmarks show the Newton backend remains faster for the tested UWHAM workloads, so L-BFGS is not the default.

Input:

slice of UNkMatrix

Behavior:

pools all windows into log unnormalized densities (log_q)
accepts positive infinity in u_nk by mapping it to negative infinity in log_q
solves the UWHAM equations on the pooled observations
fit(...) returns UwhamFit
result() materializes a DeltaFMatrix
result_with_uncertainty() uses the analytical UWHAM covariance
weights(), covariance(), and variances() expose fitted internals without rerunning the solve
write_reference_inputs(...) exports logQ.csv, size.csv, delta_f.csv, ze.csv, and metadata for cross-checking against external UWHAM implementations

ATM

Types:

AtmEstimator
AtmFit
AtmOptions
AtmUncertaintyMethod::{Analytical, Bootstrap { .. }, None}
AtmBindingEstimate

Input:

fit(...): AtmLogQMatrix
fit_leg(...), estimate_leg(...), estimate_binding(...): AtmSampleSet

Behavior:

converts ATM schedule/sample data into pooled log_q
delegates the core statistical solve to the UWHAM implementation
result() on AtmFit returns a DeltaFMatrix across all ATM schedule states
leg_result() and estimate_leg(...) collapse that fit to one endpoint-to-endpoint FreeEnergyEstimate
estimate_binding(...), estimate_rbfe(...), and estimate_abfe(...) combine two opposite-direction legs and propagate leg uncertainties in quadrature
preserves ATM-derived lambda component labels when available

Uncertainty behavior:

Analytical uses UWHAM covariance on the fitted leg
Bootstrap { n_bootstrap, seed } is only available when starting from AtmSampleSet
None suppresses uncertainty reporting

IEXP and DEXP

Several different free energy methods utilize perturbation to estimate free energy differences, but when people say Free Energy Perturbation (FEP) they are typically referring to an exponential averaging scheme based on perturbing potential energies from one lambda state to another. This perturbation can take place in either the forward or the reverse direction. In accordance with Klimovich, Shirts, and Mobley's 2016 Guidelines for the analysis of free energy calculations, we define iexp as insertion exponential averaging, typically proceeding in the direction of decreasing entropy, and dexp as deletion exponential averaging, typically proceeding in the direction of increasing entropy.

Types:

IexpEstimator
IexpFit
IexpOptions

Input:

slice of UNkMatrix

Behavior:

computes pairwise adjacent exponential averaging along the lambda schedule
fit(...) returns IexpFit
result() materializes the full pairwise DeltaFMatrix
result_with_uncertainty() includes delta-method uncertainty estimates from the exponential weights exp(-W) using an unbiased finite-sample variance
preserves lambda_labels() from the input windows when available

CLI direction conventions:

iexp reports the forward direction
dexp reports the reverse direction

DEXP is not a separate estimator type in the library; it is a CLI convention over the same estimator results.

NES

Types:

NesEstimator
NesFit
NesOptions

Input:

slice of SwitchingTrajectory

Behavior:

expects all trajectories to share the same initial and final states
applies the Jarzynski equality to the reduced switching work values
fit(...) returns NesFit
result() materializes a scalar FreeEnergyEstimate

Uncertainty behavior:

analytic uncertainty is the default and is computed from the trajectory-level Jarzynski weights exp(-W) with an unbiased finite-sample variance estimate
setting NesOptions { n_bootstrap: N, .. } with N > 0 switches to bootstrap uncertainty across trajectories

Parsing / workflow expectations:

the intended CLI input is one AMBER nonequilibrium switching trajectory per file
each trajectory contributes one reduced work value
when present, the retained lambda_path and dvdl_path support NES advisor diagnostics, but the estimator itself only needs the reduced work

Parallel execution

Some estimators support a parallel option and use Rayon-backed parallel loops internally to speed up calculations.

Common API shape

The estimator interfaces follow one shared pattern:

fit(...) -> Fit
estimate(...) -> primary result
Fit::result() -> primary result

Estimators with an explicit uncertainty split also expose estimate_with_uncertainty() and Fit::result_with_uncertainty().

ATM extends that shape with:

fit_leg(...)
estimate_leg(...)
estimate_binding(...)
estimate_rbfe(...)
estimate_abfe(...)

Overlap Diagnostics

Overlap diagnostics live in the alchemrs::analysis module.

Available functions:

overlap_matrix
overlap_eigenvalues
overlap_scalar
ti_convergence
bar_convergence
mbar_convergence
exp_convergence
dexp_convergence
nes_convergence

`overlap_matrix`

This function:

computes MBAR log weights from the supplied windows
constructs the overlap matrix from those weights and state sample counts
returns an OverlapMatrix

When the input windows carry named lambda dimensions, the returned matrix preserves them via overlap.lambda_labels().

Input:

slice of UNkMatrix
optional MbarOptions

`overlap_eigenvalues`

Computes the eigenvalues of the overlap matrix and returns them sorted in descending order.

The implementation rejects eigenvalues with significant imaginary components.

`overlap_scalar`

Computes:

1 - second_largest_eigenvalue

This is the scalar overlap diagnostic currently surfaced by the CLI.

CLI usage

bar, mbar, iexp, and dexp can include overlap diagnostics with:

--overlap-summary

The CLI then reports:

overlap scalar
overlap eigenvalues

in text, JSON, or CSV output.

Convergence series

The same module also exposes cumulative convergence helpers for plot-ready library output.

Each function returns a Vec<ConvergencePoint>:

ti_convergence
bar_convergence
mbar_convergence
exp_convergence
dexp_convergence
nes_convergence

Each point includes:

the number of windows included in that cumulative estimate
the estimated free energy
optional uncertainty
the endpoint states for that estimate
optional lambda_labels() when they are available from the parsed windows

Schedule Advisor

The lambda-schedule advisor is a diagnostic CLI workflow exposed through advise-schedule. It supports the existing u_nk overlap-based path, a TI-native dH/dlambda spacing path, and an AMBER nonequilibrium switching (NES) trajectory path.

Command:

advise-schedule

It does not fit a final production free energy estimate. Instead, it inspects the sampled schedule and reports where the current setup looks weak.

What it uses

In u_nk mode, the current MVP combines:

adjacent overlap from the MBAR-derived overlap matrix
adjacent BAR or MBAR edge estimates
edge-local block-average variability

In TI mode (--input-kind dhdl), it combines:

per-window mean_dhdl and SEM
window-local block-average variability
split-half dH/dlambda drift within each window
interval-local slope, curvature, and trapezoid contribution
interval uncertainty propagated from neighboring window SEMs

In NES mode (--input-kind nes), it combines:

cumulative Jarzynski free energy estimates versus number of switching trajectories
the final analytic uncertainty estimate
the midpoint-to-final drift in the cumulative estimate
an ensemble-mean dV/dλ(λ) switching profile retained from the trajectories
finite-difference curvature magnitude along the common lambda path
a ranked list of distinct high-curvature lambda regions

The output is deterministic and threshold-driven in both modes.

Typical usage

cargo run --release -- advise-schedule \
  --temperature 300 \
  --decorrelate \
  --u-nk-observable de \
  --output-format json \
  path/to/*/prod.out

Useful flags:

--input-kind <auto|u-nk|dhdl|nes>
--estimator <mbar|bar>
--overlap-min <VALUE>
--block-cv-min <VALUE>
--n-blocks <N>
--no-midpoints
--report <PATH>

TI usage:

cargo run --release -- advise-schedule \
  --temperature 300 \
  --input-kind dhdl \
  --decorrelate \
  --output-format json \
  path/to/*/prod.out

NES usage:

cargo run --release -- advise-schedule \
  --temperature 300 \
  --input-kind nes \
  --report nes-report.html \
  path/to/run_*/fwd.out

Output

JSON output includes:

sample_counts
provenance
edges in u_nk mode
windows and intervals in TI mode
convergence, profile, curvature, and high_curvature_regions in NES mode
suggestions

If --report is provided, the CLI also writes a standalone HTML report with:

a configuration summary
for u_nk workflows, an MBAR-derived overlap-matrix heatmap with interactive size controls
for default MBAR-backed u_nk workflows, a cumulative free energy plot versus shared elapsed simulation time
for TI mode, a cumulative free energy plot versus shared elapsed simulation time
a top priority queue of the highest-risk edges, including their weakest components
ranked schedule suggestions
edge-level diagnostics with severity badges
for TI mode, an integration-method curve gallery for each applicable TI method on the current lambda grid
for NES mode, a free energy vs. number of switches scatterplot, a mean dV/dλ switching-path plot, a curvature-magnitude plot over lambda, and a high-curvature region summary
inline priority bars for quick scanning
inline SVG lambda-axis visuals for each edge and proposal
an in-report legend for source, target, proposal, delta-bar, and status semantics
per-component rows for multidimensional schedules showing which lambda components are bisected, held at the source state, dominant, or fixed
normalized delta bars so the largest component jump on each edge is immediately visible

The u_nk overlap-matrix panel is included as a visual overlap diagnostic in the same spirit as the overlap-matrix discussion and example shown in Fig. 7b of Klimovich, Shirts, and Mobley, "Guidelines for the analysis of free energy calculations", J Comput Aided Mol Des 29, 397-411 (2015), doi:10.1007/s10822-015-9840-9.

Each edge records:

adjacent lambda endpoints
lambda delta
forward and reverse overlap values
minimum directional overlap
optional relative overlap versus neighboring edges
edge delta_f
optional uncertainty
optional relative uncertainty versus neighboring edges
optional block mean / standard deviation / coefficient of variation
dominant changing lambda components
priority score
severity classification

Each suggestion records:

suggestion kind
target edge index
lambda endpoints
focus components
proposal strategy
optional midpoint proposal
priority score
reason string

In TI mode, each window records:

lambda state
mean_dhdl
optional sem_dhdl
optional block mean / standard deviation / coefficient of variation
optional split-half dH/dlambda delta

Each TI interval records:

lambda endpoints
interval width
left and right mean_dhdl
trapezoid contribution
slope and absolute slope
optional curvature
optional interval uncertainty
priority score
severity classification

In NES mode, the final advisor payload records:

cumulative convergence points
the final scalar free energy estimate
the ensemble-mean switching profile
curvature magnitude along lambda
distinct high-curvature hotspot regions
relative uncertainty
recent midpoint-to-final drift
the final suggestion

Current heuristics

The current u_nk advisor classifies edges as:

healthy
monitor
add_sampling
add_window

The current pass uses:

overlap_min < --overlap-min to suggest a new window
block_cv >= --block-cv-min to suggest more sampling
neighbor-aware comparisons so a uniformly weak schedule is not treated the same as one isolated bad edge
component-level labeling so multidimensional schedules can report which lambda components dominate the jump

If midpoint proposals are enabled, insert_window suggestions return one of:

midpoint: the component-wise midpoint between the two edge endpoints
focused_split: midpoint only on the dominant changing component while holding the others at the source state

The current TI advisor classifies intervals as:

healthy
monitor
add_sampling
add_window
add_window_and_sampling

The TI pass uses:

high slope / curvature z-scores to suggest more lambda resolution
high block CV or interval uncertainty to suggest more sampling
midpoint proposals for inserted TI windows
one-dimensional lambda schedules only

The current NES advisor classifies the run as:

no_change
extend_sampling

The NES pass currently recommends more switching trajectories when any of these hold:

the number of trajectories is below the minimum threshold
the final relative uncertainty exceeds the configured threshold
the recent cumulative-estimate drift is larger than the allowed sigma-scaled threshold

Limits

This is currently a local adjacent-edge advisor, not a full workflow planner.
The recommendation engine is intentionally simple and should be treated as a diagnostic starting point.
Multidimensional schedules are supported through vector endpoints and preserved lambda-component labels, but the midpoint proposal is still purely geometric.
TI mode is currently limited to one-dimensional dH/dlambda schedules.
NES mode requires AMBER nonequilibrium switching outputs with a common lambda path across trajectories.
NES mode is trajectory-based, so --decorrelate, --auto-equilibrate, and nonzero --remove-burnin are intentionally unsupported there.
--input-kind auto keeps the existing u_nk behavior; use --input-kind dhdl for TI diagnostics or --input-kind nes for switching-trajectory diagnostics.

Plotting

Native plotting is optional in alchemrs.

The crate exposes SVG rendering helpers behind the plotting Cargo feature:

cargo add alchemrs --features plotting

or for this repository:

cargo test --features plotting

Current scope

The first plotting surface is intentionally small:

render_convergence_svg
render_time_convergence_svg
render_overlap_matrix_svg
render_delta_f_state_svg
render_ti_dhdl_svg
render_block_average_svg

These consume plot-ready analysis values and return SVG strings:

render_convergence_svg for ConvergencePoint
render_time_convergence_svg for TimeConvergencePoint
render_overlap_matrix_svg for OverlapMatrix
render_delta_f_state_svg for DeltaFMatrix
render_ti_dhdl_svg for DhdlSeries
render_block_average_svg for BlockEstimate

Example figures

The docs include checked-in SVGs generated with:

cargo run --example generate_doc_plots --features plotting

Convergence trace:

MBAR convergence example

Overlap heatmap:

MBAR overlap example

Adjacent-state ΔF plot:

Adjacent-state ΔF example

TI dH/dlambda plot:

TI dHdl example

Block-average plot:

Block average example

Example

use alchemrs::{
    ConvergencePlotOptions, MbarOptions, extract_u_nk, mbar_convergence, render_convergence_svg,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
let windows = vec![
    extract_u_nk("lambda0.out", 300.0)?,
    extract_u_nk("lambda1.out", 300.0)?,
];
let points = mbar_convergence(&windows, Some(MbarOptions::default()))?;
let svg = render_convergence_svg(
    &points,
    Some(ConvergencePlotOptions {
        title: "MBAR Convergence".to_string(),
        ..ConvergencePlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
Ok(())
}

Time-convergence plots use shared per-window elapsed simulation time rather than the number of lambda windows included:

use alchemrs::{
    MbarOptions, TimeConvergencePlotOptions, extract_u_nk, mbar_time_convergence,
    render_time_convergence_svg,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
let windows = vec![
    extract_u_nk("lambda0.out", 300.0)?,
    extract_u_nk("lambda1.out", 300.0)?,
];
let points = mbar_time_convergence(&windows, Some(MbarOptions::default()), None)?;
let svg = render_time_convergence_svg(
    &points,
    Some(TimeConvergencePlotOptions {
        title: "MBAR Time Convergence".to_string(),
        ..TimeConvergencePlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
Ok(())
}

Overlap heatmaps can be rendered directly from the diagnostics layer:

use alchemrs::{
    MbarEstimator, MbarOptions, OverlapPlotOptions, extract_u_nk, render_overlap_matrix_svg,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
let windows = vec![
    extract_u_nk("lambda0.out", 300.0)?,
    extract_u_nk("lambda1.out", 300.0)?,
];
let fit = MbarEstimator::new(MbarOptions::default()).fit(&windows)?;
let overlap = fit.overlap_matrix()?;
let svg = render_overlap_matrix_svg(
    &overlap,
    Some(OverlapPlotOptions {
        title: "MBAR Overlap".to_string(),
        ..OverlapPlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
Ok(())
}

Adjacent-state ΔF plots can be rendered from estimator results:

use alchemrs::{
    DeltaFStatePlotOptions, MbarEstimator, MbarOptions, extract_u_nk, render_delta_f_state_svg,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
let windows = vec![
    extract_u_nk("lambda0.out", 300.0)?,
    extract_u_nk("lambda1.out", 300.0)?,
];
let fit = MbarEstimator::new(MbarOptions::default()).fit(&windows)?;
let delta_f = fit.result_with_uncertainty()?;
let svg = render_delta_f_state_svg(
    &delta_f,
    Some(DeltaFStatePlotOptions {
        title: "Adjacent ΔF".to_string(),
        ..DeltaFStatePlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
Ok(())
}

TI dH/dlambda plots can be rendered directly from parsed TI windows:

use alchemrs::{TiDhdlPlotOptions, extract_dhdl, render_ti_dhdl_svg};

fn main() -> Result<(), Box<dyn std::error::Error>> {
let series = vec![
    extract_dhdl("lambda0.out", 300.0)?,
    extract_dhdl("lambda1.out", 300.0)?,
];
let svg = render_ti_dhdl_svg(
    &series,
    Some(TiDhdlPlotOptions {
        title: "TI dH/dlambda".to_string(),
        ..TiDhdlPlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
Ok(())
}

Block-average plots can be rendered from library block analysis results:

use alchemrs::{
    BlockAveragePlotOptions, MbarEstimator, MbarOptions, extract_u_nk,
    render_block_average_svg,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
let windows = vec![
    extract_u_nk("lambda0.out", 300.0)?,
    extract_u_nk("lambda1.out", 300.0)?,
];
let blocks = MbarEstimator::new(MbarOptions::default()).block_average(&windows, 4)?;
let svg = render_block_average_svg(
    &blocks,
    Some(BlockAveragePlotOptions {
        title: "MBAR Block Average".to_string(),
        ..BlockAveragePlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
Ok(())
}

Why it is feature-gated

Plotting is presentation logic, not core analysis logic. Making it optional keeps:

the default dependency footprint smaller
the analysis and estimator surface independent from graphics code
library usage lightweight for users who only want parsing and estimation

Outputs and Provenance

The CLI supports three output formats:

text
JSON
CSV

Units

Output units can be selected with:

--output-units <kt|kcal|kj>

The CLI converts the reported free energy and uncertainty from reduced units into the selected output unit using the requested temperature.

Text output

Text output is intended for direct reading in the terminal.

It includes:

delta_f
uncertainty
lambda endpoints, which are scalars for one-dimensional schedules and vectors for multidimensional schedules
lambda_components when available
sample counts
u_nk_observable when relevant
ti_method when the estimator is TI
ti_method_reason when TI method auto-selection was used
overlap summary when requested

For nes, the scalar result still follows the same shape, but windows, samples_in, and samples_kept all count switching trajectories rather than equilibrium windows or samples.

JSON output

JSON output is suitable for shell pipelines and downstream tooling.

The payload contains:

the scalar result fields
optional overlap summary
a provenance object

from_lambda and to_lambda are encoded as:

numbers for one-dimensional schedules
arrays for multidimensional schedules

CSV output

CSV output appends provenance fields after the result columns so the row remains self-describing.

For multidimensional schedules, the endpoint columns are emitted as quoted bracketed vectors such as "[0;0;0.8]".

Provenance

In this project, "provenance" means the analysis settings and sample-count metadata that explain how a reported result was produced.

Current provenance fields include:

estimator name
temperature
whether decorrelation ran
fixed-count burn-in trimming
whether auto-equilibration ran
the effective fast value
the effective conservative value
nskip
u_nk_observable when applicable
ti_method for TI results
ti_method_reason when ti --method auto selected the integration method
lambda_components when the parser can identify named lambda dimensions
number of windows
number of samples before preprocessing
number of samples after burn-in trimming / auto-equilibration
number of retained samples after decorrelation

This matters because preprocessing choices directly change the data used by the estimators.

Example JSON payload

{
  "delta_f": -113.58,
  "uncertainty": 1.17,
  "from_lambda": [0.0, 0.0, 0.8],
  "to_lambda": [0.0, 0.0, 0.9],
  "units": "kT",
  "overlap": {
    "scalar": 0.0183,
    "eigenvalues": [1.0, 0.9817]
  },
  "provenance": {
    "estimator": "mbar",
    "temperature_k": 300.0,
    "decorrelate": true,
    "remove_burnin": 0,
    "auto_equilibrate": false,
    "fast": false,
    "conservative": true,
    "nskip": 1,
    "u_nk_observable": "de",
    "lambda_components": ["mass-lambda", "coul-lambda", "vdw-lambda"],
    "windows": 15,
    "samples_in": 300,
    "samples_after_burnin": 300,
    "samples_kept": 126
  }
}

Library Guide

The top-level alchemrs crate re-exports the main types and functions needed for direct Rust use.

Use the library when the CLI is not enough:

embedding alchemrs inside another Rust application
building custom preprocessing or estimation pipelines
sharing the same scientific core used by the existing Python bindings

Common imports

#![allow(unused)]
fn main() {
use alchemrs::{
    AtmDirection, AtmEstimator, AtmSample, AtmSampleSet, AtmSchedule, AtmState,
    DecorrelationOptions, MbarEstimator, MbarOptions, NesEstimator, NesOptions, TiEstimator,
    TiOptions, UwhamEstimator,
    decorrelate_dhdl, decorrelate_u_nk_with_observable, extract_dhdl,
    extract_nes_trajectory, extract_u_nk_with_potential, mbar_convergence, nes_convergence,
};
}

TI example

#![allow(unused)]
fn main() {
use alchemrs::{TiEstimator, TiOptions, extract_dhdl};

fn run_ti(paths: &[String], temperature_k: f64) -> Result<(), Box<dyn std::error::Error>> {
    let mut series = Vec::new();
    for path in paths {
        series.push(extract_dhdl(path, temperature_k)?);
    }

    let estimator = TiEstimator::new(TiOptions::default());
    let fit = estimator.fit(&series)?;
    let result = fit.result()?;
    println!("TI dG = {:.6}", result.delta_f());
    Ok(())
}
}

UWHAM example

#![allow(unused)]
fn main() {
use alchemrs::{UwhamEstimator, extract_u_nk};

fn run_uwham(paths: &[String], temperature_k: f64) -> Result<(), Box<dyn std::error::Error>> {
    let mut windows = Vec::new();
    for path in paths {
        windows.push(extract_u_nk(path, temperature_k)?);
    }

    let fit = UwhamEstimator::default().fit(&windows)?;
    let result = fit.result_with_uncertainty()?;
    println!("UWHAM dG = {:.6}", result.values()[result.n_states() - 1]);
    Ok(())
}
}

UwhamFit::write_reference_inputs(...) is also available when you want to export logQ.csv, size.csv, and the fitted reference outputs for comparison against external UWHAM implementations.

MBAR example with `EPtot` decorrelation

#![allow(unused)]
fn main() {
use alchemrs::{
    DecorrelationOptions, MbarEstimator, MbarOptions, MbarSolver, UNkMatrix,
    decorrelate_u_nk_with_observable, extract_u_nk_with_potential,
};

fn load_windows(
    paths: &[String],
    temperature_k: f64,
) -> Result<Vec<UNkMatrix>, Box<dyn std::error::Error>> {
    let mut windows = Vec::new();
    for path in paths {
        let (u_nk, epot) = extract_u_nk_with_potential(path, temperature_k)?;
        let decorrelated =
            decorrelate_u_nk_with_observable(&u_nk, &epot, &DecorrelationOptions::default())?;
        windows.push(decorrelated);
    }
    Ok(windows)
}

fn run_mbar(windows: &[UNkMatrix]) -> Result<(), Box<dyn std::error::Error>> {
    let estimator = MbarEstimator::new(MbarOptions::default());
    let fit = estimator.fit(windows)?;
    let result = fit.result_with_uncertainty()?;
    let delta_index = result.n_states() - 1;
    println!("MBAR dG = {:.6}", result.values()[delta_index]);
    if let Some(labels) = fit.lambda_labels() {
        println!("lambda components = {:?}", labels);
    }

    let overlap = fit.overlap_matrix()?;
    println!("overlap = {:.6}", fit.overlap_scalar()?);
    if let Some(labels) = overlap.lambda_labels() {
        println!("overlap components = {:?}", labels);
    }
    Ok(())
}
}

MbarOptions::default() uses MbarSolver::Lbfgs. It automatically falls back to fixed-point when the evaluated state grid contains zero sampled-count states. To force fixed-point explicitly:

#![allow(unused)]
fn main() {
let estimator = MbarEstimator::new(MbarOptions {
    solver: MbarSolver::FixedPoint,
    ..MbarOptions::default()
});
}

ATM leg example

#![allow(unused)]
fn main() {
use alchemrs::{
    AtmDirection, AtmEstimator, AtmSample, AtmSampleSet, AtmSchedule, AtmState,
};

fn run_atm_leg() -> Result<(), Box<dyn std::error::Error>> {
    let leg = AtmSampleSet::new(
        AtmSchedule::new(vec![
            AtmState::new(0, AtmDirection::Forward, 0.0, 0.0, None, 0.0, 0.0, None, 0.0, 300.0)?,
            AtmState::new(1, AtmDirection::Forward, 1.0, 1.0, None, 0.0, 0.0, None, 0.0, 300.0)?,
        ])?,
        vec![
            AtmSample::new(0, 10.0, 2.0)?,
            AtmSample::new(1, 12.0, 2.0)?,
        ],
    )?;

    let estimate = AtmEstimator::default().estimate_leg(&leg)?;
    println!("ATM leg dG = {:.6}", estimate.delta_f());
    println!("ATM leg sigma = {:?}", estimate.uncertainty());
    Ok(())
}
}

For paired binding legs, use estimate_binding(...), estimate_rbfe(...), or estimate_abfe(...) on two AtmSampleSet values with opposite directions.

For GROMACS multidimensional schedules, parser-derived component names are available from the parsed windows too:

#![allow(unused)]
fn main() {
if let Some(labels) = windows[0].lambda_labels() {
    println!("window components = {:?}", labels);
}
}

OpenMM `u_kln` conversion example

If you already have reduced potentials from an OpenMM workflow as a u_kln[k][l][n] tensor, you do not need a dedicated parser. Convert each sampled-state slice into a UNkMatrix window and pass the resulting windows to the existing estimators.

The runnable example in examples/openmm_u_kln_mbar.rs demonstrates this conversion end to end.

If you prefer Python, see Python and OpenMM for the bindings API, OpenMM helper functions, and the runnable Python toy-system examples.

The same analysis module also provides plot-ready convergence series for TI, BAR, MBAR, IEXP, DEXP, and NES:

#![allow(unused)]
fn main() {
let points = mbar_convergence(windows, Some(MbarOptions::default()))?;
for point in points {
    println!(
        "windows={} delta_f={:.6} from={:?} to={:?}",
        point.n_windows(),
        point.delta_f(),
        point.from_state().lambdas(),
        point.to_state().lambdas()
    );
}
}

For equilibrium TI and MBAR workflows, the analysis layer also provides time-convergence series. These recompute the full estimator after truncating every window to the same elapsed simulation time:

#![allow(unused)]
fn main() {
let points = mbar_time_convergence(windows, Some(MbarOptions::default()), None)?;
for point in points {
    println!(
        "time_ps={} delta_f={:.6}",
        point.elapsed_time_ps(),
        point.delta_f()
    );
}
}

NES example

#![allow(unused)]
fn main() {
use alchemrs::{extract_nes_trajectory, NesEstimator, NesOptions};

fn run_nes(paths: &[String], temperature_k: f64) -> Result<(), Box<dyn std::error::Error>> {
    let mut trajectories = Vec::new();
    for path in paths {
        trajectories.push(extract_nes_trajectory(path, temperature_k)?);
    }

    let estimator = NesEstimator::new(NesOptions::default());
    let result = estimator.estimate(&trajectories)?;
    println!("NES dG = {:.6}", result.delta_f());
    println!("NES sigma = {:?}", result.uncertainty());
    Ok(())
}
}

If you enable the optional plotting feature, those convergence points can be rendered directly to SVG:

#![allow(unused)]
fn main() {
#[cfg(feature = "plotting")]
{
use alchemrs::{ConvergencePlotOptions, render_convergence_svg};

let svg = render_convergence_svg(
    &points,
    Some(ConvergencePlotOptions {
        title: "MBAR Convergence".to_string(),
        ..ConvergencePlotOptions::default()
    }),
)?;
assert!(svg.contains("<svg"));
}
}

Choosing the right preprocessing entry point

Use decorrelate_dhdl for TI inputs.
Use decorrelate_u_nk(..., UNkSeriesMethod::DE, ...) when you want an internal u_nk-derived scalar series.
Use decorrelate_u_nk_with_observable when you already have a finite external observable such as EPtot.

Top-level re-exports

The root crate re-exports:

analysis
data
error
estimators
parse
prep

You can use either:

the direct re-exported functions and types
the crate namespaces for more explicit imports

depending on your style preference.

Python and OpenMM

The repository now includes Python bindings and pure-OpenMM example workflows in addition to the Rust CLI and library APIs.

This page focuses on local development usage from the repository checkout. A more polished installation workflow can be added later when the Python package build/release flow is finalized.

Current layout

Relevant paths:

pyproject.toml: maturin project configuration for the Python package
python/alchemrs: Python package wrapper around the native extension
python/tests: Python-side test coverage
python/examples/amber_fixture_analysis.py: bundled AMBER fixture analysis through the Python API
python/examples/atm_analysis.py: bundled ATM leg analysis through the Python API
python/examples/openmm_u_kln_mbar.py: pure-OpenMM equilibrium MBAR toy example
python/examples/openmm_nes.py: pure-OpenMM nonequilibrium switching toy example
python/examples/openmm_atm.py: pure-OpenMM ATM-style toy example

Local usage

Recommended workflow:

from the repository root, run maturin develop
run tests or examples against the installed local package

POSIX shell:

python -m pytest python/tests -q
python python/examples/amber_fixture_analysis.py
python python/examples/atm_analysis.py

PowerShell:

python -m pytest .\python\tests -q
python .\python\examples\amber_fixture_analysis.py
python .\python\examples\atm_analysis.py

The maturin develop workflow is preferred because it handles the platform-specific native extension name and import path for you. If you run tests directly from a checkout without installing the extension, make sure the rebuilt native module is staged in python/alchemrs and set PYTHONPATH=python for that shell.

Bundled AMBER fixture workflow

Use python/examples/amber_fixture_analysis.py for the simplest end-to-end Python example against real repository data. It:

parses the bundled fixtures/amber/acetamide_tiny dataset
runs TI directly from dH/dlambda
runs MBAR from u_nk plus EPtot decorrelation
prints dG values in kcal/mol

Bundled ATM fixture workflow

Use python/examples/atm_analysis.py for the smallest pure-Python ATM example. It:

builds forward and reverse ATM legs from arrays
converts them into AtmSampleSet
runs alchemrs.ATM().estimate_leg(...)
combines the two legs with estimate_rbfe(...)

Python API shape

The bindings mirror the Rust crate structure at a high level:

alchemrs.parse
alchemrs.prep
alchemrs.estimators
alchemrs.analysis
alchemrs.atm
alchemrs.openmm

Top-level estimator aliases are also available:

alchemrs.ATM
alchemrs.MBAR
alchemrs.BAR
alchemrs.TI
alchemrs.IEXP
alchemrs.NES

alchemrs.MBAR() uses the same default MBAR solver policy as the Rust API and CLI: L-BFGS by default, with the Rust core automatically falling back to fixed-point when the evaluated state grid has zero sampled-count states. Pass solver="fixed-point" when you need explicit fixed-point behavior.

OpenMM support model

alchemrs does not currently ship a dedicated OpenMM parser. Instead, the Python side exposes conversion helpers and runnable examples that turn OpenMM quantities into the validated alchemrs data model.

Current reusable helper module:

alchemrs.openmm.kt_in_kcal_per_mol(temperature_k)
alchemrs.openmm.windows_from_u_kln(...)
alchemrs.openmm.switching_trajectories_from_work_values(...)
alchemrs.openmm.atm_schedule_from_lambdas(...)
alchemrs.openmm.atm_sample_set_from_arrays(...)

For ATM-style workflows, the Python bindings expose a dedicated analysis model:

alchemrs.atm.AtmState
alchemrs.atm.AtmSchedule
alchemrs.atm.AtmSample
alchemrs.atm.AtmSampleSet
alchemrs.atm.ATM
alchemrs.atm.schedule_from_arrays(...)
alchemrs.atm.sample_set_from_arrays(...)

This keeps the core design simple:

equilibrium reduced potentials map to UNkMatrix
nonequilibrium work values map to SwitchingTrajectory
ATM schedule metadata plus per-sample state_id, potE, and pertE map to AtmSampleSet

Once those objects exist, the standard prep, estimators, and analysis APIs apply unchanged.

Equilibrium OpenMM workflow

Use python/examples/openmm_u_kln_mbar.py when you want a concrete reference for:

sampling equilibrium configurations at multiple lambda windows
evaluating all samples on all lambda states to form u_kln
converting u_kln into UNkMatrix windows
running alchemrs.MBAR()

The example prints dG in kcal/mol and reports the MBAR overlap scalar.

Nonequilibrium OpenMM workflow

Use python/examples/openmm_nes.py when you want a concrete reference for:

running forward-only switching trajectories
running separate forward and reverse switching campaigns
converting switching work into SwitchingTrajectory
running alchemrs.NES() and alchemrs.analysis.nes_convergence(...)

The current example uses a toy harmonic oscillator so it can run with plain openmm from pip install openmm.

ATM-style OpenMM workflow

Use python/examples/openmm_atm.py when you want a concrete reference for:

defining an ATM leg schedule from arrays
sampling separate thermodynamic states in OpenMM
recording the per-sample quantities used by AToM-style analysis:
- state_id
- potE
- pertE
converting those samples into alchemrs.atm.AtmSampleSet
running alchemrs.ATM().estimate_leg(...) and estimate_rbfe(...)

ATM uncertainty uses analytical UWHAM covariance by default. For Python workflows that want bootstrap instead, construct the estimator as:

estimator = ar.ATM(uncertainty="bootstrap", n_bootstrap=256, seed=7)

The example is intentionally analysis-focused. It does not parse a standard ATM output format because OpenMM-based ATM workflows do not currently share one. Instead, it documents the stable integration boundary for custom workflows:

schedule metadata
sampled state ids
potential energies
perturbation energies

That is the Python-level equivalent of the AToM analysis data that feeds its UWHAM step.

Scope and limitations

Current OpenMM support in this repository is example-driven, not parser-driven.

That means:

no dedicated src/parse/openmm.rs
no claim of drop-in support for arbitrary OpenMM output files
examples focus on converting known in-memory quantities into alchemrs

This is intentional. OpenMM is a simulation engine, not a single canonical free-energy file format, so the stable integration point is the validated alchemrs data model rather than ad hoc file parsing.

Compatibility and Limitations

This section describes the current scope of the codebase and where it intentionally matches external tools.

Current compatibility targets

The repository includes fixtures and tests intended to compare behavior against:

alchemlyb
pymbar
the R UWHAM package

Important matches include:

u_nk DE decorrelation semantics
pymbar-style statistical inefficiency estimation
pymbar-style automated equilibration detection
UWHAM free-energy estimates and pooled-input exports validated against committed R reference CSVs
AMBER NES Jarzynski estimates and preprocessing semantics validated against external reference scripts

Important intentional difference:

alchemrs reports equilibration Neff_max using the actual suffix length N - t0, so this scalar can differ by 1 from pymbar / alchemlyb

Current parser scope

The implemented parser surface supports AMBER outputs, including AMBER nonequilibrium switching outputs for NES, and GROMACS dhdl.xvg outputs.

AMBER still has the broader and more battle-tested workflow coverage, but the documented and tested parsing workflow now includes both engines. Multidimensional GROMACS schedules are supported for u_nk-based analysis; multidimensional TI is still unsupported.

ATM, UWHAM, and OpenMM workflows currently enter through the typed data model and Python helper APIs rather than through a dedicated file parser. The project does not currently expose UWHAM or ATM as CLI commands because MD engines do not yet emit a common, analysis-ready file format for those workflows.

One-dimensional lambda assumption

The data-model types can store more general state metadata, and UNkMatrix-based estimators now support multidimensional lambda states when windows share a consistent evaluated-state grid.

The remaining one-dimensional restriction is TI: DhdlSeries is still scalar, so multidimensional dH/dlambda workflows are not yet supported.

NES trajectories are also currently one-dimensional in the advisor path because the retained lambda_path / dvdl_path diagnostics assume one switching coordinate.

Non-finite `u_nk`

Positive infinity values are allowed in UNkMatrix, but not every preprocessing path can use them safely.

Practical consequence:

de and all observable choices can fail during decorrelation if the derived scalar series is non-finite
epot can remain usable if the external observable is finite

Documentation scope

This book documents the current implementation, not a frozen public-stability promise. The project is still early enough that:

APIs may still evolve
defaults may still be refined
more formal hosted documentation infrastructure may still change

What is not covered here

This repository does not currently provide:

a large general-purpose plotting/reporting surface beyond the current SVG helpers and advisor HTML reports
production-grade parser coverage for every major MD engine
a dedicated CLI surface for ATM or direct UWHAM analysis until native engine outputs or a common interchange format make that command-line workflow useful
a polished published Python distribution workflow beyond the current repo-local maturin setup
versioned public API guarantees

Those may come later, but they are not part of the implemented surface documented in this book.

Keyboard shortcuts

alchemrs