Skip to content

Structure-Conditioned Sampling (ESM3)

Architecture Breakdown

Data: A PDB structure file (the conditioning input, not training data).

Models: ESM3 (pretrained generative model, structure-conditioned via set_condition_) → generative_modeling. Structure encoding uses the VQ-VAE encoder (runs once, cached).

Sampling: sample (discrete-time ancestral) → sampling

Evaluation: None in this example. For structural validation of generated sequences, see evaluation.

Generate sequences conditioned on a known protein backbone structure using ESM3.

Quick Start

uv run python examples/esm3_structure_conditioned_sampling.py

How It Works

ESM3 accepts atom37-format coordinates as conditioning input. The model's set_condition_() method runs the VQ-VAE structure encoder once (expensive), then all subsequent sampling steps use the cached structure tokens.

This is useful for inverse folding — given a desired 3D structure, generate sequences likely to fold into that shape.

Fixed-length sequences

ESM3 structure conditioning requires all sequences in a batch to match the structure length. Structure tokens are (L+2,) with BOS/EOS tokens.

See Models → ESM3 for details on structure conditioning.

Source: examples/esm3_structure_conditioned_sampling.py