StabilityPMPNN¶
ProteinMPNN-based stability predictor from the ProteinGuide paper. Trained on the Rocklin Megascale stability dataset to predict thermodynamic stability (ΔΔG) from structure + sequence.
- Encode/decode split:
encode_structure()runs once per structure (expensive),decode()runs per sequence sample (cheap). This maps naturally to ProbabilityModel'spreprocess_observations/forwardpattern. - Tokenizer:
MPNNTokenizerwith 21 tokens (20 standard AAs + UNK). When used with TAG,include_mask_token=Trueadds<mask>at idx 21.
Cross-tokenizer behavior¶
The stability predictor overrides token_ohe_basis() so that the <mask> token maps to an all-zero OHE row — preserving original PMPNN masking semantics while making mask behavior explicit in the interface. This is a key integration point with TAG's GuidanceProjection.