Collective Variable-Free Transition Path Sampling
with Generative Flow Network

ICML 2024 SPIGM Workshop

Kiyoung Seong | Seonghyun Park | Seonghwan Kim | Woo Youn Kim | Sungsoo Ahn

Paper

Abstract

Understanding transition paths between meta-stable states in molecular systems is fundamental for material design and drug discovery. However, sampling these paths via unbiased molecular dynamics simulations is computationally prohibitive due to the high energy barriers between the meta-stable states. Recent machine learning approaches are often restricted to simple systems or rely on collective variables (CVs) extracted from expensive domain knowledge. In this work, we propose to leverage generative flow networks (GFlowNets) to sample transition paths without relying on CVs. We reformulate the problem as amortized energy-based sampling over transition paths and train a neural bias potential by minimizing the squared log-ratio between the target distribution and the generator, derived from the flow matching objective of GFlowNets. Our evaluation on three proteins (Alanine Dipeptide, Polyproline Helix, and Chignolin) demonstrates that our approach, called TPS-GFN, generates more realistic and diverse transition paths than the previous CV-free machine learning approach.

Results

Alanine Dipeptide

Alanine initial state

C5

Alanine conformation change

Conformation Change

Alanine target state

C7ax


Polyproline Helix

Poly start state

Left-handed (PP-II)

Poly isomerization

Isomerization

Poly target state

Right-handed (PP-I)


Chignolin

Chignolin initial state

Unfolded

Chignolin folding process

Folding Process

Chignolin target state

Folded

Transition paths generated by TPS-GFN. (Top) A conformation change of Alanine Dipeptide. (Middle) An isomerization of Polyproline Helix from left-handed to right-handed helix. (Bottom) A Chignolin folding process.

Alanine Dipeptide

64 sampled paths for Alanine Dipeptide on the Ramachandran plot

64 sampled paths for each method on the Ramachandran plot of Alanine Dipeptide. White circles indicate meta-stable states, and stars indicate transition states. (a) The paths from unbiased MD simulations that fail to escape the initial meta-stable region. (b) The paths generated by PIPS pass through only one transition state. (c) The paths generated by TPS-GFN pass through both transition states. For clarity, 10 paths are highlighted.

Polyproline Helix

Isomerization of Polyproline

An isomerization from the meta-stable region PP-II to PP-I of Polyproline generated by TPS-GFN. (Top) 3d views of three states: initial, transition, and final state. The backbone of the Polyproline Helix is highlighted in green. (Middle) The potential energy of states over time. (Bottom) The handedness of states over time. The red line at y=0 differentiates between PP-II and PP-I.

Chignolin

Folding process of Chignolin

A folding process of Chignolin generated by TPS-GFN. (Top) 3d views of three states: initial, transition, and final state. (Middle) The potential energy over time. (Bottom) The donor-accepter distance of the two key hydrogen bonds, ASP3OD-THR6OG and ASP3N-THR8O over time. To form the hydrogen bonds, the donor-acceptor distance must be lower than the red line at y=3.5Å.


CITE

        @article{seong2024collective,
        title={Collective Variable Free Transition Path Sampling with Generative Flow Network},
        author={Seong, Kiyoung and Park, Seonghyun and Kim, Seonghwan and Kim, Woo Youn and Ahn, Sungsoo},
        journal={arXiv preprint arXiv:2405.19961},
        year={2024}
        }