Prediction of dissociation rate constants using machine learning


It is possible to estimate dissociation rates (koff) from trajectories obtained from molecular dynamics (MD) simulations. However, obtaining unbinding events from conventional MD simulations is challenging. Dissociation of a ligand from a protein is an infrequent event and usually takes milliseconds or more to happen, while MD simulations are usually limited to the microsecond timescale. Moreover, many events must be sampled to obtain a reliable estimate of dissociation rates and, for ligand design, hundreds of molecules need to be tested for a target protein.

Approaches similar to methods to derive quantitative structure-activity relationships (QSAR) could be used to predict koff values. Computational approaches such as the Comparative Binding Energy (COMBINE) analysis and linear interaction energy (LIE), which were created to predict binding affinities by weighting interaction energies obtained by computational methods, could be modified to predict koff values. Moreover, other methods used in machine learning, such as support vector machines and random forest, can also be used. Such computational methods can be helpful to predict koff values in a reliable and computationally inexpensive way.

The aims of this project are to develop data-driven models to predict koff values for ligand-protein complexes and test which features of the protein and ligand provide the most predictive models.


Schematic representation of the COMBINE analysis and LIE. Interaction energies (electrostatic, Velec, and van der Waals interaction energies, VvdW) from a ligand-protein complex are used to predict koff values, using weights parameterized employing complexes with known experimental koff values. In the COMBINE analysis, in contrast to LIE, the interaction energies are further decomposed on a residue-wise basis.