Exploring the Conformational Space and Classification of Proteins Using Robotics-based and Machine Learning Methods

Speaker: Fatemeh Afrasiabi

Committee Members: Prof. Nurit Haspel (Chair), Prof. Dan Simovici, Prof. Marc Pomplun, Prof. Kourosh Zarringhalam

GPD: Prof. Dan Simovici

Date: Thursday April 20th, 2023, at 10:00 AM Zoom Link: https://umassboston.zoom.us/j/93286955920

Abstract: Proteins are essential molecules in living organisms that perform a broad scope of functionalities, such as catalyzing biochemical reactions, providing structural support, and acting as signaling molecules. Understanding the structure and dynamics of proteins is crucial to elucidate their functions and develop therapies for various diseases. However, the conformational space of proteins is vast, and experimentally exploring it is challenging and time-consuming. Therefore, computational methods are becoming increasingly important for investigating protein conformational changes and dynamics. These methods are also subject to considerable limitations, which are discussed in this dissertation, alongside potential techniques to address them.

The goal of this Ph.D. research is to study the literature regarding protein conformational changes and develop efficient and effective methods to explore their conformational space, which is paramount to understanding how proteins function.

The first two projects focus on using Rapidly-exploring Random Trees (RRT) and Monte Carlo (MC) simulations to efficiently sample and explore the protein conformational space. In the first project, we propose a new RRT*-based search algorithm that outperforms previous methods in terms of exploration efficiency. We further improve the search by integrating rigidity analysis information into the exploration process to help guide the search toward more low-energy conformations. We use topological data analysis in the second project to gain insights into the shape of protein conformational spaces and develop more practical exploration strategies. These methods are demonstrated on several benchmark protein systems and show significant improvements over existing techniques. The final two projects explore new directions in using machine learning techniques to analyze protein conformations. In the third project, we concentrate on designing a method for classifying protein families based on learned compressed representations. We show that these compact representations, which we call fingerprints, capture the relevant features of protein families. In contrast, in the last project, we use Variational Autoencoders (VAE) to explore the conformational space of proteins on molecular dynamics simulation data. Together, this work significantly contributes to advancing our understanding of protein conformational dynamics and developing new tools for studying protein structure-function relationships.