Home → News and Events → Events Calendar → Dissertation Defense
High Performance Computing Solutions to Problems on Coding Theory, Distance Learning, Mars Image Crater Detection and Cancer Structural Variations and Fusions Identification
When: 01:00PM - 02:00PM , February 14, 2020
Speaker: Hamidreza Mohebbi
Over the past 20 years, data has increased on a large scale in various fields and researchers are facing new challenges to work with this huge volume of data. My passion for high-performance computing leads me to span my research across different fields in computer science. In this dissertation, we introduced high-performance solutions to problems in machine learning and bioinformatics like crater detection on Mars images and structural variation detection on DNA/RNA genome data. We first show different parallel heterogeneous implementations of the Berlekamp-Massey Algorithm which plays an important role in error correction of NAND flash memories which can be found almost everywhere from flash drives to large scale enterprise servers. Then, we introduce a semi-supervised learning technique to learn a novel weighted distance metric named WDM which learns labeled information from the training set and identifies groups among the samples from test set to form a metric space. The effectiveness of this novel metric evaluated intensively on classification tasks and on both CPU and GPU parallel platforms. This dissertation continues with the introduction of a bidirectional contextbased deep learning framework that learns bidirectional context-based features from both craters and its surrounding features using deep convolutional classification and segmentation models to identify efficiently sub-kilometer craters in high-resolution panchromatic images. Finally, we designed two computational methods to detect cancer inter-chromosomal rearrangements and gene fusions on DNA and RNA sequencing data, respectively. These methods split candidate reads into windows and then represent it using binary format to reduce the huge dimension of the search space and then search the location of the breakpoint in the reference genome using Jaccard distance. This dissertation, in its entirety, offers efficient solutions to a few problems in coding theory, machine learning, and bioinformatics. In the empirical study, our approaches achieved the state of the art performance using multiple real-world datasets.
Directed By: Prof. Nurit Haspel
Dissertation Committee: Prof. Nurit Haspel, Prof. Wei Ding, Prof. Dan Simovici, Prof. Kourosh Zarringhalam