Privacy-Aware Genome Mining: Server-Assisted Protocols for Private Set Intersection and Pattern Matching


The Human Genome Project has generated a great wealth of information. Currently, almost all human genome has been sequenced and now it is time to identify the functionality of each gene. The sequence of base pairs accounts for approximately 3 billion elements. While there are many efficient algorithms and implementations to mine this information, doing it privately is a great challenge. Current state-of-the-art methods have improved their efficiency, but they are not practical yet.

In this article, we introduce several protocols to drastically boost the performance of genome mining processes while guaranteeing privacy, thus, enabling practical implementations. We describe how to solve the private set intersection problem and a set of pattern matching queries with privacy. The proposed protocols are server-assisted and we prove that they are secure under the semi-honest model. We report the assessment of our solution using synthetic datasets and prove their efficiency.