National Supercomputer Center in Guangzhou: The Deep Learning of Biomedical Big Data on Tianhe-2
AI/Machine Learning/Deep Learning
Big Data Analytics
Clouds and Distributed Computing
TimeMonday, June 17th5pm - 5:20pm
DescriptionWith the revolutionary development of the next generation sequencing techniques, the amount of biological sequence data is exponentially exploding that was estimated to arrive in the ZB level within 10 years. As the biological data is of high dimension, high noise, but relatively small samples, it is challenging to directly learn from the big data. Fortunately, many efforts have been made to study molecular mechanisms underlying in life, and the accumulated knowledge provides a reliable way for efficient mining of the biomedical big data. At the same time, the learning of the big data can rapidly expand the available knowledge over the life science. Based on the “Tianhe-2” supercomputer, we have employed deep learning techniques (CNN, RNN, and the combinations) to develop a series of bioinformatics algorithms for accurate prediction of protein structure, functions, and interactions. These accurate predictions were further integrated to analyze noisy sequencing data for biological applications, including annotations of disease-causing mutations, cancer prognosis, and drug discovery and repositioning. Moreover, we are developing a biomedical cloud platform integrated with biomedical database and bioinformatics tools for data analysis and predictions. Such platform will provide a one-stop site for both biological and medical applications.
Professor of National Supercomputer Center in Guangzhou