Characterizing novel ncRNAs by integrating high-throughput data and structure predictions
主 题: Characterizing novel ncRNAs by integrating high-throughput data and structure predictions
报告人: Zhi John Lu(MOE Key Lab of Bioinformatics, School of Life Science, Tsinghua University)
时 间: 2013-11-26 14:00-16:00
地 点: 理科一号楼1114(统计中心活动)
I will present an integrative, machine-learning method, incRNA, for whole-genome identification of non-coding RNAs (ncRNAs). It combines a large amount of expression data, epigenetics data, RNA secondary-structure stability, and evolutionary conservation at the protein and nucleic-acid level. Using this model, we were able to separate known ncRNAs from coding sequences and other genomic elements with high accuracy (>93% AUC on an independent validation set), and find thousands of novel ncRNA candidates in C. elegans and Arabidopsis.
In addition, we characterized the novel ncRNA candidates and found that they have distinct expression patterns across developmental stages, tend to use novel RNA structural families, and are targeted by specific transcription factors. Overall, our study identifies many new potential ncRNAs in different systems.