SEQ 转录组表达 多源映射 非均匀性

发布时间:2017-01-02 08:39

  本文关键词:改进的RNA-Seq数据转录组表达分析研究,由笔耕文化传播整理发布。


改进的RNA-Seq数据转录组表达分析研究

Improved Trancriptome Expression Analysis for RNA-Seq Data

[1] [2] [3]

Shi Xinxin, Liu Xuejun, Zhang Li (College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China)

南京航空航天大学计算机科学与技术学院,南京210016

文章摘要基于高通量测序的RNA-Seq(RNA-sequencing)是用于转录组研究的一种新技术,针对该技术在转录组表达分析研究中存在的读段多源映射和读段非均匀分布等难点,提出一个改进的转录组表达研究方法 LDASeqII(Improvement of latent Dirichlet allocation for sequencing data)。模型利用剪接异构体结构信息对参数进行约束并进行外显子读段数目归一化处理,解决了读段非均匀分布下的多源映射问题。通过引入"伪外显子"和"伪转录本"分别处理接合区读段和噪声读段。将模型应用到真实数据集上,并与原LDASeq(Latent Dirichlet allocation for sequencing data)模型和目前流行的Cufflinks与RSEM(RNA-Seq by expectation maximization)方法进行对比。结果显示,改进方法获得了更为准确的转录本及基因表达水平计算结果。

AbstrRNA-Seq(RNA-sequencing),based on high-throughput sequencing,is a new technique for transcriptome research.Considering the difficulties in the analysis of transcript expression using RNA-Seq data,an improved method,improvement of latent dirichlet allocation for sequencing data(LDASeqⅡ)is proposed to calculate the transcript expression.To deal with multi-mappings between reads and isoforms and non-uniform distribution of reads along reference,LDASeqⅡ utilizes the known gene-isoform annotation to constrain the hyperparameters and normalizes the read counts by exon length for each individual exon.By introducing″pseudo-exon″and″pseudo-transcript″,the conjunction reads and noise reads gain proper treatments.LDASeqⅡis validated using two real datasets on gene and transcript expression calculation and compared with latent dirichlet allocation for sequencing data(LDASeq)and other two popular methods Cufflinks and RNA-Seq by expectation maximization(RSEM).The results show that LDASeqⅡobtains more accurate transcript and gene expression measurements than other approaches.

文章关键词:

Keyword::gene expression RNA-Seq transcript expression multi-mapping non-uniformity

课题项目:国家自然科学基金(61170152)资助项目; 中央高校基本科研业务费专项(CXZZ11_0217)资助项目

 

 


  本文关键词:改进的RNA-Seq数据转录组表达分析研究,由笔耕文化传播整理发布。



本文编号:231369

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/benkebiyelunwen/231369.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户f9740***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com