摘要:
|
摘要:目的 通过TCGA数据库基因突变信息结合机器学习软件RapidMiner构建肝细胞癌患者复发模
型。方法 首先通过TCGA数据库收集316例肝细胞癌患者的临床资料和全基因组测序的突变基因信
息;然后利用R语言和SPSS19.0筛选出前127个高频突变基因和12个与无疾病生存期(disease-free
survival period,DFS)显著相关的高频突变基因;通过RapidMiner8.0机器学习软件,利用316例患
者的突变基因信息训练决策树和支持向量机(support vector machine,SVM)模型。结果 通过利用
TCGA数据库筛选的基因构建的决策树模型准确率为77.42%,通过构建SVM模型佐证决策树模型的最
大准确率为77.42%。结论 通过公共数据库构建的肝细胞癌患者的复发模型,可在临床上用来分析患
者的基因检测报告,除了提供药物治疗靶点的信息外,还可初步判断患者的预后;此外,对于部分经
济条件受限的患者可重点针对决策树中的基因进行检测,来预测预后及复发可能。
|
Abstract: Objective To investigate the construction of recurrence model of patients with hepatocellular
carcinoma (HCC) by gene mutation information in TCGA database combined with machine learning software
RapidMiner. Methods The clinical data and genome-sequenced mutant gene information of 316 patients
with HCC were collected according to the TCGA database. The first 127 high frequency mutation genes and
12 high frequency mutation genes which had significant correlation with disease-free survival period (DFS)
were screened by R language and SPSS 19.0. Mutated genetic information from 316 patients were applied
to train decision trees and support vector machines (SVM) models by RapidMiner 8.0 machine learning
software. Results The accuracy of the decision tree model constructed according to the TCGA database
was 77.42%, and the maximum accuracy of the decision tree model by constructing the SVM model was
77.42%. Conclusions The recurrence model of patients with HCC constructed by public database can be
used to analyze the gene detection report of patients in practice. In addition to providing information on
drug treatment targets, it can also judge the prognosis of patients preliminarily. Some patients with limited
economic conditions can focus on detecting genes in decision trees to predict the prognosis and recurrence.
|
基金项目:
|
|
作者简介:
|
|
参考文献:
|
|
服务与反馈:
|
【文章下载】【加入收藏】
|
|
|