您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(医学版)》

山东大学学报(医学版) ›› 2011, Vol. 49 ›› Issue (5): 140-.

• 论文 • 上一篇    下一篇

核主成分logistic回归模型在非线性关联分析中的应用

高青松,薛付忠   

  1. 山东大学公共卫生学院流行病与卫生统计学研究所, 济南 250012
  • 收稿日期:2011-03-22 出版日期:2011-05-10 发布日期:2011-05-10
  • 通讯作者: 薛付忠(1964- ),男,博士,博士生导师,主要从事复杂疾病基因定位的实验设计与统计分析方法的研究。E-mail:xuefzh@sdu.edu.cn
  • 作者简介:高青松(1986- ),男,硕士研究生,主要从事复杂疾病基因定位的实验设计与统计分析方法的研究。
  • 基金资助:

    国家自然科学基金资助课题(30871392)。

Applications of the kernel principal component analysis-based logistic  regression model on nonlinear association study

GAO Qing-song, XUE Fu-zhong   

  1. Institute of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan 250012, China
  • Received:2011-03-22 Online:2011-05-10 Published:2011-05-10

摘要:

目的    将核主成分分析(KPCA)与logistic回归模型相结合,提出一种核主成分logistic(KPCA-based logistic)回归模型,用于复杂疾病基因定位的非线性关联分析。方法    针对病例对照研究设计的关联分析,对候选基因区域内的单核苷酸多肽性(SNPs)进行核主成分分析,以核主成分为自变量构建logistic回归模型,并对GAW16类风湿关节炎数据中PTPN22和RNF186两个基因区域进行分析,以验证KPCA-based logistic回归模型的有效性和实用性。结果     对PTPN22和RNF186两个基因区域的分析结果显示,KPCA-based logistic回归模型既能够检测出单点检验所能发现的区域(PTPN22),也能检测出单点检验所不能发现的区域(RNF186)。结论     KPCA-based logistic回归模型是一种有效的非线性关联分析方法,能够发现更多的易感区域。

关键词: 核主成分分析;logistic回归;复杂疾病基因定位;关联分析

Abstract:

Objective     To combine the kernel principal component analysis (KPCA) and the logistic regression model to propose a KPCA-based logistic regression model for nonlinear association analysis of complex disease gene mapping. Methods    For association study of case-control research design, the kernel principal component analysis (KPCA) was performed on single nucleotide polymorphisms (SNPs) of a candidate region to construct the logistic regression model with kernel principal components as independent variables, and then the PTPN22 and RNF186 gene regions of rheumatoid arthritis (RA) data from GAW16 were analyzed to illustrate the effectiveness and practicability of the KPCA-based logistic regression model. Results    Application to the PTPN22 and RNF186 gene regions indicated that the KPCAbased logistic regression model could detect regions which could be detected by a single-locus test (PTPN22), and identify significant regions which could not be identified by a single-locus test (RNF186). Conclusion    As an effective nonlinear association study method, the KPCAbased logistic regression model can identify more susceptible regions.

Key words: Kernel principal component analysis; Logistic regression; Complex disease gene mapping; Association study

中图分类号: 

  • R195-1
[1] 曾平,王婷. 贝叶斯错误发现率[J]. 山东大学学报(医学版), 2012, 50(3): 120-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!