Title : ( Analyses and comparison of K-nearest neighbour and AdaBoost algorithms for genotype imputation )
Authors: Abbas Mikhchi , Mahmood Honarvar , Nasser Emam Jomeh Kashan , Saeed Zerehdaran , Mehdi Aminafshar ,Access to full-text not allowed by authors
Abstract
Genomic selection has become a standard tool in dairy cattle breeding. However, for other animal species, implementation of this technology is hindered by the high cost of genotyping. Genotypic imputation is defined as the prediction of genotypes for both unrelated individuals and parent-offspring trios at the single nucleotide polymorphism (SNP) locations in a sample of individuals for which assays are not directly available. Several imputation methods are available for imputation designed for livestock population. Machine learning methods have been used in genetic studies to build models capable of predicting missing values of a marker. In this study, trategies and factors affecting the imputa tion accuracy of parent-offspring trios were compared using two Machine Learning methods namely K-Nearest neighbour (KNN) and AdaBoost (AB). The methods employed using simulated data to impute the un-typed SNPs in parent-offspring trios. Two datasets of D1 (100 trios with 5k SNPs) and D2 (500 trios with 5k SNPs) were simulated. The methods were compared in terms of imputation accuracy an d computation time and factors affecting imputation accuracy (sample size). Comparison of two methods for imputation showed that the KNN outperformed AB for imputation accuracy. The time of computation was different between methods. The KNN was the fastest algorithm. Accuracy of imputation increased with increasing number of trios. Simulation datasets showed that our methods performed very well for imputation of un-typed SNPs and can be used as an alternative for imputation of parent-offspring trios than other methods.
Keywords
Trios; machine learning methods; imputation accuracy; computation time@article{paperid:1053648,
author = {Abbas Mikhchi and Mahmood Honarvar and Nasser Emam Jomeh Kashan and Zerehdaran, Saeed and Mehdi Aminafshar},
title = {Analyses and comparison of K-nearest neighbour and AdaBoost algorithms for genotype imputation},
journal = {Research Opinions in Animal and Veterinary Sciences},
year = {2015},
volume = {5},
number = {7},
month = {September},
issn = {2221-1896},
pages = {295--299},
numpages = {4},
keywords = {Trios; machine learning methods; imputation accuracy; computation time},
}
%0 Journal Article
%T Analyses and comparison of K-nearest neighbour and AdaBoost algorithms for genotype imputation
%A Abbas Mikhchi
%A Mahmood Honarvar
%A Nasser Emam Jomeh Kashan
%A Zerehdaran, Saeed
%A Mehdi Aminafshar
%J Research Opinions in Animal and Veterinary Sciences
%@ 2221-1896
%D 2015