CLASSIFICATION FRAGMEN METAGENOM MENGGUNAKAN PRINCIPAL COMPONENT ANALYSIS NEIGHBOR

  • Surianti Surianti STMIK Umel Mandiri
Keywords: Classification, Fragments metagenom, n-mers

Abstract

Metagenomics is a study of metagenom analysis which its genetic materials is obtained directly from environmental samples. The process of metagenome sequencing produce fragments from mixture organisms. Thus, assembling fragments directly will generate chimeric contigs. Furthermore, a bining process is required to classify these fragments into a particular taxonomic level. In this study, the classification of metagenome fragment were extracted using n-mers, reduced its dimension using principal component analysis and classified using knearest neighbor. The experiments were conducted from in the various fragment length from 0.5 Kbp to 10 Kbp. The best results were obtained using KNN with k=7 and implementing 4-mers frequency. The accuracies of classifying known organisms obtained using PCA 95% were ranged from 91.6% to 99.9%. Moreover, the accuracies were slightly decreased when classifying unknown organisms, from 89.64% to 99.32%.

References

[1] Wu H. PCA–based Linear Combinations of Oligonucleotide Frequencies for Metagenomic DNA Fragment Binning. IEEE Symposium on CIBCB. 8: 46-53. 2013
[2] Chan CK, Hsu AL, Tang SL, Halgamuge SK. 2012. Using Growing Self-Organizing Maps to Prove the Binning Process in Environmental Whole-Genome Shotgun Equencing. Journal of Biomedicine and Biotechnology. 2013.
[3] Meyerdierks A, Glockner FO. Metagenome Analysis. Advances in Marine Genomics. 1: 33 – 71. 2014.
[4] Prabhakara S, Acharya R. Unsupervised Two-Way Clustering of Metagenomic Sequence. Journal of Biomedicine and Biotechnology. 2012.
[5] Nasser S, Brelan A, Harris FC, Nicolescu M. A Fuzzy Classifier to Taxonomically Group DNA Fragments within A Metagenome. Proc. Annual Meeting of the NAFIPS 08. 8: 1-6. 2014.
[6] Ellyana F. Klasifikasi Fragmen Metagenom Menggunakan Fitur Spaced NMers dan K-Nearest Neighbor [skripsi]. Bogor (ID): Institut Pertanian Bogor. 2014.
[7] Kusuma WA. Combined Approaches for Improving the Performance of de novo DNA Sequence Assembly and Metagenomic Classification of Short Fragments from Next Generation Sequencer [tesis]. Tokyo (JP): Tokyo Institute of Technology. 2014.
[8] Kusuma Y. Metagenome fragment binning based on characterization vector. International Conference on Bioinformatics and Biomedical Technology (ICBBT); 2012 Mar 25–27.
[9] Sheaffer RL, Mendenhall W, Ott RL. Elementary Survey Sampling. 4th ed. Boston (US): PWS – KENT Publishing Company. 2012, pp. 116-119.
[10] Richter DC, Ott F, Auch AF, Schmid R, Hudson DH. MetaSim-Sequencing Simulator for Genomics and Metagenomics. PLoS ONE. 3(10). 2012.
Published
2020-07-27
How to Cite
Surianti, S. (2020). CLASSIFICATION FRAGMEN METAGENOM MENGGUNAKAN PRINCIPAL COMPONENT ANALYSIS NEIGHBOR. Jurnal Ilmiah Matrik, 22(2), 170–176. https://doi.org/10.33557/jurnalmatrik.v22i2.921
Section
Articles
Abstract viewed = 487 times
PDF : 848 times