K-Means Clustering and Analyze of SARS-CoV 2 DNA based on Multiple Encoding Vector and K-Mer Method

Main Article Content

Evander Banjarnahor, Alhadi Bustamam, Titin Siswantining, Wibowo Mangunwardoyo

Abstract

According to WHO data, coronavirus or Severe Acute Respiratory Syndrome Coronavirus 2 (SARS CoV-2) affected more than 172.6 million people worldwide in early June 2021.This virus targets human breathing, causing lung infections and even death in humans. This virus targets human respiration, causing lung infections and even death in humans. Based on this information, it is vital to investigate the coronavirus's kinship to limit its spread. This study uses the K-Means Clustering method in grouping and uses Multiple Encoding Vector in analyzing the sequences. The sequence analysis results resulted in an 18-dimensional multiple encoding vector compared with the K-Mer method based on the translation of DNA codons into amino acids. DNA Sequences of SARS CoV-2 were collected from numerous affected countries for this investigation. The simulation results found that the DNA sequence of SARS CoV-2 consisted of two clusters and the second cluster was the group that had the most members. The results also show that this method is optimal in a grouping of data with the between ss/total ss is 81.4%.


 

Article Details

How to Cite
Evander Banjarnahor, Alhadi Bustamam, Titin Siswantining, Wibowo Mangunwardoyo. (2021). K-Means Clustering and Analyze of SARS-CoV 2 DNA based on Multiple Encoding Vector and K-Mer Method. Annals of the Romanian Society for Cell Biology, 18647–18658. Retrieved from https://annalsofrscb.ro/index.php/journal/article/view/8355
Section
Articles