A Deductive Learning of Heart Disease Dataset by using K Means Clustering
Main Article Content
Abstract
Cardiovascular diseases is one of the most significant causes of mortality in today’s world. Cardiovascular diseases are the number one cause of death globally with 17.9 million death cases each year. CVDs are concertedly contributed by hypertension, diabetes, overweight and unhealthy lifestyles. Exploratory Data Analysis is a pre-processing step to understand the data. There are numerous methods and steps in performing EDA, however, most of them are specific, focusing on visualization and distribution. If the number of cluster is 2, this model has 43% & 57% of cluster instances for full training set and 46% & 54% of cluster instances for 66% training set, if the number of cluster is 3, this model has 18% 48% & 34% of cluster instances for full training set and 25%,50% & 25% of cluster instances for 66% training set, if the number of cluster is 4, this model has 21%,40%,10% & 28% of cluster instances for full training set and 24%,13%,26% and 37% of cluster instances for 66% training set, If the number of cluster is 5, this model has 17%,31%,11%,19% & 21% of cluster instances for full training set and 23%,14%,20%,33% &11% of cluster instances for 66% training set, If the number of cluster is 6, this model has 10%,31%,15%,20%,6% &18% of cluster instances for full training set and 16%,18%,15%,22%,13% &15% of cluster instances for 66% training set. In this system proposes the optimal results for build the deductive learning model. Based on the time consumption the system recommends that cluster 2, 3 and 5 have zero second taken the time consumption for build the model in 66% training set. 0.01 seconds for cluster 6 and 0.03 seconds for cluster 4 in 66% training set models. Cluster 5 and 6 have low sum of squared errors for full training and 66% training set comparatively other models.