Heart Disease Prediction using Data Mining Techniques
The heart disease is one of the diseases which was affected by the huge number of people and it will take the person too critical or sometimes it will lead to death. So, we should predict the disease before it goes to danger. Here we proposed the web application for Heart disease prediction using the data mining and the application will contain various details and it will processes user specific details for various disease in the heart by the data mining. It will analyse the data given by the user or patients and it will give the data or name of the disease of the patients. This will help the patients with the instant guidance and the doctors will also get the number of patients. The patients can contact the doctor through this details instantly. The web application is the free online consulting through online and doctors can see patient’s details and the same as patients can see doctor’s details too.
Key words: Data Mining, Heart disease, Intelligent Data Mining, Data Mining Techniques, Neural Networks
The heart is a most significant muscular organ in humans, which pumps blood through the blood vessels of the circulatory systeml. Human life is dependent on the proper functioning of heart. Improper functioning of heart will influence other parts of human body like brain, kidney etc. If the blood circulation in body is inefficient, it affects both heart and brain. Generally blood arrest in heart is called as attack and blood arrest in brain is called as stroke. Human life is absolutely reliant on the efficient working of the heart and brain. The rest of this paper is presented as follows: Section 2 describes the cardiovascular disease and its pervasiveness. Section 3 describes the advantage of decision support system for the prediction of heart disease. Section 4 and 5 describes various data mining and hybrid intelligent techniques used for the prediction of heart disease.
2. Cardiovascular Disease Cardiovascular heart disease is one of the principal reasons of death for both men and women. The term heart disease relates to a number of medical conditions related to heart which define the irregular health conditions that directly stimulate the heart and all its parts. Different types of heart related cardiovascular diseases along with description are given in Table1.
Table 1: Types of Cardiovascular diseases
Heart-related cardiovascular diseases
Acute coronary syndromes
Blood-supply to the heart muscle is swiftly obstructed
Chest pain due to a lack of blood to the heart muscle
Atypical heart rhythm
Heart muscle disease
Congenital heart disease
Heart disfigurements that are present at birth
Coronary heart disease
Arteries supplying blood to heart muscle becomes obstructed
Heart is not propelling ample blood
Inflammatory heart disease
Tenderness of the heart muscle and/or the tissue
Ischaemic heart disease
Plaque builds up inside the coronary arteries
Rheumatic heart disease
Disease of the valves
Various risk factors along with its symptoms that contribute to heart attack are
Presented in Table2
Table 2. Risk Factors and Symptoms of Heart Attack
Symptoms of Heart Attack
· Blood cholesterol levels
· Physical Inactivity
· Work stress
· Chest Discomfort
· Crushing chest pain
· Dyspnoea (shortness of breath)
2.1. Prevalence of Heart Disease According to World life expectancy, India ranked 39th position of all the countries in the world suffering from coronary heart disease. In India the death rate per 100,000 is 138.36. Population suffering from Coronary heart disease in India by age, gender and region is presented in table 3.
Table 3. Coronary Heart Disease in India by Age, Gender and Region
From the above table it is obvious that in country like India female suffer from coronary heart disease more than men. Till the year 2010 the population suffering from coronary heart disease in the rural areas of India is more compared to urban population, whereas from the year 2010 onwards it is vice versa. In 2015 there is a drastic variance in the population suffering from CHD in rural and urban regions. In India population of age group between 40 and 49 suffer profoundly from the heart disease. The population suffering from heart disease in all age groups has doubled in last fifteen years.
3. Decision Support System for Heart Disease Prediction and Diagnosis Medical diagnosis can be improved by the use of computer-based systems and algorithms taking decisions at the appropriate stages. Such systems are called decision support systems (DSSs). Intelligence also plays a role here. These systems help to predict and diagnose the disease based on the patient information and domain knowledge. DSS helps in improving the quality of healthcare by providing an effective and reliable diagnosis. DSS can decrease the cost of treatment by providing a more specific and faster diagnosis efficiently and also the time is reduced compare to traditional procedures. Once placed in cloud any health organization can utilize these services.
3.1. Knowledge Discovery in Database
Decision support systems was developed using a knowledge base. Knowledge discovery in database uses data mining process which extracts useful information from data set and transforms it into a reasonable structure for further use. Data mining combines statistical analysis, machine learning and database technology to extract hidden patterns and relationships from large databases 22. 19 Defines data mining as “a process of nontrivial extraction of implicit, previously unknown and potentially useful information from the data stored in a database”. Data mining uses two strategies: supervised and unsupervised learning. A training set is used to learn the model parameters in supervised learning whereas no training set is used in unsupervised learning.
Selection Pre-processing Transformation
Figure 4. Basic Methodology for Knowledge Discovery in Database
3.2 Data Mining Algorithms
3.2.1. Neural Networks (NN) Neural network is a parallel, distributed information processing structure consisting of numerous quantities of processing elements called node, they are interconnected via unidirectional signal channels called connections. Each processing element has a single output connection that branches into many connections and each conveys the equivalent signal. The NN can be classified in two main groups according to the way they learn. They are supervised learning and unsupervised learning. In supervised learning the network compute a response to each input and then compares it with the target value. If the computed response differs from the target value, the weights of the network are adapted according to a learning rule. Examples of supervised learning are Single-layer perceptron and Multi-layer perceptron. In unsupervised learning the networks learn by identifying special features in the problems they are exposed to. Example for unsupervised learning is self-organizing feature maps.
3.2.2. Naïve Bayesian Classifier
Naïve Bayes 1 is a classification algorithm based on Bayes theorem, which calculates a probability by counting the frequency of values and combination of values n historical data. Bayes theorem finds the probability of an event occurring given the probability of another event that has already occurred.
Pro (B given A) = Pro (A and B)/Pro (A)
Heart disease is a term that assigns to a large number of medical conditions related to heart. These medical conditions describe the abnormal health conditions that directly influence the heart and all its parts. Heart disease is a major health problem in today’s time. This paper aims at analyzing the various data mining techniques introduced in recent years for heart disease prediction. Table 1 shows different data mining techniques used in the diagnosis of Heart disease over different Heart disease datasets. In some papers this is given that they use only one technique for diagnosis of heart disease as given in Shadab et al 12, Carlos et al 5 etc. but in case of other research work more than one data mining techniques are used various fields in various forms. Many Organizations now start using Data Mining as a tool, to deal with the competitive environment for data analysis. By using Mining tools and techniques, various fields of business get benefit by easily evaluate various trends and pattern of market and to produce quick and effective market trend analysis. Data mining is very useful tool for the diagnosis of diseases.
This paper exhibits the analysis of various data mining techniques which can be helpful for medical analysts or practitioners for accurate heart disease diagnosis. Due to resource constraints and the nature of the paper itself, the main methodology used for this paper was through the survey of journals and publications in the fields of medicine, computer science and engineering.
The system comprises of 2 major modules as follows:
Ø Admin Module
1. Add Training Data
2. Add Doctor Details
3. View User Details
4. View Feedback
5. View Doc Details
6. View Training Data
Ø User Module
1. Register (With Details like Age, Sex, etc.)
2. Check Heart (By providing Details like
§ Age in Year
§ Chest Pain Type
§ Fasting Blood Sugar
§ Resting Electrographic Results(Restecg)
§ Exercise Induced Angina(Exang)
§ The slope of the peak exercise ST segment
§ CA – Number of major vessels colored by fluoroscopy
§ Trest Blood Pressure
§ Serum Cholesterol
§ Maximum heart rate achieved(Thalach)
§ ST depression induced by exercise(Oldpeak)
3. System will accordingly view Doctor to consult.
4. Give Feedback
· View Doctor
In this paper the focus is on using different algorithms and combinations of several target attributes for effective heart attack prediction using data mining. Decision Tree has outperformed with 99.62% accuracy by using 15 attributes. Also the accuracy of the Decision Tree and Bayesian Classification further improves after applying genetic algorithm to reduce the actual data size to get the optimal subset of attribute sufficient for heart disease prediction.
Association classification technique apriori algorithm, was along with a new algorithm MAFIA was used. Straight Apriori-based algorithms count all of the 2k subsets of each k-item set they discover, and thus do not scale for long item sets. They use “look a heads” to reduce the number of item sets to be counted. MAFIA is an improvement when the item sets in the database are very long.
1. Ankita Dewan, Meghna Sharma,” Prediction of Heart Disease Using a Hybrid Technique in Data Mining Classification”, 2nd International Conference on Computing for Sustainable Global Development IEEE 2015 pp 704-706. 2. R. Alizadehsani, J. Habibi, B. Bahadorian, H. Mashayekhi, A. Ghandeharioun, R. Boghrati, et al., “Diagnosis of coronary arteries stenosis using data mining,” J Med Signals Sens, vol. 2, pp. 153-9, Jul 2012. 3. Carlos Ordonez, Edward Omincenski and Levien de Braal ,”Mining Constraint Association Rules to Predict Heart Disease”, Proceeding of 2001, IEEE International Conference of Data Mining, IEEE Computer Society, ISBN-0-7695-1119-8, 2001, pp: 433-440. 4. Deepika. N, “Association Rule for Classification of Heart Attack patients”, IJAEST, Vol 11(2), pp 253-257, 2011. 5. Usha. K Dr, “Analysis of Heart Disease Dataset using neural network approach”, IJDKP, Vol 1(5), Sep 2011. 6. R. Setthukkarase, Kannan, “An Intelligent System for mining Temporal rules in Clinical database using Fuzzy neural network”,European Journal of Scientific Research, ISSN 1450-216, Vol 70(3), pp 386-395, 2012. 7. M Akhil Jabbar, BL Deekshatulu, Priti Chandra,” Heart disease classification using nearest neighbor classifier with feature subset selection”, Anale. Seria Informatica, 11, 2013. 8. Chaitrali S Dangare, Sulabha S Apte.” Improved study of heart disease prediction system using data mining classification techniques”, International Journal of Computer Applications, 47(10):44–48, 2012. 9. Shadab Adam Pattekari and Asma Parveen,” PREDICTION SYSTEM FOR HEART DISEASE USING NAIVE BAYES”, International Journal of Advanced Computer and Mathematical Sciences ISSN 2230-9624, Vol 3, Issue 3, 2012, pp 290-294.