Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Education and Culture Executive Agency (EACEA). Neither the European Union nor EACEA can be held responsible for them.


Heart Disease Dataset

This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to
this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0). The names and social security numbers of the patients were recently removed from the database, replaced with dummy values. One file has been "processed", that one containing the Cleveland database. All four unprocessed files also exist in this directory. To see Test Costs (donated by Peter Turney), please see the folder "Costs"

Scientific Area:
Machine Learning

MatLab, Python

Target Group:

Cite as:
Aha, W. David (2019), “Heart Disease Data Set”, UCI – Machine Learning Repository.

Author of the review:
Inês Sena
Research Centre in Digitalization and Intelligent Robotics (CeDRI) - Instituto Politécnico de Bragança


You have to login to leave a comment. If you are not registered click here