Entropy calculator

General Information

In information theory, the entropy is a measure of impurity, uncertainty or randomness in a dataset. In datasets with binary classes, where variables can only have two possible outcome values, the entropy value lies between 0 and 1, inclusive. The higher the entropy, the more impure the dataset is.

Learn more

Calculation

The entropy of a dataset with respect to a target/class variable is calculated using the following formula:

\(H(X) = - \sum_{i=1}^{n} (p_i \log_2(p_i)) \)

Where:

\(X\) is the dataset
\(n\) is the number of classes
\(p_i\) is the proportion of instances (data points/rows) belonging to class \(i\)

Binary entropy function

Entropy calculator

Example attribute
Classes	Nr. of instances
Class 1:
Class 2:
+	10

Calculate Entropy

p(Class 1):	0.50
p(Class 2):	0.50
H(Attribute):	1.00