Decision Tree Simulator

In information theory, the entropy is a measure of impurity, uncertainty or randomness in a dataset. In datasets with binary classes, where variables can only have two possible outcome values, the entropy value lies between 0 and 1, inclusive. The higher the entropy, the more impure the dataset is.

Learn more

The entropy of a dataset with respect to a target/class variable is calculated using the following formula:

\(H(X) = - \sum_{i=1}^{n} (p_i \log_2(p_i)) \)

Where:

  • \(X\) is the dataset
  • \(n\) is the number of classes
  • \(p_i\) is the proportion of instances (data points/rows) belonging to class \(i\)

Binary entropy function

You are using more than 2 classes. The results cannot be displayed on the Binary Entropy graph. It only represents the Entropy curve for 2 classes.

Entropy calculator

All instance values are 0. Please introduce at least one instance with a positive integer value.
At least one value is invalid. Please only use positive integers as input values.
At least one input field is empty. Please add positive integer values to each input field.
Example attribute
Classes Nr. of instances
+
10
Calculate Entropy
p(Class 1): 0.50
p(Class 2): 0.50
H(Attribute):
1.00