MachineX: Jaccard Index for evaluating an ML model

Reading Time: 2 minutes

In this blog, we are going to learn about one of the evaluation metrics that is used for evaluating a classification ML model, which is, Jaccard Index. But first, let’s see what evaluation metrics are.

Evaluation Metrics

Evaluation metrics help us in telling the performance of our ML models. They help us in calculating an ML model’s accuracy. Accuracy tells us how good or bad our ML model is, i.e., how our ML model is going to perform on an unknown data sample, based on the training that it has received by the training set. For evaluating an ML model, we need a test set, which is usually different from the training set, that we feed into our ML model and see what the outputs are and compare these outputs with already known outputs. So now that we are clear with what evaluation metrics are, let’s move on to the actual topic of our blog, Jaccard Index.

Jaccard Index

Jaccard Index is one of the simplest ways to calculate and find out the accuracy of a classification ML model. Let’s understand it with an example. Suppose we have a labelled test set, with labels as –

y = [0,0,0,0,0,1,1,1,1,1]

And our model has predicted the labels as –

y1 = [1,1,0,0,0,1,1,1,1,1]

Jaccard-1

The above Venn diagram shows us the labels of the test set and the labels of the predictions, and their intersection and union.

The Jaccard Index is defined as the size of the intersection divided by the size of the union of the two labelled sets, with formula as –

CodeCogsEqn

So, for our example, we can see that the intersection of the two sets is equal to 8 (since eight values are predicted correctly) and the union is 10 + 10 – 8 = 12. So, the Jaccard index gives us the accuracy as –

CodeCogsEqn (2)

So, the accuracy of our model, according to Jaccard Index, becomes 0.66, or 66%.

That was all there is to know about the Jaccard Index. Hope this blog was helpful to you. Thanks for reading.

Evaluation Metrics

Jaccard Index

Recommend

Ask HN: How did you establish and maintain relationships with your first users?

MANNER x 野兽派「熊猫拿铁」萌上热搜，又一款卖！爆！的联名

Not heard about QAOps yet?

不仅是 64 位 Android L 还有这 9 大亮点

专访：梁朝伟《眼》幕后团队华邑——只有黑白的质感才不浮躁！

How to Write Better Compilation Error Message In Rust

The First Atomic Bomb Created This ‘Forbidden’ Quasicrystal

鸿星尔克背后的小县城，才是真正的国货之光！

【91专访】微软大佬 cabbage 分享算法面试心得

Collision Detection is Hard: The Story of Alf

About Joyk