P-values Explained By Data Scientist

 3 years ago
source link: https://www.tuicool.com/articles/uQfAfyr
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

For Data Scientists


I remember when I was having my first overseas internship at CERN as a summer student , most people were still talking about the discovery of Higgs boson upon confirming that it met the “five sigma” threshold (which means having p-value of 0.0000003).

Back then I knew nothing about p-value, hypothesis testing or even statistical significance.

And you’re right.

I went to google the word — p-value, and what I found on Wikipedia made me even more confused…

In statistical hypothesis testing, the p -value or probability value is, for a given statistical model, the probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be greater than or equal to the actual observed results.


Well done Wikipedia.

Okay. I ended up not really understanding what p-value really meant.

Until now, after going into data science field, I finally begin to appreciate the meaning of p-value and how it could be used as part of the decision making tools in certain experiments.

Therefore, I decided to explain p-values in this article and how they could be used in hypothesis testings to hopefully give you a better and intuitive understanding of p-values.

While we can’t skip the fundamental understanding of other concepts and the definition of p-value, I promise I’ll make this explanation in an intuitive way without bombarding you with all the technical terms that I faced.

There are total four sections in this article to give you a full picture from constructing a hypothesis testing to understanding p-value and using that to guide our decision making process. I strongly encourage you to go through all of them to give you a detailed understanding of p-values:

  1. Hypothesis Testing
  2. Normal Distribution
  3. What is P-value?
  4. Statistical Significance

It’ll be fun. :wink:

Let’s get started!

About Joyk

Aggregate valuable and interesting links.
Joyk means Joy of geeK