Concepts You Have to Know for Data Science Interviews — Part II. Probability
Most frequently asked questions in data scientist interviews
Most frequently asked questions in data scientist interviews
In the last article in the series (Concepts You Have to Know for Data Science Interviews — Part I: Distribution), I touched on the basics of distributions — the most important distributions and their characteristics that might show up in data science interviews. In this article, I want to continue the tutorial with common probability questions that companies like to ask DS candidates.
Probability is a complicated subject that might be hard to master in a short period of time if you really want to understand it in depth. In fact, it is a required course that spans a whole semester for math majors in college. So this article will definitely NOT make you an expert in probability, but instead will give you an idea of the most commonly tested areas within this topic.
Conditional Probability/ Bayes’ Theorem
Conditional probability is THE most tested type of probability problems during DS interviews. It’s a helpful skill because in our day-to-day job, a lot of analytics questions we run into involve conditional probability — for example, what’s the probability of having a disease given the test of the disease is positive. Conditional probability is usually calculated with Bayes’ Theorem, the formula is shown below:
A,B → Events
P(A|B) → probability of A given B is true
P(B|A) → probability of B given A is true
P(A), P(B) → the independent probabilities of A and B
Independence is an important concept when learning about conditional probability. It’s worth noting that if P(A|B) = P(A), meaning “the probability of A given B” is the same as “probability of A”, then the two events A and B are independent.
A good example of independent events is the result of each coin flip; for a fair coin:
A: flip a head in the 2nd flip → P(A) = 1/2
B: flip a tail in the 1st flip → P(B) = 1/2
A|B: flipping a tail in the 2nd flip given flipped a head in the 1st flip → P(A|B) = 1/2
Knowing that you flipped a head in the 1st flip does NOT change the probability of flipping a tail next, so those two events are independent.
In interviews, a strong hint that you should be considering using Bayes’ Theorem is the key phrases like “given that” or “conditioning on”.
Total Probability
The law of total probability is stated as below:
“Bx” and “By” are disjoint events that in combination represent the whole selection universe. Suppose a group consist of people from either SF or Seattle. We know that 40% of the group are from SF (P(Bx)) and 60% are from Seattle (P(By)); the probability of Seattle being rainy today is 70% (P(A|By)) and SF being rainy today is 20% (P(A|Bx)). What is the probability that we randomly call a person within the group and their city is rainy today (P(A))?
Using the formula above, we can easily get P(A) = 0.4*0.2+0.6*0.7 = 0.5
The total probability is usually used in calculation for conditional probability because for the conditional probability shown below, often P(B) is not directly given but has to be calculated using the law of total probability.
Binomial Probability
We have talked about the binomial distribution in my previous article about distributions. Binomial probability is using the binomial distribution to calculate the probability of having exactly n success among Y trials of an experiment that only has two outcomes (a good example is flipping a coin).
k: number of trials with success
n: number of total trials
p: probability of success in a single trial
n!: n factorial, calculation for this is n*(n-1)*(n-2)… *3*2*1
k!: k factorial, calculation similar to above
(n-k)!: you get the point
Recognizing that the question involves an event with only two outcomes is the key to deciding when to use the binomial formula to calculate probabilities.
One good example of this is: the probability of getting 4 heads out of 10 coin flips (assuming it’s an unfair coin, and for every flip, there’s a 1/4 probability of getting a head and 3/4 probability of getting a tail).
In this example:
n = 10, k=4, p=1/4
The calculation for this would be 10!/(4!*6!)*(1/4)⁴*(3/4)⁶ = 210*1/4⁴*(3/4)⁶= 0.15
How are these tested and how to prepare for them
These are tested in a lot of different ways. But the general theme is, instead of being in isolation, they are usually tested in combination. Like mentioned above, the Law of Total Probability can be easily incorporated into the Bayes’ Theorem when asked in interviews.
It’s important to note that just memorizing the formulas for these will not be enough for interviews. Because in interviews, nobody will tell you when to use which formula. Determining which is the correct one to use is arguably the hardest part. The best way to master it is looking at a large number of sample questions and try NOT to look at the solution but try to solve them yourself instead. Getting familiar with the type of questions that require the use of these formulas is key. For sample questions, a simple Google search will give you plenty of results for each. If you want to have a more centralized “database” for probability questions, I highly recommend the Ace the Data Science Interview book by Nick Singh and Kevin Huo.
Looking for more interview guide? Here are some articles that might be helpful!
Concepts You Have to Know for Data Science Interviews — Part I: Distribution
Most frequently asked questions in data scientist interviewstowardsdatascience.com
Take-Home Exercises can Make or Break Your DS Interviews
How to tackle your take-home exercise — arguably the most important part of the interview and the part that’s most in…towardsdatascience.com
The Ultimate Interview Prep Guide for Data Scientists and Data Analysts
What helped me interview successfully with FANG as well as unicornstowardsdatascience.com
The Ultimate Interview Prep Guide for Your Next Dream Data Job
What helped me interview successfully with FANG and unicorns for jobs ranging from product manager to data scientisttowardsdatascience.com