The Ultimate Guide to Understand Conditional Probability

Data science combines statistics and programming. Both are required even to begin learning the fundamentals of Data Science. The good news is that you don't have to be an expert in either of them. You'll need a solid understanding of the fundamentals to get started. In this tutorial, you will explore one of the core concepts of statistics, i.e. conditional probability.

Important Terminologies

Now, look into some important terms and concepts that are necessary to grasp the concept of conditional probability.

Become a Certified Power BI Developer

PL-300 Microsoft Power BI Certification TrainingExplore Program
Become a Certified Power BI Developer

Random Experiment

If an experiment, when repeated under identical conditions, does not produce the same outcome every time but the outcome in a trial is one of the several possible outcomes, then such an experiment is called a random experiment or a probabilistic experiment.

When a die is rolled, there are only six possible outcomes–1, 2, 3, 4, 5, or 6. However, predicting which one will occur at any roll of the dice is not possible.

Event

An event is the outcome of a random experiment. Getting heads when we toss a coin is an event. Getting a 4 when you roll a fair die is an event. 

Sample Space

The sample space is the collection of all possible outcomes of an experiment. For example, a sample space for a single throw of a die will be {1,2,3,4,5,6}.

Mutually Exclusive and Exhaustive Events

Mutually Exclusive Events: Two or more events associated with a random experiment are said to be mutually exclusive or impossible events if the occurrence of any one of them prevents the occurrence of all other events.

The turning up of heads and tails is two mutually exclusive events when tossing a fair coin. This is because if one participant shows up, the other will not participate in the same experiment.

Exhaustive Events: Two or more events associated with a random experiment are exhaustive if their union is the sample space.

Thus, a set of events associated with a random experiment is an exhaustive set of events if one of them necessarily occurs whenever the experiment is performed.

Consider the experiment of drawing a card from a well-shuffled deck of playing cards. Let A be the event "card is red ", B be the event "card is black ". A and B are exhaustive events because of A U B = S. 

Master the Art of Data Science and Analytics

Gain Excellent Analytics InsightsACCESS FREE
Master the Art of Data Science and Analytics

Algebra of Events

Let A and B be two events associated with a random experiment with sample space S. You define the event “A or B” which is said to occur if an elementary event favorable to either A or B or both is an outcome. In other words, the event “A or B” occurs if either A or B or both occur. 

A or B is represented by A ∪ B of the sample space S, also called A union B.

conditional-probability-2

For example, in a single throw of a die, if you define 

A = Getting an even number, B = Getting a multiple of 3.

Then,

A = { 2, 4, 6 } and B = { 3, 6 }

A ∪ B = { 2, 3, 4, 6 }

The event “A and B” is said to occur if an elementary event favorable to both A and B is an outcome. In other words, the event “A and B” occurs if A and B both occur. The event A and B is denoted by A ∩ B. This can also be called A intersection B.

conditional-probability-3

For example, in a single throw of a pair of dice, if you define

A = Getting an even number on first-die

B =  Getting 8 as the sum of the numbers on two dice,

Then, 

 A ∩ B = Getting an even number on the first die such that the sum of the numbers is 8

= { (2, 6), (6, 2), (4, 4) }

Probability of an Event

If there are n elementary events associated with a random experiment and m of them are favorable to an event A, then the probability of happening or occurrence of A is denoted by P(A) and is defined as the ratio mn.

Example: Find the probability of getting head in a toss of an unbiased coin.

The sample space associated with the random experiment is S = { H,T }

You will observe that there are two elementary events, H,T associated with a given random experiment. Out of these two events, only one is favorable i.e. H.

Hence, probability  = 12

Learn The Latest Trends in Data Analytics!

Post Graduate Program In Data AnalyticsExplore Program
Learn The Latest Trends in Data Analytics!

Conditional Probability

Let A and B be the two events associated with a random experiment. Then, the probability of A's occurrence under the condition that B has already occurred and P(B) ≠ 0 is called the Conditional Probability. It is denoted by P (A/B). Thus, you have

conditional-probability-1

P(A/B) = Probability of occurrence of A given that B has already occurred.

P (B/A) = Probability of occurrence of B given that A has already occurred.

Example 1: Let there be a bag containing 5 white and 4 red balls. Two balls are drawn from the bag, one after the other, without replacement. Consider the following events:

A = Drawing of a white ball in the first draw, B = Drawing a red ball in the second draw.

Now, after drawing the white ball in the first draw, you are left with 8 balls, out of which 4 are red.

P (B/A) = Probability of drawing a red ball in a second draw given that a white ball has already been drawn in the first draw.

= 4/8 = ½

Example 2: Consider the random experiment of throwing a pair of dice and two events associated with it given by:

A = The sum of the numbers on two dice is 8 = { (2,6), (6,2), (3,5), (5,3), (4,4) }

B = There is an even number on the first die.

{ (2,1), (2,2), ….., (2,6), (4,1),...., (4,6), (6,1),..., (6,6) }

In this case, events A and B are the subsets of the same sample space. So, you have the following meanings for P (A/B).

Now,

Outcomes favorable to B = 18

Outcomes favorable to A and B = 3 

P (A/B) = Probability of occurrence of A when B occurs

 = 3/18 = 1/6

Data Science and Conditional Probability

Statistical inferences are commonly used in Data Science to predict or analyze trends from data. Statistical inferences use probability distributions of data. To work effectively on data science problems, knowing probability and its applications is essential.

Conditional Probability is linked to Data Science. To solve complex data science problems, data scientists must have a thorough understanding of probability. To fully understand and implement relevant algorithms for use, a strong foundation in probability and conditional probability is required.

Get broad exposure to key technologies and skills used in data analytics and data science, including statistics with the Data Analytics Certification Program.

Conclusion

After reading this article, we hope you have a much better understanding of Conditional Probability and how it is used in statistical theory. 

If you are interested in statistics of data science and skills needed for such a career, you ought to explore Simplilearn’s Data Analytics Certification Program.

If you have any questions regarding this ‘Conditional Probability In Statistics’ tutorial, do share them in the comment section. Our subject matter expert will respond to your queries. Happy learning!

About the Author

Aryan GuptaAryan Gupta

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning.

View More
  • Acknowledgement
  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, OPM3 and the PMI ATP seal are the registered marks of the Project Management Institute, Inc.