Quartiles are used to summarize a group of numbers. Instead of looking a big list of numbers, you are looking at just a few numbers that give you an overall idea of the big list. Quartiles are great for reporting on a set of data and for making box and whisker plots. Quartiles are especially useful when you’re working with data that isn’t symmetrically distributed, or a data set that has outliers.
All numerical summaries—like mean, median, and mode—give you a few numbers to summarize a large group of data. What’s special about quartiles is that they split the data up into four equal-size groups.
These are special cases of percentiles. A percentile tells you what number is higher than a certain percent of the rest of the dataset. For example, the 90th percentile means the number that is higher than 90% of the other numbers in the group.
There are five numbers that make up the “quartiles,” although some of the five numbers have more common names. Quartiles are the five numbers you need to split a group of numbers into four equal-size groups. Here they are, from lowest to highest:
- Minimum, or (rarely) “0th percentile”—the smallest number in the group
- 1st quartile, Q1, or 25th percentile—the number that separates the lowest 25% of the group from the highest 75% of the group
- Median, or 50th percentile—the number in the middle of the group, when arranged from smallest to largest
- 3rd quartile, Q3, or 75th percentile—the number that separates the lowest 75% of the group from the highest 25% of the group
- Maximum, or (rarely) “100th percentile”—the largest number in the group
Finally, one other related statistic is the interquartile range, or IQR. IQR is the distance between the 1st and the 3rd quartile. The IQR is useful in calculating outliers. Any data value that is more than 1.5 times the IQR away from that central 50% group is called an outlier.
Quartiles let us quickly divide a set of data into four groups, making it easy to see which of the four groups a particular data point is in. Some statistics only tell us about the center of the data, or a typical value. Other statistics, like range and standard deviation, tell us something about the spread of the data values. Quartiles do both!
References [ + ]