Saturday, November 23, 2024
spot_img

Top 5 This Week

Related Posts

A closer look at Bing's box and whisker plots to analyze CPC data

Today, to build upon our working knowledge of data distributions, we’re going to be analyzing CPC data using box and whisker plots. If you missed the first installment, get caught up on histograms and meet us back here.

data distrution chartdata distrution chart

If you’ve finished part one of this series, then the histogram on the left should look familiar. The plot on the right is a box and whisker plot, created from the very same set of CPCs that we used in part one. Hooray for continuity!

First, let’s ground ourselves in some basics. Because we are not segmenting our data in any way, and therefore using only one distribution, the CPC value will be expressed on the y-axis, and the x-axis will be null.

Image7Image7

Now, let’s go through the components of the box and whisker plot. First off, the x.

Image6 1Image6 1

This x represents the mean value of the distribution, which you’ll recognize as the simple average often associated with your search data. For the purposes of this exercise, the X is your average CPC. To that end, the line in the middle of the box represents the median.

Image9Image9

While getting both the mean and median of the distribution in the visualization is a wonderful feature of the box and whisker plot, the four quartiles can help divine a lot of information that we can’t get at through a histogram.

Image8Image8

The bottom threshold of the box (or left-most threshold for a horizontally justified plot)  is the lower quartile, or first quartile, or Q1, and it represents the number such that 25 percent of observations are less than it and 75 percent are larger. In this context, think of an “observation” as a single data point.

Image2 2Image2 2

The top threshold of the box (or right-most threshold for a horizontally justified plot) is the upper quartile, or third quartile, or Q3, and it represents the number such that 75 percent of observations are less than it, and 25 percent are larger.

Following this same notation, you can also infer that the median serves as the second quartile, given that 50 percent of observations are greater, and 50 percent are lesser.

This can admittedly becoming a little confusing to keep track of. We’ve found that something that helps with intuition is to think of the quartiles as possessing ranges, and remembering that each range contains roughly a quarter of the total data points in the data set. Perhaps this pursuit would be frowned upon by the statistician purists of the world, but we take a bright view of whatever helps you learn. Hopefully the visual below helps conceptualize.

Image1 3Image1 3

Now we’re getting somewhere, right? We can observe that the first three quartile ranges of this distribution have a pretty comparable range of values. But the fourth quartile range is a much broader stroke. For this advertiser to lower their CPCs, a focused and precise tactic would be to isolate keywords that fall within that fourth quartile range, and modify the attendant bids.

Alright, but what about those dots?

Image4 1Image4 1

Data points that render as individual dots can be considered statistical outliers in the context of a data distribution. In our hypothetical scenario, the advertiser is looking for tactics to mitigate CPC cost. In addition to the fourth quartile range, this advertiser should investigate the keywords responsible for these outlier values, and act accordingly.

Hearken back to part one of this series for a moment, and recall that our distribution is right tailed, meaning that the skew is towards values that are greater than the median. Knowing what you know now about both histograms and box and whisker plots, you should be able to intuit the relationship between these two visualizations of the same data.

Image3 1Image3 1

In the final part of this series, we’ll explore using distributions to identify changes in your data over time.

 

 


Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


Popular Articles