Revisiting Google's 2 Percent Or Less Click Fraud Rate

Earlier this week, we reported
how MarketingPilgrim covered

news widely cited as Google confirming a clickfraud rate of 2
percent or less.
Matt Cutts, Google’s Shuman Ghosemajumder, business product manager for trust &
safety, clarifies things on his own blog in
Google, Click Fraud, and
Invalid Clicks
. Below, a look at that, along with some definitions of clicks
I think will help and how the detected click fraud rate for Google might be as
low as 0.2 percent of all clicks.

From what Shuman wrote:

What I said is that the quantity of invalid clicks which we detect as a
result of reactive investigations is a "negligible proportion" of the total
number of invalid clicks. Andy asked me if that percentage is less than 2%. I
told him that I was not able to provide a bound, but yes, "negligible"
certainly means less than 2% of invalid clicks.

However, more significantly, this is quite a different thing than saying
that our "click fraud rate" is less than 2%

So what is the rate? Shuman anticipates that question:

So what is our overall "click fraud rate"? As noted in the diagram in the
story, it is virtually impossible to know the intent of every click. However,
we can do a very effective job using statistical techniques to detect
potentially malicious behavior, and the total number of invalid clicks we
detect – whether for suspected malicious or non-malicious intent – is in the
single digit percentages. So third-party estimates which say that click fraud
is 15% or higher appear to clearly be substantial exaggerations.

In short, no answer. That sends me right back to Andy’s story and the idea
that it does support Google at least saying the click fraud rate is less than 2
percent. From my own
reading of Andy's story and other things Google has said:

So to be clear, two percent of all clicks are investigated by Google as
being possibly fraudulent after billing has happened. Some amount of
fraudulent clicks, of course, are never reported. That will take the figure
up. However, some amount of the investigated clicks will be cleared of fraud,
taking the amount down.

Now let me back up and reestablish some definitions, which I think will help
with the confusion when we start talking percentages.

  • Invalid Clicks: Of ALL the clicks Google ads generate, some
    unstated percentage of those are deemed "invalid." I’ll define these as clicks
    that Google NEVER bills for. Not all invalid clicks are necessarily fraudulent
    (such as in cases of quick double-clicks that aren’t billed).
  • Fraudulent Clicks: These are the percentage of clicks that an
    advertiser is charged for when the person (or bot) clicking is doing so to
    purposely cost the advertiser money or make money for themselves through an ad
  • Overall Click Fraud Rate: This is the number of fraudulent clicks
    as a percentage of ALL clicks that happen, regardless of whether Google issues
    a refund.
  • Detected Click Fraud Rate: This is the number of fraudulent clicks
    as a percentage of ALL clicks that happen where Google generates a refund
    after performing a requested investigation (or reactive investigation)
    asked for by an advertiser.
  • Undetected Click Fraud Rate: This is the number of fraudulent
    clicks as a percentage of ALL clicks that happen where no refund has happened
    because neither Google or the advertiser spotted the fraud.

Of everything I’ve listed, the most important rates are the last two.

What’s the detected click fraud rate? Again, that’s the percentage of
clicks that Google itself knows is getting past the three proactive filters it
has in place. On the one hand, the detected rate isn’t a big deal, since people
are getting refunds. On the other hand, if this is a very high rate, it suggests
a lot more might be getting past Google.

What’s the undetected click fraud rate? Tough question, because to
know the answer, you had to have detected the click fraud in the first place. It
could be that there’s a lot of click fraud escaping the notice of both Google
and advertisers. And that could also be why third party firms report higher
figures — because unlike average advertisers, they might comb more closely
through click audit trails. But as Google has noted, those companies also might
have incentive to inflate figures or might count things like unbilled invalid
clicks as actual billed click fraud.

Now what do we know about the rates, based on what everyone’s been saying?

  • Invalid Clicks Less Than 10 percent: Google says that invalid
    clicks make up a single digit percentage of ALL clicks. Reasonably, that means
    invalid clicks should be from 0.1 to 9.4 percent of all clicks, depending on
    how you like to round things.
  • Detected Click Fraud Rate Less Than 2 percent: This comes from
    Andy’s initial write-up, which has him saying that requested investigations
    were confirmed to be his estimate of 2 percent or less of ALL clicks. Not all
    of the investigated clicks would have produced refunds, but ANY detected click
    fraud would have been within the requested investigation chunks of clicks.
  • Or Detected Click Fraud Rate Less Than 0.2 percent: No, I didn’t
    accidentally copy the above point. There’s also now a figure that detected
    click fraud is less than 0.2 percent (zero point two percent) of ALL clicks.
    How? Shuman’s quote above has him saying that investigated clicks are less
    than 2 percent of INVALID clicks, not ALL clicks. So if I’m doing the math
    correctly, that’s 9.4 percent (or less, see above) for the percentage of
    invalid clicks, and 2 percent of those clicks are investigated. Rounded,
    that’s 0.2 percent.

It would be a lot easier if Google or Shuman would just give us the detected
click fraud rate. Come on gang, just give us the detected rate, the percentage
of refunds you’re issuing. If it’s really so small, that should be reassuring.
It won’t stop third party firms and others from saying you’re missing stuff
that’s undetected, but I still think it would be a heck of a lot better than all
these word games and guesses.

