Please consider these three scenarios:
So as an analytics professional, how would you efficiently and effectively detect these fraudulent actions? One effective method is the deployment of Benford's Law.
Benford's Law (also known as the law of first-digits) is a principle regarding frequency distributions. Specifically, in natural collections of numbers, the leading digit is likely to be a 1, and will make up about 30% of the distribution. Please see this graph.
The reason why this method is effective is because the natural tendency for people falsifying data is to make an equal distribution of numbers (graph on the left). However physicist Frank Benford, building upon the work of Simon Newcomb, confirmed that the natural distribution of numbers (based on the first digit) is diametrically opposite to the value of the numbers. The lowest numbers are more frequent and the highest numbers are less frequent (second graph ).
So what does this mean regarding fraud detection in the three scenarios? It means that: (1) the amount of solicited cash received by the homeless, (2) the meter readings recorded by the technician, and (3) the mileage on the odometers could all be tested with a frequency distribution of the first digits.
The beauty of this old school method of fraud detection is three-fold. First, the concept is easy to understand. People intuitively believe that all natural, social, and behavioral patterns are always randomly distributed in equal fashion. The fact is that as it relates to certain numeric distributions that is not the case. Second, the concept is easy to calculate. Parse the first digit of a series of numbers and count them. Third, the concept does not require large financial investments in analytics software or training. You might need to do some programming (i.e., transform the numbers into character strings and then parse the first digit), but nothing requiring spending a lot of money on software or a class.
Does Benford's Law have limitations? Sure. Numeric series (1) where the numbers have been assigned sequentially, (2) that have constructed minimum and maximum values, (3) consisting of square roots, and (4) other situations where the range of numbers is not natural and have fixed end points. But for accounting, election data, economic data, or as in the scenarios -- revenue, meter readings, or odometer readings, Benford's Law can be very effective.
So in seeing this old school fraud detection tool, can you think of any other scenarios where this could be effective? Please share.