Benford's Law in Excel (for Accounting Forensic), with Case Study
(This analysis is covered by the functions in Excel Power Expander.)
Benford's Law, also called the First-Digit Law, refers to the frequency distribution of digits in many (but not all) real-life sources of data. In this distribution, the number 1 occurs as the leading digit about 30% of the time, while larger numbers occur in that position less frequently: 9 as the first digit less than 5% of the time. This distribution of first digits can be described in a logarithmic scale:
With this function, the distrubution of 1 to 9 is as follows:
d P(d) Excel Power Expander function 1 30.10% =BENFORD(1) 2 17.60% =BENFORD(2) 3 12.50% =BENFORD(3) 4 9.70% =BENFORD(4) 5 7.90% =BENFORD(5) 6 6.70% =BENFORD(6) 7 5.80% =BENFORD(7) 8 5.10% =BENFORD(8) 9 4.60% =BENFORD(9)
- This is a real case I audited in China.
- The objective was to find out potential fraudulent transactions in the expenses of a listed company.
- There are over a million transactions. The traditional "random sampling" or "sampling by amount" of auditors won't work. As those were tests of luck. We needed a more logical approach to reduce or effective sample size.
Step 1 - Extract the first digit of all numbers
- First, download all the expenses data. (I assume this is a sufficiently large sample. Don't give get 10 or 20 numbers.)
- Extract the first digit of every number.
- Base on your situation, decide the treatment of negative numbers. (Skip all negative numbers? Or ignore the sign?)
The follow figure shows a way to extract the first digit with Excel functions:
Excel Power Expander has more powerful functions to extract the first, the first 2, and the first 3 digits for more advanced Benford's Law analysis. The following figure demonstrates using the FD() and F2D functions.
Note the "First 2 Digits" column. It is not really that straight forword when extracting the first 2 digits with simple approaches. Numbers smaller than 10 will be causing wrong result if not extracted properly. (For example, "9.4" will become "9." if we just take the first 2 characters from the left, and they'll distorb the total frequency of "9" when we later do the statistics.) The F2D() function of Excel Power Expander will take care of this by completely ignoring these numbers. (See the yellow cell below.)
Step 2 - Calculate the Cummulative Frequency of the Numbers
We can now use Excel's COUNTIF() function to work out the cummulative frequency of 1 to 9.
We can set up a range like E1:G10 in the figutre below.
The formula in column F count the number of 1,2,3,...,8,9 in column C, and then divide it by the total count. (The COUNTA() function counts the total count in column C. It won't be affected by the column heading, as it counts only numbers.)
We put the Benford's Law estimates in column G.
The result looks like this:
Step 3 - Plotting the Frequency Histogram / Line Chart combo chart
The final step is to create a combo chart. We want to plot the "Count" column as vertical bar chart, and we want the Benford's Law estimates to be plot as a line chart. (You can see the result in the last picture in this page.)
- Select the "Count" and the "Benfords Law Estimate" columns.
- Go to the INSERT tab of the Ribbon. Click the Chart button, and click "More Column Charts".
- Clikc "Combo" on the left, and click the first chart icon (see below)
Now, the Chart looks like this:
Interpreting the Result
In the Chart we plot, the bars for 1, 2, 3 significantly exceeded the Benford's Law estimates. And 7 is below the estimate. Now, we can go straight to study the details of the the expenses which begin with 1, 2 and 3. For 7, depends on the available resource, it will be the next target.
This result is in fact a real case I encountered during my previous audit of a list company in China. I can share my findings with you:
- There are plenty of RMB 300 expense claim of phone cards. (People in China commonly use phone cards to put credits into their phone number, and the maximum phone card is RMB 300 each.) Some staff continually claim expenses on RMB 300 phone cards and they sell the serial code on the card to someone at discount.
- There are some regular RMB 200 ~ RMB 280 expense claims on cigerrets everyday. (In China, each carton of "nice" cigerrettes costs around RMB 200.) I then found out the company has a "habit" of giving out cigerettes to police officers and customs officers for "better relationships".
- For the "1", also some regular daily expenses to buy various small items like water, food, etc. Those are also consumed by customs officers.
- Then, the next question during my audit was: why are there so many real receipts of those expenses? How can people consume so many cigerrettes? I checked and found out all those items were purchased from a store nearby. I went to that store to have a look, and found out that store was actually a dark room INSIDE the customs office building... Interesting...
This is Benford's Law in practice. :-)