Internal auditors are increasingly relying on data analytics to better understand the programs and processes they review. While many auditors use bar charts to examine and present information, they sometimes forget that those are limited to categorical data where the variables represent data that can be organized or divided into groups. Examples include gender, products, error types, and time periods like months and years. For continuous data, which is information that can be measured on a continuous scale and quite often can be subdivided into smaller increments depending on the level of precision sought, internal auditors should consider using histograms.

Histograms are a very powerful tool to analyze data because they show the distribution of a continuous variable in a diagram and their appearance is similar to bar graphs. They show the distribution of values over a range, so they can be useful when analyzing data to better understand its “fluidity.” They help to summarize the data from a process that has been collected over, or represent, a period of time. With this tool, internal auditors can present large amounts of data that are difficult to interpret in tabular form, they can then show the frequency of occurrence of the various data values and quickly illustrate the distribution of the data.

The range of values of the continuous variable is divided into a series of intervals on the x-axis.  Examples include sales revenue per hour, number of calls received in a call center, requests for technical support, customers requesting service, vehicles being serviced in a maintenance unit or going through a toll plaza, airplanes landing or taking off from an airport over a period of time (e.g. half-hour increments during a workday). 

Then the count of how many values fall into each interval is shown as bars on the chart – the number of cases per unit of the variable on the horizontal. The intervals are consecutive, adjacent (i.e. there are no gaps between the bars), and of equal size.    

One of the benefits of this analysis is that the internal auditor can compare the information on the histogram with other attributes like the number of people working during the same time period, or the number of errors based on the time period when they occurred. This way low-staffing situations causing delays or errors can be pinpointed, or errors caused by operators who may lack necessary training. This information can enhance root cause analysis significantly.

How to Prepare a Histogram

The following are the steps to prepare a histogram:

  1. Calculate the range (maximum – minimum value)

  2. Divide the range of values into a series of intervals.  This will constitute the x-axis. Ten intervals are a good rule-of-thumb to begin with.  Each class interval must be mutually exclusive, so every data point will fit into only one interval.

  3. Count how many values fall into each interval.  Make sure each interval (or bin) is consecutive and adjacent (not overlapping).  They are generally of equal size.

  4. Create a rectangle over each interval (bin) where the height represents the frequency (how many items) are in each interval. These values will be plotted on the y-axis.

Understanding the Shapes

There are three elements to interpret histograms:

  1. Centering: Is the shape aligned with the expected or target value? If the shape is shifted to the right of it, the process and its related data may be too high. If to the left, then the process and its data are running too low.

  2. Variation: What is the spread of the data?  When requirements are in place, the expectation is that all values fall within the established limits. 

  3. Shape: The terms used to describe the patterns (or shapes) of histograms are: Symmetric (or Bell Shaped), bimodal, skewed right, skewed left, unimodal or uniform and random or multimodal.

    A. Symmetric or Bell-Shaped: They represent a normal distribution.

    B. Bimodal: These histograms have two peaks and could indicate that the information came from two different systems, processes, shifts, people, machines, or other sources. If that is the case, the source data should be separated and analyzed accordingly.

    C. Skewed Right: These histograms are said to be positively skewed. This distribution has a large number of values on the lower side of the x-axis (left side), and few in the upper value (right side). 

    D. Skewed Left: These histograms are said to be negatively skewed. This distribution has many values on the higher side of the x-axis (right side), and fewer items on the lower value (left side).

    E. Unimodal or Uniform: These histograms provide little information about the program, process or system. If this occurs, check to see if several sources of variation have been combined and analyze them separately because a uniform distribution usually means that the number of intervals (i.e. classes) is too small.

    F. Random or multimodal: These histograms have no discernible pattern and may indicate a distribution that has several modes or peaks. If this occurs, check to see if several classes or sources of variation were combined. If that’s the case, analyze them separately. If multiple sources of variation are not the cause of this pattern, different groupings could be tried. For example, change the starting and ending points of the cells, or change the number of cells. A random distribution often means there are too many groups or classes.

Screen Shot 2018-09-05 at 10.48.39 AM

Histograms are similar to bar charts and are sometimes confused with them. However, bar charts use categorical data and have gaps between the rectangles. Histograms, on the other hand, use continuous data and consequently there are no gaps between the rectangles.   

They show the centering, variation and shape of the data and can be useful to quickly provide useful information to understand the dynamics underlying the data and to predict future performance of the process. As such, they are useful for root cause analysis and to provide deeper insights about a program or process under review.

When performing data analysis, internal auditors should consider preparing histograms, as they can help to provide deeper insights into the characteristics of processes, and the issues affecting their performance. With the clearer picture they provide, internal auditors can convey important points more easily than tables and text-heavy reports, they can pinpoint anomalies more precisely and also encourage discussion with process owners.

Hernan Murdock
Vice President, Audit Division
Dr. Hernan Murdock is Vice President, Audit Division for MIS Training Institute. Before joining MIS Training Institute he was the Director of Training at Control Solutions International, where he oversaw the company's training and employee development program. Previously he was a Senior Project Manager leading audit and consulting projects for clients in the manufacturing, transportation, high tech, education, insurance and power generation industries. Dr. Murdock also worked at Arthur Andersen, Liberty Mutual and KeyCorp. Dr. Murdock is a senior lecturer at Northeastern University where he teaches management, leadership and ethics. He is the author of Operational Auditing: Principles and Techniques for a Changing World, 10 Key Techniques to Improve Team Productivity, and Using Surveys in Internal Audits.