QAA Logo 


Quality Assurance Associates
QAALLC

home  |  profile   |  services  |  projects  |  staff  |  articles   |  links  |  contact     


 

WHAT IS "GOOD" DATA?
Donald A. Flory
January 2000


David Cehrs and William C. Bianchi (GROUND WATER, Vol. 34, No. 6, November-December, 1996, p. 961.) have discussed the importance of gathering sufficient data to support good environmental decisions in today’s highly competitive marketplace, which places a premium on minimizing costs. The tendency seems to be more and more toward minimizing the collection of data. Most of the data used to support environmental consulting is laboratory-generated.

William Pipkin (TODAY’S CHEMIST AT WORK, March, 1997, P22) points out that environmental testing in the U.S. today is big business (which translates to ‘the bottom-line profit is the driving force’). The laboratory service industry is very competitive and analysis costs will continue to drop as labs battle for government and environmental engineering consultant firm contracts. We’ll give you one guess what this can mean in terms of data quality.

We have been stressing for many years the importance of known and documented data quality when using laboratory analysis data for making decisions about environmental site assessments, remediation alternatives investigations, hazardous waste remediation verification, environmental compliance monitoring, and litigation support. The combination of the move toward gathering less data and the increased potential for quality control failures in a very cost-conscious laboratory climate further underscores the need for careful scrutiny of all laboratory data.

What do we mean by “good” laboratory data? The term “legally defensible” is often stated as the desired objective for laboratory analysis data. What do we mean by this term? A simple answer is that the data quality is suitable for the stated purpose of the analysis and that sufficient documentation is available to verify the suitability. What is data quality and how do we achieve it? If the stated purpose of the analysis is to comply with a state or federal environmental regulation, the factors that determine the data quality (referred to as data quality objectives in current jargon) are often stated in the regulation. This is only true to a degree. (That’s good because it makes work for us chemists.)  EPA regulations and guidance documents tend to be much more complete in defining data quality than, for example, NIOSH regulations. Forensic situations involving arson and personal injury are often even more nebulous.

Thus, "legally defensible" data is both well documented and technically valid. It is easy to understand the need for documentation, i.e. "if it ain't written down, it ain't been done". However, the question of technical validity of laboratory analysis data is often overlooked. What do we mean by technical validity? A simple answer here is that a technically valid analysis procedure is one that can adequately identify and/or measure the chemical compounds of interest with the necessary accuracy and precision. Note that by procedure, we mean the complete gamut of activities related to the measurement including sample collection, transport, storage, and preservation as well as the actual analysis method used. It is possible to have data that meets all the stated data quality objectives, but is not technically valid. This is especially true for non-EPA regulations that have minimal data quality requirements. Examples include selecting a procedure which cannot “see” the compound(s) of interest (gives a false negative result), is subject to too much positive interference (gives a false positive result), is not capable of the desired precision and accuracy, or can be misinterpreted by operating personnel of limited experience. "Junk scientists" will quickly label laboratory data with a mountain of supporting documentation as defensible; when indeed, both the documentation and technical validity of the data are inadequate.

Now, let’s take a quick look at the principal ingredients that make up suitable data quality or “good data". These are:
 


We refer to the process of verifying the suitability of the data as data quality assessment. Data quality assessment is a determination of the suitability of the data for the intended use. It includes the four major tasks of data management, data validation, data qualification/review (flagging), and finally, the determination of suitability. Data management includes determining the completeness of the data documentation. Environmental data validation primarily covers checking to see if the quality control requirements of the method have been met. Data qualification is the application of flags to the data that reflect the failures found during validation. The final determination of suitability must consider the technical validity of the data as well as the data qualifiers and be consistent with the intended use of the analytical data. It can range from rejection to unqualified acceptance and requires a very experienced chemist.

Let’s examine two hypothetical situations, one being an EPA regulation where all of the above "good data” ingredients are pretty well defined and the second, a personal injury case based on exposure to some airborne organic contaminant. In the EPA case, the user will most likely receive the results and a documentation package containing all the regulation (or method) specified sample data management and quality control data. The method requires that the laboratory apply data qualifiers to sample results that have certain data quality issues associated with laboratory controlled data quality issues. This would include a B if contamination was found in the laboratory method blank, an E if a positive result is above the calibration range of the measuring instrument, and a J if a positive result is below the calibration range of the instrument. These laboratory qualifiers will affect the suitability of the data. If examined, the documentation package may (and most likely will) reveal many other data quality issues that affect the data quality and subsequently the suitability or “goodness” of the data. Common data quality failures in our experience include holding time, calibration agreement, retention time agreement (qualitative identification), and accuracy and precision of spiked quality control samples. The National Functional Guidelines published by the EPA stipulate a set of qualifying flags to be applied to the data for various quality issues and sometimes results in rejection of the data. None of the issues, beyond the few indicated by the laboratory qualifiers, will be known to the user if the documentation is never examined. Unfortunately, this is becoming the more common case with the present emphasis on cost cutting.

Now let’s take a look at the second situation of personal injury due to airborne organic chemicals. Suppose that you are the defendant and want to assess the “goodness” of the data. The situation here is more complicated because NIOSH and forensic societies do not have clear definitions of “good data”. We believe that the EPA model of “good data” is the minimum that should be acceptable for these cases. The samples have often been collected and analyzed without formal standard operating procedures. We commonly find that these samples are reported with little or no data quality documentation. In the case where documentation is present, it is usually deficient in many areas. There is almost a direct correlation between the “badness” of the data and the lack of documentation. Delays in obtaining adequate documentation are often prolonged. A complete data quality assessment often results in rejection of a major fraction of the data, and we have rejected all of the data in some cases. Alternately, if your case requires the obtention of laboratory data to support your case, you would do well to devise a sampling and analysis plan that will produce “good data".

In closing, let’s summarize the importance of performing a comprehensive data quality assessment. The key here is known and documented data quality. A comprehensive data quality assessment is the only way to achieve known and fully documented data quality. The most compelling reasons for establishing the data quality include:
 


In conclusion, in our experience, laboratory data is not necessarily “good” and comprehensive data quality assessment is the only way to determine the suitability of the data.

 

home  |  profile   |  services  |  projects  |  staff  |  articles   |  links  |  contact

Quality Assurance Associates
Last Update: June 20, 2005