David Cehrs and William C. Bianchi (GROUND WATER, Vol. 34, No. 6, November-December, 1996, p. 961.) have discussed the importance of gathering sufficient data to support good environmental decisions in today’s highly competitive marketplace, which places a premium on minimizing costs. The tendency seems to be more and more toward minimizing the collection of data. Most of the data used to support environmental consulting is laboratory-generated.
William Pipkin (TODAY’S CHEMIST AT WORK, March, 1997, P22) points out that environmental testing in the U.S. today is big business (which translates to ‘the bottom-line profit is the driving force’). The laboratory service industry is very competitive and analysis costs will continue to drop as labs battle for government and environmental engineering consultant firm contracts. We’ll give you one guess what this can mean in terms of data quality.
We have been stressing for many years the importance of known and documented data quality when using laboratory analysis data for making decisions about environmental site assessments, remediation alternatives investigations, hazardous waste remediation verification, environmental compliance monitoring, and litigation support. The combination of the move toward gathering less data and the increased potential for quality control failures in a very cost-conscious laboratory climate further underscores the need for careful scrutiny of all laboratory data.
What do we mean by “good” laboratory data? The term “legally defensible” is often stated as the desired objective for laboratory analysis data. What do we mean by this term? A simple answer is that the data quality is suitable for the stated purpose of the analysis and that sufficient documentation is available to verify the suitability. What is data quality and how do we achieve it? If the stated purpose of the analysis is to comply with a state or federal environmental regulation, the factors that determine the data quality (referred to as data quality objectives in current jargon) are often stated in the regulation. This is only true to a degree. (That’s good because it makes work for us chemists.) EPA regulations and guidance documents tend to be much more complete in defining data quality than, for example, NIOSH regulations. Forensic situations involving arson and personal injury are often even more nebulous.
Thus, "legally defensible" data is both well documented and technically valid. It is easy to understand the need for documentation, i.e. "if it ain't written down, it ain't been done". However, the question of technical validity of laboratory analysis data is often overlooked. What do we mean by technical validity? A simple answer here is that a technically valid analysis procedure is one that can adequately identify and/or measure the chemical compounds of interest with the necessary accuracy and precision. Note that by procedure, we mean the complete gamut of activities related to the measurement including sample collection, transport, storage, and preservation as well as the actual analysis method used. It is possible to have data that meets all the stated data quality objectives, but is not technically valid. This is especially true for non-EPA regulations that have minimal data quality requirements. Examples include selecting a procedure which cannot “see” the compound(s) of interest (gives a false negative result), is subject to too much positive interference (gives a false positive result), is not capable of the desired precision and accuracy, or can be misinterpreted by operating personnel of limited experience. "Junk scientists" will quickly label laboratory data with a mountain of supporting documentation as defensible; when indeed, both the documentation and technical validity of the data are inadequate.
Now, let’s take a quick look at the principal ingredients that make up suitable data quality or “good data". These are:
We refer to the process of verifying the suitability of the data as data quality assessment. Data quality assessment is a determination of the suitability of the data for the intended use. It includes the four major tasks of data management, data validation, data qualification/review (flagging), and finally, the determination of suitability. Data management includes determining the completeness of the data documentation. Environmental data validation primarily covers checking to see if the quality control requirements of the method have been met. Data qualification is the application of flags to the data that reflect the failures found during validation. The final determination of suitability must consider the technical validity of the data as well as the data qualifiers and be consistent with the intended use of the analytical data. It can range from rejection to unqualified acceptance and requires a very experienced chemist.
Let’s examine two hypothetical situations, one being an EPA regulation where all of the above "good data” ingredients are pretty well defined and the second, a personal injury case based on exposure to some airborne organic contaminant. In the EPA case, the user will most likely receive the results and a documentation package containing all the regulation (or method) specified sample data management and quality control data. The method requires that the laboratory apply data qualifiers to sample results that have certain data quality issues associated with laboratory controlled data quality issues. This would include a B if contamination was found in the laboratory method blank, an E if a positive result is above the calibration range of the measuring instrument, and a J if a positive result is below the calibration range of the instrument. These laboratory qualifiers will affect the suitability of the data. If examined, the documentation package may (and most likely will) reveal many other data quality issues that affect the data quality and subsequently the suitability or “goodness” of the data. Common data quality failures in our experience include holding time, calibration agreement, retention time agreement (qualitative identification), and accuracy and precision of spiked quality control samples. The National Functional Guidelines published by the EPA stipulate a set of qualifying flags to be applied to the data for various quality issues and sometimes results in rejection of the data. None of the issues, beyond the few indicated by the laboratory qualifiers, will be known to the user if the documentation is never examined. Unfortunately, this is becoming the more common case with the present emphasis on cost cutting.
Now let’s take a look at the second situation of personal injury due to airborne organic chemicals. Suppose that you are the defendant and want to assess the “goodness” of the data. The situation here is more complicated because NIOSH and forensic societies do not have clear definitions of “good data”. We believe that the EPA model of “good data” is the minimum that should be acceptable for these cases. The samples have often been collected and analyzed without formal standard operating procedures. We commonly find that these samples are reported with little or no data quality documentation. In the case where documentation is present, it is usually deficient in many areas. There is almost a direct correlation between the “badness” of the data and the lack of documentation. Delays in obtaining adequate documentation are often prolonged. A complete data quality assessment often results in rejection of a major fraction of the data, and we have rejected all of the data in some cases. Alternately, if your case requires the obtentionof laboratory data to support your case, you would do well to devise a sampling and analysis plan that will produce “good data".
In closing, let’s summarize the importance of performing a comprehensive data quality assessment. The key here is known and documented data quality. A comprehensive data quality assessment is the only way to achieve known and fully documented data quality. The most compelling reasons for establishing the data quality include:
In conclusion, in our experience, laboratory data is not necessarily “good” and comprehensive data quality assessment is the only way to determine the suitability of the data.
William Pipkin (TODAY’S CHEMIST AT WORK, March, 1997, P22) points out that environmental testing in the U.S. today is big business (which translates to ‘the bottom-line profit is the driving force’). The laboratory service industry is very competitive and analysis costs will continue to drop as labs battle for government and environmental engineering consultant firm contracts. We’ll give you one guess what this can mean in terms of data quality.
We have been stressing for many years the importance of known and documented data quality when using laboratory analysis data for making decisions about environmental site assessments, remediation alternatives investigations, hazardous waste remediation verification, environmental compliance monitoring, and litigation support. The combination of the move toward gathering less data and the increased potential for quality control failures in a very cost-conscious laboratory climate further underscores the need for careful scrutiny of all laboratory data.
What do we mean by “good” laboratory data? The term “legally defensible” is often stated as the desired objective for laboratory analysis data. What do we mean by this term? A simple answer is that the data quality is suitable for the stated purpose of the analysis and that sufficient documentation is available to verify the suitability. What is data quality and how do we achieve it? If the stated purpose of the analysis is to comply with a state or federal environmental regulation, the factors that determine the data quality (referred to as data quality objectives in current jargon) are often stated in the regulation. This is only true to a degree. (That’s good because it makes work for us chemists.) EPA regulations and guidance documents tend to be much more complete in defining data quality than, for example, NIOSH regulations. Forensic situations involving arson and personal injury are often even more nebulous.
Thus, "legally defensible" data is both well documented and technically valid. It is easy to understand the need for documentation, i.e. "if it ain't written down, it ain't been done". However, the question of technical validity of laboratory analysis data is often overlooked. What do we mean by technical validity? A simple answer here is that a technically valid analysis procedure is one that can adequately identify and/or measure the chemical compounds of interest with the necessary accuracy and precision. Note that by procedure, we mean the complete gamut of activities related to the measurement including sample collection, transport, storage, and preservation as well as the actual analysis method used. It is possible to have data that meets all the stated data quality objectives, but is not technically valid. This is especially true for non-EPA regulations that have minimal data quality requirements. Examples include selecting a procedure which cannot “see” the compound(s) of interest (gives a false negative result), is subject to too much positive interference (gives a false positive result), is not capable of the desired precision and accuracy, or can be misinterpreted by operating personnel of limited experience. "Junk scientists" will quickly label laboratory data with a mountain of supporting documentation as defensible; when indeed, both the documentation and technical validity of the data are inadequate.
Now, let’s take a quick look at the principal ingredients that make up suitable data quality or “good data". These are:
- Clearly stated measurement purposes: Must include the chemical compounds to be analyzed; the sample matrices to be submitted; the intended use of the data, and the associated detection limits, accuracy, and precision required.
- Data management: Refers to sample tracking (chain-of-custody) and associated activities that guarantee the laboratory results are associated with the correct sample.
- Sampling: Includes a technically valid sampling plan that is correctly implemented to properly collect, identify, preserve, store and prepare samples for analysis.
- Analysis method: Must have sufficient selectivity, detection limits, accuracy and precision to be technically valid.
- Quality control samples: Must include sufficient quality control samples to support the necessary statements of accuracy, precision, and detection limits. These include blanks (field, trip, laboratory, reagent), duplicate measurements, matrix spikes, laboratory control samples, and performance evaluation samples.
- Quality control limits: Includes clearly stated acceptable limits for quality control samples such as allowable blank contamination; precision of duplicate samples; and accuracy of matrix spikes, performance evaluation samples and laboratory control samples. Calibration frequency and linearity may also be included.
- Documentation: Must allow a third party evaluator to verify the suitability of the sample data.
We refer to the process of verifying the suitability of the data as data quality assessment. Data quality assessment is a determination of the suitability of the data for the intended use. It includes the four major tasks of data management, data validation, data qualification/review (flagging), and finally, the determination of suitability. Data management includes determining the completeness of the data documentation. Environmental data validation primarily covers checking to see if the quality control requirements of the method have been met. Data qualification is the application of flags to the data that reflect the failures found during validation. The final determination of suitability must consider the technical validity of the data as well as the data qualifiers and be consistent with the intended use of the analytical data. It can range from rejection to unqualified acceptance and requires a very experienced chemist.
Let’s examine two hypothetical situations, one being an EPA regulation where all of the above "good data” ingredients are pretty well defined and the second, a personal injury case based on exposure to some airborne organic contaminant. In the EPA case, the user will most likely receive the results and a documentation package containing all the regulation (or method) specified sample data management and quality control data. The method requires that the laboratory apply data qualifiers to sample results that have certain data quality issues associated with laboratory controlled data quality issues. This would include a B if contamination was found in the laboratory method blank, an E if a positive result is above the calibration range of the measuring instrument, and a J if a positive result is below the calibration range of the instrument. These laboratory qualifiers will affect the suitability of the data. If examined, the documentation package may (and most likely will) reveal many other data quality issues that affect the data quality and subsequently the suitability or “goodness” of the data. Common data quality failures in our experience include holding time, calibration agreement, retention time agreement (qualitative identification), and accuracy and precision of spiked quality control samples. The National Functional Guidelines published by the EPA stipulate a set of qualifying flags to be applied to the data for various quality issues and sometimes results in rejection of the data. None of the issues, beyond the few indicated by the laboratory qualifiers, will be known to the user if the documentation is never examined. Unfortunately, this is becoming the more common case with the present emphasis on cost cutting.
Now let’s take a look at the second situation of personal injury due to airborne organic chemicals. Suppose that you are the defendant and want to assess the “goodness” of the data. The situation here is more complicated because NIOSH and forensic societies do not have clear definitions of “good data”. We believe that the EPA model of “good data” is the minimum that should be acceptable for these cases. The samples have often been collected and analyzed without formal standard operating procedures. We commonly find that these samples are reported with little or no data quality documentation. In the case where documentation is present, it is usually deficient in many areas. There is almost a direct correlation between the “badness” of the data and the lack of documentation. Delays in obtaining adequate documentation are often prolonged. A complete data quality assessment often results in rejection of a major fraction of the data, and we have rejected all of the data in some cases. Alternately, if your case requires the obtentionof laboratory data to support your case, you would do well to devise a sampling and analysis plan that will produce “good data".
In closing, let’s summarize the importance of performing a comprehensive data quality assessment. The key here is known and documented data quality. A comprehensive data quality assessment is the only way to achieve known and fully documented data quality. The most compelling reasons for establishing the data quality include:
- No laboratory achieves 100 percent completeness (ratio of tests passed to tests performed) for all QC tests. The normal completeness ranges from 80-90% depending on the test. It is essential that you know which samples are associated with failed QC.
- Your data may be used against you; therefore, you must know how “good’ it is to avoid making unwarranted claims about what it means.
- Establishing a policy of thorough data quality assessment enhances your reputation and public image.
- Many experts (toxicologists, hydrogeologists, engineers, etc.) are dependent on the data. We have seen much wasted engineering and scientific interpretation of worthless data.
- A properly managed sampling and analysis plan followed by comprehensive data quality assessment, laboratory coordination, and appropriate corrective action will usually result in cost savings that more than recover the cost of the quality assurance effort.
- Laboratories often commit errors of omission and have been known to actually produce fake data.
In conclusion, in our experience, laboratory data is not necessarily “good” and comprehensive data quality assessment is the only way to determine the suitability of the data.