DATA QUALITY ASSESSMENT
IT'S NOT JUST PUSHING PAPER
In recent years the environmental arena has expanded dramatically, pushing technology to new limits and encouraging innovation in an ever-changing regulatory atmosphere. One environmental issue dominates all aspects of this dynamic arena -- the science of identifying and quantitating regulated chemical species or, laboratory analysis. Data from laboratory analysis is essential to risk assessments, extent of contamination studies, feasibility studies, and compliance monitoring. A tremendous amount of resource and capital is expended based on laboratory analytical results. This factor plus the legal aspects surrounding environmental issues make it imperative that these results be "of known and acceptable quality" for the intended use.
Quality Assurance (QA) is the total integrated process for assuring the reliability and defensibility of decisions based on analytical data. This process is extremely rigorous due to the complexity of laboratory analyses, the number of different analytical methods each with different criteria, the various agency standards for validating results, the abundance of laboratories each with varying strengths and weaknesses, and the diverse scope of measurement purposes. In other words, it takes a lot of paper pushing. Few will argue that it is a necessary evil in order to ensure laboratory data is legally defensible. However, many miss the fact that it is also necessary to ensure data is technically valid -- a determination that is highly interpretive and requires a thorough understanding of analytical chemistry. Many data users do not realize that it is the norm, rather than the exception, for there to be some bias associated with laboratory data. If not accounted for, this bias can lead to disastrous decisions.
It is important to recognize that it doesn't necessarily take a failure on the part of the laboratory to introduce bias. In fact, in many cases, bias is introduced solely by the nature of the sample matrix, even when the laboratory performs the analysis exactly as required in the analytical method. Of course, laboratory data should never be blindly accepted -- laboratories today are more and more automated which means less and less reliance on experienced analytical chemists. Laboratory QA reviews are by necessity generic for all clients and thus often inadequate and not suited to a particular intended use. EPA guidelines make it clear that the user of the data has the final responsibility for the quality of the data:
"If the data are collected by a contract laboratory, it is the permittee's responsibility to see that all of the requirements in the method are met by the contract laboratory and that all (associated) data are provided." - Guidance on Evaluation, Resolution, and Documentation of Analytical Problems Associated with Compliance Monitoring (1).
The fundamental principle of quality assurance is data quality assessment. Third-party data quality assessment is the best way to ensure laboratory data is legally defensible and technically valid -- it eliminates potential conflicts of interest and is more efficiently (in terms of cost and time) and reliably performed by highly experienced specialists. A complete data quality assessment includes four major tasks:
Quality Control (QC) is a set of measures within a sampling and analysis methodology to assure that the process is in control. These measures may be procedural (e.g., samples must be analyzed within 12 hours of instrument calibration) or numerical (e.g., the recovery of a spiked analyte must be 80%-120%). Numerical control limits are dictated by the analytical method (which generally require laboratories to develop and regularly update statistical limits for each sample matrix), the regulatory program (which may recommend minimum limits), and/or the project work plan or QAPP (which set limits based on the accuracy needed to meet data quality objectives). Data validation is the process of verifying that the field sampler and the laboratory has complied with all QC requirements of the specified analytical method, program, and project. The first step in this process is identifying the data quality objectives, which relate to the numerical control limits, for the project. The validation is then performed, normally using a checklist, by making QC Checks that cover the necessary procedural and numerical requirements.
The QC Checks are made by examining the field notes and the data reported by the laboratory and comparing them against the QC requirements. Calculations from raw data are also made to verify the laboratory's reported values. (Raw data is unprocessed data taken directly from the instrument printouts instead of the laboratory QC reports.) Though tedious, the process is rote and can be performed relatively easily once the applicable QC Checks and numerical control limits have been identified. Applicable QC Checks are usually determined by the QC Level -- which is a measure of the reliability required. A high profile, remediation verification will normally require all the QC Checks specified in the sampling and analytical methods. Progress monitoring may require only a fraction of these. Numerical QC Checks are generally more important than procedural checks, since they are performance-based. Another variable is the amount of raw data calculations to perform. This decision is based on the QC Level and the project staff's confidence and familiarity with the laboratory. Raw data calculations are highly labor-intensive and normally are not made on more than 20% of the data. Of course, if errors are found, more calculations are required.
Laboratories normally apply qualifiers (i.e., flags) to the analytical data that directly relate to the concentration value reported. For example, a "U" flag is applied to the concentration value (in this case, the detection limit) of analytes that are not detected in the sample. Likewise, the laboratory applies "J" and "E" flags to concentration values that are below or above the calibration range of the instrument. For most analysis types though, the laboratory does not apply any flags that relate to the QC requirements of the analytical method and never for those of the project. Data qualification (sometimes called data review) is the process of flagging analytical data (both detects and non-detects), according to a set of pre-established functional guidelines, to reflect any QC failures. The procedure includes flagging each sample to reflect any failures for the sample itself (e.g., extended holding time) and any failure of a QC sample referenced to the sample (e.g., blank contamination). Normally, the procedures used are those in the USEPA Contract Laboratory Program National Functional Guidelines for Organic Data Review (2) and Guidelines for Inorganic Data Review (3). These procedures, commonly referred to as NFG, produce flags to indicate if a concentration value is only an estimate of the actual value or if it is completely rejected for use. The Region III Modifications to NFG (4) additionally produce flags to indicate if estimated values are biased low or high. Alternatively, functional guidelines specific to the project can be developed and used. Like data validation, data qualification can be standardized -- as it should be to maintain a consistent, thorough QA/QC program.
Depending on the circumstances surrounding a particular project, analytical data that is qualified, for example as biased low, may or may not be suitable for use. Likewise, a reporting limit that is above the action level for a project may render data for non-detects not suitable for use. A usability determination is the process of determining the suitability of analytical data for the intended use. It is the most critical step of data quality assessment and requires a thorough understanding of both the analysis procedure and the environmental project. The determination is made by examining the reporting limits and the qualified data for each individual sample along with the QC Check completeness (percentage of samples passing each check) for the entire data set. In some cases, it may be possible simply to use the NFG flags applied during data qualification to indicate data usability. For example, the usability determination may conclude that all data qualified as biased high should be used as reported and all data qualified as biased low should be used with a + 50% correction. In other cases, the NFG flags may not be appropriate and additional flagging is required. There are some documents with general guidelines on data usability or specific information for certain types of environmental projects as follows:
Even with such guidelines, many of which are outdated, this process is highly technical and interpretive and it requires an experienced data quality assessor to make accurate, consistent, and concise usability determinations.
Based on a data validation, qualification, and usability determination, some analytical data simply cannot be used as reported by the laboratory. Of course, it can be quite expensive to re-collect and re-analyze the samples. An analytical chemist with broad interpretive experience asking the right questions can often uncover ways to rescue the current analytical data and/or improve the quality of future analytical data. This may involve applying correction factors or calculating sample specific detection limits for the data. Modifying the sampling or analysis technique or selecting a more appropriate analytical method may also be warranted. Simply establishing regular communication with the laboratory can often drastically improve data quality. In any case, the sooner an experienced data quality assessor becomes involved in a project, the better the quality of the analytical data obtained.
As an ultimate goal, data quality assessment procedures should be standardized and uniform throughout the environmental industry and applied to all environmental analysis data. Certain aspects of data quality assessment, namely data validation and qualification, lend themselves to automation and the only practical approach to this goal is an automated software application that processes electronic format laboratory data. Any such automated software application must be able to process raw laboratory data (quantitation reports and instrument printouts), validate to any set of control limits, accommodate different QC protocols, and easily communicate with other software. A logical step would be to include data management functions -- namely preparation of sampling documents, tracking of the data quality assessment, and storage and retrieval of technically valid, legally defensible data of known and documented quality. Once such a system was set up to read in a particular laboratory's data, it would be possible to automatically process all analysis data for that laboratory at a QC Level that included all QC Checks and 100% raw data calculations for less than it costs to manually process the same data with fewer QC Checks and only 20% raw data calculations. Additionally, because data for both environmental and QC samples would reside in the system, it would provide a perfect environment for evaluation of historical data either by the engineer/ scientist or QA personnel.
Third-party data quality assessment, whether performed manually or electronically, is essential to the success of environmental projects relying on laboratory analysis data. It is easy to imagine the consequences of using analytical data without the benefit of a data validation or qualification to identify the associated QC problems -- data that is only an estimate of the actual value or completely inaccurate might unknowingly be used to make important environmental decisions. This can lead to poor decisions that cost a lot of time and money. To take it one step further, without the benefit of a usability determination the user may tend to take data at face value not understanding the implications of QC deficiencies found by the validator. Without data rescue, the user can be stuck with worthless analytical data. Without the involvement of an experienced analytical chemist, future sampling events will likely yield more of the same. Alternatively, a thorough data quality assessment based on sound analytical chemistry provides legally defensible, technically valid data for environmental projects. And even for the most complex scenario, a data quality assessment runs only about 5-10% of the sampling and analysis cost -- an expense more than justified by avoiding decisions based on invalid data and by savings in re-sampling and re-analysis costs.
Guidance on Evaluation, Resolution, and Documentation of Analytical Problems Associated with Compliance Monitoring. USEPA Office of Water - Engineering and Analysis Division. U.S. Government Printing Office: Washington, DC, June 1993.
USEPA Contract Laboratory Program National Functional Guidelines for Organic Data Review. USEPA Office of Solid Waste and Emergency Response. U.S. Government Printing Office: Washington, DC, October 1999.
USEPA Contract Laboratory Program National Functional Guidelines for Inorganic Data Review. USEPA Office of Solid Waste and Emergency Response. U.S. Government Printing Office: Washington, DC, October 2004.
Region III Modifications to National Functional Guidelines for Organic Data Review Multi-Media, Multi-Concentration (OLMO1.0-OLMO1.9). USEPA Region III Central Regional Laboratory - Quality Assurance Branch. U.S. Government Printing Office: Washington, DC, September 1994.
(5) Guidance on Environmental Data Verification and Data Validation (G-8). EPA/240/B-02/004 November 2002.
Guidance for Data Usability in Risk Assessment. Office of Emergency and Remedial Response. U.S. Government Printing Office: Washington, DC, April 1992.
Superfund: Data Quality Objectives for Remedial Response Activities - Development Process and Example Scenario. USEPA. U.S. Government Printing Office: Washington, DC, March 1987.
QA/QC Guidance for Removal Activities: Sampling QA/QC Plan and Data Validation Procedures. USEPA. U.S. Government Printing Office: Washington, DC, April 1990.
Taryn G. Scholz holds a B.S. in chemical engineering from the University of Houston and is currently the president of Quality Assurance Associates (QAA), a small environmental consulting firm in College Station, Texas. She has been involved with data quality assessments for thousands of samples including her role as assistant quality assurance manager at a major Superfund site. She currently serves as project manager for data quality assessment projects and is working on an in-house research and development project called ADAM. The Automated Data Assurance Manager (ADAM) is a software tool that performs data management and data quality assessment functions using electronic format laboratory data.
Louise McGinley holds a BS in chemistry from Southwest Texas State University and is now employed by a software firm in Austin, Texas. She has been involved in the design, technical support, and customized add-ons of laboratory data management systems for scores of projects.
Dr. Donald A. Flory holds a Ph.D. in analytical chemistry and geology from the University of Houston and is currently a principal of QAA. He has served as quality assurance manager for several different laboratories and a major Superfund project and has published numerous papers in the fields of radiation chemistry, organic geochemistry, and environmental chemistry. He is currently responsible for quality assurance oversight on all data quality assessment projects and for devising data rescue procedures.