Air Quality Sensors and Data Adjustment Algorithms: When Is It No Longer a Measurement?

S technology to measure outdoor air pollution is becoming ubiquitous. Sensors are currently developed and deployed by a wide number of start-up technology companies, academic institutions, government organizations, community groups, traditional air quality instrument manufacturers, and other commercial entities. Developers seek to maximize the quality and quantity of information from sensor technologies, while minimizing the cost to build and maintain. The original equipment manufacturer (OEM) sensor components used for detection of atmospheric gases and particles generally trade off measurement selectivity, sensitivity, and reproducibility for miniaturization, power, and price. Additionally, performance targets for OEM sensors or integrated sensor devices are not currently established. Air quality sensors, therefore, have a variety of known measurement artifacts that those developing and applying the technology seek to overcome. A growing trend in air sensor applications is to improve the data quality from sensors through applying multiple linear regression, machine learning, or other complex mathematical algorithms. To develop a data adjustment method, the sensor device is usually collocated with a reference-grade monitor in an environment that is representative of the sampling conditions. This collocation time frame serves as the training period for which a correction algorithm is developed that incorporates the sensor raw data and adjusts the data to most closely match the reference-grade data. Thereafter, the sensor device is relocated to another environment for ongoing use and the correction algorithm is applied, based upon the presumption that the ongoing sampling conditions are within range of the calibration period. In some approaches, sensor data at one location are adjusted based upon measurements in other places, assuming there is homogeneity in air pollution concentrations over a specific geographic area and time frame; for example, this approach appears to be supported via commercially available software (e.g., Advanced Normalization Tool for AirVision; http://agilaire.com/pdfs/ANT.pdf). These emerging strategies raise a number of questions for debate, such as: How confident are we in the approach of calibrating sensors at one location for a short period of time, then deploying at other locations under potentially differing conditions and for longer time spans? What are the appropriate parameters to include in sensor data postprocessing? At what point do sensor data depart the identity of an independent measurement, but are now considered a model output to some degree, and does this distinction matter? A measurement purist would argue that the only parameters that are appropriate for inclusion into a sensor data adjustment algorithm are those that are definitively proven to cause measurement response error or bias. For example, optical particle sensors often display artifacts under increasing humidity. This effect is due to the condensation of water to the particles, altering their light-scattering properties and introducing inaccuracy in the estimated particulate matter mass concentrations. Optical particle sensors also have lower particle size limits for their detection capability (e.g., 300 nm). Numerous gas-phase sensors have known cross-sensitivities, whereby an electrochemical or metal oxide sensor that is identified as sensing a specific gas may also have some degree of responsiveness to another gas type. Complicating this further, gas sensors may also have measurement artifacts related to temperature and humidity. Finally, some low-cost sensors drift in their measurement response over time. These complex factors collectively create a multidimensional problem, for which a variety of groups attempt to solve through sophisticated data postprocessing. A critical issue for debate in the scientific community is the appropriate design of sensor postprocessing algorithms. Of chief concern are the inclusion of parameters for which there is no demonstrated measurement artifact or rely upon untested assumptions about the state of the atmosphere. In the era of big data, it is tempting to maximize the ability of sensors to