Data Mining and LCA: A Survey of possible marriages

Matthew Pietrzykowski*, General Electric Global Research

A survey of data mining techniques is presented together with possible applications to aid the analyst in evaluating an LCA model. The core of an LCA is an empirical model that requires a variety of input data. The model outputs are functions of the data, the transformations, subjective choice and inherent error. This presentation will focus on tools born out of statistics, artificial intelligence, computer science and other disciplines designed to work with, manage and interrogate LCA data and models.

The LCA modeling process is iterative and requires the analyst to challenge the data and model frequently with focused questions such as: During the data gathering stage, were there errors in collection, missing data or data input? Once an inventory is built, what type of inherent structure is present? How do the data naturally aggregate? Did the data collection produce enough representative data for the scope of the study? Given an adequate data inventory to address the question asked of the model, how certain is the information, and how will this certainty manifest itself in the results? Can credible assessments be made based on the model's output? Data mining techniques can help with these questions.

Tools like clustering, signal processing and transformations can address concerns in LCA data inspection. Neural networks, supervised and unsupervised discrimination as well as multivariate data reduction help with LCI exploration. Uncertainty and sensitivity analyses can be approached with stochastic methods, experimental design, Bayesian approaches, genetic algorithms and more. The goal of this discussion is to present a variety of techniques that are available to the LCA practitioner that may not have been considered and to give some simple examples of their application.


*