After many years of immersion in technical work I still marvel at how an organization can become mired in raw data. Smart people can easily succumb to the notion that data equals knowledge. Especially in circumstances where data is accumulated faster than it can be assimilated.

It is relatively easy to collect data in a chemical lab. You take a set of samples and prep them for testing, load the sample vials into the sample tray, and let the automated sampling widget move through its paces. In a few minutes or hours the software has accumulated files bulging with data points.  It is even possible to construct graphs with all sorts of statistical manipulations on the data, but still not morph the data into usable knowledge. I’ve been to meetings where graphs are presented but were not backed up with interpretation. What was the presenters point in showing the graph?

Computerized chromatography stations will spew data all day long onto hard drives based on selections from a cafeteria-style menu. With hyphenated instrumentation, an innocent looking 2-dimensional chromatogram is actually just a part of a higher dimensional data set with corresponding mass spectra or UV/Vis spectra.

The task for the technical manager is to get control of this stream of data and render some of it into higher level knowledge that will help people run the organization and get product or research out the door. This is the true work product of the experimental scientist: knowledge woven from a data cross-fire and supported by accepted theory.

I do not know what others do when confronted by a data tsunami. I can only speak for myself on this. When the data flow gets ahead of me, it usually means that I am spread too thin. It indicates that I am not taking enough time to properly devise experiments for maximum impact and am skimping on the analysis in favor of other duties.

Another issue relating to managing diverse data output is the matter of storing accumulated data and knowledge for easy retrieval. It is easy to throw things into folders and file away. But in a few months, the taxonomy used for filing a given bundle of data becomes murky. Soon, one is forced to rummage through many files to find data because you’ve forgotten details on how you organized the filing system.

There are ways around this problem. Laboratory Information Systems (LIMS) are offered by numerous vendors. A good LIMS package goes a long way towards managing data and distributing knowledge. We have a homebrew LIMS system (built in MS Access) that seems to work rather well for analytcial data. However, it was not constructed with process safety information in mind.

What I have constructed for my process safety work is an Access-based application that structures various kinds of information graphically into regions on a form. Within each region is a set of data fields that are subordinate to a given heading or context. The form is devised to prompt the user to consider many types of thermokinetic experiments and provides fields that are links to specific documents. The form provides both actual data and links to source documents. It can be used to enter data or to retrieve it.

This is what Access is designed to do, so I have described absolutely nothing conceptually new. Access allows me to aggregate related kinds of experimental results, reports (the knowledge part), and source documents in one field of view so as to allow the users visual processing capability the chance to browse more efficiently.

An example of “related kinds of experimental data” would be DSC, TGA, ARC, and RC1 reports. What connects these fields is the domain of thermal sensitivity of a compound or reaction mixture.

Another aggregation of fields would be the conditions related to an incident. I like to select key descriptors to an incident so as to aid in incident type studies at a  later date. It is useful to be able to sort incidents resulting from a blown rupture disk or a spill, fire, triangulated drum, etc.

A database is rather like a garden. In order to be useful it must be planted and then cultivated. Ignore it and it will lose its comprehensiveness, casting into doubt its continued use.

Next up is the development of an in-house Wikipedia style browser application for aggregating product, process, and safety information. This offers the best opportunity yet for making information and diverse data available to employees. It can be written in narrative form so as to impart knowledge and history. Why was a particular vendor chosen or how did we decide on that specification? What was the rationale for the process change in step 4.2?  The ability to explain and link to in-house source documents from a familiar and single point of access is key to potential success.

Advertisements