Ethical dilemmas associated with Data and Image Manipulation
“Manipulation—that term sounds so negative…”
Long before the days of Photoshop® or even personal computers(1), I had my first experience with image manipulation and ethics. A friend of mine and I spent a summer together at an honors program for high school students. My research using fuming acids was fun, but a complete failure. He had been doing research on developing chemicals that would attract cockroaches towards a trap (that would kill them). A key photograph showed a large group of Blattella Germanica all heading towards the scented piece of wood. One lone cockroach faced away from the scent. My friend airbrushed the dissenter out of the photograph for the photo that would soon be front and center on the Science Fair poster — simply to eliminate what he would consider irrelevant questions. The top prize was won and the scholarships awarded. I switched from Chemistry to Physics but the roach’s disappearance bothered me. The stray cockroach had vanished from the photo like someone standing next to Stalin in a photograph after a “purge.” The guidelines for scientific handling of images has been well established; however, studies of images submitted to scientific journals show a large number of images that have been manipulated in ways that violate ethical guidelines even if they do not alter the conclusions of the study(2). Interestingly, many scientists submitting the altered images seemed unaware of the issues involved. Some typical guidelines can be found at Elsevier(3), Rossner(4) and Yamada(5).
Acceptable image manipulation includes such harmless changes as adjusting contrast and brightness or cropping to focus on the desired subject. This can escalate to a wide range of practices that may be acceptable with appropriate disclosure. Other practices may be ethical violations. Cropping can be used to eliminate contradictory results like the stray cockroach. Nonlinear filters can overly enhance attractive features. Other manipulations such as merging multiple images may be justified if sources are clearly described. Smudging edges, cloning, deletion of data and many similar ways of touching up images made possible by sophisticated computer technology is almost always unacceptable.
Data manipulation is an even bigger problem in scientific research. Data manipulation can include fabrication of results and falsification of data (omitting measurements, changing data or altering processes.) Society of Petroleum Evaluation Engineers members must be aware of and avoid ethical violations of any type; these may seem inconsequential but are more common than most people might recognize. Here is one example “inspired by actual events” along with another recent example from my personal experience.
Example 1: Just the facts
In a hearing before the Texas Railroad Commission, an expert witness presents two large exhibits. One shows about a dozen gas wells drilled in the last few years in Field A with large red dots. Dates and pressures are annotated. The expert says “Here are some wells I looked at in Field A. The G-1 shut-in pressures are noted. On the next exhibit I have plotted those pressures over time. From these wells it appears that the initial pressure in the field encountered by these wells has declined significantly over time.” Figure 1 is his second exhibit.
Figure 1 Exhibit B G-1 Pressures from selected wells
Hard to argue with the facts, right?
On cross examination the expert maintains that although the field has “tight gas” and that the short buildup times may not in fact represent true reservoir pressures; however, to him the results seem incontrovertible. But if every well in the field is plotted, the results are more like Figure 2. In this figure the first expert’s “selected wells” are shown connected with a light red dashed line and the suggestions from “all of the wells” is that there is essentially no change over time as suggested by the very slight negative slope of the “best fit” line. This illustration portrays the first expert’s analysis in a very dim light indeed. But perhaps some of the data points in Figure 2 purported to be from the entire field include perforations over a different interval or depth. Maybe they are overly sampled from one part of the field? Of course the “best fit line” shows essentially no correlation and the ordinate scale of this graph has been expanded to minimize the appearance of data variability.
Figure 2 Complete field data G-1 pressures
The use of logarithmic scales, selective times and similar manipulation of data can be misleading and unethical. But clearly there are gray areas. In a recent analysis of the frequency of earthquakes in a certain area over time consider the following plot. From this analysis one could conclude that earthquake frequency in this area has not changed significantly in recent years (and specifically as a result of an increase in hydraulic fracturing).
Selected data for 3.0 (and above) magnitude earthquakes is shown because earthquakes smaller than 3.0 can generally not be felt at the surface or cause any sort of damage. Another research group plotted just 1.0 to 2.0 magnitude earthquakes over an overlapping time period from a smaller number of measurement locations and showed significant increases in such small earthquakes. A reader might reach very different conclusions from looking at the two figures we generated even though both of us clearly pointed out all of the assumptions we had made. If we look at all of the increases in earthquakes spatially there is some correlation with oilfields, more with regional faults, quite a bit with the locations of larger historical earthquakes. In some areas there is a smaller but not a negligible correlation with areas corresponding to high levels of hydraulic fracturing activity.
As scientists and engineers we have to analyze data and display our analyses. In some ways our ability to make sense of data is what we are paid for. Unfiltered, unedited, uncorrected data are meaningless if not misleading. Some editing simply makes it possible to compare apples to apples such as correcting pressure to a common subsea depth. Others have judgment calls involved. Hundreds, if not thousands of unconventional wells to analyze? Maybe we want to include those with “similar completion practices.” Volumes? Fluids? Numbers of stages? Approaches to spacing? Rates? Well intentioned engineers attempting to elucidate statistical data from large datasets may well reach disturbingly different results. I am curious to hear your stories of ethical dilemmas associated with either data or image manipulation.
For some insights into ethics and compliance at Baker Hughes, see http://www.bakerhughes.com/company/corporate-social-responsibility/compliance-and-ethics and http://www.bakerhughes.com/company/corporate-social-responsibility/compliance-and-ethics/corporate-governance
At Baker Hughes, corporate social responsibility (CSR) is central to our core values. We conduct our business in an ethical and responsible way. We’ve invested in a number of initiatives to maintain our strong CSR position for the future, including:
We believe maintaining high standards will result in a range of sustainable benefits to all our stakeholders. Our efforts help to motivate employees, support community growth and development, minimize risk, reduce cost, and increase shareholder value.
Originally appeared as an ethics column for the Society of Petroleum Evaluation Engineers