GEOG 323 Reflection 2: GIS as Reproducible Science
This week we read a piece by Wright, Goodchild, and Proctor (1997) on whether GIS is a "tool" or a "science," or perhaps more accurately, laying out valid arguments for both positions, which of course depend on the context in which GIS is being used. We also read chapters 2 and 3 of Reproducibility and Replicability in Science (2019) from the National Academies of Science, Engineering, and Medicine, which discuss the production of scientific knowledge as well as issues of reproducibility and replicability. I found it interesting to consider my GIS work within the realm of "science" because I have always considered geography, and by association, GIS, as part of the social science realm and therefore not "real science," which in my mind has always been a domain reserved for the hard sciences: biology, chemistry, physics, and the like. Although I enjoy math and computer science and the modes of precise, analytical thinking required in these fields, for most of college I never considered myself a "STEM person" because I was majoring in geography and spent most of my time outside the geography department studying non-STEM subjects like economics, history, and French. However, as a result of my recent work in GIS and this week's topic of GIS as reproducible science, I am beginning to rethink that consideration.
Wright et al. propose three classifications of GIS as it relates to science: "GIS as tool," "GIS as toolmaking," and "the science of GIS" (p. 354). Rather than three completely separate categories, Wright et al. argue that they are instead "three positions along a continuum from tool to science" (p. 354, emphasis in original). Although I had never thought about classifying GIS work in such a way before this week's readings and discussion, I find that this description fits very well with my experience "doing GIS," as Wright et al. put it, and reading about others' work in the realms of geographic information systems and geographic information science.
As I see it, the GIS work I have done in classroom and internship settings falls into the categories of "GIS as tool" and "GIS as toolmaking." Over the course of my Middlebury career, I've moved from executing point-and-click workflows in QGIS to solve simple and complex problems, to authoring scripts to automate, and visualize the results of, very specific workflows, to writing models and processing scripts that are adaptable to a variety of situations and even packaging some of them into a QGIS plugin. In the realm of environmental geography and remote sensing, I've tackled increasingly complex analyses in Google Earth Engine (GEE), writing and adapting code to harness the functionalities of GEE and the wealth of data in its imagery libraries. As I've developed a growing interest in computer science over the past year, I think I've shifted my internal approach to GIS from tool to toolmaking. I've become increasingly curious about the QGIS Python API and creating custom tools to increase efficiency and, just as importantly, challenge myself as a programmer. With a program like GEE, the line between tool and toolmaking is blurrier than in a desktop GIS like QGIS or ArcGIS because instead of a GUI, GEE just has JavaScript and Python APIs. Depending on the level of complexity of the analysis or script, it can be seen either as employing the built-in tools of GEE to achieve a result, or instead as developing a new tool that extends the functionality of GEE in new and innovative ways.
Although I wouldn't say I have personally contributed to the "science" aspect of GIS, none of what I have accomplished in GIS would have been possible without advancements in the field of GIS as science. As Wright et al. put it, "The term 'science' may be viewed as shorthand for a logical and systematic approach to problems that seek generalizable answers" (p. 353). As we discussed in class, geographic information science isn't necessarily dependent on the existence of geographic information systems because spatial theories can be developed using just pencil and paper. It's both the development of geospatial theory and figuring out how to embed such theory into useful computer programs that can solve geospatial problems on much larger scales and with a high level of precision that, in my mind, make up the realm of GIS as science. I don't think I have personally done either of these things, but I have certainly built off the work of others who have, analyzing and engaging with their theories and tools along the way. In this way I see GIS as a tool and GIS as toolmaking as distinct from, yet critically dependent upon, GIS as science.
As the authors of the National Academies of Science book put it, "science is a communal enterprise" (p. 32). Thus it is crucial that researchers' methods and findings are clear: "Researchers have to be able to understand others' research in order to build on it. When research is communicated with clear, specific, and complete accounting of the materials and methods used, the results found, and the uncertainty associated with the results, other scientists can know how to interpret the results" (p. 32). Clear and detailed documentation is as important in the realm of GIS as it is in any physical science because specific choices of tools, parameters, and data modifications and transformations can have significant impacts on results and the conclusions drawn from them.
The authors of the book discuss growing concerns regarding "the lack of data, code, and detailed description of methods in individual studies or a set of studies" (p. 42), which limit reproducibility and replicability of research due to confusion about how the research was actually performed in the first place. Although the potential for GIS researchers to gloss over seemingly minor steps and focus on the "big picture" of their analyses is great, so is the opportunity to carefully document decisions and workflows, particularly in the world of open-source GIS. As the book authors explain, providing detailed documentation enables other researchers to reproduce and replicate results and increase overall confidence in the findings. The realm of open-source GIS is perfectly suited to such reproducibility, given that source code, and in the truest sense of open source, data too, is available to all. With appropriate documentation, others should be able to perform identical analyses and validate the results, contributing to major advancements in geographic information science. It is only through confidence in GIS as science that future geographers, computer scientists, and others can engage productively with GIS as a tool and GIS as toolmaking, and such confidence is inextricably linked to the reproducibility and replicability of geographic information science research.