6 minute read

To what extent does open-source GIS help solve the problems of the reproducibility crisis for geography? How?

Before diving into the question, we need to define some terms:

  1. Reproducibility vs Replicability

• Reproducibility: Obtaining consistent results using the same input data, computational steps, methods and code, and conditions of analysis (NASEM 2019)

• Replicability: Obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data (NASEM 2019)

  1. Open-Source and Free Software: What is Freedom?

• Free as in the Free Software Movement: In this context, “free” refers to freedom, not price. Free software, as defined by the Free Software Foundation, grants users four essential freedoms: the freedom to run the program for any purpose, modify it to suit their needs, redistribute copies (either for free or for a fee), and distribute modified versions to benefit the community (Rey 2009). This concept emphasizes the user’s right to control and manipulate the software, ensuring that it remains open and customizable. The focus is on user empowerment, the availability of source code, and the ability to collaborate and innovate.

• Free as in $0.00: On the other hand, when something is described as “free” in terms of cost, it means that users do not have to pay any money to access or use it. This type of “free” does not necessarily guarantee the four freedoms associated with free software (Rey 2009). It may include software that is provided without charge, but it might not grant users the rights to modify, redistribute, or access its source code. In this case, the emphasis is on affordability rather than user empowerment and control.

• Open source: free software but free of the misconception that free software is of inferior quality and not fit for corporate use (Rey 2009)

• Free software: freely available source code that is a necessity for the innovation and progress of the field (Rey 2009)


Open-source GIS plays a significant role in addressing some of the problems related to the reproducibility crisis in geography, as discussed in both the National Academies Press (NASEM) report on Reproducibility and Replicability in Science and Show Me the Code, Sergio Rey’s article on spatial analysis and open source.

Here’s how open-source GIS helps tackle these issues:

Transparency: Open-source GIS software allows researchers in geography to share not only their research findings but also the underlying data and the code used to conduct their analyses. This transparency is a key aspect of reproducibility, as it enables other researchers to replicate and verify the results. This aligns with the principles of the Honor Code at Middlebury College, where students are dedicated to properly attributing sources and respecting licensing terms when using all resources.

Code Availability: In Rey’s article, the concept of show me the code emphasizes the importance of sharing the code used in spatial analysis. Open-source GIS software, by its nature, provides access to the source code, allowing researchers to not only see how analyses were performed but also modify and adapt the code for their specific research needs. This availability of code facilitates reproducibility, as it ensures that the computational aspects of the research are transparent and accessible to others.

Collaboration: Open-source GIS fosters a collaborative environment where researchers can work together to improve and refine spatial analysis methods. This collaborative aspect is crucial for addressing reproducibility challenges, as it allows for the identification and correction of errors or inconsistencies in analyses.

Reducing Cost Barriers: Open-source GIS software is typically available at no cost, which helps reduce barriers to access for researchers and students. An example of this is QGIS, which I have been using since my introduction to human geography with GIS. As someone with limited resources, I was able to step foot into the world of GIS and engage in and contribute to reproducible research in geography. Hence, this affordability aspect aligns with the broader reproducibility goals of making research accessible to a wider audience.

Flexibility and Customization: Open-source GIS tools are often highly customizable, allowing researchers to tailor their analyses to specific research questions or geographic contexts. This malleability enables researchers to adapt and reproduce analyses in different settings, contributing to the replicability of findings across various geographic regions or studies. The tool should not determine one’s research design because the tool is flexible and one can implement different things with it.

These aspects align with the principles of reproducibility and replicability and help ensure that geographic research can be more transparent, accessible, and verifiable. This kind of environment not only works in open science but also in the liberal arts education that emphasizes critical thinking, interdisciplinary exploration, and ethical engagement. The community I live in provides me with valuable skills and a mindset for addressing complex, real-world issues.


Are there problems with reproducibility and replicability in geography that open-source GIS cannot help solve?

In his article, Sergio Rey discusses how open-source GIS has several advantages, including community collaboration, user-led innovation, and accessibility, but there are still some problems related to reproducibility and replicability in geography that it may not fully address.

Technological Elitism: The developer-centric nature of open-source projects may exclude those without programming skills, fostering technological elitism. This may also lead to path dependency in knowledge, where some researchers just use one kind of software because it is all they know how to use. This is something I hope does not happen to me - currently, I only know how to use open-source GIS software (QGIS) as I cannot afford one with a paywall like ArcGIS.

Interface Design: Program interfaces may be designed primarily for engineers and developers, making them less user-friendly.

Documentation Quality: Some open-source projects lack proper documentation, which can be problematic for non-technical users and discourage new developers from participating.

Rapid Evolution: Open-source projects can change rapidly, making it challenging to create course curricula around a moving target or conduct long-term research projects that rely on specific software versions.

Forks: Occasionally, open-source projects undergo “forks” where developers split off to start independent projects, potentially causing issues for projects that relied on the original software.


But what about open-source GIS in…

Science?

Open-source GIS enhances reproducibility and transparency in research, fosters collaboration among scientists, and enables the sharing of data and methodologies but it also challenges include the need for technical expertise to use open-source tools effectively and potential issues with data quality and compatibility.

The Government?

Open-source GIS reduces costs, promotes data sharing among agencies, and enhances public access to government-generated geospatial data. However, security concerns and the need for robust data governance may arise, and government agencies may require additional resources for implementing and maintaining open-source solutions.

Private Businesses?

Open-source GIS can lower licensing costs, provide flexibility for custom solutions, and foster innovation by leveraging a broader developer community. Still, businesses may face challenges in terms of support and maintenance for open-source tools, potential intellectual property issues, and the need for in-house technical expertise.

While open-source GIS is a powerful tool for improving reproducibility and replicability in geography and offers valuable opportunities for learning, it does not eliminate all challenges. However, open-source GIS serves as a foundation for transparency and collaboration, fostering a culture of reproducibility and openness within the geographic research community. I believe that it is essential to recognize the value of open-source GIS while acknowledging its limitations and working toward addressing these challenges for the benefit of reproducibility and replicability in geography.

Find more information about the class I am taking here!


Bibliography

NASEM. 2019. Reproducibility and Replicability in Science. Washington, D.C.: National Academies Press. Chapter 3, Understanding reproducibility and replicability (pages 31-43)

Rey, S. J. 2009. Show me the code: Spatial analysis and open source. Journal of Geographical Systems 11 (2):191–207.