Madeleine Tango's Portfolio

a collection of open source GIScience work

Reproducibility vs. Ethics

5/13/21

 

In Mei-Po Kwan’s talk, she discusses the tension between COVID-19 research and protecting personal privacy. She focuses particularly on how we can protect vulnerable populations—something that research can both help with or cause great harm. She emphasizes the importance of showing the experiences often omitted from records and data—experiences of low-income families, sex workers, and other more vulnerable populations. However, she also acknowledges that sensitive health data in vulnerable communities can increase vulnerability, so it is important that researchers use data responsibly with solid protection, and that they destroy the data after they use it. While I am uncomfortable with the idea of destroying data you will never be able to get back, making it difficult to reproduce studies or correct errors in analysis, I also understand that destroying data is the most sure way to prevent data leaks. Kwan discusses technology that can identify where a person lives and works, tracking people at the building level. Scale is important to take into account; aggregated data make it much more difficult to connect personal data to individual identities. Bluetooth can be used for COVID-19 tracing by using proximity as opposed to coordinate locations, from which it is difficult to determine individuals’ locations.

Kwan emphasizes the importance of having guidelines about data use that the government follows, given people tend to be more afraid of the government’s use of their data than private companies (even though private companies tend to have more leeway with how they use data). She makes clear the importance of researchers understanding the region they are studying, as cultural differences and realities can affect the way one uses data (e.g., differences with role of restaurants and COVID-19). Moreover, having an understanding of local people and practices helps build trust between researchers and those providing data.

Use of personal data for research is complicated because humans are complicated. It is important to feel comfortable with complexity and work toward best practices that ensure research does not cause harm to populations studied. Humility and transparency in methods (#8 in Zook et al. 2017’s recommendations), even if it is not possible to be transparent about the data itself, are important to allow for the opportunity of improvement and critique. However, protecting data will inherently make it more difficult to reproduce work, even if researchers document the best they can.

 

Questions for class debate about curtailing vs. expanding the use/capabilities/applications of spatial big data research:

 

Main Page