Global Data Quality Assessment and the Situated Nature of "Best" Research Practices in Biology
Data Science Journal
This paper reflects on the relation between international debates around data quality assessment and the diversity characterising research practices, goals and environments within the life sciences. Since the emergence of molecular approaches, many biologists have focused their research, and related methods and instruments for data production, on the study of genes and genomes. While this trend is now shifting, prominent institutions and companies with stakes in molecular biology continue to set standards for what counts as ‘good science’ worldwide, resulting in the use of specific data production technologies as proxy for assessing data quality. This is problematic considering (1) the variability in research cultures, goals and the very characteristics of biological systems, which can give rise to countless different approaches to knowledge production; and (2) the existence of research environments that produce high-quality, significant datasets despite not availing themselves of the latest technologies. Ethnographic research carried out in such environments evidences a widespread fear among researchers that providing extensive information about their experimental setup will affect the perceived quality of their data, making their findings vulnerable to criticisms by better-resourced peers. These fears can make scientists resistant to sharing data or describing their provenance. To counter this, debates around Open Data need to include critical reflection on how data quality is evaluated, and the extent to which that evaluation requires a localised assessment of the needs, means and goals of each research environment.
This research was funded by the European Research Council grant award 335925 (“The Epistemology of Data Science”), the Leverhulme Trust Grant number RPG-2013-153 (“Beyond the Digital Divide”), and the Australian Research Council, Discovery Project DP160102989 (“Organisms and Us”). The
This is the author accepted manuscript. The final version is available from Ubiquity Press via the DOI in this record.
Vol. 16, p.32