Journal of Global Health's Guidelines for Reporting Analyses of Big Data Repositories Open to the Public (GRABDROP): preventing 'paper mills', duplicate publications, misuse of statistical inference, and inappropriate use of artificial intelligence.
Journal:
Journal of global health
Published Date:
Jul 1, 2025
Abstract
In recent years, global accessibility to large 'big data' repositories that enable 'open research' - such as the UK Biobank, National Health and Nutrition Examination Survey (NHANES), and Global Burden of Disease (GBD) datasets - has created unprecedented opportunities for researchers worldwide to conduct secondary data analyses. This development is particularly beneficial for early-career researchers in low- and middle-income countries (LMICs), as it lets them access large and otherwise costly datasets without the need for local infrastructure, potentially curbing brain drain. However, through our work at the Journal of Global Health (JoGH), we have identified emerging concerns that must be addressed to help preserve the integrity and scientific value of this otherwise positive trend. These include: the risk of 'paper mills' mass-producing superficial papers with questionable authorship practices; duplicate publications produced through republishing already available results or by multiple groups testing the same hypothesis using identical datasets and methods without awareness of each other's work; proliferation of false-positive findings due to inadequate adjustment for multiple testing in large datasets; and the inappropriate or undisclosed use of artificial intelligence (AI) tools in generating manuscripts. To counter these issues while continuing to support legitimate and innovative secondary data analyses, JoGH is introducing guidelines for authors submitting such work for consideration and peer review. These guidelines require authors to declare transparently: their previous published work based on similar datasets or hypotheses; the originality of their research question and design in the context of other similar research; their awareness of related published studies using the same dataset; how they addressed multiple testing statistically; and the role of AI, if any, in manuscript preparation or data analysis. A new, mandatory section in such submitted manuscripts - 'Adherence to JoGH's Guidelines for Reporting Analyses of Big Data Repositories Open to the Public (GRABDROP)' - will summarise these declarations, with full details provided in a supplemental file. This proactive editorial policy aims to safeguard scientific quality while empowering global researchers. By improving transparency and accountability, JoGH seeks to ensure that the benefits of open big data are not undermined by unethical or careless practices. We suggest that other publishers engage in an open discussion on how to address these challenges and consider adopting JoGH's GRABDROP guidelines or similar measures to maintain trust in scientific outputs derived from secondary analyses. Through these steps, JoGH remains committed to fostering reproducible and equitable global health research.