Predictability-Computability-Stability workflow for veridical data science in the age of artificial intelligence.

Journal: Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
Published Date:

Abstract

Data science is a pillar of artificial intelligence (AI), which is transforming nearly every domain of human activity, from the social and physical sciences to engineering and medicine. While data-driven findings in AI offer unprecedented power to extract insights and guide decision-making, many are difficult or impossible to replicate. A key reason for this challenge is the uncertainty introduced by the many choices made throughout the data science life cycle (DSLC). Traditional statistical frameworks often fail to account for this uncertainty. The Predictability-Computability-Stability (PCS) framework for veridical (truthful) data science (VDS) offers a principled approach to addressing this challenge throughout the DSLC. This paper presents an updated and streamlined PCS workflow, tailored for practitioners and enhanced with guided use of generative AI (GenAI). We include a running example to display the PCS framework in action and conduct a related case study that showcases the uncertainty in downstream predictions caused by judgement calls in the data cleaning stage. This article is part of the theme issue 'Statistical workflow'.

Authors

Keywords

No keywords available for this article.