code_transformed: The Influence of Large Language Models on Code
Journal:
arXiv
Published Date:
Jun 13, 2025
Abstract
Coding remains one of the most fundamental modes of interaction between
humans and machines. With the rapid advancement of Large Language Models
(LLMs), code generation capabilities have begun to significantly reshape
programming practices. This development prompts a central question: Have LLMs
transformed code style, and how can such transformation be characterized? In
this paper, we present a pioneering study that investigates the impact of LLMs
on code style, with a focus on naming conventions, complexity, maintainability,
and similarity. By analyzing code from over 19,000 GitHub repositories linked
to arXiv papers published between 2020 and 2025, we identify measurable trends
in the evolution of coding style that align with characteristics of
LLM-generated code. For instance, the proportion of snake\_case variable names
in Python code increased from 47% in Q1 2023 to 51% in Q1 2025. Furthermore, we
investigate how LLMs approach algorithmic problems by examining their reasoning
processes. Given the diversity of LLMs and usage scenarios, among other
factors, it is difficult or even impossible to precisely estimate the
proportion of code generated or assisted by LLMs. Our experimental results
provide the first large-scale empirical evidence that LLMs affect real-world
programming style.