A machine learning multimodal profiling of Per- and Polyfluoroalkyls (PFAS) distribution across animal species organs via clustering and dimensionality reduction techniques.
Journal:
Food research international (Ottawa, Ont.)
PMID:
40356129
Abstract
Per- and polyfluoroalkyl substances (PFAS) contamination in aquatic and terrestrial organisms poses significant environmental and health risks. This study quantified 15 PFAS compounds across various tissues (liver, kidney, gill, muscle, skin, lung, blood, breast, feather) from fish (Clarias gariepinus, Oreochromis niloticus, Lates niloticus, Tilapia zilli), livestock (camel, cow, sheep, ram, goat), and birds (pigeon, chicken, turkey). Among the fishes, C. gariepinus exhibited the highest PFAS accumulation, with PFOA (46.5 ng/g) and PFTrDA (50.1 ng/g) dominant in liver and kidney, while O. niloticus showed elevated PFTrDA (56.87 ng/g) and PFUnDA (29.43 ng/g). In livestock, camel liver contained high PFNA (9.22 ng/g), and cow liver had the highest PFOS (8.1 ng/g). Among the birds, pigeon liver showed the highest PFNA (7.83 ng/g). To analyze PFAS distribution patterns, dimensionality reduction and clustering techniques were employed. Principal Component Analysis (PCA) captured 68.28 % of total variance, revealing two distinct clusters whereby fish species are strongly related with higher PFAS concentration, while poultry showed unique PFAS profiles when compared to other types of meat. Clustering of PFOS, PFOA, and other PFAS compounds near the center explained their influence across the general meat types particularly the fish species, while t-Distributed Stochastic Neighbor Embedding (t-SNE) confirmed clear separations in high-dimensional space. Clustering analyses, including K-means, hierarchical clustering, DBSCAN, and Gaussian Mixture Models (GMM), identified well-defined patterns, with DBSCAN and GMM detecting overlapping categories and outliers. Feature importance analysis using a Random Forest model highlighted total PFAS as the most significant predictor, with PFHxA and PFDODA also contributing strongly, while organ type and species played a lesser role. These findings demonstrate the effectiveness of unsupervised learning techniques in characterizing PFAS bioaccumulation patterns across species and tissues, providing valuable information for ecological and toxicological risk assessments.