Patient Similarity Computation for Clinical Decision Support: An Efficient Use of Data Transformation, Combining Static and Time Series Data
Journal:
arXiv
Published Date:
Jun 8, 2025
Abstract
Patient similarity computation (PSC) is a fundamental problem in healthcare
informatics. The aim of the patient similarity computation is to measure the
similarity among patients according to their historical clinical records, which
helps to improve clinical decision support. This paper presents a novel
distributed patient similarity computation (DPSC) technique based on data
transformation (DT) methods, utilizing an effective combination of time series
and static data. Time series data are sensor-collected patients' information,
including metrics like heart rate, blood pressure, Oxygen saturation,
respiration, etc. The static data are mainly patient background and demographic
data, including age, weight, height, gender, etc. Static data has been used for
clustering the patients. Before feeding the static data to the machine learning
model adaptive Weight-of-Evidence (aWOE) and Z-score data transformation (DT)
methods have been performed, which improve the prediction performances. In
aWOE-based patient similarity models, sensitive patient information has been
processed using aWOE which preserves the data privacy of the trained models. We
used the Dynamic Time Warping (DTW) approach, which is robust and very popular,
for time series similarity. However, DTW is not suitable for big data due to
the significant computational run-time. To overcome this problem, distributed
DTW computation is used in this study. For Coronary Artery Disease, our DT
based approach boosts prediction performance by as much as 11.4%, 10.20%, and
12.6% in terms of AUC, accuracy, and F-measure, respectively. In the case of
Congestive Heart Failure (CHF), our proposed method achieves performance
enhancement up to 15.9%, 10.5%, and 21.9% for the same measures, respectively.
The proposed method reduces the computation time by as high as 40%.