BACKGROUND: The specification of metadata in clinical and epidemiological study projects absorbs significant expense. The validity and quality of the collected data depend heavily on the precise and semantical correct representation of their metadata...
Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess en...
The international journal of biostatistics
27227715
Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples...
Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs....
Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth),...
Binary classification problems are ubiquitous in health and social sciences. In many cases, one wishes to balance two competing optimality considerations for a binary classifier. For instance, in resource-limited settings, an human immunodeficiency v...