A comprehensive benchmarking of different outlier detection techniques for univariate datasets by Monte Carlo simulations

ERGİN, Malik; Aksu, Yunus; KOŞKAN, Özgür

doi:10.1080/03610918.2025.2571982

A comprehensive benchmarking of different outlier detection techniques for univariate datasets by Monte Carlo simulations

ERGİN M., Aksu Y., KOŞKAN Ö.

Communications in Statistics: Simulation and Computation, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Basım Tarihi: 2025
Doi Numarası: 10.1080/03610918.2025.2571982
Dergi Adı: Communications in Statistics: Simulation and Computation
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, MathSciNet, zbMATH
Anahtar Kelimeler: Analysis of variance, Monte Carlo simulation, Outlier detection, Sample size, Univariate data
Isparta Uygulamalı Bilimler Üniversitesi Adresli: Evet

Özet

In scientific studies, outliers are among the major factors that can alter experimental results, particularly in analysis of variance. To address this problem, various outlier detection techniques have been developed by researchers. Selecting a suitable technique for practical uses is critical to ensure accurate and reliable statistical inference. The aim of this simulation study was to present a benchmark of nine outlier detection techniques (2σ, Z score, Modified Z score, Median Absolute Deviation-2MADE and 3MADE-, IQR, Grubbs, Rosner, HMSDHM) in terms of type I error probability in ANOVA and their detection capability for injected outliers. A Monte Carlo simulation program was constructed using random numbers generated from normal distribution. Small, medium, and large sample sizes, various numbers of injected outliers, and magnitudes of outliers were considered. In results, the 3MADE and Grubbs tests outperformed others in small samples. For medium samples, the 3MADE and Rosner tests showed reliable results. Furthermore, the Z score, Modified Z score, and Rosner tests performed best in large samples in terms of detection accuracy of injected outliers and type I error probabilities. Overall, the results indicated that the best outlier detection technique depends on the sample size, with 3MADE, Grubbs, and Rosner showing superior performance.