From Regression to Machine Learning: Modeling Height–Diameter Relationships in Crimean Juniper Stands Without Calibration Overhead


Diamantopoulou M. J., ÖZÇELİK R., Eler Ü., KOPARAN B.

Forests, cilt.16, sa.6, 2025 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 16 Sayı: 6
  • Basım Tarihi: 2025
  • Doi Numarası: 10.3390/f16060972
  • Dergi Adı: Forests
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Environment Index, Geobase
  • Anahtar Kelimeler: calibration, extreme gradient boost, mixed-effects, quantile regression, random forest, shallow multilayer perceptron, tree height
  • Isparta Uygulamalı Bilimler Üniversitesi Adresli: Evet

Özet

Accurate modeling of height–diameter (h–d) relationships is critical for forest inventory and management, particularly in complex forest ecosystems such as natural and pure Crimean juniper (Juniperus excelsa Bieb.) stands. This study evaluates both traditional parametric and modern machine learning (ML) approaches to develop reliable h–d models based on 2135 sample trees measured in southern Türkiye. The modeling approaches include fixed-effects (FE), mixed-effects (ME), three quantile regression (QR) models based on three, five, and nine quantile levels, and non-parametric ML methods: shallow multilayer perceptron (S_MLP), extreme gradient boost (XGBoost), and random forest (RF). According to the assessment metrics for the fitting and test datasets, the XGBoost modeling approach achieved the most accurate performance. For the fitting dataset, it achieved root mean square error values of 1.11 m and 1.21 m. For the test dataset, the corresponding error values were 1.16 m and 1.24 m, resulting in the highest accuracy among all models, closely followed by the RF and S_MLP models. A key practical advantage of ML approaches is that they do not depend on calibration scenarios, meaning they can operate without the need for preliminary parameter configuration. In contrast, the ME model showed the highest accuracy among the parametric methods when calibration was applied. In this case, when applying ME models, the study recommends calibrating the model by measuring four randomly selected trees per plot to balance prediction accuracy and field sampling effort.