Evaluation of Global Data for National-Scale Soil Depth Mapping in Data-Scarce Regions: A Case Study from Sri Lanka
Journal article, 2026
High-resolution soil depth maps are valuable for environmental modelling, yet reliable data remains scarce in the tropics. This study evaluates the feasibility of mapping depth to bedrock (DTB) in Sri Lanka using a legacy dataset (n = 88) and global environmental covariates (n = 247). A robust machine learning workflow was employed—including feature selection, hyperparameter tuning, and a stacked ensemble of four algorithms (Random Forest, XGBoost, Cubist, SVM)—to test the limits of global data for local mapping. Despite rigorous optimization, the final ensemble model achieved a performance of R2 = 0.197 (RMSE = 35.4 cm) under spatial cross-validation. While still modest, this result significantly outperforms existing global products and quantifies the “prediction gap” inherent in using ~1 km resolution global covariates to model micro-scale soil variability. An initial exploration involved log-transforming the target variable; however, following rigorous testing, the untransformed depth was modelled directly to avoid bias in back-transformation. A robustness experiment was further conducted, reducing predictors from 24 to 12, which degraded performance, confirming that the model captures complex, physically meaningful climatic interactions rather than fitting noise. The study concludes that while global covariates can capture regional meso-scale trends (explaining ~20% of variance), they are insufficient for resolving local micro-relief (<50 m). The resulting map and uncertainty products provide a critical “baseline” for national planning, but effectively demonstrate that future improvements will require investment in higher-resolution local covariates (e.g., LiDAR) rather than more complex algorithms.
depth-to-bedrock
machine learning
soil depth
Sri Lanka
digital soil mapping
ensemble model