globalchange  > 气候变化与战略
DOI: 10.5194/hess-24-2505-2020
论文题名:
Systematic comparison of five machine-learning models in classification and interpolation of soil particle size fractions using different transformed data
作者: Zhang M.; Shi W.; Xu Z.
刊名: Hydrology and Earth System Sciences
ISSN: 1027-5606
出版年: 2020
卷: 24, 期:5
起始页码: 2505
结束页码: 2526
语种: 英语
Scopus关键词: Adaptive boosting ; Classification (of information) ; Decision trees ; Economic and social effects ; Forecasting ; Interpolation ; Mean square error ; Metadata ; Multilayer neural networks ; Nearest neighbor search ; Particle size ; Silt ; Support vector machines ; Textures ; K nearest neighbours (k-NN) ; Log-ratio transformations ; Machine learning models ; Multilayer perceptron neural networks ; Residual sum of squares ; Root mean square errors ; Soil particle-size fractions ; Spearman rank correlation ; Learning systems ; accuracy assessment ; classification ; computer simulation ; data assimilation ; machine learning ; particle size ; prediction ; size structure ; soil texture
英文摘要: Soil texture and soil particle size fractions (PSFs) play an increasing role in physical, chemical, and hydrological processes. Many previous studies have used machine-learning and log-ratio transformation methods for soil texture classification and soil PSF interpolation to improve the prediction accuracy. However, few reports have systematically compared their performance with respect to both classification and interpolation. Here, five machine-learning models - K-nearest neighbour (KNN), multilayer perceptron neural network (MLP), random forest (RF), support vector machines (SVM), and extreme gradient boosting (XGB) - combined with the original data and three log-ratio transformation methods - additive log ratio (ALR), centred log ratio (CLR), and isometric log ratio (ILR) - were applied to evaluate soil texture and PSFs using both raw and log-ratio-transformed data from 640 soil samples in the Heihe River basin (HRB) in China. The results demonstrated that the log-ratio transformations decreased the skewness of soil PSF data. For soil texture classification, RF and XGB showed better performance with a higher overall accuracy and kappa coefficient. They were also recommended to evaluate the classification capacity of imbalanced data according to the area under the precision-recall curve (AUPRC). For soil PSF interpolation, RF delivered the best performance among five machine-learning models with the lowest root-mean-square error (RMSE; sand had a RMSE of 15.09 %, silt was 13.86 %, and clay was 6.31 %), mean absolute error (MAE; sand had a MAD of 10.65 %, silt was 9.99 %, and clay was 5.00 %), Aitchison distance (AD; 0.84), and standardized residual sum of squares (STRESS; 0.61), and the highest Spearman rank correlation coefficient (RCC; sand was 0.69, silt was 0.67, and clay was 0.69). STRESS was improved by using log-ratio methods, especially for CLR and ILR. Prediction maps from both direct and indirect classification were similar in the middle and upper reaches of the HRB. However, indirect classification maps using log-ratio-transformed data provided more detailed information in the lower reaches of the HRB. There was a pronounced improvement of 21.3 % in the kappa coefficient when using indirect methods for soil texture classification compared with direct methods. RF was recommended as the best strategy among the five machine-learning models, based on the accuracy evaluation of the soil PSF interpolation and soil texture classification, and ILR was recommended for component-wise machine-learning models without multivariate treatment, considering the constrained nature of compositional data. In addition, XGB was preferred over other models when the trade-off between the accuracy and runtime was considered. Our findings provide a reference for future works with respect to the spatial prediction of soil PSFs and texture using machine-learning models with skewed distributions of soil PSF data over a large area. © 2019 Lippincott Williams and Wilkins. All rights reserved.
Citation statistics:
被引频次[WOS]:32   [查看WOS记录]     [查看WOS中相关记录]
资源类型: 期刊论文
标识符: http://119.78.100.158/handle/2HF3EXSE/162696
Appears in Collections:气候变化与战略

Files in This Item:

There are no files associated with this item.


作者单位: Zhang, M., Key Laboratory of Land Surface Pattern and Simulation, State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, 100101, China, School of Earth Sciences and Resources, China University of Geosciences, Beijing, 100083, China; Shi, W., Key Laboratory of Land Surface Pattern and Simulation, State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, 100101, China, College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China; Xu, Z., State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China

Recommended Citation:
Zhang M.,Shi W.,Xu Z.. Systematic comparison of five machine-learning models in classification and interpolation of soil particle size fractions using different transformed data[J]. Hydrology and Earth System Sciences,2020-01-01,24(5)
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Zhang M.]'s Articles
[Shi W.]'s Articles
[Xu Z.]'s Articles
百度学术
Similar articles in Baidu Scholar
[Zhang M.]'s Articles
[Shi W.]'s Articles
[Xu Z.]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Zhang M.]‘s Articles
[Shi W.]‘s Articles
[Xu Z.]‘s Articles
Related Copyright Policies
Null
收藏/分享
所有评论 (0)
暂无评论
 

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.