IRIS Catalogo Istituzionale della Ricerca dell'Università degli Studi del Molise

Estimation of wood volume and biomass is an important assignment of any National Forest Inventory. However, the estimation process is often expensive, laborious and sometimes imprecise because of small sample sizes relative to populationvariability. Remote sensing techniques are an option to assist in surveying large areas by providing data that can be related to the forest attribute of interest through mathematical models of relationships. Light Detection and Ranging (LiDAR) is a technology that can provide data that are closely related to forest wood volume and biomass. With these data, linear regression is often used to estimate forest attributes. If the relationship provides evidence of nonlinearity, a transformation in the variables can be considered. However, modern computation allows ﬁtting nonlinear regression models without transformations of the variables. Nonlinear least squares (NLS) techniques also give more freedom to assure satisfaction of natural conditions such as non-negativity and/or lower and upper asymptotes. Like any estimation technique, NLS is subject to overﬁtting when using a large number of predictor variables. Because NLS is more computationally intensive than linear regression, stepwise selection techniques may require considerable programming effort. We compared three methods to select predictor variables for nonlinear models of relationships between forest attributes and LiDAR metrics, two of them based on genetic algorithms (GAs) and one based on random forest (RM). GAs were implemented to optimize a cost function that yields root mean square error or the Akaike Information Criterion (AIC), while RM was based on variable importance in decision trees. A model with the predictor variable most correlated with the response variable was also considered. We compared the results of overall estimation for two datasets using the model-assisted, generalized regression estimator and concluded that the combination of GAs and AIC was the most efﬁcient and stable procedure for selection of variables.We attribute this result to the penalty that AIC applies to models with large numbers of variables,which leads to a more efﬁcient model with a minimum loss of information.

Methods for variable selection in LiDAR-assisted forest inventories

Moser, Paolo;Vibrans, Alexander C.;Mcroberts, Ronald E.;Næsset, Erik;Gobakken, Terje;Chirici, Gherardo;MURA, Matteo;MARCHETTI, Marco

2017-01-01

Abstract

Estimation of wood volume and biomass is an important assignment of any National Forest Inventory. However, the estimation process is often expensive, laborious and sometimes imprecise because of small sample sizes relative to populationvariability. Remote sensing techniques are an option to assist in surveying large areas by providing data that can be related to the forest attribute of interest through mathematical models of relationships. Light Detection and Ranging (LiDAR) is a technology that can provide data that are closely related to forest wood volume and biomass. With these data, linear regression is often used to estimate forest attributes. If the relationship provides evidence of nonlinearity, a transformation in the variables can be considered. However, modern computation allows ﬁtting nonlinear regression models without transformations of the variables. Nonlinear least squares (NLS) techniques also give more freedom to assure satisfaction of natural conditions such as non-negativity and/or lower and upper asymptotes. Like any estimation technique, NLS is subject to overﬁtting when using a large number of predictor variables. Because NLS is more computationally intensive than linear regression, stepwise selection techniques may require considerable programming effort. We compared three methods to select predictor variables for nonlinear models of relationships between forest attributes and LiDAR metrics, two of them based on genetic algorithms (GAs) and one based on random forest (RM). GAs were implemented to optimize a cost function that yields root mean square error or the Akaike Information Criterion (AIC), while RM was based on variable importance in decision trees. A model with the predictor variable most correlated with the response variable was also considered. We compared the results of overall estimation for two datasets using the model-assisted, generalized regression estimator and concluded that the combination of GAs and AIC was the most efﬁcient and stable procedure for selection of variables.We attribute this result to the penalty that AIC applies to models with large numbers of variables,which leads to a more efﬁcient model with a minimum loss of information.

Scheda breve

Scheda completa

Scheda completa (DC)

	Codice UT ISI
	
				WOS:000391387500011
			
	Codice DOI
	
				https://dx.doi.org/10.1093/forestry/cpw041
			
	Codice Scopus
	
				2-s2.0-85016045269
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11695/61303

Citazioni

ND

32

33

social impact