Skip to main content

Events

Statistics and Data Science seminars: Prof Duncan Lee (University of Glasgow)

A spatial autoregressive random forest algorithm for small-area spatial prediction


Event details

In spatial areal unit data with missing or suppressed values, it is desirable to create models that are able to predict observations that are not available. Typically, statistical spatial smoothing models fitted in a Bayesian hierarchical framework are used for this purpose, which capture any unexplained residual spatial autocorrelation in the data through conditional autoregressive (CAR) or spatial autoregressive (SAR) priors applied to a set of random effects. In contrast, typical machine learning approaches such as random forests or neural networks ignore this residual autocorrelation, and instead base predictions on complex non-linear feature-target relationships. In this paper we propose SPAR-Forest, a novel spatial prediction algorithm that fuses random forests with spatial smoothing models. By iteratively refitting a random forest combined with a Bayesian CAR or SAR model in one algorithm, SPAR-Forest can incorporate flexible feature-target relationships while still accounting for the residual spatial autocorrelation. Our results, based on a Scottish property price data set and multiple simulated data sets, show that SPAR-Forest outperforms Bayesian CAR / SAR models, random forests, and state-of-the-art hybrid approaches including geographical random forests, providing a state-of-the-art framework for small-area spatial prediction.