Multi-scenario species distribution modeling.
Correlative species distribution models (SDMs) are increasingly being used to predict suitable insect habitats. There is also much criticism of prediction discrepancies among different SDMs for the same species and the lack of effective communication about SDM prediction uncertainty. In this paper, we undertook a factorial study to investigate the effects of various modeling components (species-training-datasets, predictor variables, dimension-reduction methods, and model types) on the accuracy of SDM predictions, with the aim of identifying sources of discrepancy and uncertainty. We found that model type was the major factor causing variation in species-distribution predictions among the various modeling components tested. We also found that different combinations of modeling components could significantly increase or decrease the performance of a model. This result indicated the importance of keeping modeling components constant for comparing a given SDM result. With all modeling components, constant, machine-learning models seem to outperform other model types. We also found that, on average, the Hierarchical Non-Linear Principal Components Analysis dimension-reduction method improved model performance more than other methods tested. We also found that the widely used confusion-matrix-based model-performance indices such as the area under the receiving operating characteristic curve (AUC), sensitivity, and Kappa do not necessarily help select the best model from a set of models if variation in performance is not large. To conclude, model result discrepancies do not necessarily suggest lack of robustness in correlative modeling as they can also occur due to inappropriate selection of modeling components. In addition, more research on model performance evaluation is required for developing robust and sensitive model evaluation methods. Undertaking multi-scenario species-distribution modeling, where possible, is likely to mitigate errors arising from inappropriate modeling components selection, and provide end users with better information on the resulting model prediction uncertainty.