Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting
J.A. Fernades, X. Irigoien, J.A. Lozano, I. Inza, A. Perez
Environmental Modelling & Software, 40, pp. 245-254, (2013)
Keywords
Supervised classification, Multi-dimensional classification, Bayesian networks, Missing imputation, Discretization, Feature subset selection, Environmental modelling, Recruitment forecasting
Abstract
Page Content
A
multi-species approach to fisheries management requires taking into
account the interactions between species in order to improve recruitment
forecasting of the fish species. Recent advances in Bayesian networks
direct the learning of models with several interrelated variables to be
forecasted simultaneously. These models are known as multi-dimensional
Bayesian network classifiers (MDBNs). Pre-processing steps are critical
for the posterior learning of the model in these kinds of domains.
Therefore, in the present study, a set of ‘state-of-the-art’
uni-dimensional pre-processing methods, within the categories of missing
data imputation, feature discretization and feature subset selection,
are adapted to be used with MDBNs. A framework that includes the
proposed multi-dimensional supervised pre-processing methods, coupled
with a MDBN classifier, is tested with synthetic datasets and the real
domain of fish recruitment forecasting. The correctly forecasting of
three fish species (anchovy, sardine and hake) simultaneously is doubled
(from 17.3% to 29.5%) using the multi-dimensional approach in
comparison to mono-species models. The probability assessments also show
high improvement reducing the average error (estimated by means of
Brier score) from 0.35 to 0.27. Finally, these differences are superior
to the forecasting of species by pairs.
Code
DOI: 10.1016/j.envsoft.2012.10.001
See all publications 2013
No