Marine big data-driven ensemble learning for estimating global phytoplankton group composition over two decades (1997–2020)
Zhang, Y; Shen, F; Sun, X; et al.Tan, K
Date: 12 May 2023
Article
Journal
Remote Sensing of Environment
Publisher
Elsevier
Publisher DOI
Abstract
Accurate monitoring of the spatial-temporal distribution and variability of phytoplankton group (PG) composition is of vital importance in better understanding of marine ecosystem dynamics and biogeochemical cycles.
While existing bio-optical algorithms provide valuable information, relying solely on satellite ocean color data
remains ...
Accurate monitoring of the spatial-temporal distribution and variability of phytoplankton group (PG) composition is of vital importance in better understanding of marine ecosystem dynamics and biogeochemical cycles.
While existing bio-optical algorithms provide valuable information, relying solely on satellite ocean color data
remains insufficient to obtain high-precision retrieval of PG due to the intricate nature of the bio-optical signal
and PG composition itself. An interdisciplinary approach combining advancements in machine learning with big
data from ocean observations and simulations offers a promising avenue for more accurate quantification of PG
composition. In this study, an ensemble learning approach, called the spatial-temporal-ecological ensemble
(STEE) model, is developed to construct a robust prediction model for eight distinct phytoplankton groups (i.e.,
Diatoms, Dinoflagellates, Haptophytes, Pelagophytes, Cryptophytes, Green Algae, Prokaryotes, and Prochlorococcus). The proposed method introduces multiple data simultaneously: ocean color, physical oceanographic, biogeochemical, and spatial and temporal information. An ensemble strategy is applied to increase the
performance of the model by merging three advanced machine-learning algorithms. The combined validation of
multiple cross-validation (CV) strategies (i.e., standard, spatial block, and temporal block CVs) shows that the
proposed STEE model has superior robustness and generalization ability. In addition, the analysis shows a high
degree of concordance between the independent datasets and the modeled estimations for long-time series sites,
indicating that the STEE model is capable of effectively monitoring long-term trends in phytoplankton group
composition. Finally, the proposed model was utilized to retrieve global monthly phytoplankton group products
(STEE-PG) over an extended period (September 1997 to May 2020), and comparisons demonstrated better rationality of spatio-temporal distribution than existing satellite-derived phytoplankton group products. Hence,
this new model comprehensively integrates all kinds of observation data and yields long-term global PG products
with high accuracy, which will enhance our understanding of the response of marine ecosystems to environmental and climate change.
Earth and Environmental Science
Faculty of Environment, Science and Economy
Item views 0
Full item downloads 0
Except where otherwise noted, this item's licence is described as © 2023. This version is made available under the CC-BY-NC-ND 4.0 license: https://creativecommons.org/licenses/by-nc-nd/4.0/