Marine big data-driven ensemble learning for estimating global phytoplankton group composition over two decades (1997–2020)
dc.contributor.author | Zhang, Y | |
dc.contributor.author | Shen, F | |
dc.contributor.author | Sun, X | |
dc.contributor.author | Tan, K | |
dc.date.accessioned | 2023-05-16T16:00:23Z | |
dc.date.issued | 2023-05-12 | |
dc.date.updated | 2023-05-15T16:54:56Z | |
dc.description.abstract | Accurate monitoring of the spatial-temporal distribution and variability of phytoplankton group (PG) composition is of vital importance in better understanding of marine ecosystem dynamics and biogeochemical cycles. While existing bio-optical algorithms provide valuable information, relying solely on satellite ocean color data remains insufficient to obtain high-precision retrieval of PG due to the intricate nature of the bio-optical signal and PG composition itself. An interdisciplinary approach combining advancements in machine learning with big data from ocean observations and simulations offers a promising avenue for more accurate quantification of PG composition. In this study, an ensemble learning approach, called the spatial-temporal-ecological ensemble (STEE) model, is developed to construct a robust prediction model for eight distinct phytoplankton groups (i.e., Diatoms, Dinoflagellates, Haptophytes, Pelagophytes, Cryptophytes, Green Algae, Prokaryotes, and Prochlorococcus). The proposed method introduces multiple data simultaneously: ocean color, physical oceanographic, biogeochemical, and spatial and temporal information. An ensemble strategy is applied to increase the performance of the model by merging three advanced machine-learning algorithms. The combined validation of multiple cross-validation (CV) strategies (i.e., standard, spatial block, and temporal block CVs) shows that the proposed STEE model has superior robustness and generalization ability. In addition, the analysis shows a high degree of concordance between the independent datasets and the modeled estimations for long-time series sites, indicating that the STEE model is capable of effectively monitoring long-term trends in phytoplankton group composition. Finally, the proposed model was utilized to retrieve global monthly phytoplankton group products (STEE-PG) over an extended period (September 1997 to May 2020), and comparisons demonstrated better rationality of spatio-temporal distribution than existing satellite-derived phytoplankton group products. Hence, this new model comprehensively integrates all kinds of observation data and yields long-term global PG products with high accuracy, which will enhance our understanding of the response of marine ecosystems to environmental and climate change. | en_GB |
dc.description.sponsorship | National Natural Science Foundation of China | en_GB |
dc.description.sponsorship | National Natural Science Foundation of China | en_GB |
dc.format.extent | 113596-113596 | |
dc.identifier.citation | Vol. 294, article 113596 | en_GB |
dc.identifier.doi | https://doi.org/10.1016/j.rse.2023.113596 | |
dc.identifier.grantnumber | 42076187 | en_GB |
dc.identifier.grantnumber | 42271348 | en_GB |
dc.identifier.uri | http://hdl.handle.net/10871/133153 | |
dc.identifier | ORCID: 0000-0003-4855-6692 (Sun, Xuerong) | |
dc.language.iso | en_US | en_GB |
dc.publisher | Elsevier | en_GB |
dc.rights.embargoreason | Under embargo until 12 May 2024 in compliance with publisher policy | en_GB |
dc.rights | © 2023. This version is made available under the CC-BY-NC-ND 4.0 license: https://creativecommons.org/licenses/by-nc-nd/4.0/ | en_GB |
dc.subject | Phytoplankton group composition | en_GB |
dc.subject | HPLC pigments | en_GB |
dc.subject | Marine big data | en_GB |
dc.subject | Artificial intelligence | en_GB |
dc.subject | Ensemble learning | en_GB |
dc.title | Marine big data-driven ensemble learning for estimating global phytoplankton group composition over two decades (1997–2020) | en_GB |
dc.type | Article | en_GB |
dc.date.available | 2023-05-16T16:00:23Z | |
dc.identifier.issn | 0034-4257 | |
exeter.article-number | 113596 | |
dc.description | This is the author accepted manuscript. The final version is available from Elsevier via the DOI in this record | en_GB |
dc.description | Data availability: Data will be made available on request. | en_GB |
dc.identifier.journal | Remote Sensing of Environment | en_GB |
dc.relation.ispartof | Remote Sensing of Environment, 294 | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/4.0/ | en_GB |
dcterms.dateAccepted | 2023-04-17 | |
rioxxterms.version | AM | en_GB |
rioxxterms.licenseref.startdate | 2023-05-12 | |
rioxxterms.type | Journal Article/Review | en_GB |
refterms.dateFCD | 2023-05-16T15:56:05Z | |
refterms.versionFCD | AM | |
refterms.dateFOA | 2024-05-11T23:00:00Z | |
refterms.panel | B | en_GB |
refterms.dateFirstOnline | 2023-05-12 |
Files in this item
This item appears in the following Collection(s)
Except where otherwise noted, this item's licence is described as © 2023. This version is made available under the CC-BY-NC-ND 4.0 license: https://creativecommons.org/licenses/by-nc-nd/4.0/