|Matt Amos||Lancaster University||EDS infrastructure|
|Katie Awty-Carroll||Plymouth Marine Laboratory||EDS infrastructure, Extremes, Big (environmental) data analysis|
|Oscar Brousse||University College London||Extremes|
|Jeremy Carter||Lancaster University||EDS infrastructure|
|Amulya Chevuturi||UK Centre for Ecology & Hydrology||Extremes|
|Dan Clarkson (not presenting)||Lancaster University||Extremes|
|Elena Fillola Mayoral||Department of Engineering Mathematics, University of Bristol||EDS infrastructure|
|Matt Fry||UK Centre for Ecology and Hydrology||Net zero|
|Sarah Jones||Lancaster University||Net zero, Resilience|
|Doran Khamis||UKCEH||Net zero|
|Kathryn Leeming||British Geological Survey||EDS infrastructure|
|George Linney||Lancaster University and UKCEH||Resilience|
|David Parkes||Lancaster University||EDS infrastructure, Conceptual/philosophical challenges in representing the environment|
|Joe Phillips||Lancaster University||Extremes, Resilience|
|Peter Radvanyi||University of Glasgow, School of Mathematics and Statistics||EDS infrastructure|
|Michael Tso||UKCEH||Net zero|
|Craig Wilkie||University of Glasgow||EDS infrastructure|
|Kate Wright||Lancaster University||EDS infrastructure|
Ensembles of climate models are routinely used to generate projections of the changing climate. By using multiple models, we can improve the accuracy of projections whilst quantifying projection uncertainty. However, methods for assimilating ensemble projections often fail to propagate uncertainties through the model in a principled manner. Here we present an end-to-end Bayesian statistical model that combines Gaussian processes and Wasserstein barycentres, to probabilistically ensemble climate models. This method produces better constrained climate projections, with realistic and principled uncertainty estimates crucial for fairly communicating the possible impacts of important projections. We show that our method performs favourably to commonly used methods, such as weighted means and multi-model means, particularly for uncertainty quantification. Although demonstrated for a climate specific application, the method is generalisable and applicable to the wider group of ensembles of environmental models.
Katie Awty-Carroll, Pete Bunting, Dan Clewley
A Deep Learning framework for large-scale mangrove monitoring
Mangrove forests are of high biological, ecological, and economic importance, and are some of the most biodiverse and carbon rich ecosystems on the planet. Despite their significance, mangrove ecosystems are threatened globally by over-exploitation, expansion of aquaculture, and the effects of anthropogenic climate change. These pressures make assessment and monitoring of the world’s mangrove forests a major concern. Multispectral Earth Observation data (e.g. Landsat, Sentinel-2) provides an opportunity for global mangrove monitoring over decadal time scales. However, satellite monitoring efforts are constrained by challenges of data quality and quantity. Mangrove ecosystems are highly dynamic, experiencing rapid tidal and geomorphological change, and cloud contamination of imagery can substantially reduce data frequency. As a result, developing a globally consistent methodology for mangrove mapping is difficult, and there are currently no global estimates of mangrove extent which make use of the full satellite data archive. Utilising the Open Data Cube framework, this project performs an integrated analysis of thirteen study sites chosen to represent a range of mangrove ecosystems. Using the computational capabilities and Artificial Intelligence expertise provided by NEODAAS, a Convolutional Long Short-Term Memory Neural Network has been trained to classify pixels based on spatiotemporal context, decreasing reliance on individual pixel observations and reducing the impact of ephemeral change. The feasibility of transfer learning between Landsat and Sentinel-2 to decrease model training time will also be investigated. In addition to generating more accurate mangrove extent maps, his project demonstrates a transferable pipeline for large-scale mangrove monitoring using Deep Learning.
Cities impact the local climate, and in particular local near-surface air temperatures. The urban heat island (UHI), describes the hotter temperatures generally observed in cities, when compared to surrounding rural areas. The UHI can have numerous negative impacts, for example: energy consumption, biodiversity loss, heat-related morbidity and mortality, and infectious diseases. The UHI is a well-studied phenomenon, yet, to provide mitigative and adaptative urban planning strategies that deal with these issues, a better understanding of the spatio-temporal heterogeneity of the UHI is necessary.
A major obstacle to achieving the latter goal is data accessibility, temporal continuity and spatial density. Standards for weather data collection require official monitoring stations to be placed in open areas to ensure consistency, so cannot fully capture the urban environment. As an illustration, the most urbanized active weather station in London is located in St James Park, thereby recording a milder UHI than observed during field campaigns in more densely built areas of the city. Urban climatologists are well aware of these issues and have proposed different solutions. In fact, numerous studies have: i) developed denser meteorological stations networks during specific months or years to acquire data in more heterogeneous urban environments; ii) designed field surveys to record urban meteorological data through mobile sensors and get snapshots of the urban climate spatial heterogeneity; and iii) used satellite remote sensed climatological data to relate the evolution of land surface characteristics to surface climate variabilities. However, such studies are always limited by financial, human, and computational resources.
Recently, the concept of Smart Cities has gained attention. Smart Cities continuously record data from citizens, devices, and buildings, thereby enabling analytics that can guide urban development and services to reach sustainability. Within this paradigm, personal weather stations have been sold to users primarily for recording their homes and outdoor and indoor meteorological data to help them understand and react to their home thermal environment, e.g., by changing their heating or cooling hours. But this data is also shared on the cloud and offered to other users, including urban climatologists.
We show the potential of crowdsourced data from citizen weather stations for studying the urban climate, and more specifically urban heat, in London, and south east England. Our study focuses on 6 recent years (2015-2020), when the citizen weather station network started densifying in south-east England, with a special focus on the summer 2018 heatwave. We first show that crowd-sourced weather data can help with studying the urban climate in south-east England for these 6 years. Then, through innovative data filtering and treatment, we highlight the usability of citizen weather stations for studying the temporal variation of urban heat related to wind-borne heat transport. Finally, we demonstrate that such network can improve urban heat studies by providing a reliable source of data to bias-correct an urban climate models representation of the UHI using the example of the 2018 summer in London.
Our study provides key perspectives on the utility of crowd-sourced weather data for heat-related epidemiological studies in cities, and lead to assessment of various interventions designed to reduce urban overheating by combining urban climate modelling with crowd-sourced weather data.
Regional Climate Models (RCM) are the primary source of climate data available for impact studies over Antarctica. These climate-models experience significant, large-scale biases over Antarctica for variables such as snowfall, surface temperature and melt. Correcting for these biases is desirable for impact models being driven by meteorological data that aim to produce optimal estimates of for example surface run-off and ice discharge. Typical approaches to bias correction often neglect the handling of uncertainties in parameter estimates and donâ€™t account for the different supports of climate-model and observed data. Here a fully Bayesian approach using latent Gaussian processes is proposed for bias correction, where parameter uncertainties are propagated through the model. Advantages of this approach are demonstrated by bias-correcting RCM output for snowfall over the Antarctic Peninsula.
Chevuturi A., Tanguy M., Svensson C., Hannaford J.
Drivers of extreme UK drought
Droughts are estimated to be one of the major geophysical hazards under the changing climate. Studies show an increase in future droughts over UK, but with large uncertainties. Our study is aimed at understanding drivers of extreme UK droughts and subsequently predictors for forecasting droughts, which is essential for improving drought early warning systems and identifying current and future water resource availability. We analyze the driving processes behind drought conditions by evaluating the concurrent and lagged relationships between UK standardized precipitation index (SPI) and global-scale variables (e.g., sea surface temperature, SST) over a long study period of 1862 – 2015. We identify two regions of UK (northwest and southeast), with distinct drought characteristics, using cluster analysis. We analyze the drivers of droughts in these two regions on annual and seasonal timescales. Results show that the northern Atlantic region stands out as a strong driver for both northwest and southeast UK droughts, concurrently. Indian Ocean region also shows strong relationship with UK SPI lagged by a few months for both regions of UK. Equatorial Pacific shows up as a strong driver to southeast UK droughts but not for the northwest droughts. Building upon these results, we are working towards identifying the teleconnection pathways between global SSTs and UK droughts and identifying predictors, which could be applied to predictive models to forecast droughts. Our work will be beneficial for seasonal forecasting of drought and long-term water resources planning applications.
Daniel Clarkson (unable to attend)
To assess future risk from temperature extremes in the presence of climate change, statistical extreme value models can be used to estimate the frequency, magnitude, and spatio-temporal extent of future extreme events. Standard extreme value methods assume an underlying density function that is monotonically decreasing in the tail, which is not always the case for environmental data sets due to their inherently complex structure. In the context of ice surface temperatures, the data are problematic due to a spike in observations close to, but just below, zero, and a non-negligible number of positive observations. To account for this, we use a Gaussian mixture model as a marginal distribution for a Spatial Conditional Extremes model. Despite the Gaussian mixture model not being an approach taken from the traditional extreme value toolkit, it allows estimation of the melt point as a probability rather than using a fixed melt threshold, reflecting uncertainty from both the process and the data. This approach gives us a definition of an extreme temperature that considers the context of melt, and the model shows considerable improvements in terms of fit and predictions on standard extreme value models. The full Spatial Conditional Extremes model allows us to simulate melt events given that we observe an extreme temperature at a particular location, allowing us to analyse the size and magnitude of melt events across the ice sheet.
Elena Fillola Mayoral, Raul Santos-Rodriguez, Matt Rigby
A machine learning emulator for Lagrangian atmospheric dispersion models
Lagrangian particle dispersion models are commonly used in top-down or inverse greenhouse gas emission inference. In time-reversed mode, these models calculate footprints, which describe the sensitivity of the atmospheric observation to emissions in model grid cells. However, the high computational cost of traditional dispersion models means that the growing volume of GHG data available, in particular from satellites, is underutilised. Here, we develop a machine learning emulator for NAME, the Met Officeâ€™s dispersion model, which outputs footprints for a small (~100km x 100km) region around the observation point. The footprint value at each cell is modelled independently using gradient-boosted regression trees (GBRTs) with meteorological variables as inputs. Our emulator can predict the full footprint in around five orders of magnitude faster than running the dispersion model (or three orders of magnitude taking into account training time), while achieving a Pearsonâ€™s R of over 0.7 for the predicted CH4 concentration over the area. Moreover, as GBRTs rank the most relevant features for predicting each cell value, the model sheds light on the underlying factors that drive concentration. Our emulator is presented as a proof-of-concept of the capabilities of machine learning to speed up atmospheric dispersion modelling and could be expanded to 3D outputs to efficiently use satellite observations.
Land InSight: a digital twin of the UK’s soil carbon and water
Understanding the interactions at the land surface, in particular the roles of soils for water and carbon retention, is critical to the UK’s response to the climate emergency. Soil moisture has a huge influence on both flooding and drought, predicted to be the greatest direct impacts of climate change on the UK. Increasing the uptake of soil carbon through land management will be critical to achieving net-zero targets. We need to improve understanding of land surface processes and predictions of soil behaviour under land management scenarios as we move to warmer summers and wetter winters.
Digital Twins of the Environment (DTEs) are a genuinely new paradigm offering a pathway for the discovery of new knowledge about environmental systems, improving our ability to model and predict the functioning of these systems and provide information for decision making in real-time or for constructed scenarios. Traditionally, there has been a tendency for process-based modelling and data-driven modelling to be isolated activities. Digital twins provide an opportunity for data-driven understanding to challenge process-based models and, conversely, for process understanding to inform data-driven analyses.
The Land InSight digital twin of the UK’s soil carbon and water is addressing a number of challenges in the creation of DTEs including integration of sensor data within real-time environmental modelling, development of hybrid modelling approaches, use of modern IT infrastructures for the implementation of digital twins, and the delivery of useful data and information products to end users.
Exploring pathways to achieving nature positive and carbon neutral land use and food systems in Wales.
The devolved nations of the UK are at a significant juncture regarding environmental, agricultural and land use policy following its exit from the EU and international and national commitments on climate change and net-zero. How we use and manage land has the potential to contribute to climate targets through acting as a carbon sink. However, the configuration of land that would best achieve nature positive and carbon neutral land use and food systems in Wales remains unclear. Using the FABLE calculator, four pathways towards achieving this were explored. The results indicate that continuing as usual or slight improvements in policy wonâ€™t be sufficient for achieving climate targets, and achieving these targets will require land to become a carbon sink. The results demonstrate alternative approaches to achieving nature positive and carbon neutral land use and food systems are possible, but can have other consequences for biodiversity. The pathways were co-created with colleagues at the Welsh Government, and it is anticipated that this work will aid further modelling work.
Data-driven approaches to linking soil moisture and soil carbon
Soil moisture has a close relationship with soil carbon. The carbon content of soil affects the response of the moisture content to meteorological input; the moisture content of soil affects the amount and type of respiration that can occur. This work seeks to elucidate this relationship by building a predictive data-driven model of soil moisture using data from the COSMOS-UK sensor network and using it as a predictor of soil organic carbon. We roll this model out across the UK and compare results to the Countryside Survey carbon map. Further, using flux tower observations of night time net ecosystem exchange (of CO2), a model linking soil moisture to respiration is created and used to propose functional forms for respiration in land surface models.
Using Functional Data Analysis for Environmental Time Series
Trends in environmental time series are often investigated using the Mann-Kendall test: a non-parametric approach identifying a general increase or decrease in values. We demonstrate applying functional data analysis to questions of trend and change in time series, allowing for much more flexibility than the typical monotonic approach. We use the funFEM R package to cluster time series recorded at different locations based on their functional representations and show the relevance of the spatial patterns this generates.
Ecosystem service provision maps are used to show the capacity of a region to provide an ecosystem service. Mapping the provision of ecosystem services is important to inform management and policy decisions to help avoid further loss of these vital services. A wide range of evidence types can be used to map ecosystem service provision. We created ecosystem service provision maps for Europe using evidence from an integrated modelling platform, expert opinion and literature synthesis, for the ecosystem services timber production, carbon sequestration and aesthetic landscapes. We compared the similarity between the different evidence type ecosystem service provision maps and investigated how this similarity changes depending on the ecosystem service mapped. Future ecosystem service provision maps for different climate change and socioeconomic scenario combinations were created using each evidence type and we investigated how their similarity changes depending on the future scenario mapped. This study highlights potential impact that the type of evidence underpinning ecosystem service provision maps may have on decision making.
Representing shape in environmental data science using fractal dimension
Shapes in nature are almost never simple. They exhibit complexity on many different levels, resisting reduction to a single length scale or representation with a (reasonably) finite set of simple geometric objects. In scientific representations, shapes are often no more complicated than they need to be to capture the features of an object or process on a scale that is of greatest interest to a given study. With limitations on processing power, observational resolution, data set size, and model complexity, some degree of abstraction is necessary to make processing spatial information tractable. However, while we do (and must) accept simplifications of a complex reality in order to study environmental problems, and typically account for some of the uncertainty that arises from datasets or models with a given resolution, it is rarer that we account for the way these simplifications affect scaling relationships arising from shape in the properties we observe or model, and the biases these impart. In my presentation I intend to visualise this abstract problem in a tangible way with examples of non-integer dimensionality in natural processes and illustrate how this can impact the results of science which uses simplifications that do not respect this dimensionality.
Since the 1990s, the melting of Earth’s Polar ice sheets has contributed approximately one-third of global sea level rise. As Earth’s climate warms, this contribution is expected to increase further, leading to the potential for social and economic disruption on a global scale. If we are to begin mitigating these impacts, it is essential that we better understand how Earth’s ice sheets evolve over time.
Currently, our understanding of ice sheet change is largely informed by satellite observations, with the longest continuous record coming from the technique of satellite altimetry. These instruments provide high-resolution measurements of ice sheet surface elevation through time, allowing for estimates of ice sheet volume change and mass balance to be derived. Satellite radar altimeters work by transmitting a microwave pulse towards Earth’s surface and listening to the returned echo, which is recorded in the form of discrete waveforms that encode information about both the ice sheet surface topography and its electromagnetic scattering characteristics. Current methods for converting these waveforms into elevation measurements typically rely on a range of assumptions that are designed to reduce the dimensionality and complexity of the data. As a result, subtle, yet important, information can be lost.
A potential alternative approach for information extraction comes in the application of deep learning algorithms, which have seen enormous success in diverse fields such as oceanography and radar imaging. Such approaches allow for the development of singular, data-driven methodologies that can bypass the many, successive, human-engineered steps in current processing workflows. Despite this, deep learning has yet to see application in the context of ice sheet altimetry. Here, we are therefore interested in exploring the potential of deep learning to extract deep and subtle information directly from the raw altimeter waveforms themselves, in order to drive new understanding of the contribution of polar ice sheets to global sea level rise. Here we will provide first results from our preliminary analysis, together with a roadmap for the planned activities ahead.
Groundwater Monitoring Network Optimisation: Well Redundancy Analysis
Large scale monitoring of groundwater quality requires the establishment of well networks. Data from a network is used to model the extent and behaviour of constituents of potential concern (CoPC) in space and time. The fieldwork associated with drilling and sampling of wells incurs health and safety risks, as well as significant costs. In many long-term monitoring operations, a reduced number of wells would be sufficient for supporting robust statistical models. Thus, the optimisation of monitoring networks is important for maximising the value of collected data whilst minimising the number of samples being collected. One approach for spatial optimisation is adjusting the resolution of the monitoring network by identifying and removing wells that provide redundant data. Sampling at these wells can then be reduced or ceased. A well is considered ’redundant’ if its removal does not significantly affect model predictions. This presentation aims to introduce methods by which the influence of individual wells on the predictions of a groundwater quality model involving CoPC can be quantified. The analysis is based on a spatiotemporal model, fitted to simulated data using a P-splines approach. The procedure ranks the wells within the network based on their individual influence on model predictions. Wells with low influence should be prioritised when deciding where to reduce monitoring efforts. This information can be used to optimise the groundwater monitoring network spatially without significantly decreasing monitoring quality and prediction accuracy.
Comparison and integration of UK soil moisture data products
The status of soil moisture across a national scale is of interest to a wide range of environmental science applications. For instance, it controls the amount of carbon that can be sequestered in soils and thus affects a countryâ€™s ability to meet â€˜Net Zeroâ€™ targets. In the UK, national scale soil moisture monitoring is provided by the COSMOS-UK network, which has also been used to calibrate statistical models to yield UK-wide gridded soil moisture. Additionally, soil moisture is also returned as by-products in river flow and land surface models. Finally, gridded UK soil moisture information are also provided by global satellite products (e.g. SMAP and ASCAT) and global models. In this study, we compare and evaluate the soil moisture response of different products, such as their drydown behaviour, soil water deficit estimates, and representative depths. Critically, we seek to understand the factors influencing the similarities and differences between these products (e.g. driving data and soil properties) and suitability of each product in various applications. We then explore potential ways to integrate soil moisture data from various products to provide more robust and holistic estimates of UK soil moisture status. This work contributes towards the ongoing efforts to improve the readiness to provide UK soil moisture information to the scientific community.
Craig Wilkie, Surajit Ray, Marian Scott, Claire Miller
Statistical downscaling for fusion of satellite and in-river water quality data: application to the Ramganga river
Satellite remote sensing is increasingly used to understand the health of inland freshwater lakes. However, the high spatial resolution of data from Sentinel-2 and commercial satellite programmes means that satellite data can also be used to improve river water quality monitoring. Our motivation is the Ramganga river in northern India, which is vital to millions of people but suffers from pollution from agriculture and other industries. There is limited routine monitoring, but satellite data fill in gaps in our knowledge of spatiotemporal patterns of water quality. As part of the Ramganga Water Data Fusion Project, in-river data were collected. We present and discuss methods for spatiotemporal fusion of in-river and satellite water quality data within rivers, focussing on an application to chlorophyll data. We use nonparametric statistical downscaling to model the relationship between the in-river data and satellite data, with the in-river data assumed to be accurate within measurement error. We model the satellite data at each spatial location as observations of smooth functions over time, allowing the basis coefficients to vary smoothly over space, enabling prediction at any spatial location or timepoint. This allows the creation of a calibrated smooth spatial and temporal surface of chlorophyll data along the river length, which can be used to identify and understand patterns in water quality. Acknowledgements: Rajiv Sinha, Manudeo Singh, Umar Farooq, Bharat Choudhary (Indian Institute of Technology Kanpur) provided the in-river data; Andrew Tyler, Peter Hunter, Veloisa Mascarenhas (University of Stirling) provided the remote sensing data. Funding: EPSRC reference EP/T003669/1.
Data to Decision: uncertainties in environmental data science
Uncertainties are an inherent feature of scientific research; these can be ‘aleatory’ due to the random nature of the world; or ‘epistemic’ due to limited knowledge or ignorance, and thus could be reduced with further research. Alongside these, language ambiguities can occur due to the collaboration of researchers with different disciplinary backgrounds. Unquestionably, scientific uncertainties impact on actions taken by stakeholders, and additionally, researchers need to consider that interpretation of – and response to – uncertainty differs between individuals. Understanding the many sources of uncertainty along the data to decision pipeline will aid provision of robust scientific evidence to underpin decision-making. This evidence, accompanied by transparency of uncertainties, will enable the decision maker to understand the level of risk they are taking. Grounded in data collected from interviews and focus groups, this poster will discuss the uncertainties experienced by experts from environmental science, computer science, and statistics, to provide a new typology of uncertainty for environmental data science.
This work is being carried out as part of the Data Science of the Natural Environment (DSNE) Project at Lancaster University.