Plot netCDF data using the R leaflet package
Visualize TROPOMI netCDF data over your study area.
I spent last Tuesday morning trying to extract values of TROPOMI nitrogen dioxide measurements at points that correlated with air quality monitors across Ontario. Ontario has 38 air quality monitoring stations, and I wanted to see if there was a decent correlation between surface monitors and satellite monitors.
When I finished running my code, I had a table with more than 43,000 rows (one for every hour from 2018 to August 2021). However, when I filtered out the “NaN” values, I had less than 16,000! Losing more than 50% of my training data due to unavailable satellite measurements is not ideal, and I wanted to see what was going on before moving forward with my work.
This post provides a method I used to visualize the satellite data over Ontario using R and the leaflet package.
If you’re unfamiliar with leaflet, you can read more about it here.
It’s important to note that I’ve already preprocessed my Sentinel-5P data using the harp package in Python, which - in my humble opinion - is the reigning champion of all python packages. I highly recommend you regrid your data before using this tutorial, it makes the entire process of working with TROPOMI Level 2 data much, much easier. I’ll write a blog post on how to use harp over the next couple of weeks.
I wanted to visualize my netCDF data in order to understand why there was so much missing data, but not just in a raster plot like I usually do. I wanted to see it overlaid on a map of my study area. Here’s how I did it.
Import the appropriate libraries into your R workspace.
```{r}
library(raster)
library(leaflet)
library(rgdal)
```
We’re importing the raster package to manage the netCDF, leaflet to access their map functions, and rgdal so that everything works smoothly.
Then, create a variable with the name of the netCDF file.
netCDF_file <- "./TROPOMIdata/converted_20210630.nc"
Then, we can use the raster package to create a rasterLayer for the tropospheric nitrogen dioxide layer.
no2 <- raster(netCDF_file,varname = "tropospheric_NO2_column_number_density")
This is what the output looks like:
> no2
class : RasterLayer
dimensions : 1599, 1999, 3196401 (nrow, ncol, ncell)
resolution : 0.01, 0.01 (x, y)
extent : -95, -75.01, 41, 56.99 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs
source : converted_20210630.nc
names : tropospheric_NO2_column_number_density
zvar : tropospheric_NO2_column_number_density
As you can see, we’ve converted the netCDF data into a RasterLayer, which will allow us to turn it into a layer suitable for leaflet. I also changed the colours to a red - yellow range using the colorNumeric function.
no2_leaflet <- projectRasterForLeaflet(no2,method = "bilinear")
color_pal <- colorNumeric("OrRd", values(test), na.color = "transparent")
And then we can plot it! It’s really that simple.
leaflet() %>% addTiles() %>%
addRasterImage(test, colors = color_pal, opacity = .7) %>% addLegend(pal = color_pal, values = values(no2_leaflet), title = "NO2")
Here is what the entire netCDF data set looks like:
And now let’s zoom into the Toronto region.
You can see large areas of satellite data missing over the Toronto region. There are many reasons why data could be missing to this degree (glare, albedo, errors, etc), and I won’t go into the specifics of that here. This is just one example, and there are many more files I’ve taken a look at to spot check my work. Give it a try and let me know how it goes!
If you want to see more technical coding content, leave a comment on what you’d like to see!
Want to learn more? Check out this course that I made where we dive deeper into spatial data science concepts using R.