Mapping Methods

Dean Evans and David Iles

Summary

To generate standardized, province-wide maps of breeding bird relative abundance in Saskatchewan, we conducted statistical modelling using data collected during the Saskatchewan Breeding Bird Atlas. Because survey coverage and effort varied among atlas squares—and many northern squares were not surveyed at all—statistical modelling was used to correct for this variation and to interpolate abundance estimates across unsampled areas.

Our approach involved four main steps:

  1. Filtering species-specific data appropriate for abundance analysis.
  2. Fitting statistical models to estimate spatial abundance patterns and habitat associations, accounting for differences in survey effort and timing.
  3. Generating model-based predictions of relative abundance and probability of observation across the province, for a standardized amount of survey effort.
  4. Reviewing and validating model outputs, both quantitatively and qualitatively, to ensure reliability.

All analyses were conducted in R version 4.4.0. Code and documentation are available at https://github.com/deanrobertevans/SaskAtlas/.

Data Sources and Filtering

We used three main sources of species observation data:

  • Point Counts: Standardized 3-, 5-, and 10-minute counts where observers recorded all birds seen or heard.
  • Acoustic Recordings: Up to 10 minutes in duration, transcribed by experts to yield species counts similar to point counts (though without visual detections).
  • Stationary Count and Linear Transect Checklists: Completed checklists within each 10 × 10 km atlas square, indicating which species were and were not detected during a survey. Only complete checklists (i.e., those in which observers indicated all species were reported) were included.  Linear transect checklists imply the atlas participant moved through the atlas square while recording birds, and included associated measures of distance traveled and time taken.  Stationary counts are similar to point counts, but included non-standardized survey durations.

All data were extracted from NatureCounts and WildTrax platforms and cross-referenced to remove duplicates. Surveys were included in the abundance analysis if they:

  • Occurred during a given species expected breeding season and during the time that species was expected to be the most vocally active;
  • Were spatially and temporally referenced (i.e., included GPS coordinates and date/time information);
  • Represented valid and complete survey protocols (e.g., 3-10 minute point counts or complete checklists);
  • Met minimum quality assurance thresholds (e.g., not flagged as outliers);

Even if a survey was not used for relative abundance modelling, it still contributed to other atlas products (e.g., species presence and breeding summaries).

Covariates

We modelled bird abundance using a suite of environmental covariates, summarized within a 1 km radius of each survey location. These included:

  • Land Cover: Percent cover of “Urban,” “Cropland,” “Wetland,” “Needleleaf Forest,” and “Grass/Shrub” from the 2020 Land Cover of Canada dataset (Natural Resources Canada 2020).
  • Forest Characteristics: Percent conifer and broadleaf forest cover, and canopy closure from the SCANFI dataset (Guindon et al. 2024).
  • Topography: Mean elevation and slope derived from digital elevation maps of Saskatchewan retrieved using the ‘elevatr’ R package (Hollister 2023).
  • Climate: Mean annual temperature and precipitation (1991–2020 normals; Fick & Hijmans 2017; Wang et al. 2016).
  • Hydrology: Mean distance to water and wetland recurrence (Pekel et al. 2016).

In total our analysis contained 14 biophysical variables. To reduce multicollinearity and simplify interpretation, we conducted a Principal Component Analysis (PCA) of these environmental variables. The top seven principal component axes (PCs) were retained as predictors in the models, which together explained 84% of the variation in habitat among sites.  Each PC axis described different gradients of habitat across the study area.  For example, PC axis 1 described the transition from closed forest (high canopy closure, high percent broadleaf and needleleaf/conifer forest, lower temperature) to open landscapes (low canopy closure, high crop and grassland cover, high annual temperature).The second PC axis described the contrast between areas with high crop cover and those with high grass/shrub cover.  The third PC axis largely described landscape “wetness recurrence”, with a gradient from lower-lying areas with high wetness and wetland recurrence to those with higher elevation and lower wetland recurrence. Other PC axes had more complex interpretations that described other habitat gradients across survey locations (e.g., precipitation gradients, urban development, etc).  All continuous predictors were modelled using quadratic terms, allowing for nonlinear, intermediate optimum relationships where supported by the data.

We also included observer-related covariates to account for spatio-temporal variation in survey effort and timing.  These effort and timing corrections included:

  1. A survey-level “time since sunrise” variable (TSS) to account for variation in bird vocal activity over the course of a day, included with a flexible curvilinear spline effect on relative abundance.
  2. For point counts and ARUs, we included detectability offsets calculated using the QPAD approach (Solymos et al. 2013), which account for variation in survey duration (e.g., 3, 5, or 10-minute surveys) and effective detection radii of birds. We used the same offset calculations for both human point counts and ARUs, as several studies have shown that counts are similar from both methods (i.e., that ARU surveys are largely equivalent to human point counts for most species).
  3. For linear transect checklists, we included flexible curvilinear spline effects of both survey duration and distance travelled.
  4. For stationary checklists, we included a flexible curvilinear spline effect of survey duration.

Statistical Modelling of Relative Abundance

Relative abundance of each bird species during the breeding season was estimated using an integrated spatial modelling framework that simultaneously incorporates multiple survey types (i.e., point counts and ARU surveys, linear transect checklists, and stationary count checklists).  In this joint modelling approach, all data sources contributed to the estimation of species–habitat relationships, range limits, and spatial structure. This was implemented using the ‘inlabru’ package in R (Bachl et al. 2019), which supports Bayesian integrated spatial models through multiple likelihoods sharing common model components.

Joint Model Structure

The model accounted for:

  • Spatial Covariates – the set of environmental variables hypothesized to influence bird distribution as measured by principal component axes (described above).
  • Spatial Autocorrelation – modelled using Gaussian Random Fields (GRF) using the SPDE approach, allowing the model to capture residual spatial dependencies in the data that were not directly tied to habitat covariates.
  • Effort corrections – to account for differences in bird counts resulting from surveys of different duration (for stationary surveys) and distance travelled (for linear transect checklists).
  • Time since sunrise – modelled using spline-type effects to account for variation in bird detectability over the course of a day
  • Detectability corrections – estimated using time-removal and distance sampling, following the QPAD approach as outlined by Solymos et al. (2013) and Edwards et al. (2023).
  • Informative Priors: Species’ estimated range limits from eBird (Fink et al. 2023) were used as informative priors, providing species-specific spatial constraints.

All three data types (point counts/acoustic counts, stationary checklists, linear transect checklists) were linked through a shared predictor structure, where the linear predictor for each dataset included:

  • A dataset-specific intercept
  • Range edge effect: modelled as a linear function of distance beyond the species’ estimated breeding range edge
  • Environmental covariates: based on principal component axes described above, included as linear and quadratic terms
  • A shared spatial random field, estimated using a Stochastic Partial Differential Equation (SPDE) approach

The response variable and error distribution used in modelling varied by dataset:

  1. Point Counts and ARUs (PC): modelled as counts using negative binomial error and a log link function.
  2. Linear Transect Checklists (LT): modelled as binary presence/absence records using binomial error and a complementary log-log link function.
  3. Stationary Count Checklists (SC): also modelled as presence/absence data using a binomial distribution and complementary log–log link function.

Although complete checklists contained counts of birds, preliminary analyses indicated that inclusion of total count information from checklists introduced a large amount of uncertainty into predictions and reduced cross-validation accuracy compared to models that only used presence/absence information from checklists.  We therefore reduced counts to presence/absence by treating any counts greater than 0 as presence.

Likelihood Equations

The likelihood for counts derived from point counts and ARUs was therefore modelled as:


The terms in this equation are:

  • : Global intercept for point count surveys.
  • : Offset correcting for species- and effort-specific detectability from QPAD models.
  • : Spline effect of time since sunrise for survey .
  • : Coefficient capturing abundance change near species’ range edge.
  • : Distance from the edge of the species’ range for survey .
  • : Spatial random effect (latent Gaussian field) for location of survey .
  • : Additive fixed effects for standardized environmental covariates and their quadratic terms, for each of the principal component axes included as predictors.

Binary presence/absence observations from linear transects were modelled using a Bernoulli likelihood, with a complementary log–log link corresponding to a latent Poisson detection process as follows:


The additional terms in this equation are:

  • : Global intercept for linear transect checklists.
  • : Log-duration of the survey as an offset to account for effort.
  • : Log-distance of the survey as an offset to account for effort.

Similarly, stationary count checklists were modelled using:


where:

  • : Global intercept for stationary count checklists.
  • : Log of survey duration (in minutes), accounting for effort.

Validation and Mapping Relative Abundance and Probability of Observation

To produce spatial predictions of relative abundance, we generated model-based estimates across all of Saskatchewan using a 1 km resolution grid. For each species, we predicted the expected number of individuals detected across 15 standardized 5-minute point counts per grid cell, based on the remotely sensed covariates in that cell. These predictions were derived from the fitted model using posterior sampling (1,000 draws), and were initially on the log scale. We then exponentiated the predictions and summarized them using the median and the 90% credible interval (i.e., the 5th and 95th percentiles), providing a measure of both central tendency and uncertainty in predicted abundance. We also calculated the width of the 90% credible interval as an indicator of prediction uncertainty.

To make the predictions more interpretable for end users, we also computed the probability of observing the species at least once across 15 point counts, based on the predicted abundance and fitted parameters of the negative binomial model. This was calculated as one minus the probability of observing zero individuals across 15 point counts, allowing users to identify areas where they would be likely to observe the species.

To assess the quality and reliability of each species’ map, we compared predicted relative abundances to observed counts and detection/non-detection patterns in the raw data. This included qualitative review by expert ornithologists familiar with species-specific distribution patterns across the province, as well as visual checks for overfitting or implausible extrapolation.

For some species, multiple iterations of model refinement and data filtering were necessary to ensure that prediction maps were consistent with known patterns of occurrence. In some cases, this involved adjusting the model structure and removing outlier data. Only prediction maps that passed both data-driven and expert-based review were considered suitable for publication in the atlas.

References

Bachl, F. E., Lindgren, F., Borchers, D. L., & Illian, J. B. (2019). inlabru: an R package for Bayesian spatial modelling from ecological survey data. Methods in Ecology and Evolution, 10(6), 760-766.

Edwards, B. P., Smith, A. C., Docherty, T. D., Gahbauer, M. A., Gillespie, C. R., Grinde, A. R., … & Zlonis, E. J. (2023). Point count offsets for estimating population sizes of north American landbirds. Ibis, 165(2), 482-503.

Fink, D., T. Auer, A. Johnston, M. Strimas-Mackey, S. Ligocki, O. Robinson, W. Hochachka, L. Jaromczyk, C. Crowley, K. Dunham, A. Stillman, C. Davis, M. Stokowski, P. Sharma, V. Pantoja, D. Burgin, P. Crowe, M. Bell, S. Ray, I. Davies, V. Ruiz-Gutierrez, C. Wood, A. Rodewald. 2023. eBird Status and Trends, Data Version: 2023; Released: 2024. Cornell Lab of Ornithology, Ithaca, New York. https://doi.org/10.2173/WZTW8903

Fick, S.E. and R.J. Hijmans, 2017. WorldClim 2: new 1km spatial resolution climate surfaces for global land areas. International Journal of Climatology 37 (12): 4302-4315.

Guindon, L., Manka, F., Correia, D. L., Villemaire, P., Smiley, B., Bernier, P., … & Boulanger, Y. (2024). A new approach for Spatializing the Canadian National Forest Inventory (SCANFI) using Landsat dense time series. Canadian Journal of Forest Research, 54(7), 793-815.

Hollister, J.W. (2023). elevatr: Access Elevation Data from Various APIs. R package version 0.99.0. https://CRAN.R-project.org/package=elevatr/

Natural Resources Canada.  2020 land cover of Canada.  Downloaded from: https://open.canada.ca/data/en/dataset/ee1580ab-a23d-4f86-a09b-79763677eb47

Pekel, J. F., Cottam, A., Gorelick, N., & Belward, A. S. (2016). High-resolution mapping of global surface water and its long-term changes. Nature, 540(7633), 418-422. Downloaded from: https://global-surface-water.appspot.com/download

Sólymos, P., Matsuoka, S. M., Bayne, E. M., Lele, S. R., Fontaine, P., Cumming, S. G., … & Song, S. J. (2013). Calibrating indices of avian density from non‐standardized survey data: making the most of a messy situation. Methods in Ecology and Evolution, 4(11), 1047-1058.

Wang T., Hamann A., Spittlehouse D., & Carroll C. (2016) Locally Downscaled and Spatially Customizable Climate Data for Historical and Future Periods for North America. PLoS ONE 11, e0156720. doi:10.1371/journal.pone.0156720. Downloaded from: https://adaptwest.databasin.org/pages/adaptwest-climatena/

 

 

Birds Canada Privacy Policy | Accessibility Policy
Saskatchewan Breeding Bird Atlas, Birds Canada, 115 Perimeter Road Saskatoon, SK, S7N 0X4 Canada
Phone: 1-306-249-2894 E-mail: skatlas@birdscanada.org Banner photo: May Haga