| Type: | Package | 
| Title: | Functions and Data for the Book 'Applied Hierarchical Modeling in Ecology' Vols 1 and 2 | 
| Version: | 0.2.10 | 
| Date: | 2024-09-11 | 
| Depends: | R (≥ 2.10) | 
| Imports: | grDevices, graphics, methods, stats, utils, unmarked (≥ 0.12.2), mvtnorm | 
| Suggests: | coda, fields, raster, sp, spdep | 
| Description: | Provides functions to simulate data sets from hierarchical ecological models, including all the simulations described in the two volume publication 'Applied Hierarchical Modeling in Ecology: Analysis of distribution, abundance and species richness in R and BUGS' by Marc Kéry and Andy Royle: volume 1 (2016, ISBN: 978-0-12-801378-6) and volume 2 (2021, ISBN: 978-0-12-809585-0), https://www.mbr-pwrc.usgs.gov/pubanalysis/keryroylebook/. It also has all the utility functions and data sets needed to replicate the analyses shown in the books. | 
| License: | GPL (≥ 3) | 
| Copyright: | file COPYRIGHTS | 
| NeedsCompilation: | no | 
| URL: | https://www.mbr-pwrc.usgs.gov/pubanalysis/keryroylebook/ | 
| BugReports: | https://github.com/hmecology/AHMbook/issues | 
| Encoding: | UTF-8 | 
| Language: | en-US | 
| Packaged: | 2024-09-11 15:21:15 UTC; ken | 
| Author: | Marc Kéry [aut], Andy Royle [aut], Mike Meredith [aut], Ken Kellner [ctb, cre] (simIDS), Urs Breitenmoser [dtc] (EurasianLynx), Richard Chandler [ctb], Bob Dorazio [ctb], Evan Grant [dtc] (duskySalamanders), John-Andren Henden [dtc] (Finnmark), Roland Kays [dtc] (MesoCarnivores), David King [dtc] (cswa), Xavier Lambin [dtc] (waterVoles), Jeremy Mizel [ctb, dtc] (treeSparrow), Anja Molinari-Jobin [dtc] (EurasianLynx), René-Jean Monneret [dtc] (FrenchPeregrines), Arielle Parsons [dtc] (MesoCarnivores), René Ruffinoni [dtc] (FrenchPeregrines), Michael Schaub [ctb], Rahel Solmann [ctb], Nicolas Strebel [dtc], Chris Sutherland [ctb, dtc] (waterVoles), Mathias Tobler [ctb], Gesa von Hirschheydt [ctb] (simOccCat), Fridolin Zimmermann [dtc] (EurasianLynx), Aargau Biodiversity Monitoring Program (LANAG) [dtc], British Ornithological Trust [dtc] (willowWarbler, UKmarbledWhite, <https://www.bto.org/>), Butterfly Conservation [dtc, cph] (UKmarbledWhite, <https://butterfly-conservation.org>), Centre for Ecology & Hydrology [dtc, cph] (UKmarbledWhite, <https://www.ceh.ac.uk>), Dutch Centre for Field Ornithology (Sovon) [dtc] (wagtail, <https://www.sovon.nl>), eMammal [dtc] (MesoCarnivores, <https://emammal.si.edu/>), Groupe Pèlerin Jura [dtc] (FrenchPeregrines), Hubbard Brook Ecosystem Study [dtc] (HubbardBrook, <http://data.hubbardbrook.org/data/dataset.php?id=178>), Joint Nature Conservation Committee (JNCC) [dtc, cph] (UKmarbledWhite, <https://jncc.gov.uk/>), North Carolina Museum of Natural Sciences [dtc] (mesoCarnivores), North Carolina State University [dtc] (mesoCarnivores), Progetto Lince Italia [dtc] (EurasianLynx, <https://www.progettolinceitalia.it>), Swiss Federal Statistical Office [dtc, cph] (BerneseOberland, <http://www.bfs.admin.ch>), Swiss Foundation KORA Carnivore Ecology and Wildlife Management (SCALP project) [dtc] (EurasianLynx, <https://www.kora.ch>), Swiss Ornithological Institute [dtc] (<https://www.vogelwarte.ch>), UK Butterfly Monitoring Scheme (UKBMS) [dtc, cph] (UKmarbledWhite, <https://www.ceh.ac.uk/our-science/projects/uk-butterfly-monitoring-scheme>) | 
| Maintainer: | Ken Kellner <contact@kenkellner.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-09-11 18:00:07 UTC | 
Functions and data for the Book “Applied Hierarchical Modeling in Ecology” Volumes 1 and 2
Description
Provides functions to simulate data sets from hierarchical ecological models, including all the simulations described by Marc Kéry and Andy Royle in the two-volume publication, Applied Hierarchical Modeling in Ecology: analysis of distribution, abundance and species richness in R and BUGS, Academic Press (Vol 1, 2016; Vol 2, 2021), plus new models developed after publication of the books.
It also has all the utility functions and data sets needed to replicate the analyses shown in the books.
SIMULATION FUNCTIONS
Introduction
- sim.fn
- Simulate a homogeneous Poisson point process and illustrate the fundamental relationships between intensity, abundance and occurrence (AHM1 - section 1.1) 
- data.fn
- Simulate count data that are replicated in space and in time according to the binomial N-mixture model of Royle (2004) (this is for much simpler cases than is possible with function - simNmixin Chapter 6 below) (AHM1 - 4.3)
Abundance from Counts: Binomial N-Mixture models
- simNmix
- Simulate count data and individual detection histories for binomial and multinomial mixture models respectively under a wide range of conditions (AHM1 - 6.9.3) 
- simpleNmix
- Simulate count data under a very simple version of the binomial mixture model, with time for space substitution (AHM1 - 6.12) 
- playRN
- Play Royle-Nichols (RN) model: generate count data under the binomial N-mixture model of Royle (2004), then 'degrade' the data to detection/nondetection and fit the RN model using unmarked and estimate site-specific abundance (AHM1 - 6.13.1) 
Abundance from Hierarchical Distance Sampling
- sim.ldata
- Simulate data under a non-hierarchical line transect distance sampling model (AHM1 - 8.2.3) 
- sim.pdata
- Simulate data under a non-hierarchical point transect (= point count) distance sampling model (AHM1 - 8.2.5.1) 
- simHDS
- Simulate data under a hierarchical distance sampling protocol (line or point) (AHM1 - 8.5.1) 
- simHDSpoint
- A simplified version of - simHDSfor point transects only.
- simHDSg
- Simulate data under a hierarchical distance sampling (HDS) protocol with groups (AHM1 - 9.2.1) 
- simHDStr
- Simulate data under a time-removal/distance sampling design (AHM1 - 9.3.2) 
- simHDSopen
- Simulate open hierarchical distance sampling data (AHM1 - 9.5.4) 
- issj.sim
- Simulate data under the open distance sampling protocol for the Island Scrub Jays (AHM1 - 9.7.1) 
- sim.spatialDS
- Simulate data under a basic spatial distance sampling model (AHM1 - 9.8.3) 
- sim.spatialHDS
- Simulate data under a spatial hierarchical distance sampling model (AHM1 - 9.8.5) 
- simIDS
- Simulate data for an integrated distance sampling, point count and occupancy study 
Static Occurrence using Site-Occupancy Models
- simOcc
- Simulate detection/nondetection data under static occupancy models under a wide range of conditions (AHM1 - 10.5) 
- simOccCat
- As above, but allows simulation of categorical covariates 
- sim3Occ
- Simulate detection/nondetection data under a static 3-level occupancy model (AHM1 - 10.10) 
- simOccttd
- Simulate 'timing data' under a static time-to-detection occupancy design (AHM1 - 10.12.1) 
- wigglyOcc
- Simulate detection/nondetection data under a static occupancy model with really wiggly covariate relationships in occupancy and detection probability (AHM1 - 10.14) 
Hierarchical Models for Communities
- simComm
- Simulate detection/nondetection or count data under a community occupancy or abundance model respectively (AHM1 - 11.2) 
Relative Abundance Models for Population Dynamics
- simNpC
- Simulate data on abundance (N), detection probability (p) and resulting counts (C) under a counting process with imperfect detection (AHM2 - 1.2) 
- simPOP
- Simulate count data under a demographic state-space, or Dail-Madsen, model (no robust design) (AHM2 - 1.7.1) 
- simPH
- Simulate count data with phenological curves within a year (AHM2 - 1.8.1) 
Modeling Population Dynamics with Count Data
- simDM0
- Simulate count data from a Dail-Madsen model under a robust design, no covariates (AHM2 - 2.5.1) 
- simDM
- Simulate count data from a Dail-Madsen model under a robust design, with covariates (AHM2 - 2.5.5) 
- simMultMix
- Simulate “removal” count data from a multinomial-mixture model (AHM2 - 2.7.1) 
- simFrogDisease
- Simulate detection data for diseased frogs (AHM2 - 2.9.1) 
Hierarchical Models of Survival
- simCJS
- Simulate individual capture history data under a Cormack-Jolly-Seber (CJS) survival model (AHM2 - 3.2.2) 
Dynamic Occupancy Models
- simDynocc
- Simulate detection/nondetection data under a dynamic occupancy model under a wide range of conditions (AHM2 - 4.4) 
- simDemoDynocc
- Simulate detection/nondetection data under a demographic dynamic occupancy model (AHM2 - 4.12) 
Dynamic Community Models
- simDCM
- Simulate detection/nondetection data under a general dynamic community model (site-occupancy variant) (AHM2 - 5.2) 
Spatial Models of Distribution and Abundance
- simDynoccSpatial
- Simulate detection/nondetection data under a dynamic occupancy model with spatial covariate and spatial autocorrelation (AHM2 - 9.6.1.1). See also - simDynoccSpatialData
- simExpCorrRF
- Simulate data from a Gaussian random field with negative exponential correlation function (AHM2 - 9.2) 
- simOccSpatial
- Simulate detection/nondetection data under a spatial, static occupancy model for a real landscape in the Bernese Oberland (Swiss Alps) (AHM2 - 9.2) 
- simNmixSpatial
- Simulate counts under a spatial, static binomial N-mixture model for a real landscape in the Bernese Oberland (Swiss Alps) (AHM2 - 9.2) 
Integrated Models for Multiple Types of Data
- simPPe
- Simulate a spatial point pattern in a heterogeneous landscape and show aggregation to abundance and occurrence ('e' for educational version) (AHM2 - 10.2) 
- simDataDK
- Simulate data for an integrated species distribution model (SDM) of Dorazio-Koshkina (AHM2 - 10.6.1) 
Spatially Explicit Distance Sampling
- simSpatialDSline
- Simulate line transect distance sampling data with spatial variation in density (AHM2 - 11.5) 
- simSpatialDSte
- Simulate data for replicate line transect distance sampling surveys with spatial variation in density and temporary emigration (AHM2 - 11.8.1) 
- simDSM
- Simulate line transect data for density surface modeling (AHM2 - 11.10.1) 
DATA SETS
- BerneseOberland
- Landscape data for the Bernese Oberland around Interlaken, Switzerland (AHM2 - 9.2) 
- crestedTit
- Crested Tit data from the Swiss Breeding Bird Survey MHB (Monitoring Häufige Brutvögel) for 1999 to 2015 (AHM2 - 1.3) 
- cswa
- Chestnut-sided Warbler data for point counts and spot-mapping from White Mountain National Forest (AHM2 - 2.4.3) 
- crossbillAHM
- Crossbill data from the Swiss Breeding Bird Survey for 2001 to 2012 (AHM2 - 4.9) 
- dragonflies
- Toy data set used in AHM1 - 3.1 
- duskySalamanders
- Counts of juvenile vs adult salamanders over 7 years (AHM2 - 2.9.2) 
- EurasianLynx
- Data for Eurasian Lynx in Italy and Switzerland (AHM2 - 7.3.2) 
- Finnmark
- Data from surveys of birds in Finnmark in NE Norway (AHM2 - 5.7) 
- FrenchPeregrines
- Detection data for peregrines in the French Jura (AHM2 - 4.11) 
- greenWoodpecker
- Count data for Green Woodpeckers in Switzerland from the MHB (AHM2 - 2.2) 
- HubbardBrook
- Point count data for warblers from Hubbard Brook, New Hampshire (AHM2 - 8.2) 
- jay
- The European Jay data set (from the MHB) is now included in unmarked (AHM1 - 7.9) 
- MesoCarnivores
- Camera trap data for 3 species of meso-carnivores (AHM2 - 8.2) 
- MHB2014
- Complete data from the Swiss Breeding Bird Survey MHB (Monitoring Häufige Brutvögel) for the year 2014 (AHM1 - 11.3) 
- spottedWoodpecker
- Data for Middle Spotted Woodpeckers in Switzerland (AHM2 - 4.11.2) 
- SwissAtlasHa
- A 1ha-scale subset of the count data from the Swiss Breeding Bird Atlas (AHM2 - 8.4.2) 
- SwissEagleOwls
- Territory-level, multi-state detection/nondetection data for Eagle Owls in Switzerland (AHM2 - 6.4) 
- SwissMarbledWhite
- Data from the Biodiversity Monitoring Program (LANAG) in the Swiss Canton of Aargau for Marbled White butterfly (AHM2 - 1.8.2) 
- SwissSquirrels
- Count data for Red Squirrels in Switzerland from the Swiss breeding bird survey MHB (AHM1 - 10.9) 
- SwissTits
- Data for 6 species of tits in Switzerland from from the Swiss breeding bird survey MHB during 2004 to 2013 (AHM1 - 6.13.1) 
- treeSparrow
- Data for Tree Sparrows in Alaska (AHM2 - 11.8.4) 
- ttdPeregrine
- Time-to-detection data for Peregrines (AHM1 - 10.12.2) 
- UKmarbledWhite
- Data from the UK Butterfly Monitoring Scheme (UKBMS) for Marbled White butterfly (AHM2 - 1.8.2) 
- wagtail
- Distance sampling data for Yellow Wagtails in The Netherlands (AHM1 - 9.5.3) 
- waterVoles
- Detection/nondetection data for the Mighty Water Vole of Scotland (AHM2 - 7.2.2) 
- wigglyLine
- Coordinates for a wiggly transect line (AHM2 - 11.9) 
- willowWarbler
- Capture-history (survival) data for Willow Warblers in Britain (AHM2 - 3.4.1) 
UTILITY FUNCTIONS
- ppc.plot
- Plot results from posterior predictive checks in section AHM1 - 6.8, for a fitted binomial N-mixture model object with JAGS 
- plot_Nmix_resi
- Do diagnostic plots for one binomial N-mixture model fitted with all three mixture distributions currently available in unmarked: Poisson, negative binomial and zero-inflated Poisson (AHM1 - 6.9.3) 
- map.Nmix.resi
- Produce a map of the residuals from a binomial N-mixture model (see Section AHM1 - 6.9.3) 
- instRemPiFun,- crPiFun,- crPiFun.Mb,- MhPiFun
- Define the relationship between the multinomial cell probabilities and the underlying detection probability parameters (i.e., a pi function) in various designs (AHM1 - 7.8 and AHM2 - Chapter 2) 
- spline.prep
- Prepare input for BUGS model when fitting a spline for a covariate (AHM1 - 10.14) 
- graphSSM
- Plot trajectories of counts and latent abundance from a fitted Gaussian state-space model (AHM2 - 1.6.1) 
- ch2marray
- Convert capture history data to the m-array aggregation (AHM2 - 3.4.1) 
- valid_data
- Partial validation of simulated data with false positives (AHM2 - 7.6.2) 
- getLVcorrMat
- Compute the correlation matrix from an analysis of a latent variable occupancy or binomial N-mixture model (AHM2 - 8.4.2) 
- zinit
- Generate starting values for fitting survival models (introduced in AHM2 - 3.2.3). 
- standardize
- Standardize covariates to mean 0, SD 1. 
- fitstats,- fitstats2
- Calculate fit-statistics used in parboot GOF tests throughout the book (eg, Sections AHM1 - 7.5.4, AHM1 - 7.9.3, AHM2 - 2.3.3) 
- e2dist
- Compute a matrix of Euclidean distances 
- image_scale
- Draw scale for image (introduced in chapter AHM1 - 9.8.3) 
- bigCrossCorr
- Report cross-correlations above a given threshold 
- Color_Ramps
- Color ramps for use with image or raster plots 
Author(s)
Marc Kéry, Andy Royle, Mike Meredith
Landscape data for the Bernese Oberland around Interlaken, in the Alps of central Switzerland
Description
Spatially-referenced data on elevation, forest cover, and water cover at a 1km-sq resolution for a 50km x 50km region in Switzerland, the Bernese Oberland. This is a small subset of the Switzerland data in the unmarked package. See the Examples for a plot of the location.
Usage
data(BerneseOberland)Format
A data frame with 2500 observations on the following 5 variables.
- x
- Easting (m) 
- y
- Northing (m) 
- elevation
- a numeric vector (m) 
- forest
- a numeric vector (percent cover) 
- water
- a numeric vector (percent cover) 
Details
Forest and water coverage (in percent area) was computed using the 1992-97 landcover dataset of the Swiss Federal Statistical Office (http://www.bfs.admin.ch). Median elevation (in meters) was computed using a median aggregation of the digital elevation model of the Swiss Federal Statistical Office.
x and y are the coordinates of the center of each 1km2 pixel.
The coordinate reference system is intentionally not specified.
These data can only be used for non-profit projects. Otherwise, written permission must be obtained from the Swiss Federal Statistical Office.
Source
Swiss Federal Statistical Office (http://www.bfs.admin.ch)
Examples
data(BerneseOberland)
str(BerneseOberland)
library(raster)
data(Switzerland, package="unmarked")
r1 <- with(Switzerland, rasterFromXYZ(data.frame(x = x, y = y, z = elevation)))
mapPalette1 <- colorRampPalette(c("grey", "yellow", "orange", "red"))
plot(r1, col = mapPalette1(100), axes = FALSE, box = FALSE, zlim = c(0, 4500),
  main = "Location of Bernese Oberland data set")
with(BerneseOberland, rect(min(x), min(y), max(x), max(y), lwd=2))
Color Ramps
Description
These create vectors of n contiguous colors for use in image and raster plots. They are similar to heat.colors or terrain.colors but with extra arguments to give greater control of colors. Currently included ramps are:
* sequential yellow-orange-red
* sequential light to dark green
* sequential light to dark gray
* diverging blue-yellow-red
* diverging green-brown
Sequential ramps go from light to dark, diverging ramps from dark through light to dark. All these ramps are color-blind safe and based on color palettes from https://colorbrewer2.org/ by Cynthia Brewer and colleagues.
Usage
rampYOR(n=5, range=1:9, bias=1, ...)
rampGreens(n=5, range=1:9, bias=1, ...)
rampGreys(n=5, range=1:9, bias=1, ...)
rampBYR(n=5, range=1:11, bias=1, ...)
rampGBr(n=5, range=1:11, bias=1, ...)
Arguments
| n | the number of colors to be in the palette. | 
| range | the range of colors from the underlying palette to include: each ramp is based on a palette of 9 (if sequential) or 11 (if diverging) colors; range allows the extreme colors to be excluded. | 
| bias | a positive number: values >1 give more widely spaced colors at the high end. | 
| ... | additional arguments passed to  | 
Value
Return a vector of color values in hexadecimal format.
Author(s)
Mike Meredith.
Examples
# uses the built-in volcano data set
require(grDevices) # for colours
require(graphics)
image(volcano, col=rampYOR(225), main="Sequential yellow-orange-red")
image(volcano, col=rampBYR(225), main="Divergent blue-yellow-red")
image(volcano, col=rampGBr(225), main="Divergent green-brown")
points(runif(500), runif(500), pch=16, cex=0.7) # add points
title(sub="Points are hard to see on the darkest colors", line=2)
image(volcano, col=rampGBr(225, range=2:10), main="Green-brown without darkest colors")
points(runif(500), runif(500), pch=16, cex=0.7)
# Try with a raster
require(raster)
r <- raster(system.file("external/test.grd", package="raster"))
plot(r, main="Default colors") # default is rev(terrain.colors(225))
plot(r, col=rampGBr(225), main="Divergent green-brown")
plot(r, col=rampGBr(225, bias=3), main="Divergent green-brown, bias=3")
plot(r, col=rampGBr(225, bias=0.5), main="Divergent green-brown, bias=0.5")
Data for Eurasian lynx from Switzerland and Italy (only Alps).
Description
The data are observations of Eurasian lynx (Lynx lynx) confirmed by experts and by the general public (and not confirmed by experts), summarized to a 10 x 10 km grid and for 3 occasions during a 'winter' (Nov-Dec, Jan-Feb, Mar-Apr). The observations are classified as 'certain' or 'uncertain', and only the latter are assumed to be contaminated with false positives (but both are subject to false-negative errors). This data set covers the Alps in Switzerland and Italy from 1994 to 2016.
Usage
data("EurasianLynx")Format
EurasianLynx is a data frame with 43,332 rows corresponding to observations, and 10 columns:
- type
- factor: 'certain' or 'uncertain'. 
- site.nr
- site (10 x 10 km cell) identifier. 
- y.1, y.2, y.3
- detection (1)/non-detection (0) for the three occasions. 
- Year
- the year of the observation. 
- Cntry
- the country, 'Italy' or 'Switzerland'. 
- xcoord
- the x coordinate of the cell, km E of the origin. 
- ycoord
- the y coordinate of the cell, km N of the origin. 
- forest
- the percentage forest cover in the cell. 
Source
The Foundation KORA (SCALP Project) and the Progetto Lince Italia (Anja Molinari-Jobin, Urs Breitenmoser, Fridolin Zimmermann).
References
Molinari-Jobin, A., Kéry, M., Marboutin, E., Molinari, P., Koren, I., Fuxjäger, C., Breitenmoser-Würsten, C., Wölfl, S., Fasel, M., Kos, I., Wölfl, M., & Breitenmoser, U. (2012) Monitoring in the presence of species misidentification: the case of the Eurasian lynx in the Alps. Animal Conservation, 15, 266-273.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 7.3.2 and 9.6.1.3.
Examples
data(EurasianLynx)
lynx <- EurasianLynx # Shorter name
str(lynx)
# Create additional columns needed for the analysis in 'unmarked':
lynx$occ.1 <- 1
lynx$occ.2 <- 2
lynx$occ.3 <- 3
lynx$sYear <- standardize(lynx$Year)
str(lynx)
Data from surveys of birds in Finnmark in NE Norway
Description
A total of 37 plots were placed spanning the existing variation in extent and fragmentation of willow thickets. Each plot was visited 3-5 times in early July each year from 2005 to 2008. Birds observed within 100m during a 15 min period were recorded.
Usage
data("Finnmark")Format
Finnmark is a list with 4 elements:
- species
- a data frame with a row for each species and the following columns: - species : the English name. 
- latin : the Latin name. 
- assemblage : the guild to which each species belongs. 
 
- sites
- a data frame with a row for each plot and the following columns: - region : a factor, 3 levels, Ilford, Komag, and Vestre Jakobselv. 
- catchment : a factor, 11 levels. 
- plot : a factor, the alphanumeric ID of each plot. 
- plotnr : the ID number of each plot. 
- area : the percentage of a 400 x 400m quadrat centered on the sampling point covered by tall willow thickets. 
- edge : the length (m) of the edge of the willow thickets in the quadrat. 
- height : the mean height of willows at 4 measuring points. 
- density : a measure of thicket density at 4 measuring points. 
 
- counts
- a sites x replicates x years x species array of counts 
- timeOfDay
- a sites x replicates x years array with the time of day that the survey was carried out. 
Source
Data generously provided by John-André Henden.
References
Henden, J.-A., Yoccoz, N.G., Ims, R.A., Langeland, K. (2013) How Spatial Variation in Areal Extent and Configuration of Labile Vegetation States Affect the Riparian Bird Community in Arctic Tundra. PLoS ONE 8(5): e63312.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 5.
Examples
data(Finnmark)
str(Finnmark)
# Create the objects needed for the analysis:
# Extract the numeric covariates and standardise
str(Finnmark$sites)
scov <- scale(Finnmark$sites[, 5:8])
str(scov)
# Convert the 'counts' array to detection/nondetection data:
y <- Finnmark$counts > 0
storage.mode(y) <- "integer"
str(y)
# Get the guild information for each species
guild <- Finnmark$species$assemblage
# Standardise the timeOfDay and replace NAs with 0
tod <- with(Finnmark, (timeOfDay - mean(timeOfDay, na.rm=TRUE))/ sd(timeOfDay, na.rm=TRUE))
tod[is.na(tod)] <- 0
str(tod)
Data for observations of peregrines from the French Jura mountains
Description
The data are detection/nondetection data of the Peregrine Falcon (Falco peregrinus) from the wild and wonderful French Jura between 1964 and 2016 for 284 cliff sites (= territories, or sites in the context of a site-occupancy model) where a pair had been detected at least once in this period. A large proportion of sites are visited multiple times per year, but unfortunately only the aggregate results are available in each year, i.e., whether a pair was detected at least once, or never.
Usage
data("FrenchPeregrines")Format
FrenchPeregrines is a data frame with 284 rows and 56 columns:
- site
- cliff (or site) identifier. 
- department
- factor, the administrative area (Ain, Jura or Doubs). 
- height
- factor, height of the cliff, low, medium, or tall. 
- yr1964 to yr2016
- detection histories for each year: 1 if a pair of peregrines was detected during at least one survey, 0 if no pair was detected, NA if no survey was carried out in that year. 
Source
Groupe Pèlerin Jura (René-Jean Monneret, René Ruffinoni, and colleagues)
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 4.11.
Examples
data(FrenchPeregrines)
str(FrenchPeregrines)
# Extract the capture history data:
ch <- as.matrix(FrenchPeregrines[, 4:56])
dim(ch)
range(ch, na.rm=TRUE)
Warbler point count data from the Hubbard Brook Experimental Forest
Description
Three replicated point counts of 10 min each, out to a maximum distance of 100 m, were conducted in each spring between 1999 and 2018 at a total of 373 locations in the Hubbard Brook Experimental Forest in New Hampshire. This data set contains the data for the following 13 species:
- AMRE - American Redstart 
- BAWW - Black-and-white Warbler 
- BHVI - Blue headed Vireo 
- BLBW - Blackburnian Warbler 
- BLPW - Blackpoll Warbler 
- BTBW - Black-throated Blue Warbler 
- BTNW - Black-throated Green Warbler 
- CAWA - Canada Warbler 
- MAWA - Magnolia Warbler 
- MYWA - Myrtle Warbler 
- NAWA - Nashville Warbler 
- OVEN - Ovenbird 
- REVI - Red-eyed Vireo 
Usage
data("HubbardBrook")Format
HubbardBrook is a list with 4 elements:
- counts
- a sites x replicates x years x species array of counts, the number of unique individuals detected within 50m of the point count location. 
- sitecov
- a data frame with rows for 373 sites and the following columns: - PlotID : a numeric site identifier. 
- Longitude : the longitude of the point (WGS84); 2 sites have missing data. 
- Latitude : the latitude of the point (WGS84); 2 sites have missing data. 
- Elev : the elevation of the point, m. 
- Aspect : aspect of the sample location (degrees). 
- Slope : slope of the sample location. 
- Forest : general forest cover type of sample location. 
 
- dates
- a sites x replicates x years array of ordinal day of count (1 Jan = 1); NA entries occur when surveys were not carried out. 
- times
- a sites x replicates x years array with the start time of the survey, hours after midnight. 
Source
Rodenhouse N.L. & Sillett, T.S. (2019) Valleywide Bird Survey, Hubbard Brook Experimental Forest, 1999-2016 (ongoing). Environmental Data Initiative. <https://doi.org/10.6073/pasta/faca2b2cf2db9d415c39b695cc7fc217>. Dataset accessed 2020-01-07.
References
Betts, M.G., Rodenhouse, N.L., Sillett, T.S., Doran, P.J. & Holmes, R.T. (2008). Dynamic occupancy models reveal within-breeding season movement up a habitat quality gradient by a migratory songbird. Ecography 31:592–600.
Goetz, S.J., Steinberg, D., Betts, M.G., Holmes, R.T., Doran, P.J., Dubayah, R., & Hofton, M. (2010). Lidar remote sensing variables predict breeding habitat of a Neotropical migrant bird. Ecology 91:1569–1576.
Van Tatenhove, A., Filiberti, E., Sillett, T.S., Rodenhouse, N.L. & Hallworth, M. T. (2019). Climate-related distribution shifts of migratory songbirds and sciurids in the White Mountain National Forest. Forests 10:84.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 8.2.
Examples
data(HubbardBrook)
str(HubbardBrook)
Data from the Swiss Breeding Bird Survey MHB 2014
Description
The Swiss breeding bird survey ("Monitoring Häufige Brutvögel" MHB) has monitored the populations of approximately 150 common species since 1999. The MHB sample consists of 267 1-km squares that are laid out as a grid across Switzerland. Fieldwork is conducted by about 200 skilled birdwatchers annually, most of them volunteers. Avian populations are monitored using a simplified territory mapping protocol (also called spot-mapping West of the Atlantic), where each square is surveyed up to three times during the breeding season (only twice above the tree line around 2000 m elevation). Surveys are conducted along a transect that does not change over the years and is 4-6 kms in length on average.
The list MHB2014 has the full survey data for the year 2014.
Usage
data("MHB2014")Format
MHB2014 is a list with 5 elements:
- species
- a data frame with rows for 158 species, including 15 species not recorded in the year 2014, and the following columns: - specid : a numeric species ID based on phylogeny. 
- latabb : a 6-letter abbreviation of the Latin name. 
- engname : the English name. 
- latname : the Latin name. 
- body.length : body length in cm. 
- body.mass : body mass in g. 
- wing.span : wing span in cm. 
 
- sites
- a data frame with rows for 267 1x1 km quadrat, including 1 quadrat not surveyed in 2014, and the following columns: - siteID : an alphanumeric site identifier. 
- coordx : the x coordinate of the center of the quadrat; the coordinate reference system is intentionally not specified. 
- coordy : the y coordinate of the center of the quadrat. 
- elev : the mean elevation of the quadrat in m. 
- rlength : the length of the route walked in the quadrat in km. 
- nsurvey : the number of replicate surveys planned in the quadrat; above the tree-line 2, otherwise 3. 
- forest : percentage forest cover. 
- obs14 : identifying number of the observer. 
 
- counts
- a sites x replicates x species array of counts 
- date
- a sites x replicates matrix with Julian dates of the surveys, 1 April = 1 
- dur
- a sites x replicates matrix with the duration of each survey, mins 
Note
Section 11.3 of the AHM1 book has code to read in data from a CSV file, "MHB_2014.csv". This is a huge file, because the site data are repeated for all 158 species and the species data are repeated for all 267 sites. The MHB2014 list has all the same data, but in a more compact format. See Examples for ways to generate the objects used in the book from the list.
Source
Swiss Ornithological Institute
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 11.3.
Examples
data(MHB2014)
str(MHB2014)
# Create the objects at foot of p.644:
( nsite <- nrow(MHB2014$sites) )    # number of sites in Swiss MHB
nrep <- 3                           # maximum number of replicate surveys per season
( nspec <- nrow(MHB2014$species) )  # 158 species occur in the 2014 data
# Check the dimensions of the 'count' array:
dim(MHB2014$count) == c(nsite, nrep, nspec)
# Create the detection/nondetection matrix 'y':
y <- MHB2014$count > 0        # this is logical, convert to integer
storage.mode(y) <- "integer"  # don't use 'as.integer', that strips out dimensions and names
str(y)
# Pull out and check the data for common chaffinch, p.645:
head(tmp <- y[, , "Common Chaffinch"])
tail(tmp)
Camera trap data for Bobcat, Coyote and Red Fox
Description
Camera trap (detection/nondetection) data for Bobcat (Lynx rufus), Coyote (Canis latrans) and Red Fox (Vulpes vulpes) from six Mid-Atlantic states in the eastern United States analyzed by Rota et al (2016).
Usage
data("MesoCarnivores")Format
MesoCarnivores is a list with 4 elements:
- bobcat, coyote, redfox
- 1/0 detection data for the respective species: matrices with rows for 1437 sites x 3 replicates, where each replicate corresponds to 1 week of observations. 
- sitecov
- a data frame with rows for 1437 sites and the following columns: - Dist_5km : the proportion of disturbed land in the surrounding 5km. 
- HDens_5km : housing density in the surrounding 5km. 
- Latitude : the latitude of the point, decimal degrees divided by 100. 
- Longitude : the longitude, decimal degrees divided by 100. 
- People_site : number of people photographed at a site divided by 1000. 
- Trail : 1 if the camera was located on a trail, 0 otherwise. 
 
Source
Data courtesy of eMammal, Roland Kays, Arielle Parsons, and their group at the North Carolina Museum of Natural Sciences and North Carolina State University.
References
Rota, C.T., Ferreira, M.A.R., Kays, R.W., Forrester, T.D., Kalies, E.L., McShea, W.J., Parsons, A.W., & Millspaugh, J.J. (2016) A multispecies occupancy model for two or more interacting species. Methods in Ecology and Evolution, 7, 1164-1173.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 8.2.
Examples
data(MesoCarnivores)
str(MesoCarnivores)
Data for 1ha quadrats from the Swiss Breeding Bird Atlas
Description
A 1ha-scale subset of the count data from the most recent Swiss Breeding Bird Atlas (Knaus et al. 2018). The data consists of replicate counts of birds from 2,318 1ha quadrats for 78 passerine species. Each quadrat is surveyed up to three times during the breeding season (only twice above the tree line).
Usage
data("SwissAtlasHa")Format
SwissAtlasHa is a list with 3 elements:
- counts
- a sites x replicates x species array of counts 
- sites
- a data frame with rows for 2318 1ha quadrats and the following columns: - x : the x coordinate of the center of the quadrat; the coordinate reference system is intentionally not specified. 
- y : the y coordinate of the center of the quadrat. 
- elevation : the mean elevation of the quadrat, m. 
- slope : the slope. 
- north : the aspect, with due south = -1, due north = 1. 
- forest : proportion of forest cover. 
- nsurvey : the number of replicate surveys carried out in the quadrat. 
 
- date
- a sites x replicates matrix with Julian dates of the surveys, 1 January = 1 
Source
Swiss Ornithological Institute
References
Knaus, P., S. Antoniazza, S. Wechsler, J. Guélat, M. Kéry, N. Strebel, & T. Sattler (2018) Brutvogelatlas 2013–2016. Bestandsentwicklung der Brutvögel der Schweiz und des Fürstentums Liechtensteins (Swiss Breeding Bird Atlas 2013–2016). Schweizerische Vogelwarte, Sempach.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 8.4.2.
Examples
data(SwissAtlasHa)
str(SwissAtlasHa)
# display the species names:
dimnames(SwissAtlasHa$counts)[[3]]
Territory-level, multi-state detection/nondetection data for Eagle Owls in Switzerland
Description
Multi-state detection/nondetection data stem from more or less opportunistic surveys in a total of 274 territories of the Eurasian Eagle Owl (Bubo bubo), the largest owl in the world, between 2007 and 2016 (10 years). Sites are all over Switzerland and represent the complete list of sites where a territorial Eagle Owl was ever detected during this period. For site/year combinations with more than 20 observations, 20 were randomly selected for inclusion.
The data set recognizes four observed states: 'nondetection', 'detection of single bird', 'detection of a pair without breeding evidence' and 'detection of a pair with confirmed breeding'; these being the classical criteria of bird atlases for possible, probable or confirmed breeding. For the analysis in the book, the last two are aggregated into a single state 'Pair'.
Usage
data("SwissEagleOwls")Format
SwissEagleOwls is a list with 2 elements:
- obs
- a data frame with a row for each of the 5974 observations and the following columns: - site_name : a numeric site ID. 
- year : the year of the observation. 
- jdate : the Julian date of the observation (1 = Jan 1st). 
- y : the observed state: 0 = species not detected; 1 = detection of a single bird; 2 = detection of a pair without evidence of breeding; 3 = detection of a pair with evidence of breeding. 
 
- sites
- a data frame with rows for 274 1x1km quadrats containing the sites, and the following columns: - site_name : the numeric site ID, corresponding to site_name in the obs data frame. 
- region : codes for six regions of Switzerland. 
- elev : the elevation of the center of the quadrat, m. 
- forest : proportion of the quadrat covered by forest. 
- slope : the slope of the quadrat in degrees. 
- roads : total road length in the quadrat, m. 
 
Source
Swiss Ornithological Institute
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 6.
Examples
data("SwissEagleOwls")
str(SwissEagleOwls)
Data for Swiss Marbled White butterflies
Description
Detection/nondetection data for the Marbled White butterfly (Melanargia galathea) collected in the canton of Aargau, Switzerland, as part of the Aargau Biodiversity Monitoring Program (LANAG). 519 study plots were each sampled once every five years from 1998 to 2010 with 11 visits each year, resulting in 1337 observed detection histories.
Usage
data("SwissMarbledWhite")Format
A data frame with 1337 rows and the following columns:
- site
- identification number for site 
- year
- the year of the record 
- Day1, Day2, Day3, Day4, Day5, Day6, Day7, Day8, Day9, Day10, Day11
- Julian date of the survey, 1 = 1st April. NA indicates that the date of the survey was not recorded; see Examples for a way to impute dates. 
- y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11
- 1 if the species was recorded during the survey, 0 otherwise. 
Source
Biodiversity Monitoring Program in the canton of Aargau, Switzerland (Courtesy Isabelle Flöss, Abteilung Landschaft und Gewässer des Kanton Aargau). See Appendix C of Roth et al. (2014).
References
Roth, T., Strebel, N. and Amrhein, V. (2014), Estimating unbiased phenological trends by adapting site-occupancy models. Ecology, 95: 2144-2154
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 4.13.
See Also
Examples
data(SwissMarbledWhite)
str(SwissMarbledWhite)
# Data preparation for AHM2 section 4.13
y <- as.matrix(SwissMarbledWhite[,14:24])    # Grab detection/nondetection data
DATE <- as.matrix(SwissMarbledWhite[,3:13])  # Grab survey dates
for(t in 1:11) {               # Mean-impute date (but don't transform)
  DATE[is.na(DATE[,t]),t] <- mean(DATE[,t], na.rm=TRUE)
}
year <- SwissMarbledWhite$year
nsites <- length(unique(SwissMarbledWhite$site))
nyears <- length(unique(SwissMarbledWhite$year))
nsurveys <- ncol(y)
nobs <- nrow(y)
Data for Red Squirrels in Switzerland from the Swiss breeding bird survey MHB
Description
A file with detection/nondetection data for the Red Squirrel (Sciurus vulgaris) in 265 1 km2 survey quadrats in Switzerland for 2007, together with covariates (data for the two remaining among the total of 267 quadrats in that monitoring program were not available). See Examples for code to load the data.
Format
The file SwissSquirrels.txt is a tab-delimited text file with 265 rows and the following columns:
- spec.name : the species name. 
- coordx : the x coordinate of the center of the quadrat; the coordinate reference system intentionally not specified. 
- coordy : the y coordinate of the center of the quadrat. 
- ele : the mean elevation of the quadrat, m. 
- route : the length of the route walked in the quadrat, km. 
- forest : percentage forest cover. 
- det071, det072, det073 : 1/0 detection data for 3 survey occasions in 2007. 
- date071, date072, date073 : Julian date for the 3 survey occasions (1 = 1st April). 
- dur071, dur072, dur073 : duration of the 3 survey occasions (mins). 
Source
Swiss Ornithological Institute
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.9.
Examples
# To read in the text file
fn <- file.path(find.package("AHMbook"), "extdata", "SwissSquirrels.txt")
data <- read.table(fn, header = TRUE)
str(data)
Data from the Swiss Breeding Bird Survey for 6 species of tits
Description
The Swiss breeding bird survey ("Monitoring Häufige Brutvögel" MHB) has monitored the populations of approximately 150 common species since 1999. The MHB sample consists of 267 1-km squares that are laid out as a grid across Switzerland. Fieldwork is conducted by about 200 skilled birdwatchers annually, most of them volunteers. Avian populations are monitored using a simplified territory mapping protocol (also called spot-mapping West of the Atlantic), where each square is surveyed up to three times during the breeding season (only twice above the tree line around 2000 m elevation). Surveys are conducted along a transect that does not change over the years and is 4-6 kms in length on average.
The list SwissTits has the data for six species of tits from 2004 to 2013. There are some missing values: see Details.
Usage
data("SwissTits")Format
SwissTits is a list with 5 elements:
- species
- a data frame with a row for each species and the following columns: - specid : a numeric species ID based on phylogeny. 
- latname : a 6-letter abbreviation of the Latin name. 
- name : the English name. 
 
- sites
- a data frame with a row for each 1x1 km quadrat and the following columns: - siteID : an alphanumeric site identifier. 
- coordx : the x coordinate of the center of the quadrat; the coordinate reference system intentionally not specified. 
- coordy : the y coordinate of the center of the quadrat. 
- AQ : an identifier for the 10km x 10km block called "Atlas Quadrat" within which the site falls. 
- AQ.coordx : the x coordinate of the center of the AQ. 
- AQ.coordy : the y coordinate of the center of the AQ. 
- elev : the mean elevation of the quadrat, m. 
- rlength : the length of the route walked in the quadrat, km. 
- forest : percentage forest cover. 
 
- counts
- a sites x replicates x years x species array of counts 
- date
- a sites x replicates x years array with Julian dates of the surveys, 1 April = 1 
- dur
- a sites x replicates x years array with the duration of each survey, mins 
Details
Missing values in the date array indicate that the corresponding survey was not carried out.
On 26 occasions when surveys were carried out, the duration was not recorded, resulting in additional NAs in the dur array.
A new method for recording breeding territories was introduced in 2004, but the old protocol was in use at some sites until 2013. Surveys with the old protocol have the counts shown as NA in the count array.
Note
Sections 6.9.1 and 6.13.1 of the AHM1 book have code to read in data from a CSV file, "SwissTits_mhb_2004_2013.csv". The SwissTits list has all the same data in a more compact format. See Examples for ways to generate the objects used in the book from the list.
Source
Swiss Ornithological Institute
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.9.1 and 6.13.1.
Examples
data(SwissTits)
str(SwissTits)
# Section 6.9.1, foot of p.257 and top of p.258
# -------------
y0 <- SwissTits$counts[, , '2013', 'Great tit']
( NA.sites <- which(rowSums(is.na(y0)) == 3) ) # Unsurveyed sites
y <- y0[-NA.sites, ]                 # Drop them from the count data
tits <- SwissTits$sites[-NA.sites, ] # Also drop from the site covariates
str(y)
# Get date and duration data for 2013, without the NA.sites rows:
date <- SwissTits$date[-NA.sites, , '2013']
dur <- SwissTits$dur[-NA.sites, , '2013']
# Section 6.13.1, p.303
# --------------
# Get the count data for 2013 (all species)
y0 <- SwissTits$count[, , '2013', ]
str(y0)
# We keep the sites with count data, remove those with 3 NAs
# See which sites have counts in 2013 for (say) Great tits:
keep <- which(rowSums(is.na(y0[, , "Great tit"])) != 3)
length(keep)
y <- y0[keep, , ]
# Get the covariate data for the 'keep' sites
elev <- SwissTits$sites$ele[keep]
route <- SwissTits$sites$rlength[keep]
forest <- SwissTits$sites$forest[keep]
date <- SwissTits$date[keep, , '2013']  # Survey date
dur <- SwissTits$dur[keep, , '2013']    # Survey duration
# Degrade counts to detection/nondetection data
y3DRN <- y
y3DRN[y3DRN > 1] <- 1
str(y3DRN)
# Final detail...
( spec.names <- paste0(SwissTits$species$name, "s") )
Data for UK Marbled White butterflies
Description
Data from the UK Butterfly Monitoring Scheme (UKBMS) for the Marbled White butterfly (Melanargia galathea), a fairly widespread species that flies in a single generation per year in the UK.
Most survey sites are chosen opportunistically and represent fixed transects of variable and often unknown length, which are walked between two and 26 times per year starting in early April. Butterflies are counted within an imaginary 5m x 5m 'moving box' along the transects (i.e., according to a Pollard walk protocol).
We restrict the data to sites where the marbled white was ever recorded, the years 1991–2015 and those 80 sites where counts took place in at least 20 of these 25 years.
Usage
data("UKmarbledWhite")Format
A data frame with 9651 rows and the following columns:
- C
- count, the number of butterflies recorded 
- site
- identification number for site 
- year
- the year of the record 
- north
- approximate northing of the site in the British National Grid, in meters 
- date
- Julian date of the survey, 1 = 1st January 
Source
UK Butterfly Monitoring Scheme (UKBMS) data © copyright and database right Butterfly Conservation, the Centre for Ecology and Hydrology, British Trust for Ornithology, and the Joint Nature Conservation Committee, 2018. The UKBMS is indebted to all volunteers who contribute data to the scheme.
https://www.ceh.ac.uk/our-science/projects/uk-butterfly-monitoring-scheme
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 1.8.2.
See Also
Examples
data(UKmarbledWhite)
str(UKmarbledWhite)
Cross correlations for MCMC output
Description
A wrapper for coda::crosscorr, which calculates cross-correlations between posterior draws of parameters in Markov chain Monte Carlo output. When the output has hundreds of parameters, the matrix produced by crosscorr is unwieldy, and bigCrossCorr extracts only those greater than a given threshold.
Usage
bigCrossCorr(x, big = 0.6, digits = 3)
Arguments
| x | an  | 
| big | only values below -big or above +big will be returned | 
| digits | the number of decimal places to return | 
Value
A data frame with 2 columns for the names of parameters and a 3rd column with the cross-correlation.
Author(s)
Mike Meredith
See Also
crosscorr in package coda.
Converts capture-histories to an m-array
Description
Converts capture-histories to an m-array for use in a Cormack-Jolly-Seber (CJS) model.
Usage
ch2marray(CH)
Arguments
| CH | An individuals x time matrix of capture records, 1 if captured, 0 otherwise, no missing values. | 
Value
An m-array, a (years-1) x years matrix, where element [i, j] contains the number of individuals released in year i and recaptured in year j+1 (by definition no recaptures can occur in year 1). The last column contains the number of individuals released in year i and never recaptured.
Author(s)
Marc Kéry & Andy Royle, modified from code in Kéry and Schaub (2012).
References
Kéry and Schaub (2012) Bayesian population analysis using WinBUGS - a hierarchical perspective, Academic Press.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 3.4.1
Examples
data(willowWarbler)
ch <- as.matrix(willowWarbler$birds[ , 1:11]) # extract capture-histories.
dim(ch)
ch2marray(ch)
Count data from the Swiss Breeding Bird Survey MHB for crested tits from 1999 to 2015
Description
The Swiss breeding bird survey ("Monitoring Häufige Brutvögel" MHB) has monitored the populations of approximately 150 common species since 1999. The MHB sample consists of 267 1-km squares that are laid out as a grid across Switzerland. Fieldwork is conducted by about 200 skilled birdwatchers annually, most of them volunteers. Avian populations are monitored using a simplified territory mapping protocol (also called spot-mapping West of the Atlantic), where each square is surveyed up to three times during the breeding season (only twice above the tree line around 2000 m elevation). Surveys are conducted along a transect that does not change over the years and is 4-6 kms in length on average.
The data frame crestedTit has the data for Crested Tits Parus cristatus from 1999 to 2015. There are some missing values: see Details.
Usage
data("crestedTit")Format
crestedTit is a data frame with 267 rows and 131 columns:
- coordx, coordy
- the x and y coordinates of the center of the quadrat; the coordinate reference system intentionally not specified. 
- elev
- the mean elevation of the quadrat, in meters. 
- forest
- percentage forest cover 
- nsurveys
- the number of replicate surveys planned in the quadrat; above the tree-line 2, otherwise 3. 
- 1999 to 2016
- total number of territories observed for the year. 
- date991 to date163
- Julian date of the survey, 1 = 1st January; the first 2 digits in the column name indicate the year and the 3rd digit the survey. 
- dur991 to dur163
- duration of the survey, in minutes; the first 2 digits in the column name indicate the year and the 3rd digit the survey. 
Details
Missing values in the date array indicate that the corresponding survey was not carried out.
On 26 occasions when surveys were carried out, the duration was not recorded, resulting in additional NAs in the dur array.
A new method for recording breeding territories was introduced in 2004, but the old protocol was in use at some sites until 2013. Surveys with the old protocol have the counts shown as NA in the count array.
See also Chapter 6 in Kéry & Royle (2016) for further description of the survey and the data it produces.
Source
Swiss Ornithological Institute
References
Kéry & Royle (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.
Kéry & Royle (2021) Applied Hierarchical Modeling in Ecology AHM2 - 1.
Examples
data(crestedTit)
str(crestedTit)
# Prepare objects needed for section AHM2 - 1.3 and following
C <- as.matrix(crestedTit[, 6:23]) # matrix of counts
year <- 1999:2016
datetmp <- as.matrix(crestedTit[, 24:77])  # matrix of Julian dates
datefull <- array(datetmp, c(nrow(datetmp), 3, ncol(datetmp)/3))
    # site x rep x year array
durtmp <- as.matrix(crestedTit[, 78:131])  # matrix of survey durations
durfull <- array(durtmp, c(nrow(durtmp), 3, ncol(durtmp)/3))
    # site x rep x year array
# Get mean date and duration of survey for each site and year
mdate <- apply(datefull, c(1,3), mean, na.rm=TRUE)
mdate[is.nan(mdate)] <- NA  # Replace NaN with NA
mdur <- apply(durfull, c(1,3), mean, na.rm=TRUE)
mdur[is.nan(mdur)] <- NA
# For Sec 1.5, we need standardised covariates with missing values imputed
elev.sc <- as.numeric(scale(crestedTit$elev))     # site elevation
forest.sc <- as.numeric(scale(crestedTit$forest)) # site forest cover
mean.date <- mean(mdate, na.rm=TRUE)         # mean survey date per site
sd.date <- sd(mdate, na.rm=TRUE)
mdate.sc <- (mdate - mean.date) / sd.date
mdate.sc[is.na(mdate.sc)] <- 0               # mean imputation
mean.dur <- mean(mdur, na.rm=TRUE)           # mean survey duration per site
sd.dur <- sd(mdur, na.rm=TRUE)
mdur.sc <- (mdur - mean.dur) / sd.dur
mdur.sc[is.na(mdur.sc)] <- 0                 # mean imputation
# For sections 1.6 and 1.7, we remove sites with no crested tits recorded,
#   or recorded in only one year:
nzero <- apply(C == 0, 1, sum, na.rm=TRUE) # number of zero years per site
sel <- nzero <= 1  # TRUE if site has 2 or more years of non-zero data
Data from the Swiss Breeding Bird Survey MHB for European Crossbill from 2001 to 2012
Description
The Swiss breeding bird survey ("Monitoring Häufige Brutvögel" MHB) has monitored the populations of approximately 150 common species since 1999. The MHB sample consists of 267 1-km squares that are laid out as a grid across Switzerland. Fieldwork is conducted by about 200 skilled birdwatchers annually, most of them volunteers. Avian populations are monitored using a simplified territory mapping protocol (also called spot-mapping West of the Atlantic), where each square is surveyed up to three times during the breeding season (only twice above the tree line around 2000 m elevation). Surveys are conducted along a transect that does not change over the years and is 4-6 kms in length on average.
The data frame crossbillAHM has the data for European Crossbill (Loxia curvirostra)  from 2001 to 2012.
A variant of this data set for 1999 to 2007 only is included in package unmarked.
Usage
data("crossbillAHM")Format
crossbillAHM is a data frame with 267 rows and 77 columns:
- coordx, coordy
- the x and y coordinates of the center of the quadrat; the coordinate reference system intentionally not specified. 
- elev
- the mean elevation of the quadrat, m. 
- forest
- percentage forest cover 
- nsurveys
- the number of replicate surveys planned in the quadrat; above the tree-line 2, otherwise 3. 
- det011 to det123
- 1 if the species was detected in the quadrat, 0 otherwise; NA if the corresponding survey was not carried out; the first 2 digits indicate the year and the 3rd digit the survey. 
- date011 to date123
- Julian date of the survey, 1 = 1st January; the first 2 digits indicate the year and the 3rd digit the survey; NA if the corresponding survey was not carried out. 
Source
Swiss Ornithological Institute
References
Kéry, M & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 4.
Examples
data(crossbillAHM)
str(crossbillAHM)
# Extract data as site x survey x year arrays
ytmp <- as.matrix(crossbillAHM[, 6:41]) # matrix of detections
y <- array(ytmp, c(nrow(ytmp), 3, ncol(ytmp)/3))
datetmp <- as.matrix(crossbillAHM[, 42:77])  # matrix of Julian dates
date <- array(datetmp, c(nrow(datetmp), 3, ncol(datetmp)/3))
Point count and spot-mapping data for Chestnut-sided Warbler
Description
Chestnut-sided Warblers (Setophaga pensylvanica) in 38 “wildlife openings” in the White Mountain National Forest were surveyed using point counts and spot-mapping during June and July 2004 as part of Richard Chandler's MS thesis. Counts are for birds detected by song at each point. Two of the larger openings had more than one point, giving 43 points in all. Surveys lasted 10 mins and were divided into intervals of (0–2], (2–5], and (5–10] min. Each point was surveyed 3 times, except for one point surveyed only twice. The spot-mapping data come from intensive surveys of each patch recording individual sightings and nest locations, and using counter-singing to separate adjacent territories.
Usage
data("cswa")Format
cwsa is a list with 3 components:
- counts
- a points x intervals x occasions array of counts; NAs correspond to the missing survey. 
- covs
- a data frame with rows for 43 points and the following columns: - time1 : time of start of first survey, hr; NA for the missing survey. 
- time2 : same for second survey. 
- time3 : same for third survey. 
- date1 : date of first survey, NA for the missing survey. 
- date2 : date of second survey. 
- date3 : date of third survey. 
- obs1 : observer ID for first survey. 
- obs2 : observer ID for second survey. 
- obs3 : observer ID for third survey. 
- patchArea : patch area, ha. 
- plotArea : size of point count area, ha. Most were 50-m radius but some were smaller because they overlapped adjacent forests which weren’t the focus of the study. 
- woodHt : woody vegetation height, m. 
- woodCov : proportion covered by woody vegetation. 
 
- spotMaps
- a data frame with rows for 38 patches and the following columns: - CSWA : estimated population of chestnut-sided warblers in each patch obtained from spot-mapping. 
- parea : patch area, ha. 
 
Source
David King and colleagues.
References
Chandler, R.B., King, D.I., & Chandler, C.C. (2009) Effects of management regime on the abundance and nest survival of shrubland birds in wildlife openings in northern New England, USA. Forest Ecology and Management, 258, 1669-1676.
Chandler, R.B., Royle, J.A., & King, D.I. (2011) Inference about density and temporary emigration in unmarked populations. Ecology, 92, 1429-1435.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 2.
Examples
data(cswa)
str(cswa)
# Replace the NAs in the covariates with the mean from other surveys:
covs <- cswa$covs
covs$time1[is.na(covs$time1)] <- mean(covs$time1, na.rm=TRUE)
covs$date1[is.na(covs$date1)] <- mean(covs$date1, na.rm=TRUE)
# Now use covs instead of cswa$covs
Simulate count data under a binomial N-mixture model
Description
Function to simulate point counts replicated at M sites during J occasions. Population closure is assumed for each site. Expected abundance may be affected by elevation (elev), forest cover (forest) and their interaction. Expected detection probability may be affected by elevation, wind speed (wind) and their interaction. Used in AHM1 to illustrate how a data set under some specific model of interest can be simulated.
Usage
data.fn(M = 267, J = 3, mean.lambda = 2, beta1 = -2, beta2 = 2, beta3 = 1,
   mean.detection = 0.3, alpha1 = 1, alpha2 = -3, alpha3 = 0, show.plot = TRUE)
Arguments
| M | Number of spatial replicates (sites) | 
| J | Number of temporal replicates (occasions) | 
| mean.lambda | Mean abundance at value 0 of abundance covariates | 
| beta1 | Main effect of elevation on abundance | 
| beta2 | Main effect of forest cover on abundance | 
| beta3 | Interaction effect on abundance of elevation and forest cover | 
| mean.detection | Mean detection prob. at value 0 of detection covariates | 
| alpha1 | Main effect of elevation on detection probability | 
| alpha2 | Main effect of wind speed on detection probability | 
| alpha3 | Interaction effect on detection of elevation and wind speed | 
| show.plot | if TRUE, plots of the data will be displayed; set to FALSE if you are running simulations | 
Value
A list with the input arguments and the following additional elements:
| elev | Scaled elevation (a vector of length M) | 
| forest | Scaled forest cover (a vector of length M) | 
| wind | Scaled wind speed (an M x J matrix) | 
| lambda | Expected number of individuals at each site (a vector of length M) | 
| N | Realized number of individuals at each site (a vector of length M) | 
| p | Probability of detection for each survey (an M x J matrix) | 
| C | The number of detections for each survey (an M x J matrix) | 
| Ntotal | Total abundance,  | 
| psi.true | True occupancy in sample | 
| summaxC | Sum of max counts (all sites) | 
| psi.obs | Observed occupancy in sample | 
Note
The colors used for points in some of the plots indicate different temporal replicates.
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 4.3.
Examples
# Generate a simulated data set with default arguments and look at the structure:
tmp <- data.fn()
str(tmp)
str(data.fn(J = 2))              # Only 2 surveys
str(data.fn(J = 1))              # No temporal replicate
str(data.fn(M = 1, J = 100))     # No spatial replicates, but 100 counts
str(data.fn(beta3 = 1))          # With interaction elev-wind on p
str(data.fn(M = 267, J = 3, mean.lambda = 2, beta1 = -2, beta2 = 2, beta3 = 1,
  mean.detection = 1))           # No obs. process (i.e., p = 1, perfect detection)
str(data.fn(mean.lambda = 50))   # Really common species
str(data.fn(mean.lambda = 0.05)) # Really rare species
Fictional data for dragonflies
Description
A toy data set with fictional data for 9 individuals of the dragonfly Onychogomphus uncatus from 3 populations in the Spanish Pyrenees.
Usage
data(dragonflies)Format
The format is seven vectors of length 9:
- pop
- a factor indicating which population the individual was drawn from. 
- sex
- a factor indicating the sex of each individual. 
- wing
- wingspan. 
- body
- body length. 
- mites
- number of mites (ectoparasites) counted on each individual. 
- color
- proportion of body yellow as opposed to black. 
- damage
- the number of wings (out of 4) damaged. 
Source
Fictitious data.
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 3.1
Examples
data(dragonflies)
lm(wing ~ pop + body)
Age-classified counts of dusky salamanders from two mid-Atlantic states in the USA
Description
Northern dusky salamanders (Desmognathus fuscus) were sampled in 12 headwater streams in Maryland and the District of Columbia, USA, annually in June and July from 2005 to 2012. Most streams were sampled at two locations separated by at least 100 m along the stream reach for a total of 21 survey locations. At each location, an observer walked two 15-m transects along the stream-bed and turned cover objects, 6 cm in the water and on the bank within 2 m of the stream. Individuals < 35mm snout-to-vent length were classed as juveniles, including both first-year and second-year juveniles. The data are counts of adults and juveniles.
Usage
data("duskySalamanders")Format
duskySalamanders is four-dimensional (21 x 7 x 2 x 2) array of counts, with dimensions:
- site
- a numerical site identifier. 
- year
- the year of the survey, 2005 to 2012. 
- age
- Juveniles or Adults. 
- survey
- June or July. 
Source
Evan Grant and colleagues.
References
Zipkin, E.F. et al (2014) Modeling structured population dynamics using data from unmarked individuals, Ecology, 95(1) 22-29.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 2.9.2
Examples
data(duskySalamanders)
str(duskySalamanders)
Calculates the pair-wise distances between two sets of points
Description
Calculates the Euclidian distance between each of the points defined by the coordinates in 'x' and each of those in 'y'.
Usage
e2dist(x, y = NULL)
Arguments
| x | a 2-column matrix or data frame with the x and y coordinates of a set of points, or a length 2 vector with the coordinates of a single point. | 
| y | an optional 2-column matrix or data frame, or a length 2 vector, with the x and y coordinates of a second set of points; if NULL, y is taken to be the same as x. | 
Value
A nrows(x) x nrows(y) matrix with the pair-wise distances.
Author(s)
Andy Royle
Examples
pts1 <- expand.grid(x = 1:5, y = 6:8)
pts2 <- cbind(x=runif(5, 1, 5), y=runif(5, 6, 8))
require(graphics)
plot(pts1)
points(pts2, pch=19, col='red')
e2dist(x=pts1, y=pts2)
# with a vector argument:
vec <- colMeans(pts2)
e2dist(vec, pts2)
# with y = NULL:
e2dist(x=pts2)
Functions to return fit statistics
Description
fitstats produces sum of squared errors, Chi-squared statistic and Freeman-Tukey statistic used in parboot GOF tests throughout the book, starting with AHM1 - 7.5.4. fitstats2 produces the above, plus corresponding statistics based on total detection frequency across replicate surveys; see AHM1 - 7.9.3.
Usage
fitstats(fm)
fitstats2(fm)
Arguments
| fm | A fitted model object as returned by unmarked. | 
Value
For fitstats, a named vector of length 3 with sum of squared errors (SSE), Chi-squared statistic (Chisq) and Freeman-Tukey (freemanTukey).
For fitstats2, a named vector of length 6 with the same information plus corresponding statistics based on numbers(SSE.n), (Chisq.n) and (freemanTukey.n).
Author(s)
Marc Kéry & Andy Royle
References
Kéry & Royle (2016) Applied Hierarchical Modeling in Ecology AHM1 - 7.5.4 and 7.9.3.
Compute results of a Joint Species Distribution Model (JSDM)
Description
Compute pairwise residual correlation in occurrence or abundance in a Joint Species Distribution Model (JSDM) with latent variables.
getLVcorrMat computes the correlation matrix in residual occurrence or abundance in an latent-variable multi-species occupancy or N-mixture model which has been fit using MCMC (eg, in JAGS or WinBUGS). See Tobler et al. (2019) and Kéry & Royle (2021) chapter 8.
getEcorrMat computes the correlation matrix for the coefficients of the observed environmental covariates.
Usage
getLVcorrMat(lv.coef, type=c("occupancy", "Nmix"), stat=mean)
getEcorrMat(beta, stat=mean)
Arguments
| lv.coef | MCMC chains for the coefficients of the latent variables, typically from the  | 
| beta | MCMC chains for the coefficients of the environment variables (excluding the intercept), typically from the  | 
| type | Indication of whether the model fit was for occupancy data (with a probit link) or an N-mixture model based on count data. | 
| stat | The function used to summarize the MCMC chains for the correlations; if  | 
Value
The relevant correlation matrix, species x species; if stat = NULL, an array, iterations x species x species, with the MCMC chains.
Author(s)
Mathias Tobler.
References
Tobler, M. et al. (2019) Joint species distribution models with species correlations and imperfect detection. Ecology, 100(8), e02754.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 8.4.2 and 8.5.4.
Plot the results for a Gaussian state-space model
Description
Plot the observed time-series and the estimated population trajectories for a multivariate Gaussian state-space model (SSM).
Usage
graphSSM(ssm, C)
Arguments
| ssm | An object of class  | 
| C | The original count data: an individuals x year matrix of counts, no missing values. | 
Value
No output, the function is called for plotting.
Author(s)
Marc Kéry & Andy Royle, adapted for multivariate SSMs from the SSM graphing function graph.ssm in Chapter 5 in Kéry & Schaub (2012).
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology Section 1.6
Kéry, M. & Schaub, M. (2012) Bayesian population analysis using WinBUGS - a hierarchical perspective, Academic Press.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 1.6.
Count data from the Swiss Breeding Bird Survey for Green Woodpeckers from 2004 to 2017
Description
The Swiss breeding bird survey ("Monitoring Häufige Brutvögel" MHB) has monitored the populations of approximately 150 common species since 1999. The MHB sample consists of 267 1-km squares that are laid out as a grid across Switzerland. Fieldwork is conducted by about 200 skilled birdwatchers annually, most of them volunteers. Avian populations are monitored using a simplified territory mapping protocol (also called spot-mapping West of the Atlantic), where each square is surveyed up to three times during the breeding season (only twice above the tree line around 2000 m elevation). Surveys are conducted along a transect that does not change over the years and is 4-6 kms in length on average.
The data frame greenWoodpecker has count data for Green Woodpeckers Picus viridis from 2004 to 2017. There are some missing values: see Details.
Usage
data("greenWoodpecker")Format
greenWoodpecker is a data frame with 267 rows and 146 columns:
- x, y
- the x and y coordinates of the center of the quadrat; the coordinate reference system intentionally not specified. 
- nsurveys
- the number of replicate surveys planned in the quadrat; above the tree-line 2, otherwise 3. 
- elevation
- the mean elevation of the quadrat, m. 
- forest
- percentage forest cover 
- route.length
- the length of the route 
- X2004.1.count to X2017.3.count
- number of territories observed for each survey; 2004.1 indicates the first survey in 2004. 
- X2004.1.date to X2017.3.date
- Julian date of the survey, 1 = a weekend near 15 April. 
- X2004.1.time to X2017.3.time
- time spent in the quadrat during the survey, mins. 
- X2004.observer to X2017.observer
- numerical code identifying the observer. 
Details
Missing values in the count and date columns indicate that the corresponding survey was not carried out.
On 26 occasions when surveys were carried out, the time taken was not recorded, resulting in additional NAs in the time columns.
Source
Swiss Ornithological Institute
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 2.2.
Examples
data(greenWoodpecker)
str(greenWoodpecker)
Helper function to draw scale for image
Description
Plots a scale in the right margin of a plot, typically an image plot.
Usage
image_scale(z, col, x, y = NULL, size = NULL, digits = 2, labels = c("breaks", "ranges"),
      cex.legend=1)
Arguments
| z | a vector of at least 2 numbers which define the range of values for which colors should be plotted; this should be the same as the value of  | 
| col | a vector of colors, use the same as in the image | 
| x | the x coordinate of the left edge of the scale bar; or a list with components x and y, each of length 2, giving the x and y coordinates of the edges of the scale bar. | 
| y | the y coordinate of the bottom edge of the scale bar, ignored if  | 
| size | the size of the boxes making up the scale bar, a length 2 vector with the width and height, or a scalar, in which case width = height; ignored if  | 
| digits | the number of decimal places to display | 
| labels | if "breaks", the dividing lines in the scale bar are labeled, if "ranges", each box is labeled with its range; may be abbreviated | 
| cex.legend | the magnification to be used for axis annotation | 
Value
None, used for its plotting side effect.
Note
This function appears in the book text as image.scale; renamed here to avoid confusion with generic image functions.
Author(s)
Andy Royle.
References
Royle, J.A., Chandler, R.B., Sollmann, R., & Gardner, B. (2014) Spatial capture-recapture, Elsevier.
Examples
# uses the built-in volcano data set
require(grDevices) # for colours
require(graphics)
par(mar = c(3,3,3,6))  # make the right margin wide enough
image(t(volcano)[ncol(volcano):1,], col=terrain.colors(12))
image_scale(volcano, col=terrain.colors(12))
# Try placing the scale bar on the left
par(mar = c(3,8,3,1))  # make the left margin wide enough
image(t(volcano)[ncol(volcano):1,], col=terrain.colors(12))
image_scale(volcano, col=terrain.colors(12), x= -0.28, digits=0, cex.legend=1.2)
# Trial and error needed to get the x value right.
Simulate open distance sampling data for the Island Scrub Jays
Description
Function to simulate open distance sampling data for the Island Scrub Jays, based on Sollmann et al (2015).
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
issj.sim(B, db, lam, sigma, phi, gamma, npoints, nyrs, nbsize = -1.02)
Arguments
| B | Radius of the circle sampled; a site is a circle of radius B around a point. | 
| db | Distance categories; a vector of cut points from 0 to B inclusive. | 
| lam | Expected abundance per site, a vector of length  | 
| sigma | Scale parameter of the half-normal detection function at each site,  a vector of length  | 
| phi | Survival probability | 
| gamma | Recruitment rate | 
| npoints | Number of sites where point counts are conducted. | 
| nyrs | Number of years | 
| nbsize | Size parameter for the negative binomial distribution used to generate individual counts per site for year 1. | 
Value
A list with the following elements:
| NcList | A list with one element per year, with distances of all animals from the point. | 
| detList | A list with one element per year, a  | 
| N | The (true) number of animals at each point for each year, a  | 
| cell | The site IDs where point counts are conducted. | 
| y | 
 | 
| dclass | a vector with the distance class for each animal detected | 
| site | a corresponding vector with the site for each animal detected | 
| nsite | the number of sites in the study area | 
| lam,phi,gamma,sigma | the values of the corresponding arguments | 
Author(s)
Marc Kéry & Andy Royle, based on Sollmann et al (2015)
References
Sollmann, R., Gardner, B., Chandler, R.B., Royle, J.A., Sillett, T.S. (2015) An open population hierarchical distance sampling model. Ecology 96, 325-331.
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 9.7.1.
Examples
# A toy example with just 20 sites
set.seed(2015)
tmp <- issj.sim(B = 300,
    db = c(0,50, 100, 150, 200, 250, 300),
    lam = c(3.01, 7.42, 20.51, 1.60, 0.42, 3.42, 8.24, 0.66, 0.32, 0.39, 0.46, 0.52,
      0.63, 0.36, 4.93, 0.47, 2.07, 0.42, 0.48, 0.47),
    sigma = c(110, 91, 70, 114, 135, 101, 88, 130, 133, 134, 134, 135, 131, 135, 100,
      135, 110, 135, 134, 135),
    phi = 0.6, gamma = 0.35,
    npoints = 15, nyrs = 4)
str(tmp)
# Compare the number detected with the true numbers present
with(tmp, cbind(y, N[cell, ]))
Mapping of residuals
Description
Produces a map of the mean residuals from an N-mixture model fit by function pcount in unmarked. Used in AHM1 - 6.9.3 to produce maps of Switzerland with the residuals for each site.
Usage
map.Nmix.resi(fm, x, y)
Arguments
| fm | the fitted model object | 
| x | x coordinates of each site | 
| y | y coordinates of each site | 
Value
None. Used for its plotting side effects.
Note
In previous versions, the defaults were x = tits$coordx and y = tits$coordy, but those defaults only worked if the data object tits was in the workspace. To run the code on page 263 of AHM1, you now need to specify the coordinates, eg, map.Nmix.resi(fm5, x = tits$coordx, y = tits$coordy).
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.9.3.
Model selection results from a list of 'unmarkedFitOccuFP' models
Description
The unmarked functions fitList and modSel do now work for 'unmarkedFitOccuFP' models, so this stop-gap function is deprecated. Use modSel(fitList(fits=mod.list)) instead.
Usage
modSelFP(mod.list)
Arguments
| mod.list | A list of fitted model objects of class 'unmarkedFitOccuFP' as returned by unmarked. | 
Value
A data frame with a row for each model in the input list and columns for number of parameters, AIC, deltaAIC, AIC weight and cumulative AIC weight.
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 7.2.2.
Compute multinomial cell probabilities
Description
These functions are used internally in the multinomial-Poisson models multinomPois and gmultmix in package unmarked to calculate the multinomial cell probabilities from the "raw" detection probabilities. See piFuns. These functions are customized for the examples in chapter 7 of AHM1 and only allow for 3 capture occasions.
See makePiFuns for functions to construct piFuns with more occasions.
Usage
instRemPiFun(p)
crPiFun(p)
crPiFun.Mb(p)
MhPiFun(p)
Arguments
| p | a numeric matrix with rows for each site and 3 columns, see Details. | 
Details
instRemPiFun defines the relationship between the multinomial cell probabilities and the underlying detection probability parameters in a removal design with 3 sampling periods of unequal length, specifically 2, 3 and 5 minutes. The columns of p give the detection probabilities per unit time (ie, per minute) for each site and sampling period. See AHM1 Section 7.7. This is the same as the function defined in the Examples section of piFuns in unmarked.
crPiFun defines a pi function for capture-recapture design with 3 capture occasions. The columns of p give the detection probabilities for the three occasions. See AHM1 Section 7.8.5. NOTE that this is not the same as the custom crPiFun defined in Section 7.9.1.
crPiFun.Mb defines a pi function for capture-recapture design with 3 capture occasions with a behavioral response. Detection probabilities do not vary with time. Column #1 of p gives the probability of detection for animals not previously detected; column #3 gives detection probability after the first detection; column #2 is ignored. See AHM1 Section 7.8.2.
MhPiFun defines a pi function for a model with individual detection heterogeneity, modeled as a random effect with a logit-normal distribution. Column #1 of p gives the mean probability of detection; p[1, 2] is a value in [0, 1] which controls the scale parameter for the normal distribution; other entries are ignored. See Section 7.8.3.
Value
A matrix with a row for each site corresponding to the rows of p and...
...for instRemPiFun, a column for each detection occasion with the multinomial probability.
...for other functions, a column for each detection history; for 3 detection occasions these are 001, 010, 011, 100, 101, 110, 111.
Author(s)
Richard Chandler, Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 7.
Examples
# Capture probs for 5 sites, with 3 survey periods
( p <- matrix(0.4, nrow=5, ncol=3) )
# Removal model
instRemPiFun(p)
# The corresponding obsToY matrix:
matrix(1, 2, 3)
# Capture-recapture model
crPiFun(p)
# The corresponding obsToY matrix:
matrix(1, 3, 7)
# Capture-recapture model with behavioural response
( pMb <- cbind(rep(0.7, 5), NA, 0.3) )
crPiFun.Mb(pMb)
# The corresponding obsToY matrix:
matrix(1, 3, 7)
# Capture-recapture model with heterogeneity
pMh <- cbind(rep(0.4, 5), NA, NA)
pMh[1, 2] <- 0.3
pMh
MhPiFun(pMh)
# The corresponding obsToY matrix:
matrix(1, 3, 7)
Function to play Royle-Nichols (RN) model
Description
Function generates replicated count data under the binomial N-mixture model of Royle (2004), then 'degrades' the counts to detection/nondetection and fits the Royle-Nichols (RN) model (Royle & Nichols 2003) using unmarked and estimates site-specific abundance.
Usage
playRN(M = 267, J = 3, mean.abundance = 1, mean.detection = 0.3,
    show.plots = TRUE, verbose = TRUE)
Arguments
| M | The number of sites. | 
| J | The number of visits to each site. | 
| mean.abundance | Expected abundance at each site. | 
| mean.detection | Expected detection at each survey at each site. | 
| show.plots | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
| verbose | if FALSE, output to the console will be suppressed. | 
Value
A list with the following elements:
| nsites | The number of sites, equal to  | 
| nvisits | The number of visits, equal to  | 
| coef | A named vector of coefficients for the linear models for expected number and detection probability | 
| slope | Slope of the regression of the estimated number on the true number; 1 if the model is perfect | 
Author(s)
Marc Kéry & Andy Royle
References
Royle, J.A. & Nichols, J.D. (2003) Estimating abundance from repeated presence-absence data or point counts, Ecology, 84, 777-790.
Royle, J.A. (2004) N-mixture models for estimating population size from spatially replicated counts, Biometrics, 60, 108-115.
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.13.1.
Examples
# Run a simulation with the default arguments and look at the results:
playRN()
# Execute the function using various settings
playRN(M = 100, J = 3, mean.abundance = 0.1)  # Increasing abundance
playRN(M = 100, J = 3, mean.abundance = 1)
playRN(M = 100, J = 3, mean.abundance = 5)
playRN(M = 100, J = 3, mean.detection = 0.3)  # Increasing detection
playRN(M = 100, J = 3, mean.detection = 0.5)
playRN(M = 100, J = 3, mean.detection = 0.7)
playRN(M = 100, J = 20)                       # More visits
playRN(M = 1000, J = 3)                       # More sites
Produce some residual plots
Description
Function does diagnostic plots for one binomial N-mixture model fitted with all three mixture distributions currently available in package unmarked: Poisson, negative binomial and zero-inflated Poisson. For each, fitted values vs. observed data and residuals vs. fitted values are plotted.
Usage
plot_Nmix_resi(fmP, fmNB, fmZIP)
Arguments
| fmP | Fitted model object for Poisson distribution. | 
| fmNB | Fitted model object for negative binomial distribution. | 
| fmZIP | Fitted model object for zero-inflated Poisson distribution. | 
Value
None, used for its plotting effect.
Note
This function appears in the text as plot.Nmix.resi; renamed here to avoid confusion with generic plot functions.
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.9.3.
Plot results from posterior predictive check
Description
Function plots results from posterior predictive check in AHM1 section 6.8 for a fitted model object with JAGS.
Usage
ppc.plot(fm)
Arguments
| fm | The fitted model object | 
Value
None, used for its plotting side effect.
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.8.
Simulate a Poisson point process
Description
Simulates animal or plant locations in space according to a homogenous Poisson process. This process is characterized by the intensity, which is the average number of points per (very small)unit area. The resulting point pattern is then discretized to obtain abundance data and  presence/absence (or occurrence) data. The discretization of space is achieved by choosing the cell size. It is used in AHM1 Section 1.1 to help to understand the relationship between point patterns, abundance data and occurrence data (also called presence/absence or distribution data). For a similar and somewhat more sexy version of this function, see simPPe.
Usage
sim.fn(quad.size = 10, cell.size = 1, intensity = 1, show.plot = TRUE)
Arguments
| quad.size | The length of each side of the quadrat (in arbitrary units) | 
| cell.size | The length of each side of the cells into which the quadrat is divided. The ratio of quad.size to cell.size must be an integer. | 
| intensity | The average number of points (animals or plants) per unit area. | 
| show.plot | If TRUE, the results are plotted. Set to FALSE when running simulations. | 
Value
A list with the values of the arguments and the following additional elements:
| exp.N | Expected population size in quadrat | 
| breaks | boundaries of grid cells | 
| n.cell | Number of cells in the quadrat | 
| mid.pt | Cell mid-points | 
| M | Realized population size in quadrat | 
| u1 | x coordinate of each individual | 
| u2 | y coordinate of each individual | 
| N | The number of individuals in each cell (a vector of length n.cell) | 
| z | Presence/absence (1/0) in each cell (a vector of length n.cell) | 
| psi | Proportion of cells occupied, ie, the species is present. | 
Author(s)
Marc Kéry and Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 1.1.
Examples
# Generate a simulated data set with default arguments and look at the structure:
tmp <- sim.fn()
str(tmp)
# Effect of grain size of study on abundance and occupancy (intensity constant)
tmp <- sim.fn(quad.size = 10, cell.size = 1, intensity = 0.5)
tmp <- sim.fn(quad.size = 10, cell.size = 2, intensity = 0.5)
tmp <- sim.fn(quad.size = 10, cell.size = 5, intensity = 0.5)
tmp <- sim.fn(quad.size = 10, cell.size = 10, intensity = 0.5)
# Effect of intensity of point pattern (intensity) on abundance and occupancy
tmp <- sim.fn(intensity = 0.1) # choose default quad.size = 10, cell.size = 1
tmp <- sim.fn(intensity = 1)
tmp <- sim.fn(intensity = 5)
tmp <- sim.fn(intensity = 10)
Simulation of distance sampling data.
Description
Simulates non-hierarchical line transect data under conventional distance sampling (CDS). It subjects N individuals to sampling, and then retains the value of distance from transect only for individuals that are captured.
Usage
sim.ldata(N = 200, sigma = 30, show.plot = TRUE)
Arguments
| N | number of individuals along transect with distance Uniform(-100, 100) | 
| sigma | scale parameter of half-normal detection function | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the following elements:
| N | the number of individuals along the transect | 
| sigma | scale parameter of half-normal detection function | 
| xall | distance from the transect line for all N individuals | 
| x | absolute distance from the transect line for those individuals detected | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 8.2.3.
Examples
# Simulate a data set with the default arguments and look at the structure of the output:
tmp <- sim.ldata()
str(tmp)
Simulate non-hierarchical point transect (= point count) data
Description
Function simulates coordinates of individuals on a square with a count location at the center point.
Usage
sim.pdata(N = 1000, sigma = 1, B = 3, keep.all = FALSE, show.plot = TRUE)
Arguments
| N | total population size in the square | 
| sigma | scale of half-normal detection function | 
| B | circle radius; the data are simulated over a square of side 2 * B, but individuals outside the circle of radius B are not detected. | 
| keep.all | if TRUE, the data for all individuals are returned; if FALSE, only for individuals detected. | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the following elements:
| N | total population size in the square | 
| sigma | scale of half-normal detection function | 
| B | circle radius | 
| u1,u2 | the x and y coordinates of each of the individuals | 
| d | the distance of each individual from the center of the circle | 
| y | a 0/1 indicator showing whether each individual is detected or not, a vector of length N | 
| N.real | the realized number of individuals within the circle of radius B | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 8.2.5.1.
Examples
# Simulate a data set with the default arguments and look at the structure of the output:
tmp <- sim.pdata()
str(tmp)
Simulates data for a basic spatial distance sampling model
Description
Generates data with the following steps:
1. Simulate a spatially correlated habitat covariate (x) over a grid of pixels covering a square.
2. Distribute the population of N individuals over the square with probability of location in a pixel related to the covariate.
3. Decide which individuals are detected using a distance sampling model with an observer at the center of the square, with either a half normal or a logit detection function. (Note that all the individuals in the square can be detected.)
4. If keep.all = FALSE, return the locations of only the individuals detected.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
sim.spatialDS(N = 1000, beta = 1, sigma = 1, keep.all = FALSE, B = 3,
  model=c("logit", "halfnorm"), lambda = B/3, useHabitat, show.plot=TRUE)
Arguments
| N | total population size in the square | 
| beta | coefficient for the effect of spatial covariate x on the distribution of individuals | 
| sigma | scale parameter of detection function | 
| keep.all | if TRUE, the data for all individuals are returned; if FALSE, only for individuals detected. | 
| B | distance from the observer to the side of the square. This is usually set so that the probability of detection of individuals outside the square is negligable, eg,  | 
| model | The detection function used, can be "logit" or "halfnorm": see Details. | 
| lambda | The scale parameter for the spatially autocorrelated Habitat covariate. | 
| useHabitat | If the output from a previous simulation is provided, the same Habitat covariate will be used (and  | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Details
The "logit" detection function is 2*plogis(-d^2/(2*sigma^2)), which corresponds to the detection model implemented in unmarked::pcount.spHDS.
Value
A list with the values of the input arguments and the following additional elements:
| u1 | x coordinate of each animal | 
| u2 | y coordinate of each animal | 
| d | distance of each animal from the center of the circle | 
| pixel.id | the pixel in which each animal is located, the row number in  | 
| y | indicator of detection of each animal, a vector of length N | 
| N.real | the number of animals inside the circle of radius B | 
| Habitat | Value of the spatially correlated habitat covariate, a 900 x 1 matrix | 
| grid | Coordinates of the center of each pixel, a dataframe with 900 rows and 2 columns | 
If keep.all = FALSE (the default), only the animals detected are included in u1, u2, d, pixel.id.
Note
Kéry & Royle (2016, p.535 and discussion p.540) and earlier versions of AHMbook included a hazard rate detection function. This is problematic because the detection probability at distance zero is less than 1 (p(0) < 1) and should not be used. It is replaced here with the logit detection function, which does have p(0) = 1.
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM2 - 9.8.3.
Examples
# Generate data with the default arguments and look at the structure:
tmp <- sim.spatialDS()
str(tmp)
# Generate date with model = "logit" and analyse the data with unmarked::pcount.spatialHDS
# RNGkind(sample.kind = "Rounding") # run this for R >= 3.6.0
set.seed(1234)
tmp <- sim.spatialDS(model="logit")
# Plot shows a large area of good habitat west of the observer with many animals detected
str(tmp)  # 272 animals detected out of 850 inside the circle (N.real)
# Fit some models with unmarked
if(require(unmarked)) {
  # Get the count of animals detected in each pixel
  pixel.count <- tabulate(tmp$pixel.id, nbins=nrow(tmp$grid))
  # Centre the Habitat covariate
  Habitat <- tmp$Habitat - mean(tmp$Habitat)
  # Create a detection covariate: distance between observer and pixel center
  dist <- with(tmp, sqrt((grid[,1]-B)^2 + (grid[,2]-B)^2))
  # Construct an unmarkedFrame
  umf <- unmarkedFramePCount(y=cbind(pixel.count),
     siteCovs=data.frame(dist=dist, Habitat=Habitat))
  summary(umf)
  (fm0 <- pcount.spHDS(~ -1 + I(dist^2) ~ 1, umf, K=20))
  (fm1 <- pcount.spHDS(~ -1 + I(dist^2) ~ Habitat, umf, K=20))
  # The model with Habitat has much lower AIC
  # Get an estimate of the total population in the square (true is N = 1000)
  sum(predict(fm1, type='state')[, 1])
}
Simulates data for a hierarchical spatial distance sampling model
Description
Generates data for distance sampling from spatially-replicated point transects, with density dependent on a spatially correlated habitat covariate. For each point count, the procedure is:
1. Simulate the habitat covariate over a grid of pixels covering a square.
2. Distribute the population of individuals over the square with probability of location in a pixel related to the covariate.
3. Decide which individuals are detected using a distance sampling model with an observer at the center of the square, with a half normal detection function. (Note that individuals outside the circle of radius B can be detected.)
The locations and detection status of individuals at all sites are collated and returned, except for individuals at sites when none are detected.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
sim.spatialHDS(lam0 = 4, sigma = 1.5, B = 3, nsites = 100,
  beta1 = 1, npix = 20, show.plots = 3)
Arguments
| lam0 | the expected number of individuals in the square of side = 2*B if the habitat covariate had no effect (ie,  | 
| sigma | scale parameter of the half-normal detection function | 
| B | distance from the observer to the side of the square. This is usually set so that the probability of detection of individuals outside the square is negligable, eg,  | 
| nsites | number of sites | 
| beta1 | the size of the effect of the habitat covariate on the number of individuals in a pixel. | 
| npix | the number of pixels along each dimension of the square: the entire grid has  | 
| show.plots | the number of sites for which plots should be displayed. | 
Value
A list with the following components:
| data | a matrix with columns for siteID, the coordinates of each individual ( | 
| B | the radius of the circle; the data are simulated over a square of side 2*B | 
| Habitat | a matrix with the values of the Habitat covariate for each pixel at each site | 
| grid | a data frame with the coordinates of each pixel (same for all sites) | 
| N | the realized number of individuals in the square at each site | 
| nsites | the number sites | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 9.8.5.
Examples
# Generate data with the default arguments and look at the structure:
tmp <- sim.spatialHDS()
str(tmp)
Simulate detection/nondetection data for static 3-level occupancy models
Description
Function generates 3-level occupancy data with possibility of site-specific random variation at every level, "time effects" at the middle and the lower levels and effects of one distinct covariate at each level.
Usage
sim3Occ(nunits = 100, nsubunits = 5, nreps = 3,
  mean.psi = 0.8, beta.Xpsi = 1, sd.logit.psi = 0,
  mean.theta = 0.6, theta.time.range = c(-1, 1), beta.Xtheta = 1, sd.logit.theta = 0,
  mean.p = 0.4, p.time.range = c(-2, 2), beta.Xp = -1, sd.logit.p = 0,
  show.plot = TRUE, verbose = TRUE)
Arguments
| nunits | Number of main units (large quadrats) | 
| nsubunits | Number of subunits (nested subsamples within each main unit) | 
| nreps | Number of replicate surveys in every subunit | 
| mean.psi | Mean large-scale, unit-level occupancy probability (psi) | 
| beta.Xpsi | effect on psi of covariate A (at main unit level) | 
| sd.logit.psi | SD of logit(psi), unstructured site variation in psi | 
| mean.theta | Mean small-scale (subunit) occupancy probability (theta) | 
| theta.time.range | range of theta 'intercepts' for subunits | 
| beta.Xtheta | effect on theta of covariate B (at subunit level) | 
| sd.logit.theta | SD of logit(theta), unstructured site variation in theta | 
| mean.p | Mean per-survey detection probability | 
| p.time.range | range of p 'intercepts' for replicates | 
| beta.Xp | effect on p of covariate C (unit by subunit by replicate) | 
| sd.logit.p | SD of logit(p) | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the values of the input arguments and the following additional elements:
| theta.time.effect | Simulated time effect on theta, a vector of length  | 
| p.time.effect | Simulated time effect on p, a vector of length  | 
| p | Detection probabiliy, a  | 
| z | Occupancy indicator for main units, a  | 
| a | Occupancy indicator for subunits, a  | 
| y | Detection array, a  | 
| sum.z | True number of occupied main units | 
| obs.sum.z | Observed number of occupied main units | 
| sum.z.a | Number of units with >=1 occupied, surveyed subunit | 
| covA | Simulated covariate A, a vector of length  | 
| covB | Simulated covariate B, a  | 
| covC | Simulated covariate C, a  | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.10.
Examples
# Generate data with the default arguments and look at the structure:
tmp <- sim3Occ()
str(tmp)
# 'Null' model (model 1)
str(data <- sim3Occ(nunits = 100, nsubunits = 5, nreps = 3, mean.psi = 0.8,
  beta.Xpsi = 0, sd.logit.psi = 0, mean.theta = 0.6, theta.time.range = c(0, 0),
  beta.Xtheta = 0, sd.logit.theta = 0, mean.p = 0.4, p.time.range = c(0,0),
  beta.Xp = 0, sd.logit.p = 0))
# No covariate effects, no random variability (model 2)
str(data <- sim3Occ(nunits = 100, nsubunits = 5, nreps = 3, mean.psi = 0.8,
  beta.Xpsi = 0, sd.logit.psi = 0, mean.theta = 0.6, theta.time.range = c(-1, 1),
  beta.Xtheta = 0, sd.logit.theta = 0, mean.p = 0.4, p.time.range = c(-2,2),
  beta.Xp = 0, sd.logit.p = 0))
# All covariate effects, but no random variability (model 3)
str(data <- sim3Occ(nunits = 100, nsubunits = 5, nreps = 3, mean.psi = 0.8,
  beta.Xpsi = 1, sd.logit.psi = 0, mean.theta = 0.6, theta.time.range = c(-1, 1),
  beta.Xtheta = 1, sd.logit.theta = 0, mean.p = 0.4, p.time.range = c(-2,2),
  beta.Xp = -1, sd.logit.p = 0))
# Most complex model with all effects allowed for by sim function (model 4)
str(data <- sim3Occ(nunits = 100, nsubunits = 5, nreps = 3, mean.psi = 0.8,
  beta.Xpsi = 1, sd.logit.psi = 1, mean.theta = 0.6, theta.time.range = c(-1, 1),
  beta.Xtheta = 1, sd.logit.theta = 1, mean.p = 0.4, p.time.range = c(-2,2),
  beta.Xp = -1, sd.logit.p = 1))
Simulate individual capture-histories under a Cormack-Jolly-Seber (CJS) survival model
Description
Function generates individual capture-histories under a CJS model with possibly time-dependent parameters. The number of values for interval-specific survival (phi) and time-specific detection (p) must be ensured to be equal to the number of occasions (n.occ) minus 1.
Usage
simCJS(n.occ = 6, n.marked = 20, phi = 0.7, p = 0.4, show.plot = TRUE)
Arguments
| n.occ | The number of occasions (eg, years). | 
| n.marked | The number of individuals marked per year except for the last year; scalar, or a vector of length (n.occ-1). | 
| phi | The apparent survival probability between years; scalar, or a vector of length (n.occ-1). | 
| p | The recapture probability for all except the first year; scalar, or a vector of length (n.occ-1). | 
| show.plot | Choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| z | n.ind x n.occ matrix, 1 if the individual is alive and in the study area, 0 otherwise. | 
| ch | n.ind x n.occ matrix, 1 if the individual is captured, 0 otherwise. | 
| f | n.ind vector, occasion the individual was first captured, and marked. | 
| n.ind | scalar, total number of individuals marked. | 
| n.alive | n.occ vector, number of animals alive and in the study area each year. | 
Author(s)
Marc Kéry & Andy Royle, based on code written by Michael Schaub for Chapter 7 of Kéry & Schaub (2012).
References
Kéry, M. & Schaub, M. (2012) Bayesian population analysis using WinBUGS - a hierarchical perspective, Academic Press.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 3.2.2.
Examples
# Run with the default arguments and look at the structure of the output:
set.seed(123)
tmp <- simCJS()
str(tmp)
colSums(tmp$ch)
Simulate community occupancy or community abundance data
Description
Simulate detection/nondetection or count data, respectively, under a community occupancy or abundance model with random species effects for psi or lambda and p (both including effects of one covariate, 'habitat' for psi or lambda and 'wind speed' for p) (introduced in AHM1 - 11.2)
Usage
simComm(type = c("det/nondet", "counts"), nsites = 30, nreps = 3, nspecies = 100,
  mean.psi = 0.25, sig.lpsi = 1, mu.beta.lpsi = 0, sig.beta.lpsi = 0,
  mean.lambda = 2, sig.loglam = 1, mu.beta.loglam = 1, sig.beta.loglam = 1,
  mean.p = 0.25, sig.lp = 1, mu.beta.lp = 0, sig.beta.lp = 0, show.plot = TRUE)
Arguments
| type | choose whether you want to simulate detection/nondetection ("det/nondet") data or count data ("counts"). | 
| nsites | number of sites | 
| nreps | number of replicate samples (occasions or repeated measurements) | 
| nspecies | total number of species in the area that is sampled by these sites (regional species pool) | 
| mean.psi | community mean of occupancy probability over all species in community (probability scale); ignored if  | 
| sig.lpsi | community standard deviation of logit(occupancy probability intercept); ignored if  | 
| mu.beta.lpsi | community mean of the effects of 'habitat' covariate on occupancy probability; ignored if  | 
| sig.beta.lpsi | community standard deviation of the effects of 'habitat' covariate on occupancy probability; ignored if  | 
| mean.lambda | community mean of expected abundance over all species in superpopulation; ignored if  | 
| sig.loglam | community standard deviation of log(lambda intercept); ignored if  | 
| mu.beta.loglam | community mean of the effects of 'habitat' covariate on log(lambda); ignored if  | 
| sig.beta.loglam | community standard deviation of the effects of 'habitat' covariate on expected abundance; ignored if  | 
| mean.p | community mean of detection probability over all species in superpopulation (probability scale) | 
| sig.lp | community standard deviation of logit(detection probability intercept) | 
| mu.beta.lp | community mean of the effects of 'wind' covariate on detection probability | 
| sig.beta.lp | community standard deviation of the effects of 'wind' covariate on detection probability | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Details
Function simulates data from repeated sampling of a metacommunity (or spatially structured community) according to the model of Dorazio & Royle (JASA, 2005) for type = "det/nondet" (this is the default) or under the model of Yamaura et al. (2012) for type = "counts".
Occupancy probability (psi) or expected abundance (lambda) can be made dependent on a continuous site covariate 'habitat', while detection probability can be made dependent an observational covariate 'wind'. Both intercept and slope of the two log-linear or logistic regressions (for occupancy or expected abundance, respectively, and for detection) are simulated as draws from a normal distribution with mean and standard deviation that can be selected using function arguments.
Specifically, the data are simulated under the following linear models:
(1) for type = "det/nondet" (i.e., community occupancy)
| (occupancy (psi) and detection (p) for site i, replicate j and species k) | |
| psi[i,k] <- plogis(beta0[k] + beta1[k] * habitat[i] | Occupancy | 
| p[i,j,k] <- plogis(alpha0[k] + alpha1[k] * wind[i,j] | Detection | 
(2) for type = "counts" (i.e., community count)
| (exp. abundance (lambda) and detection (p) for site i, rep. j and species k) | |
| lambda[i,k] <- exp(beta0[k] + beta1[k] * habitat[i] | E(N) | 
| p[i,j,k] <- plogis(alpha0[k] + alpha1[k] * wind[i,j] | Detection | 
Species-specific heterogeneity in intercepts and slopes is modeled by up to four independent normal distributions (note: no correlation between the intercepts as in Dorazio et al. (2006) or Kéry & Royle (2008))
(1) for type = "det/nondet" (i.e., community occupancy)
| beta0 ~ dnorm(logit(mean.psi), sig.lpsi) | Mean and SD of normal distribution | 
| beta1 ~ dnorm(mu.beta.lpsi, sig.beta.lpsi) | |
| alpha0 ~ dnorm(logit(mean.p), sig.lp) | |
| alpha1 ~ dnorm(mu.beta.lp, sig.beta.lp) | 
(2) for type = "counts" (i.e., community count)
| beta0 ~ dnorm(log(mean.lambda), sig.loglam) | Mean and SD of normal distribution | 
| beta1 ~ dnorm(mu.beta.loglam, sig.beta.loglam) | |
| alpha0 ~ dnorm(logit(mean.p), sig.lp) | |
| alpha1 ~ dnorm(mu.beta.lp, sig.beta.lp) | 
Value
A list with the arguments supplied and the following additional elements:
(1) for type = "det/nondet" (i.e., community occupancy)
| habitat | length  | 
| wind | 
 | 
| psi | 
 | 
| p | 
 | 
| z | 
 | 
| z.obs | 
 | 
| y.all | 
 | 
| y.obs | 
 | 
| y.sum.all | detection frequency for all species | 
| y.sum.obs | detection frequency for observed species | 
| Ntotal.fs | finite sample (or conditional) species richness | 
| Ntotal.obs | observed species richness | 
| S.true | true number of species occurring at each site | 
| S.obs | observed number of species occurring at each site | 
(2) for type = "counts" (i.e., community count)
| habitat | length  | 
| wind | 
 | 
| lambda | 
 | 
| p | 
 | 
| N | 
 | 
| y.all | 
 | 
| y.obs | 
 | 
| ymax.obs | 
 | 
| Ntotal.fs | finite sample (or conditional) species richness | 
| Ntotal.obs | observed species richness | 
Author(s)
Marc Kéry & Andy Royle, with community occupancy model code partly based on code by Richard Chandler.
References
Dorazio, R.M. & Royle, J.A. (2005) Estimating size and composition of biological communities by modeling the occurrence of species. J American Statistical Association, 100, 389-398.
Dorazio, R.M., et al (2006) Estimating species richness and accumulation by modeling species occurrence and detectability. Ecology 87, 842-854.
Kéry, M. & Royle, J.A. (2008) Hierarchical Bayes estimation of species richness and occupancy in spatially replicated surveys. Journal of Applied Ecology 45, 589-598.
Yamaura, Y., et al. (2012) Biodiversity of man-made open habitats in an underused country: a class of multispecies abundance models for count data. Biodiversity and Conservation 21, 1365-1380.
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 11.2.
Examples
# Default arguments:
str(simComm())
# Some possibly interesting settings of the function
data <- simComm(nsites = 267, nspecies = 190, mean.psi = 0.25, sig.lpsi = 2,
  mean.p = 0.12, sig.lp = 2) # similar to Swiss MHB
data <- simComm(mean.psi = 1)         # all species occur at every site
data <- simComm(mean.p = 1)           # no measurement error (perfect detection)
# Effect of spatial sample size (nsites) on species richness in sample (Ntotal.fs)
data <- simComm(nsites=50, nspecies = 200) # 1-3 are usually missed in sample
data <- simComm(nsites=30, nspecies = 200) # 4-6 usually missed
data <- simComm(nsites=10, nspecies = 200) # around 30 typically missed
Simulate detection/nondetection data under a general dynamic community (site-occupancy) model
Description
Function to simulate detection/nondetection data under a general dynamic community (= dynamic, multi-species site-occupancy) model, including:
* annual variation in the probabilities of patch persistence, colonization and detection is specified by the bounds of a uniform distribution.
* species heterogeneity around the community means is specified by the SD of a normal distribution and expressed on the logit scale
* one covariate is allowed per parameter (site covariate for psi1, site-year covariate for phi and gamma and site-year-rep covariate for p). Each covariate is allowed to differ among species again according to a logit-normal model of heterogeneity.
* additional detection heterogeneity at the site- or the occasion level, with the possibility of a temporal trend in this heterogeneity over years. E.g., an annual trend in detection heterogeneity at the site or the occasion level is specified by the value in the first and the last year. Hence, trend.sd.site = c(0, 1) will result in a linear trend in the magnitude of site-level heterogeneity in detection from 0 in the first year to 1 in the last year, with interpolation for the years in between.
* additional detection heterogeneity that among occasions according to a quadratic effect of occasion number (to model the typical 'phenological curve' of an insect species for instance).
These last two types of detection heterogeneity are not (yet) allowed to be species-specific.
Usage
simDCM(nspecies = 50, nsites = 100, nsurveys = 3, nyears = 10,
  mean.psi1 = 0.4, sig.lpsi1 = 1, mu.beta.lpsi1 = 0, sig.beta.lpsi1 = 0,
  range.mean.phi = c(0.8, 0.8), sig.lphi = 1, mu.beta.lphi = 0,
  sig.beta.lphi = 0, range.mean.gamma = c(0.2, 0.2), sig.lgamma = 1,
  mu.beta.lgamma = 0, sig.beta.lgamma = 0, range.mean.p = c(0.5, 0.5),
  sig.lp = 1, mu.beta.lp = 0, sig.beta.lp = 0, range.beta1.survey = c(0, 0),
  range.beta2.survey = c(0, 0), trend.sd.site = c(0, 0),
  trend.sd.survey = c(0, 0), show.plot = TRUE, verbose = TRUE)
Arguments
| nspecies | number of species (typically called N in AHM book) | 
| nsites | number of sites (M). | 
| nsurveys | number of replicate surveys within a year (J). | 
| nyears | number of years (T). | 
| mean.psi1 | average (across all species in the community) of the intercept of occupancy probability in first year. | 
| sig.lpsi1 | sd of the normal distribution from which species-specific occupancy intercepts are drawn (centered on logit(mean.psi1)), on logit scale. | 
| mu.beta.lpsi1 | community mean of the coefficients of the covariate in probabilities of initial occupancy: the probability-scale mean of the normal distribution from which the species-specific coefficients are drawn. | 
| sig.beta.lpsi1 | sd of the normal distribution from which species-specific slopes are drawn (centered on mu.beta.lpsi1). | 
| range.mean.phi | bounds of uniform distribution from which the average (across species) annual intercept of persistence is drawn. | 
| sig.lphi | sd of the normal distribution from which species-specific persistence intercepts are drawn (centered on logit(mean.phi), which are year-specific), on logit scale. | 
| mu.beta.lphi | community mean of the coefficients of the covariate in probabilities of persistence: the probability-scale mean of the normal distribution from which the species-specific coefficients are drawn. | 
| sig.beta.lphi | sd of the normal distribution from which species-specific persistence slopes are drawn (centered on mu.beta.lphi). | 
| range.mean.gamma | bounds of uniform distribution from which the average (across species) annual intercept of colonization is drawn. | 
| sig.lgamma | sd of the normal distribution from which species-specific colonization intercepts are drawn (centered on logit(mean.gamma), which are year-specific), on logit scale. | 
| mu.beta.lgamma | community mean of the coefficients of the covariate in probabilities of colonization: the probability-scale mean of the normal distribution from which the species-specific coefficients are drawn. | 
| sig.beta.lgamma | sd of the normal distribution from which species-specific colonization slopes are drawn (centered on mu.beta.lgamma). | 
| range.mean.p | bounds of uniform distribution from which the average (across species) annual intercept of p is drawn. | 
| sig.lp | sd of the normal distribution from which species-specific detection intercepts are drawn (centered on logit(mean.p), which are year-specific), on logit scale. | 
| mu.beta.lp | community mean of the coefficients of the covariate in probabilities of detection: the probability-scale mean of the normal distribution from which the species-specific coefficients are drawn. | 
| sig.beta.lp | sd of the normal distribution from which species-specific detection slopes are drawn (centered on mu.beta.lp). | 
| range.beta1.survey | the range of the annual variation in the linear effect of survey (i.e., of surveys 1-12 if nsurveys = 12) on the product of availability and detection. | 
| range.beta2.survey | the range of the annual variation in the quadratic effect of survey (i.e., of surveys 1-12 if nsurveys = 12) on the product of availability and detection. | 
| trend.sd.site | sd of normal distribution to model logit-normal noise in p at the site level in the first and the last year of the simulation, with values for intermediate years interpolated linearly. | 
| trend.sd.survey | sd of normal distribution to model logit-normal noise in p at the occasion level, in the first and the last year, with values for intermediate years interpolated linearly. | 
| show.plot | if TRUE, plots are produced. Set this to FALSE when running simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the values of the input arguments and the following additional elements:
| Xpsi1 | Site covariate for psi1, a nsites x 1 matrix | 
| Xphi | Yearly-site covariate for phi, a nsites x nyears matrix | 
| Xgamma | Yearly-site covariate for gamma, a nsites x nyears matrix | 
| Xp | Observation covariate for p, a nsites x nsurveys x nyears array | 
| beta0.lpsi | initial (logit-scale) occupancy intercept for each species in the community, a vector of length nspecies | 
| beta1.lpsi | initial (log-scale) occupancy slope on Xpsi1 for each species in the community, a vector of length nspecies | 
| psi | occupancy probability per site, year and species, a nsites x nyears x nspecies array | 
| mean.phi | mean persistence (across species) intercept for each interval, a vector of length (nyears - 1) | 
| mean.gamma | mean colonization (across species) intercept for each interval, a vector of length (nyears - 1) | 
| eps.lphi | additive species effects in logit(phi) intercept, a vector of length nspecies | 
| eps.lgamma | additive species effects in logit(gamma) intercept, a vector of length nspecies | 
| beta0.lphi | logit-scale persistence intercepts for each species in community, a nspecies x (nyears - 1) matrix | 
| beta0.lgamma | logit scale colonization intercepts for each species in the community, a nspecies x (nyears - 1) matrix | 
| beta1.lphi | slope of logit(phi) on Xphi for each species in the community, a vector of length nspecies | 
| beta1.lgamma | slope of logit(gamma) on Xgamma for each species in the community, a vector of length nspecies | 
| phi | probability of persistence for each site, yearly interval and species, a nsites x (nyears-1) x nspecies array | 
| gamma | probability of colonization for each site, yearly interval and species, a nsites x (nyears-1) x nspecies array | 
| mean.p | mean detection (across species) intercept for each year, a vector of length nyears | 
| eps.lp | additive species effects in logit(p) intercept, a vector of length nspecies | 
| beta0.lp | species- and site-specific intercepts in the linear predictor for p, a nspecies x nyears matrix | 
| beta1.lp | species specific slopes of logit(p) on Xp, a vector of length nspecies | 
| beta1 | linear effect of the occasion number on detection probability, a vector of length nyears | 
| beta2 | quadratic effect of the occasion number on detection probability, a vector of length nyears | 
| sd.site | standard deviation of the zero-mean normal distribution from which additional, site-level detection heterogeneity is simulated, a vector of length nyears | 
| sd.survey | standard deviation of the zero-mean normal distribution from which additional, occasion-level detection heterogeneity is simulated,, a vector of length nyears | 
| eps1 | additive site random effects tht generate unstructured site-level detection heterogeneity, a vector of length nsites | 
| eps2 | additive occasion random effects tht generate unstructured site-level detection heterogeneity, a vector of length nsurveys | 
| n.occ | Number of occupied sites, a nyears x nspecies matrix | 
| psi.fs | Finite-sample occupancy proportion, a nyears x nspecies matrix | 
| mean.psi | Average psi over sites, a nyears x nspecies matrix | 
| z.obs | Observed value of z matrix, a nsites x nyears x nspecies array | 
| n.occ.obs | Observed number of occupied sites, a nyears x nspecies matrix | 
| psi.obs | Observed occupancy (finite sample), a nyears x nspecies matrix | 
| nyears.pres | Number of years when species present, a vector of length nspecies | 
| nspecies.pres | Number of species ever present, scalar | 
| nyears.det | Number of years when species detected, a vector of length nspecies | 
| nspecies.det | Number of species ever detected, scalar | 
| z | True value of z matrix (ie, presence/absence), a nsites x nyears x nspecies array | 
| p | Probability of detection, a nsites x nsurveys x nyears x nspecies array | 
| y | Observed detection history, a nsites x nsurveys x nyears x nspecies array of 0/1 | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 5.2.
Examples
# Simulate a data set with the default arguments and look at the structure of the output:
tmp <- simDCM()
str(tmp)
# Default arguments, without plots
str(data <- simDCM(show.plot = FALSE))
# More examples:
str(data <- simDCM(nspecies = 200)) # More species (looks great)
str(data <- simDCM(nspecies = 1))   # A single species (ha, works !)
str(data <- simDCM(nsites = 267))   # More sites
str(data <- simDCM(nsites = 1))     # A single site
str(data <- simDCM(nsurveys = 10))  # More visits
str(data <- simDCM(nyears = 25))    # More years
str(data <- simDCM(nyears = 2))     # Just two years
try(data <- simDCM(nyears = 1))     # A single year ... error
# No species heterogeneity in parameters of initial occupancy
str(data <- simDCM(sig.lpsi1 = 0, sig.beta.lpsi1 = 0))
# No species heterogeneity in parameters of persistence
str(data <- simDCM(sig.lphi = 0, sig.beta.lphi = 0))
# No species heterogeneity in parameters of colonisation
str(data <- simDCM(sig.lgamma = 0, sig.beta.lgamma = 0))
# No species heterogeneity in parameters of detection
str(data <- simDCM(sig.lp = 0, sig.beta.lp = 0))
# No annual variation in rates
str(data <- simDCM(range.mean.phi = c(0.8, 0.8), range.mean.gamma = c(0.3, 0.3),
  range.mean.p = c(0.6, 0.6)))
# Function arguments that lead to much structure (no zero arguments)
str(data <- simDCM(nspecies = 200, nsites = 267, nsurveys = 3, nyears = 25,
  mean.psi1 = 0.4, sig.lpsi1 = 3, mu.beta.lpsi1 = 1, sig.beta.lpsi1 = 3,
  range.mean.phi = c(0.5, 1), sig.lphi = 3, mu.beta.lphi = 1,
  sig.beta.lphi = 3, range.mean.gamma = c(0, 0.5),
  sig.lgamma = 3, mu.beta.lgamma = -1, sig.beta.lgamma = 3,
  range.mean.p = c(0.1, 0.9), sig.lp = 3, mu.beta.lp = 1, sig.beta.lp = 3,
  range.beta1.survey = c(-2, -0.5), range.beta2.survey = c(0, 2),
  trend.sd.site = c(0, 0), trend.sd.survey = c(0, 0), show.plot = TRUE))
# Not every occurring species will be detected
set.seed(1)
str(data <- simDCM(nspecies = 200, nsites = 20, nsurveys = 2, nyears = 10,
  mean.psi1 = 0.1, sig.lpsi1 = 5,
  range.mean.phi = c(0.3, 0.3), sig.lphi = 5,
  range.mean.gamma = c(0.1, 0.1), sig.lgamma = 5,
  range.mean.p = c(0.1, 0.1), sig.lp = 5) )
# Pull out data from species 5
ysp5 <- data$y[,,,5]
# Pull out data from year 1
yyr1 <- data$y[,,1,]
Simulate count data under the dynamic N-mixture model of Dail-Madsen
Description
Simulation for multiple-visit data (from pcountOpen help file in package unmarked). simDM0 has no covariates while simDM allows for covariates. Both functions assume constant time intervals between primary periods.
Usage
simDM0(nsites = 50, nsurveys = 3, nyears = 5,
  lambda = 4, gamma = 1.5, phi = 0.8, p = 0.7, show.plots=TRUE)
simDM(nsites = 50, nsurveys = 3, nyears = 5,
  mean.lambda = 4, mean.gamma.rel = 0.5,
  mean.phi = 0.8, mean.p = 0.7,
  beta.lam = 1, beta.gamma = 1, beta.phi = -1, beta.p = -1,
  show.plots=TRUE)
Arguments
| nsites | number of sites. | 
| nsurveys | number of replicate (secondary) samples within period of closure. | 
| nyears | number of primary samples: years, seasons etc. | 
| lambda | Initial expected abundance. | 
| gamma | recruitment rate. | 
| phi | apparent survival rate. | 
| p | detection probability. | 
| mean.lambda | Initial expected abundance at cov.lam = 0. | 
| mean.gamma.rel | recruitment rate at cov.gamma = 0. | 
| mean.phi | apparent survival rate at cov.phi = 0. | 
| mean.p | detection probability at cov.p = 0. | 
| beta.lam | the slope of parameter lambda (link transformed) on the cov.lam covariate | 
| beta.gamma | the slope of parameter gamma (link transformed) on the cov.gamma covariate | 
| beta.phi | the slope of parameter phi (link transformed) on the cov.phi covariate | 
| beta.p | the slope of parameter p (link transformed) on the cov.p covariate | 
| show.plots | if TRUE, plots are produced. Set this to FALSE when running simulations. | 
Value
For simDM0, a list with the values of the input arguments and the following additional elements:
| N | true number of individuals, nsites x nyears | 
| S | number of survivors, nsites x (nyears-1) | 
| R | number of recruits, nsites x (nyears-1) | 
| y | number detected, nsites x nyears x nsurveys | 
| yy | number detected as a matrix, nsites x (nyears*nsurveys) | 
simDM has the following additional elements:
| cov.lam | covariate for lambda generated from Uniform(-1, 1), nsites vector | 
| cov.gamma | covariate for gamma generated from Uniform(-1, 1), nsites vector | 
| cov.phi | covariate for phi generated from Uniform(-1, 1), nsites vector | 
| cov.p | covariate for p generated from Uniform(-1, 1), nsites x nyears x nsurveys | 
| ccov.p | cov.p formatted as a matrix, nsites x (nyears*nsurveys) | 
Author(s)
Marc Kéry & Andy Royle, based in part on code written by Richard Chandler.
References
Dail, D. & Madsen, L. (2011) Models for estimating abundance from repeated counts of an open metapopulation. Biometrics, 67, 577-587.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 2.5.
Examples
# Simulate a data set with the default arguments and look at the structure of the output:
tmp0 <- simDM0()
str(tmp0)
tmp <- simDM()
str(tmp)
Simulate line transect data for density surface modeling
Description
The function generates a population represented as a binomial point pattern in a heterogeneous landscape with density a function of the covariate Habitat. Data for multiple line transect surveys using a wiggly transect are then simulated, and the pixel IDs for the activity centers of detected individuals returned.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
simDSM(X, Ntotal = 400, sigma = 0.65, beta1 = 1.0,
    nsurveys = 2, xlim = c(-0.5, 3.5), ylim = c(-0.5, 4.5), show.plots = TRUE)
Arguments
| X | a 2-column matrix with coordinates of regularly spaced points along the transect line; see Examples. | 
| Ntotal | the true total number of individuals in the study area. | 
| sigma | scale parameter for the half-normal detection function. | 
| beta1 | coefficient for the relationship between the Habitat covariate and population density. | 
| nsurveys | the number of replicate surveys along the transect. | 
| xlim,ylim | the extent of the (rectangular) study area | 
| show.plots | if TRUE, summary plots are displayed. | 
Value
A list with the values of the input arguments and the following additional elements:
| Habitat | a vector for the habitat covariate for each pixel | 
| Habgrid | a 2-column matrix with the coordinates of center of each pixel | 
| nPix | the number of pixels in the study area | 
| N | true number of activity centers in each pixel | 
| U | a 2-column matrix with the locations of ACs for all individuals in the population | 
| Ucap | a 2-column matrix with the locations of ACs for individuals detected at least once | 
| nind | the number of individuals detected at least once | 
| pixel | a nind x nsurvey matrix with the pixel ID for the activity center or NA if the individual was not detected on the survey | 
Author(s)
Marc Kéry, Andy Royle & Mike Meredith
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 11.10.
Examples
# Run the function with default values and look at the output
library(AHMbook)
data(wigglyLine)
points <- sp::SpatialPoints( wigglyLine )
sLine <- sp::Line(points)
regpoints <- sp::spsample(sLine, 100, type = "regular")
str(simDSM(X = regpoints@coords))
# Generate the data set used in AHM2 11.10
RNGversion("3.5.3")
set.seed(2027, kind = "Mersenne-Twister")
tmp <- simDSM(X = regpoints@coords) # Produces Fig 11.15 in the book
Simulate data for an integrated species distribution model (SDM) of Dorazio-Koshkina
Description
The function generates a population represented as a point pattern in a heterogeneous landscape and simulates data from two different sources: (1) opportunistic presence-only data, and (2) replicate counts or detection/nondetection data in randomly-placed quadrats. This is the scenario for the integrated models described by Dorazio (2014) and Koshkina et al. (2017). The former assumes counts as data source (2) while the latter assume detection/nondetection data.
A Poisson point pattern (PPP) with intensity a function of a covariate X and intercept and coefficient beta is simulated on a discrete (pixel-based) approximation of a continuous landscape.
This PPP is first thinned with a pixel-wise thinning probability controlled by a covariate W and coefficients alpha, and second, with a landscape-wise random drop-out process to produce a first data set of presence-only kind.
A second data set is simulated by imagining replicated counts conducted in randomly-selected quadrats within the landscape. Detection of individuals is imperfect, with probability of detection controlled by the covariate W and coefficients gamma. These counts can be quantized to detection/nondetection data for use in a model as in Koshkina et al. (2017).
For simDataDK1 animals are limited to one individual per pixel; this is not the case for simDataDK.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
simDataDK(sqrt.npix = 100, alpha = c(-1,-1), beta = c(6,0.5),
  drop.out.prop.pb = 0.7, quadrat.size = 4, gamma = c(0,-1.5),
  nquadrats = 250, nsurveys = 3, show.plot = TRUE)
simDataDK1(sqrt.npix = 100, alpha = c(-1,-1), beta = c(6,0.5),
  drop.out.prop.pb = 0.7, quadrat.size = 4, gamma = c(0,-1.5),
  nquadrats = 250, nsurveys = 3, show.plot = TRUE)
Arguments
| sqrt.npix | number of pixels along each side of square state space (the 'landscape'); the total number of pixels is then  | 
| alpha | coefficients for the relationship: logit(b) = alpha[1] + alpha[2] * W, where b is the sampling detection bias in the presence-only observations. | 
| beta | coefficients for the relationship: log(lambda) = beta[1] + beta[2] * X, where lambda is the intensity of the Poisson point process. If the values of beta result in very large numbers of animals, an error will occur. | 
| drop.out.prop.pb | proportion of presence-only points at the end that are discarded. | 
| quadrat.size | length of the side of quadrats for conducting replicate counts in pixel units; note that sqrt.npix / quadrat.size must yield an integer. | 
| gamma | coefficients for the relationship: logit(p) = gamma[1] + gamma[2] * W, where p is the probability of detecting an individual during the count surveys in the quadrats. | 
| nquadrats | the number of quadrats selected for the count survey. | 
| nsurveys | the number of replicate counts in each quadrat. | 
| show.plot | if TRUE, summary plots are displayed. | 
Value
A list with the values of the input arguments and the following additional elements:
| npix | the number of pixels in the landscape | 
| s.area | the area of the whole landscape = 4 | 
| s.loc | 2-column vector with the location of each pixel | 
| xcov | values of the 'X' (intensity) covariate | 
| wcov | values of the 'W' (detection) covariate | 
| N.ipp | true number of individuals in the landscape | 
| pixel.id.ipp | pixel ID for each individual in the population | 
| loc.ipp | coordinates for each individual in the population | 
| pTrue.ipp | probability of detection for each individual for presence-only data | 
| pixel.id.det | pixel ID for each individual detected opportunistically | 
| N.det | number of detections | 
| loc.det | coordinates of each individual detected opportunistically | 
| pcount | probability of detection during count surveys, varies by quadrat | 
| fullCountData | matrix with rows for each quadrat, columns for ID, x and w coords, true N, and 3 replicate counts | 
| countData | as above, but rows for quadrats sampled only | 
| s | a Raster Stack with layers for 'X', 'W', and number in each pixel, 'n' | 
| squad | a Raster Stack corresponding to the quadrats, with mean 'X' and 'W' and true abundance, 'N' | 
Author(s)
Marc Kéry, Andy Royle & Mike Meredith, based on the code written by Dorazio (2014) and adapted by Koshkina et al. (2017).
References
Dorazio, R.M. (2014) Accounting for imperfect detection and survey bias in statistical analysis of presence-only data. Global Ecology and Biogeography, 23, 1472-1484.
Koshkina, V., Wang, Y., Gordon, A., Dorazio, R.M., White, M., & Stone, L. (2017) Integrated species distribution models: combining presence-background data and site-occupany data with imperfect detection. Methods in Ecology and Evolution, 8, 420-430.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 10.
Examples
# Run the function with default values and look at the output
str(tmp <- simDataDK(), 1)  # use str(., max.level=1) to limit the amount of output.
str(tmp <- simDataDK(show.plot=FALSE), 1)  # no plots
str(tmp <- simDataDK(sqrt.npix = 500), 1)  # much larger landscape
str(tmp <- simDataDK(alpha = c(-1,1)), 1)  # positive effect of W on bias rate parameter b
str(tmp <- simDataDK(beta = c(6, 0.5)), 1) # lower density
str(tmp <- simDataDK(drop.out.prop = 0), 1)# No final uniform thinning ("drop out")
str(tmp <- simDataDK(beta = c(6, 1)), 1)   # steeper gradient of habitat suitability
Simulate data under a demographic dynamic occupancy model
Description
Function to simulate detection/nondetection data under a variant of the demographic occupancy (or 'local survival') model of Roth & Amrhein (2010). Data are simulated in an 'unconditional' manner, i.e., for each site from first to last year. All parameters can be made year-dependent by specification of a range within which annual values will be drawn from uniform distributions.
Usage
simDemoDynocc(nsites = 100, nyears = 10, nvisits = 5, psi1 = 0.6,
    range.phi = c(0.2, 0.9), range.r = c(0, 0.4), range.p = c(0.1, 0.9),
    show.plot=TRUE)
Arguments
| nsites | Number of sites. | 
| nyears | Number of years (or 'seasons', as they are somewhat confusingly often called in the occupancy literature). | 
| nvisits | Number of replicate surveys (= occasions) within a year. | 
| psi1 | occupancy probability in first year. | 
| range.phi | bounds of uniform distribution from which annual local probability of persistence is randomly drawn. | 
| range.r | bounds of uniform distribution from which annual local probability of colonization is randomly drawn. | 
| range.p | bounds of uniform distribution from which annual probability of detection is randomly drawn. | 
| show.plot | If TRUE, plots of results are displayed; set to FALSE if running simulations. | 
Value
A list with the values of the arguments input and the following additional elements:
| phi | persistence for each interval, a vector of length nyears - 1 | 
| r | colonization for each interval, a vector of length nyears - 1 | 
| p | detection probability for each year, a vector of length nyears | 
| z | true occurrence state, a nsites x nyears matrix of 0/1 | 
| y | the observed detection history, a nsites x nvisits x nyears array | 
| f | year of first detection, a vector of length nsites | 
| nocc.true | the true number of occupied sites, a vector of length nyears | 
| nocc.true | the observed number of occupied sites, a vector of length nyears | 
Author(s)
Marc Kéry & Andy Royle
References
Roth, T. & Amrhein, V. (2010), Estimating individual survival using territory occupancy data on unmarked animals. Journal of Applied Ecology, 47, 386-392.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 4.
Examples
# Generate data with the default arguments and look at the structure:
str(data <- simDemoDynocc() )                # Implicit defaults
str(data <- simDemoDynocc(psi1 = 1))         # All sites initially occupied
str(data <- simDemoDynocc(nsites = 1000))    # Plenty more sites
str(data <- simDemoDynocc(nyears = 100))     # Plenty more years
str(data <- simDemoDynocc(nvisits = 20))     # Plenty more visits
str(data <- simDemoDynocc(range.phi = c(0.8, 0.8))) # Constant survival
str(data <- simDemoDynocc(range.phi = c(0.2,0.3), range.r = c(0,0.2))) # Decline
str(data <- simDemoDynocc(range.phi = c(0.8,1), range.r = c(0.5,0.7))) # Increase
str(data <- simDemoDynocc(nvisits = 1))      # Single visit
str(data <- simDemoDynocc(range.p = c(1,1))) # Perfect detection
Simulate detection/nondetection data under a wide variety of non-spatial dynamic occupancy models
Description
Function to simulate detection/nondetection data under a general dynamic site-occupancy model, including:
* Annual variation in the probabilities of patch persistence, colonization and detection is specified by the bounds of a uniform distribution.
* One covariate is allowed to affect each parameter: a site covariate for psi1, site-by-year covariates for phi and gamma, and an observational covariate for p. Covariates are generated internally from uniform(-2, 2) distributions.
* Additional heterogeneity among sites in persistence and colonization or both.
* Additional detection heterogeneity at the site-, the survey, or the site-by-survey level, with the possibility of a temporal trend in this heterogeneity over the years. E.g., an annual trend in detection heterogeneity at the site or the survey level is specified by the first and second value, which correspond to the heterogeneity in the first and the last year. Hence, trend.sd.site = c(0, 1) will result in a linear trend in the magnitude of site heterogeneity in detection from 0 in the first year to 1 in the last year.
* Additional detection heterogeneity that varies over the survey (= occasion) according to a quadratic effect of occasion number (to model the typical phenology of an insect species for instance).
* Simulation of data under a BACI (before-after-control-impact) design, where some event happens in a given year and reduces phi or gamma by a stated percentage (only reductions, no increases allowed!)
Usage
simDynocc(nsites = 250, nyears = 10, nsurveys = 3, year.of.impact = NA,
  mean.psi1 = 0.4, beta.Xpsi1 = 0,
  range.phi = c(0.5, 1), sd.lphi.site = 0, impact.phi = 0, beta.Xphi = 0,
  range.gamma = c(0, 0.5), sd.lgamma.site = 0, impact.gamma = 0, beta.Xgamma = 0,
  sd.lphi.lgamma.site = 0,
  range.p = c(0.1, 0.9), beta.Xp = 0,
  range.beta1.survey = c(0, 0), range.beta2.survey = c(0, 0),
  trend.sd.site = c(0, 0), trend.sd.survey = c(0, 0),
  trend.sd.site.survey = c(0, 0), show.plots = TRUE)
Arguments
| nsites | Number of sites. | 
| nyears | Number of years (or 'seasons'). | 
| nsurveys | Number of replicate surveys (= occasions) within a year. | 
| year.of.impact | Year in which an impact happens that can affect phi and gamma (for BACI design), NA if no impact occurs; for the BACI design,  | 
| mean.psi1 | average occupancy probability in first year. | 
| beta.Xpsi1 | coefficient of environmental covariate in probability of initial occupancy. | 
| range.phi | bounds of uniform distribution from which annual probability of persistence is randomly drawn. | 
| sd.lphi.site | SD of random site effect on persistence on the logit scale drawn from a normal distribution with mean zero. | 
| impact.phi | negative effect in percent on annual phi (e.g., impact.phi = 20 means a 20% reduction in phi); ignored if  | 
| beta.Xphi | coefficients of environmental covariate in probability of persistence. | 
| range.gamma | bounds of uniform distribution from which annual probability of colonization is randomly drawn. | 
| sd.lgamma.site | SD of random site effect on colonization on the logit scale drawn from a normal distribution with mean zero. | 
| impact.gamma | negative effect in percent on annual gamma (e.g., impact.gamma = 20 means a 20% reduction in gamma); ignored if  | 
| beta.Xgamma | coefficient of environmental covariate in probability of colonization. | 
| sd.lphi.lgamma.site | SD of random site effect on persistence AND colonization on the logit scale drawn from a normal distribution with mean zero. | 
| range.p | bounds of uniform distribution from which annual probability of detection is randomly drawn. | 
| beta.Xp | coefficients of environmental covariate in probability of detection. | 
| range.beta1.survey | bounds of the uniform distribution from which the annual variation in the linear effect of the survey occasion (i.e., of survey 1-12 with  | 
| range.beta2.survey | the same for the quadratic effect of survey occasion. | 
| trend.sd.site | initial and final values of sd of normal distribution to model logit-normal noise in p at the site level; a linear trend is assumed over time; if the two values are the same, a constant value is assumed. | 
| trend.sd.survey | initial and final values of sd of normal distribution to model logit-normal noise in p only at the 'survey' level; if they are different, a linear trend is assumed over time. | 
| trend.sd.site.survey | initial and final values of sd of normal distribution to model logit-normal noise in p at the site/year/survey = ‘survey’ level; if they are different, a linear trend is assumed over time. | 
| show.plots | If TRUE, plots of results are displayed; set to FALSE if running simulations. | 
Value
A list with the values of the arguments input and the following additional elements:
| impact | a 0/1 vector of length (nyears-1) indicating if an impact applies to the interval | 
| BACI.effect.phi | reduction in persistence due to impact, a vector of length nyears - 1 | 
| BACI.effect.gamma | reduction in colonization due to impact, a vector of length nyears - 1 | 
| beta1 | linear effect of occasion on the product of availability and detection, a vector of length nyears | 
| beta2 | quadratic effect of occasion on the product of availability and detection, a vector of length nyears | 
| mean.phi | mean persistence for each interval before application of any BACI effect, a vector of length nyears - 1 | 
| mean.gamma | mean colonization for each interval before application of any BACI effect, a vector of length nyears - 1 | 
| mean.p | mean detection probability for each year, a vector of length nyears | 
| psi | annual occupancy for each site, a nsites x nyears matrix | 
| mean.psi | average occupancy over sites, a vector of length nyears | 
| n.occ | number of occupied sites, a vector of length nyears | 
| psi.fs | finite-sample occupancy proportion, a vector of length nyears | 
| psi.app | apparent occupancy over sites, a vector of length nyears | 
| z | true occurrence state, a nsites x nyears matrix of 0/1 | 
| phi | persistence, a nsites x nyears-1 matrix | 
| gamma | colonization, a nsites x nyears-1 matrix | 
| p | detection probability, a nsites x nsurveys x nyears array | 
| y | the observed detection history, a nsites x nsurveys x nyears array | 
| Xpsi1 | covariate affecting initial occupancy, a vector of length nsites | 
| Xphi | covariate affecting persistence, a nsites x nyears matrix | 
| Xgamma | covariate affecting colonization, a nsites x nyears matrix | 
| Xp | covariate affecting probability of detection, a nsites x nsurveys x nyears array | 
| eps.lphi.site | site random effects on persistence, a vector of length nsites | 
| eps.lgamma.site | site random effects on colonization, a vector of length nsites | 
| eps.lphi.lgamma.site | site random effects on persistence and colonization, a vector of length nsites | 
| eps1 | site random effects on detection, a vector of length nsites | 
| eps2 | the survey random effects on detection, a vector of length nsurveys | 
| eps3 | the site/survey/year random effects on detection, a nsites x nsurveys x nyears array | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 4.
Examples
# Generate data with the default arguments and look at the structure
tmp <- simDynocc()
str(tmp)
# no annual variation in the parameters
str(data <- simDynocc(nsites = 250, nsurveys = 3, nyears = 10, mean.psi1 = 0.6,
    range.phi = c(0.7, 0.7), range.gamma = c(0.3, 0.3), range.p = c(0.5, 0.5)))
# a fully time-dependent model (with p constant within each primary period)
str(data <- simDynocc(mean.psi1 = 0.6, range.phi = c(0.5, 0.8),
   range.gamma = c(0.1, 0.5), range.p = c(0.1, 0.9)) )
# a time-constant model with four different covariates affecting the four parameters
str(data <- simDynocc(mean.psi1 = 0.6, beta.Xpsi1 = 1,
  range.phi = c(0.6, 0.6), beta.Xphi = 2, range.gamma = c(0.3, 0.3),
  beta.Xgamma = 2, range.p = c(0.2, 0.2), beta.Xp = -2) )
# seasonal variation in detection probability
str(data <- simDynocc(nsurveys = 12, mean.psi1 = 0.6,
  range.phi = c(0.6, 0.6), range.gamma = c(0.3, 0.3),
  range.p = c(0.5, 0.5), range.beta1.survey = c(-0.3, 0.4),
  range.beta2.survey = c(0, -0.7)) )
# now both yearly variation and effects of all covariates (including survey)
str( data <- simDynocc(mean.psi1 = 0.6, beta.Xpsi1 = 1,
   range.phi = c(0.6, 1), beta.Xphi = 2, range.gamma = c(0, 0.2),
   beta.Xgamma = 2, range.p = c(0.1, 0.9), beta.Xp = -2,
   range.beta1.survey = c(-0.4, 0.5), range.beta2.survey = c(0, -0.8)) )
# To add detection heterogeneity at the site level, you can do this:
str(data <- simDynocc(trend.sd.site = c(3, 3)) ) # No time trend
str(data <- simDynocc(trend.sd.site = c(1, 3)) ) # With time trend
# To add detection heterogeneity at the level of the survey, you can do this:
str(data <- simDynocc(trend.sd.survey = c(3, 3)) ) # No time trend
str(data <- simDynocc(trend.sd.survey = c(1, 3)) ) # With time trend
# To add detection heterogeneity at the level of the individual visit, you can do this:
str(data <- simDynocc(trend.sd.site.survey = c(3, 3)) ) # No trend
str(data <- simDynocc(trend.sd.site.survey = c(1, 3)) ) # With trend
# To simulate data under a BACI design, where an impact happens in year 10
str(data <- simDynocc(nsites = 250, nsurveys = 3, nyears = 20, year.of.impact = 10,
   impact.phi = 80, impact.gamma = 50) )
# And data where there is no detection error (i.e., with p = 1):
str( data <- simDynocc(range.p = c(1, 1)) )
Simulate data under a dynamic occupancy model with spatial autocorrelation
Description
Function to simulate detection/nondetection data under a general dynamic site-occupancy model with autocorrelation, including:
* Annual variation in the probabilities of patch persistence, colonization and detection is specified by the bounds of a uniform distribution.
* One covariate is allowed to affect a parameter: a site covariate for psi1, site-by-year covariates for phi and gamma, and an observational covariate for p. Covariates are generated internally from uniform(-2, 2) distributions.
* Additional heterogeneity among sites in persistence and colonization or both.
* Additional detection heterogeneity at the site or survey level, with the possibility of a temporal trend in this heterogeneity over the years. E.g., an annual trend in detection heterogeneity at the site or the survey level is specified by the first and second value, which correspond to the heterogeneity in the first and the last year. Hence, trend.sd.site = c(0, 1) will result in a linear trend in the magnitude of site heterogeneity in detection from 0 in the first year to 1 in the last year.
* A single, spatially structured covariate for habitat suitability may affect all parameters via coefficient beta.XAC; for a biologically reasonable way, choose coefficients with the same sign for all 4 (mediated by underlying density). That spatial covariate is simulated as a Gaussian random field with negative exponential correlation function with 'range parameter' theta.XAC.
* Autologistic effects (beta.Xautolog) in persistence and colonization probability can be chosen, which fits a logistic regression of these parameters on the proportion of occupied neighboring cells (in a queen's or 2nd order neighborhood) during the previous time step.
* Additional detection heterogeneity can be introduced at the site- or the individual survey level, with the possibility of a temporal trend in this heterogeneity. For instance, an annual trend in detection heterogeneity at the site or the survey level is specified by the value in the first and the last year. Hence, trend.sd.site = c(0, 1) will result in a linear trend in the magnitude of site heterogeneity in detection from 0 in the first year to 1 in the last year.
Previous versions used RandomFields, but that is not currently available on CRAN. fields is now used instead, but it cannot deal with large values of side and theta.XAC. If you have RandomFields installed (perhaps by getting it from the CRAN archive), you can load a version of simDynoccSpatial that supports it with source(system.file("RandomFieldsSupport", "simDynoccSpatial.R", package="AHMbook")).
See also simDynoccSpatialData.
Usage
simDynoccSpatial(side = 50, nyears = 10, nsurveys = 3,
      mean.psi1 = 0.4, beta.Xpsi1 = 0,
      range.phi = c(0.8, 0.8), beta.Xphi = 0,
      range.gamma = c(0.1, 0.1), beta.Xgamma = 0,
      range.p = c(0.4, 0.4), beta.Xp = 0,
      theta.XAC = 5000, beta.XAC = c(0, 0, 0, 0), beta.Xautolog = c(0, 0),
      trend.sd.site = c(0, 0), trend.sd.survey = c(0, 0),
      seed.XAC = NA, seed = NULL, show.plots = TRUE, ask.plot = TRUE, verbose = TRUE)
Arguments
| side | side length of square simulation area; the number of sites, or cells, M = side^2. | 
| nyears | Number of years (or 'seasons'). | 
| nsurveys | Number of replicate surveys (= occasions) within a year. | 
| mean.psi1 | average occupancy probability in first year. | 
| beta.Xpsi1 | coefficient of environmental covariate in probability of initial occupancy. | 
| range.phi | bounds of uniform distribution from which annual probability of persistence is randomly drawn. | 
| beta.Xphi | coefficient of environmental covariate in probability of persistence. | 
| range.gamma | bounds of uniform distribution from which annual probability of colonization is randomly drawn. | 
| beta.Xgamma | coefficient of environmental covariate in probability of colonization. | 
| range.p | bounds of uniform distribution from which probability of detection is randomly drawn. | 
| beta.Xp | coefficient of environmental covariate in probability of detection. | 
| theta.XAC | 'range parameter' of a covariate with exponential spatial correlation (i.e., a Gaussian random field is used as an environmental covariate); must be > 0. | 
| beta.XAC | vector of coefficients of that field for the 4 model parameters: psi1, phi, gamma, and p (in that order). | 
| beta.Xautolog | vector of coefficients of autologistic covariate in the following order: persistence (phi), colonization (gamma); the autocovariate is computed at every season as the proportion of occupied cells in a queen's neighborhood around each cell. | 
| trend.sd.site | initial and final values of sd of normal distribution to model logit-normal noise in p at the site level; a linear trend is assumed over time; if the two values are the same, a constant value is assumed. | 
| trend.sd.survey | initial and final values of sd of normal distribution to model logit-normal noise in p at the 'survey' level; if they are different, a linear trend is assumed over time. | 
| seed.XAC | the seed to be used for simulation of the spatially structured covariate for habitat suitability. | 
| seed | the seed to be used for simulation of values apart from the spatially structured covariate for habitat suitability; using the same value for seed.XAC with different values for seed allows generation of different data sets with the same habitat suitability covariate. | 
| show.plots | if TRUE, summary plots are displayed. | 
| ask.plot | If TRUE, pause between plots of results; set to FALSE if running simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the values of the arguments input and the following additional elements:
| M | scalar, total number of pixels in the study area, side^2 | 
| grid | 2-column matrix, x and y coordinates of each pixel | 
| amatrix | M x M matrix, [i,j] = 1 if cells i and j are neighbors, 0 otherwise | 
| Xpsi1 | side x side matrix, value of covariate affecting initial occupancy | 
| Xphi | side x side x nyears array, value of covariate affecting persistence | 
| Xgamma | side x side x nyears array, value of covariate affecting colonization | 
| Xp | side x side x nsurveys x nyears array, value of covariate affecting detection | 
| XAC | side x side matrix, the spatially correlated covariate | 
| Xauto | side x side x nyears array, the autocovariate, the proportion of neighboring cells occupied | 
| Xautoobs | side x side x nyears array, the observed autocovariate, the proportion of neighboring cells where the species was detected | 
| sd.site | vector nyears, year-specific values of SD of Gaussian random site effects in p | 
| sd.survey | vector nyears, year-specific values of SD of Gaussian random survey effects in p | 
| mean.phi | vector nyears-1, mean persistence for each interval | 
| mean.gamma | vector nyears-1, mean colonization for each interval | 
| mean.p | vector nyears, mean detection probability for each year | 
| psi | side x side x nyears array, probability of occupancy for each site and year | 
| mean.psi | vector nyears, mean occupancy over cells for each year | 
| psi.app | vector nyears, apparent occupancy, proportion of cells where species detected | 
| z | side x side x years array, true occupancy state | 
| zobs | side x side x years array, observed occupancy state | 
| nocc | vector nyears, true number of occupied cells | 
| nocc.obs | vector nyears, number of cells where the species was detected | 
| phi | side x side x nyears-1 array, probability of persistence in each interval between years | 
| gamma | side x side x nyears-1 array, probability of colonization in each interval between years | 
| p | side x side x nsurveys x nyears array, probability of detection | 
| y | side x side x nsurveys x nyears array, the observed detection history | 
| umf | an unmarked data frame object with the simulated data | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 9.
Examples
# Generate data with the default arguments and look at the structure
try(str(simDynoccSpatial()))  # Fails if RandomFields is not available
str(simDynoccSpatial(side=50, theta.XAC=1))
Simulated data from simDynoccSpatial for AHM2 section 9.6.1.2 (page 564)
Description
The data analyzed in Kéry & Royle (2021) section 9.6.1.2 was generated using simDynoccSpatial with the RandomFields package. That data set cannot be generated if  RandomFields is not available. This provides a data set to use with the code in that section.
Usage
data(simDynoccSpatialData)Format
A list with 45 elements. See the Value section in simDynoccSpatial for details.
Details
The data set was generated on a machine with RandomFields installed, using the following code:
  source(system.file("RandomFieldsSupport", "simDynoccSpatial.R", package="AHMbook"))
  simDynoccSpatialData <- simDynoccSpatial(side = 30, nyears = 10, nsurveys = 3, mean.psi1 = 0.1,
  range.phi = c(0.5, 0.5), range.gamma = c(0.2, 0.2), range.p = c(0.4, 0.4),
  beta.Xautolog = c(1, 1), seed.XAC = 1, seed = 24, ask.plot = TRUE)
Source
Simulated data.
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 section 9.6.1.2.
Examples
data(simDynoccSpatialData)
dat <- simDynoccSpatialData
str(dat)
Create a Gaussian random field with negative exponential correlation
Description
Function creates Gaussian random field with negative exponential correlation and visualizes correlation and random field.
Previous versions used RandomFields, but that is not currently available on CRAN. fields is now used instead, but it cannot deal with large values of size and theta. If you have RandomFields installed (perhaps by getting it from the CRAN archive), you can load a version of simExpCorrRF that supports it with source(system.file("RandomFieldsSupport", "simExpCorrRF.R", package="AHMbook")).
Usage
simExpCorrRF(variance = 1, theta = 1, size = 50, seed = NA, show.plots = TRUE)
Arguments
| variance | variance of field. | 
| theta | parameter governing spatial correlation (=1/phi) ("large theta means high correlation") Note that RMexp (which is used internally) is specified in terms of phi = 1/theta. | 
| size | number of pixels along each side of the square site. | 
| seed | used to generate reproducible output. | 
| show.plots | if TRUE, the result will be displayed. | 
Value
A list with the values of the input arguments and the following additional elements:
| field | the random field variable, a vector of length size^2 | 
| grid | the grid corresponding to field | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 9.
Examples
# Run the function with default values and look at the output
str(tmp <- simExpCorrRF())
Simulate detection data for diseased frogs
Description
Simulates detection data for an amphibian species affected by the fungal pathogen Batrachochytrium dendrobatidis (Lips et al., 2006). Surveys are carried out on multiple occasions each year. Observers count frogs and record whether they were infected or not. Over time, individuals are subject to mortality and there may also be recruitment of new individuals into the population. Individuals transition between the two states: infected frogs may become uninfected (i.e., can shed the fungus) and vice versa. It also seems reasonable to expect that survival probability differs between the two states.
Usage
simFrogDisease(nsites = 100, nyears = 3, nsurveys = 3, alpha.lam = 3,
    omega = c(0.9, 0.7), gamma = c(2,1), p = c(0.8, 0.8, 0.8),
    recovery = 0.1, infection = 0.1)
Arguments
| nsites | the number of sites surveyed. | 
| nyears | the number of years of the study. | 
| nsurveys | the number of surveys each year. | 
| alpha.lam | mean abundance per site in the first year. | 
| omega | vector length 2, state-specific survival, noninfected and infected. | 
| gamma | vector length 2, state-specific recruitment, noninfected and infected. | 
| p | vector length  | 
| recovery | probability of recovery for an infected frog. | 
| infection | probability of infection for a noninfected frog. | 
Value
A list with the values of the arguments and the following additional elements:
| SN | sites x intervals, number of noninfected frogs surviving | 
| SI | sites x intervals, number of infected frogs surviving | 
| GN | sites x intervals, number of noninfected frogs recruited | 
| GI | sites x intervals, number of infected frogs recruited | 
| TrI | sites x intervals, number of infected frogs recovering | 
| TrN | sites x intervals, number of noninfected frogs becoming infected | 
| NN | sites x years, number of noninfected frogs at each site | 
| NI | sites x years, number of infected frogs at each site | 
| yN | sites x years x surveys, number of noninfected frogs detected | 
| yI | sites x years x surveys, number of infected frogs detected | 
Author(s)
Marc Kéry and Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 2.9.1.
Lips, K.R., Brem, F., Brenes, R., Reeve, J.D., Alford, R.A., Voyles, J., et al., 2006. Emerging infectious disease and the loss of biodiversity in a Neotropical amphibian community. Proc. Natl. Acad. Sci. USA, 103, 3165-3170.
Examples
# Generate a simulated data set with default arguments and look at the structure:
tmp <- simFrogDisease()
str(tmp)
Simulate data under hierarchical distance sampling protocol (line or point)
Description
The function simulates hierarchical distance sampling (HDS) data under either a line or a point transect protocol. At each site, it works with a strip of width B*2 (for line transects) or a circle of radius B inscribed in a square of side B*2 (for point transects).
The state process is simulated by first drawing a covariate value, "habitat", for each site from a Normal(0, 1) distribution. This is used in a log-linear regression with arguments mean.lambda and beta.lam to calculate the expected number of animals, lambda, in each strip or square.
For line transects, the distance from the line is drawn from a Uniform(0, B) distribution. For point transects, the animals are distributed randomly over the square before calculating the distance of each from the point. Observations of animals further than B from the point are discarded.
A detection covariate, "wind", for each site is drawn from a Uniform(-2, 2) distribution. This is used in a log-linear regression with arguments mean.sigma and beta.sig to calculate the scale parameter, sigma, of the half-normal detection function. Detections are simulated as Bernoulli trials with probability of success decreasing with distance from the line or point.
Usage
simHDS(type=c("line", "point"), nsites = 100, mean.lambda = 2, beta.lam = 1,
  mean.sigma = 1, beta.sig = -0.5, B = 3, discard0 = TRUE, show.plot = TRUE)
Arguments
| type | type of transect, "line" or "point". | 
| nsites | Number of sites (spatial replication) | 
| mean.lambda | the expected value of lambda when the habitat covariate = 0; the intercept of the log-linear regression for lambda is log(mean.lambda). | 
| beta.lam | the slope of the log-linear regression for lambda on a habitat covariate. | 
| mean.sigma | the expected value of the scale parameter of the half-normal detection function when the wind speed = 0; the intercept of the log-linear regression for sigma is log(mean.sigma). | 
| beta.sig | the slope of log-linear regression of scale parameter of the half-normal detection function on wind speed | 
| B | the strip half-width or circle radius | 
| discard0 | If TRUE, subset to sites at which individuals were captured. You may or may not want to do this depending on how the model is formulated so be careful. | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| data | simulated distance sampling data: a matrix with a row for each individual detected and 5 columns: site ID, status (1 if captured), x and y coordinates (NA for line transects), distance from the line or point; if  | 
| habitat | simulated habitat covariate | 
| wind | simulated detection covariate | 
| N | simulated number of individuals at each site | 
| N.true | for point counts, the simulated number of individuals within the circle sampled | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 8.5.1.
Examples
# Simulate a data set with the default arguments and look at the structure of the output
set.seed(123)
tmp <- simHDS()
str(tmp)
head(tmp$data)
tmp <- simHDS("point", discard0=FALSE)
str(tmp)
head(tmp$data, 10)
Simulate data under HDS protocol with groups
Description
Simulates hierarchical distance sampling (HDS) data for groups under either a line or a point transect protocol and using a half-normal detection function (Buckland et al. 2001).
At each site, it works with a strip of width B*2 (for line transects) or a circle of radius B (for point transects).
The state process is simulated by first drawing a covariate value, "habitat", for each site from a Normal(0, 1) distribution. This is used in a log-linear regression with arguments beta0 and beta1 to calculate the expected number of groups in each strip or circle. Group size is simulated by first drawing from a Poisson distribution with parameter lambda.group then adding 1.
For line transects, the distance from the line is drawn from a Uniform(0, B) distribution. For point transects, the distance from the point is simulated from B*sqrt(Uniform(0,1)), which ensures a uniform distribution over the circle.
The group size is used in a log-linear regression with arguments alpha0 and alpha1 to calculate the scale parameter, sigma, of the half-normal detection function. Detections are simulated as Bernoulli trials with probability of success decreasing with distance from the line or point.
Usage
simHDSg(type = c("line", "point"), nsites = 100, lambda.group = 0.75,
  alpha0 = 0, alpha1 = 0.5,
  beta0 = 1, beta1 = 0.5, B = 4, discard0 = TRUE, show.plot = TRUE)
Arguments
| type | The type of distance transect, either "line" or "point". | 
| nsites | Number of sites (spatial replication) | 
| lambda.group | Poisson mean of group size | 
| alpha0 | intercept of log-linear model relating sigma of the half-normal detection function to group size | 
| alpha1 | slope of log-linear model relating sigma of the half-normal detection function to group size | 
| beta0 | intercept of log-linear model relating the Poisson mean of the number of groups per unit area to habitat | 
| beta1 | slope of log-linear model relating the Poisson mean of the number of groups per unit area to habitat | 
| B | strip half width or the radius of the circle | 
| discard0 | whether to discard or keep the data from sites with nobody detected | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| data | simulated distance sampling data: a matrix with a row for each group detected and 6 columns: site ID, status (1 if captured), x and y coordinates (NA for line transects), distance from the line or point, group size; if  | 
| habitat | simulated habitat covariate | 
| N | simulated number of groups at each site | 
| N.true | for point counts, the simulated number of groups within the circle sampled | 
| groupsize | group size for each of the groups observed | 
Author(s)
Marc Kéry & Andy Royle
References
Buckland, S.T., et al (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 9.2.1.
Examples
# Run with the default arguments and look at the structure of the output:
set.seed(123)
tmp <- simHDSg()
str(tmp)
head(tmp$data)
str(simHDSg(type = "line"))     # Defaults for line transect data
str(simHDSg(type = "point"))    # Defaults for point transect data
str(simHDSg(lambda.group = 5))  # Much larger groups
str(simHDSg(lambda.group = 5, alpha1 = 0)) # No effect of groups size on p
Simulate open hierarchical distance sampling data
Description
Simulates distance sampling data for multiple replicate surveys in a multi-season (or multi-year) model, incorporating habitat and detection covariates, temporary emigration, and a trend in abundance or density.
At each site, it works with a strip of width B*2 (for line transects) or a circle of radius B (for point transects).
The state process is simulated by first drawing a covariate value, "habitat", for each site from a Normal(0, 1) distribution. This is used in a log-linear regression with arguments mean.lam, beta.lam and beta.trend to calculate the expected number of animals, lambda, in each strip or circle for each year. Site- and year-specific abundances are drawn from a Poisson distribution with mean lambda. The number available for capture at each replicate survey is simulated as a binomial distribution with probability phi.
For line transects, the distance from the line is drawn from a Uniform(0, B) distribution. For point transects, the distance from the point is simulated from B*sqrt(Uniform(0,1)), which ensures a uniform distribution over the circle.
A detection covariate, "wind", for each survey is drawn from a Uniform(-2, 2) distribution. This is used in a log-linear regression with arguments mean.sig and beta.sig to calculate the scale parameter, sigma, of the half-normal detection function. Detections are simulated as Bernoulli trials with probability of success decreasing with distance from the line or point.
Usage
simHDSopen(type=c("line", "point"), nsites = 100,
  mean.lam = 2, beta.lam = 0, mean.sig = 1, beta.sig = 0,
  B = 3, discard0 = TRUE, nreps = 2, phi = 0.7, nyears = 5, beta.trend = 0)
Arguments
| type | the transect protocol, either "line" or "point" . | 
| nsites | Number of sites (spatial replication) | 
| mean.lam | intercept of log-linear regression of expected lambda on a habitat covariate | 
| beta.lam | slope of log-linear regression of expected lambda on a habitat covariate | 
| mean.sig | intercept of log-linear regression of scale parameter of half-normal detection function on wind speed | 
| beta.sig | slope of log-linear regression of scale parameter of half-normal detection function on wind speed | 
| B | strip half width, or maximum distance from the observer for point counts | 
| discard0 | Discard sites at which no individuals were captured. You may or may not want to do this depending on how the model is formulated so be careful. | 
| nreps | the number of distance sampling surveys within a period of closure in a season (or year) | 
| phi | the availability parameter | 
| nyears | the number of seasons (typically years) | 
| beta.trend | loglinear trend of annual population size or density | 
Value
A list with the values of the arguments entered and the following additional elements:
| data | simulated distance sampling data: a list with a component for each year, each itself a list with a component for each replicate; this is a matrix with a row for each individual detected and 5 columns: site ID, status (1 if captured), x and y coordinates (NA for line transects), distance from the line or point; if  | 
| habitat | simulated habitat covariate, a vector of length  | 
| wind | simulated detection covariate, a  | 
| M.true | simulated number of individuals, a  | 
| K | 
 | 
| Na | the number of individuals available for detection, a  | 
| Na.real | for point counts, the number of individuals available for detection within the circle sampled, a  | 
Note
For "point" the realized density is [(area of circle) /(area of square)]*lambda
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 9.5.4.1.
Examples
set.seed(123)
tmp <- simHDSopen() # Generate data with default parameters
str(tmp)
head(tmp$data[[1]][[1]])
tmp <- simHDSopen("point")
str(tmp)
head(tmp$data[[1]][[1]])
tmp <- simHDSopen(discard0=FALSE)
str(tmp)
head(tmp$data[[1]][[1]])
Simulate data under hierarchical distance sampling point transect protocol
Description
The function simulates hierarchical distance sampling (HDS) data under a point transect protocol. At each site, it works with a circle of radius B.
The state process is simulated by first drawing a covariate value, "habitat", for each site from a Normal(0, 1) distribution. This is used in a log-linear regression with arguments mean.density and beta.density to calculate the expected density of animals per hectare. The expected number of animals in the circle is calculated from the area of the circle and the density, and numbers are drawn from a Poisson distribution for each site.
Animals are assumed to be distributed randomly over the circle, and distances from the point are generated.
A detection covariate, "wind", for each site is drawn from a Uniform(-2, 2) distribution. This is used in a log-linear regression with arguments mean.sigma and beta.sigma to calculate the scale parameter, sigma, of the half-normal detection function. Detections are simulated as Bernoulli trials with probability of success decreasing with distance from the point.
This is a simplified (and faster) version of the function simHDS with type="point", but works with densities and with measurements in hectares and meters.
Usage
simHDSpoint(nsites = 1000, mean.density = 1, beta.density  = 1,
    mean.sigma = 20, beta.sigma = -5, B = 60, discard0=FALSE, show.plots=TRUE)
Arguments
| nsites | Number of sites (spatial replication) | 
| mean.density | the expected value of density (animals per hectare) when the habitat covariate = 0; the intercept of the log-linear regression for density is log(mean.density). | 
| beta.density | the slope of the log-linear regression for density on a habitat covariate. | 
| mean.sigma | the expected value of the scale parameter of the half-normal detection function when the wind speed = 0; the intercept of the log-linear regression for sigma is log(mean.sigma). | 
| beta.sigma | the slope of log-linear regression of scale parameter of the half-normal detection function on wind speed. | 
| B | the circle radius in meters. | 
| discard0 | If TRUE, subset to sites at which individuals were captured. You may or may not want to do this depending on how the model is formulated so be careful. | 
| show.plots | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| data | simulated distance sampling data: a matrix with a row for each individual detected and 2 columns: site ID and distance from the point in meters. If  | 
| counts | the number of individuals detected at each site | 
| habitat | simulated habitat covariate | 
| wind | simulated detection covariate | 
| N | simulated number of individuals within the circle sampled | 
Author(s)
Marc Kéry, Andy Royle, Mike Meredith
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 8.5.1.
Examples
# Simulate a data set with the default arguments and look at the structure of the output
set.seed(123)
tmp <- simHDSpoint()
str(tmp)
head(tmp$data, 10)
head(tmp$counts, 10)
# Without 'wind', plots clearly show effect of distance from point
tmp <- simHDSpoint(beta.sigma=0)
tmp <- simHDSpoint(discard0=TRUE)
str(tmp)
head(tmp$data, 10) # some sites missing, no NAs.
head(tmp$counts)
Simulate HDS time-removal or double-observer data
Description
A general function for simulating hierarchical distance sampling (HDS) data combined with a time-removal (with 3 removal periods) or double-observer protocol, either for a line or a point transect protocol and with method = "removal" or method = "double". Also produces plots of the output.
At each site, it works with a strip of width B*2 (for line transects) or a circle of radius B (for point transects).
The state process is simulated by first drawing a covariate value, "habitat", for each site from a Normal(0, 1) distribution. This is used in a log-linear regression with arguments beta0 and beta1 to calculate the expected number of groups in each strip or circle. Group size is simulated by first drawing from a Poisson distribution with parameter lambda.group then adding 1.
For line transects, the distance from the line is drawn from a Uniform(0, B) distribution. For point transects, the distance from the point is simulated from B*sqrt(Uniform(0,1)), which ensures a uniform distribution over the circle.
The group size is used in a log-linear regression with arguments alpha0 and alpha1 to calculate the scale parameter, sigma, of the half-normal detection function. Detections are simulated according to the selected protocol.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
simHDStr(type = c("line", "point"), method=c("removal", "double"), nsites = 200,
  lambda.group = 1, alpha0 = 0, alpha1 = 0, beta0 = 1, beta1 = 0.5,
  p.avail = 0.75, K = 3, p.double = c(0.4, 0.6),
  B = 3, discard0 = FALSE, show.plot = TRUE)
Arguments
| type | The type of distance transect, either "line" or "point". | 
| method | Is the method time-removal ("removal") or double-observer ("double")? | 
| nsites | Number of sites (spatial replication) | 
| lambda.group | Poisson mean of group size | 
| alpha0 | intercept of log-linear model relating sigma of half-normal detection function to group size | 
| alpha1 | slope of log-linear model relating sigma of half-normal detection function to group size | 
| beta0 | intercept of log-linear model relating the Poisson mean of the number of groups per unit area to habitat | 
| beta1 | slope of log-linear model relating the Poisson mean of the number of groups per unit area to habitat | 
| p.avail | overall availability probability (phi in text) | 
| K | number of removal periods (of equal length) | 
| p.double | detection probability for first and second observer | 
| B | strip half width or radius of the circle | 
| discard0 | whether to discard or keep the data from sites with nobody detected | 
| show.plot | choose whether to show plots or not. Set to FALSE when running simulations. | 
Value
A list with the values of the arguments and the following additional elements:
| data | simulated distance sampling data: a matrix with a row for each group detected and 7 columns: site ID, status (1 if captured), x and y coordinates (NA for line transects), distance from the line or point, group size, the time interval of removal or capture history; if  | 
| habitat | simulated habitat covariate | 
| M | simulated number of groups at each site | 
| M.true | for point counts, the simulated number of groups within the circle sampled | 
| params | a vector with the input arguments | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 9.3.2 (time removal) and 9.4.1 (double observer).
Examples
# Run with the default arguments and look at the structure of the output:
set.seed(123)
tmp <- simHDStr() # default: line, removal
str(tmp)
head(tmp$data)
tmp <- simHDStr("point", method="double")
str(tmp)
head(tmp$data)
Simulate data for an integrated distance sampling, point count and occupancy study
Description
Generates hierarchical distance sampling data, point count data, and occupancy (detection/nondetection) data for an integrated distance sampling (IDS) model with shared density and availability processes, but possibly different detection/perceptability processes (i.e., different detection functions).
The function calls simHDSpoint to generate hierarchical distance sampling (HDS) data under a point transect protocol with a half-normal detection function. Density is modeled as a log-linear regression on a "habitat" covariate, with coefficients mean.density and beta.density.
Point count and occupancy data are also generated by calls to simHDSpoint, assuming that the underlying detection process involves reduced probability of detection with distance from the observer. The distances from the observer are discarded, and only the counts (for PC data) or species detection information (for occupancy) retained.
Availability is modeled according to Sólymos et al. (2013), where the probability of availability depends on the duration of the observation.
Usage
simIDS(mean.density = 1, beta.density = 1, mean.phi = 0.14, beta.phi = 0,
    nsites_HDS = 1000, sigHDS = 100, maxDist_HDS = 200, nbins = 4,
    range.dur.HDS = c(5, 5),
    nsites_PC = 10000, sigPC = 70, maxDist_PC = 500, range.dur.PC = c(3, 30),
    nsites_OC = 5000, sigOC = sigPC, maxDist_OC = maxDist_PC,
    range.dur.OC = range.dur.PC,
    show.plots = TRUE)
Arguments
| mean.density | the expected value of density (animals per hectare) when the habitat covariate = 0; the intercept of the log-linear regression for density is log(mean.density). | 
| beta.density | the slope of the log-linear regression for density on a habitat covariate. | 
| mean.phi | the expected value of the availability parameter. | 
| beta.phi | the slope of the log-linear regression for availability on a covariate - not yet implemented. | 
| nsites_HDS | the number of sites (point transects) for distance sampling. | 
| sigHDS | the value of the scale parameter of the half-normal detection function for distance sampling. | 
| maxDist_HDS | the truncation distance for distance sampling (in meters); any observations beyond this distance are discarded. | 
| nbins | the number of distance bins for grouping distance sampling data. | 
| range.dur.HDS | the range of durations for distance sampling; durations for each site are simulated from a uniform distribution with this range. | 
| nsites_PC | the number of sites for point counts; set to 0 to suppress generation of point counts. | 
| sigPC | the value of the scale parameter of the half-normal detection function for point counts. | 
| maxDist_PC | the maximum distance from the observer for detection of animals when conducting point counts (m). | 
| range.dur.PC | the range of durations for point counts; durations for each site are simulated from a uniform distribution with this range. | 
| nsites_OC | the number of sites for occupancy surveys; set to 0 to suppress generation of occupancy data. | 
| sigOC | the value of the scale parameter of the half-normal detection function for occupancy surveys. | 
| maxDist_OC | the maximum distance from the observer for detection of animals when conducting occupancy surveys (m). | 
| range.dur.OC | the range of durations for point counts; durations for each site are simulated from a uniform distribution with this range. | 
| show.plots | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| dds | simulated durations for distance sampling: a vector length  | 
| umf_hds0 | an  | 
| umf_hds1 | an  | 
| dpc | simulated durations for point counts: a vector length  | 
| umf_pc0 | an  | 
| umf_pc1 | an  | 
| doc | simulated durations for occupancy surveys: a vector length  | 
| umf_oc0 | an  | 
| umf_oc1 | an  | 
Note
A function to analyze these data, IDS, will appear in a future version of unmarked. In the meantime, a devel version can be installed with remotes::install_github("kenkellner/unmarked", ref="IDS").
Author(s)
Ken Kellner, Marc Kéry, Mike Meredith
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 8.5.1.
Sólymos, P. et al. (2013) Calibrating indices of avian density from non-standardized survey data: making the most of a messy situation. Methods in Ecology and Evolution 4, 1047-1058.
Kéry, M. et al. (2022) Integrated distance sampling models for simple point counts. (Submitted to Ecology)
Examples
# Simulate a data set with the default arguments and look at the structure of the output
tmp <- simIDS()
str(tmp)
Simulate the multinomial-mixture model
Description
Simulation of "removal" data from a multinomial-mixture model.
Usage
simMultMix(nsites = 100, nsurveys = 3, nyears = 4,
  lambda = 3, theta = 0.5, p = 0.3)
Arguments
| nsites | number of sites. | 
| nsurveys | number of replicate (secondary) samples | 
| nyears | number of primary samples: years, seasons etc. | 
| lambda | expected local population size. | 
| theta | availability, the proportion of the population available for detection. | 
| p | detection probability. | 
Value
A list with the values of the input arguments and the following additional elements:
| M | true local population size, vector length nsites | 
| N | true number available for detection, vector length nsites | 
| y | number detected, nsites x nyears x nsurveys | 
| y2d | number detected as a matrix, nsites x (nyears*nsurveys) | 
Author(s)
Marc Kéry & Andy Royle, adapting code in documentation for gmultmix by Richard Chandler.
References
Chandler, R.B., Royle, J.A. & King. D.I. (2011) Inference about density and temporary emigration in unmarked populations. Ecology, 92, 1429-1435.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 2.7.1.
Examples
# Simulate a data set with the default arguments and look at the structure of the output:
tmp <- simMultMix()
str(tmp)
Simulate data for binomial and multinomial mixture models
Description
This very general function generates single-season count data under variants of the binomial N-mixture model of Royle (2004) and of the multinomial N-mixture model of Royle et al (2007).
Usage
simNmix(nsites = 267, nvisits = 3, mean.theta = 1, mean.lam = 2, mean.p = 0.6,
  area = FALSE, beta1.theta = 0, beta2.theta = 0, beta3.theta = 0,
  beta2.lam = 0, beta3.lam = 0, beta4.lam = 0, beta3.p = 0, beta5.p = 0,
  beta6.p = 0, beta.p.survey = 0, beta.p.N = 0, sigma.lam = 0, dispersion = 10,
  sigma.p.site = 0, sigma.p.visit = 0, sigma.p.survey = 0, sigma.p.ind = 0,
  Neg.Bin = FALSE, open.N = FALSE, show.plots = TRUE, verbose = TRUE)
Arguments
| nsites | number of sites | 
| nvisits | number of visits per site | 
| mean.theta | proportion of sites that can have non-zero abundance in principle: suitability model for zero-inflation | 
| mean.lam | Expected abundance at the average value of all abundance covariates (and ignoring random site effects): abundance model | 
| mean.p | Expected detection at the average value of all detection covariates (and ignoring all random effects): detection model | 
| area | determines area of sites (A), defaults to A=1 (i.e., all identical), but you can supply a vector of site areas of length nsites instead. | 
| beta1.theta | coefficient of site covariate 1 in suitability model | 
| beta2.theta | coefficient of site covariate 2 in suitability model | 
| beta3.theta | coefficient of site covariate 3 in suitability model | 
| beta2.lam | coefficient of site covariate 2 in abundance model | 
| beta3.lam | coefficient of site covariate 3 in abundance model | 
| beta4.lam | coefficient of site covariate 4 in abundance model | 
| beta3.p | coefficient of site covariate 3 in detection model | 
| beta5.p | coefficient of site covariate 5 in detection model | 
| beta6.p | coefficient of site covariate 6 in detection model | 
| beta.p.survey | coefficient of survey ('observational') covariate on p | 
| beta.p.N | coefficient of centered local population size (log(N+1)) in detection model (i.e., coef. for density-dependent detection prob.) | 
| sigma.lam | "Overdispersion SD" in linear predictor of abundance | 
| dispersion | 'size' or extra-Poisson dispersion of Negative binomial | 
| sigma.p.site | "Overdispersion SD" in linear predictor of detection coming from random site effects | 
| sigma.p.visit | "Overdispersion SD" in linear predictor of detection coming from random visit effects | 
| sigma.p.survey | "Overdispersion SD" in linear predictor of detection coming from random site-by-survey effects | 
| sigma.p.ind | "Overdispersion SD" in linear predictor of detection coming from random site-by-individual effects | 
| Neg.Bin | if FALSE, any overdispersion in abundance is modeled by a Poisson log-normal; if TRUE, abundance overdispersion is modeled by adoption of a Negative binomial distribution for latent N | 
| open.N | if TRUE, data are simulated under one specific form of an open population, where N in the first occasion is drawn from the specified mixture distribution and for all further occasions j, we have N_ij ~ Poisson(N_i(j-1)). With open.N = TRUE, we must have sigma.p.ind = 0, show.plot = FALSE and nvisits >1. | 
| show.plots | if TRUE, plots of the data will be displayed; set to FALSE if you are running simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Details
Data are simulated at the level of each individual and individual-specific detection heterogeneity can be included. As a side-effect, individual-specific detection histories are generated and hence, data are also be simulated under the corresponding multinomial N-mixture model.
Broadly, the function can generate data under this most general model:
'Suitability' (zero-inflation) ~ cov1 + cov2 + cov3
Abundance ~ offset + cov2 + cov3 + cov4 + overdispersion
Detection ~ cov3 + cov5 + cov6 + survey.covariate + log(N+1) + eps.site + eps.visit + eps.survey + eps.individual
Overdispersion in abundance is modeled either as a Poisson-log-normal with a normal random site effect in lambda or with a Negative binomial with mean lambda and a 'size', or dispersion, parameter. Variable site areas can be specified to affect abundance as in an offset.
Abundance can be zero-inflated (this is the 'suitability' model). Note that the zero-inflation parameter is called theta here (in unmarked it is called psi). mean.phi is the probability that a site is suitable (i.e., 1 minus the expected proportion of sites with structural zero abundance.
Site covariate 2 can affect both suitability and abundance, while covariate 3 may affect all three levels. Hence, the function permits to simulate the case where a single site covariate affects different levels in the process (e.g., abundance and detection) in opposing directions (as for instance in Kéry, 2008)
Density-dependent detection can be modeled as a logistic-linear effect of local abundance (centered and log(x+1) transformed). Overdispersion in detection is modeled via normal random effects (the eps terms above) specific to sites, visits, surveys or individuals.
Effects of covariates and random-effects factors are modeled as additive on the link scale (log for abundance and logit for suitability and detection).
Data may be generated under one specific open-population model when argument 'open.N' is set to TRUE.
Value
A list with the arguments input and the following additional elements:
| nobs | The total number of visits | 
| site.cov | An nsites x 6 matrix of values for 6 site covariates | 
| survey.cov | An nsites x nvisits matrix of values for a survey covariate | 
| log.lam | Linear predictor of PLN abundance model including random effects, a vector of length nsites | 
| s | Site suitability indicator, a vector of length nsites | 
| N | Number of individuals at each site, a vector of length nsites | 
| p | Probability of detection, an array with dimensions sites occupied x visits x max(N) | 
| DH | Detection history (1/0), an array with dimensions sites occupied x visits x max(N) | 
| N.open | Number of individuals at each site for open model, a sites x visits matrix | 
| C | Summary of DH: number of individuals detected for each site and visit | 
| eta.lam | Random site effects in lambda, a vector of length nsites, zero if Neg.Bin == TRUE | 
| eta.p.site | Random site effects in p, a vector of length nsites | 
| eta.p.visit | Random visit effects in p, a vector of length nvisits | 
| eta.p.survey | Random survey (= site-by-survey) effects in p, a nsites x nvisits matrix | 
| eta.p.ind | Random individual (= site-by-ind) effects in p (NOT site-ind-visit !), a nsites x max(N) matrix | 
| odcN | Naive overdispersion measure (var/mean) for true abundance (N) | 
| odcC | Naive overdispersion measure (var/mean) for observed counts (C) | 
| Ntotal | Total abundance summed over all sites | 
| summax | The sum of maximum counts over all sites | 
Author(s)
Marc Kéry & Andy Royle
References
Royle, J.A. (2004) N-mixture models for estimating population size from spatially replicated counts. Biometrics, 60, 108-115.
Royle, J.A., et al (2007) Hierarchical spatial models of abundance and occurrence from imperfect survey data. Ecological Monographs, 77, 465-481.
Kéry, M. (2008) Estimating abundance from bird counts: binomial mixture models uncover complex covariate relationships, Auk, 125, 336-345
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.5.
Examples
# Generate data with the default arguments and look at the structure:
tmp <- simNmix()
str(tmp)
str(data <- simNmix())                  # Null data-generating model
str(data <- simNmix(mean.theta = 0.60)) # ZIP with 40% structural zeroes
str(data <- simNmix(sigma.lam = 1))     # Poisson-lognormal (PLN) mixture
str(data <- simNmix(Neg.Bin = TRUE))    # Negative-binomial mixture
str(data <- simNmix(mean.theta = 0.6, sigma.lam = 1))  # Zero-inflated PLN
str(data <- simNmix(mean.theta = 0.6, Neg.Bin = TRUE)) # Zero-infl. NegBin
str(data <- simNmix(mean.p = 1))        # Perfect detection (p = 1)
str(data <- simNmix(mean.theta = 0.6, mean.p = 1))     # ZIP with p = 1
str(data <- simNmix(sigma.lam = 1, mean.p = 1))        # PLN with p = 1
Simulate replicated counts under a spatial, static binomial N-mixture model
Description
Simulates replicated counts under a spatial, static binomial N-mixture model for a semi-realistic landscape in a square of 50x50 km in the Bernese Oberland around Interlaken, Switzerland. Unit of the data simulation is a 1km2 quadrat, hence, there are 2500 units. It uses the data set BerneseOberland, which has covariates for elevation and forest cover.
For abundance, the function allows you to specify a quadratic effect of elevation. Then, a Gaussian spatial random field (s) with negative exponential correlation function is simulated using simExpCorrRF; you can set the variance and the range (scale) parameter theta. Basically, the larger the value of theta.RF, the bigger are the 'islands' simulated in the random field. The abundance in each quadrat (i) is built up via the following linear predictor:
lambda[i] <- exp(beta0 + beta[1] * elev[i] + beta[2] * elev[i]^2 + s[i]) 
N[i] ~ Poisson(lambda[i])
Replicated counts are simulated as usual under a binomial observation model, and detection probability is allowed to vary by one site and one observational covariate: respectively quadrat forest cover (in the BerneseOberland data set), and wind-speed, which is invented data. Counts at each site (i) and for each occasion (j) are then produced according to the following model:
  p[i,j] <- plogis(alpha0 + alpha[1] * forest[i] + alpha[2] * wind[i,j]) 
C[i,j] ~ Binomial(N[i], p[i,j]) 
Finally, we assume that only a subset of the 2500 quadrats is surveyed. Hence, we allow you to choose the number of quadrats that are surveyed and these will then be randomly placed into the landscape. We then assume that counts will only be available for these surveyed quadrats, i.e., counts from all non-surveyed quadrats will be NA.
To recreate the data sets used in the book with R 3.6.0 or later, call RNGversion="3.5.3" before the call to simNmixSpatial. This should only be used for reproduction of old results.
Usage
simNmixSpatial(nsurveys = 3, mean.lambda = exp(2), beta = c(2, -2),
  mean.p = 0.5, alpha = c(-1, -1), sample.size = 500, variance.RF = 1, theta.RF = 10,
  seeds = c(10, 100), truncN = 6, show.plots=TRUE, verbose = TRUE)
Arguments
| nsurveys | number of surveys per quadrat. | 
| mean.lambda | expected number of individuals at a quadrat with mean values of all covariates;  | 
| beta | vector of length 2, the linear and quadratic coefficients for the effect of elevation on abundance. | 
| mean.p | Expected detection at the average value of all detection covariates (and ignoring all random effects): detection model | 
| alpha | vector of length 2, the coefficients for the effect of forest and wind on detection. | 
| sample.size | the number of quadrats surveyed. | 
| variance.RF | variance of the random field. | 
| theta.RF | parameter governing spatial correlation (=1/phi); the larger the value of theta.RF, the bigger are the 'islands' simulated in the random field. | 
| seeds | vector of length 2; random seeds used for the random field and the selection of quadrats surveyed respectively. | 
| truncN | used to limit the range of values of N displayed in some plots. | 
| show.plots | if TRUE, plots of the data will be displayed; set to FALSE if you are running simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the arguments input and the following additional elements:
| xcoord,ycoord | The x and y coordinates from the  | 
| elevation | The elevation covariate from the  | 
| forest | The forest cover covariate from the  | 
| elevationS | The elevation covariate standardized to mean 0, SD 1. | 
| forestS | The forest cover covariate standardized to mean 0, SD 1. | 
| wind | The wind covariate, generated internally. | 
| field | The spatially-autocorrelated covariate, generated internally. | 
| beta0 | Intercept for the abundance model, equal to  | 
| lam | Expected abundance in each quadrat. | 
| N | Number of individuals at each site. | 
| Ntotal | Total number of individuals in the study area. | 
| p | probability of detection for each survey. | 
| y | number of individuals detected in each quadrat. | 
| surveyed.sites | the IDs of the quadrats surveyed. | 
| yobs | number of individuals detected in each quadrat surveyed, NA for quadrats not surveyed. | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 11.
Examples
# Generate data with the default arguments and look at the structure:
str(dat <- simNmixSpatial())
str(dat <- simNmixSpatial(show.plots=FALSE))
# More surveys
str(dat<- simNmixSpatial(nsurveys = 10))
# Minimal number of surveys is 1
str(dat<- simNmixSpatial(nsurveys = 1))
# Much more common species
str(dat<- simNmixSpatial(mean.lambda = exp(4)))
# Only negative linear effect of elevation
str(dat<- simNmixSpatial(beta = c(-2, 0)))
# No effect of elevation at all
str(dat<- simNmixSpatial(beta = c(0, 0)))
# Perfect detection (p = 1)
str(dat<- simNmixSpatial(mean.p = 1))
# No effect of forest cover on detection
str(dat<- simNmixSpatial(alpha = c(0, -1)))
# No effect of wind speed on detection
str(dat<- simNmixSpatial(alpha = c(-1, 0)))
# Sample only 100 quadrats
str(dat<- simNmixSpatial(sample.size = 100))
# Sample all 2500 quadrats
str(dat<- simNmixSpatial(sample.size = 2500))
# Larger variance of the multivariate Gaussian Random variable in the random field
#  (this will increase the effect of the field on abundance and counts)
str(dat<- simNmixSpatial(variance.RF = 10))
# No spatial autocorrelation (Variant 1: set variance to 0)
str(dat<- simNmixSpatial(variance.RF = 0))
# No spatial autocorrelation (Variant 2: set theta very close to 0,
#  but not quite 0, otherwise function breaks)
str(dat<- simNmixSpatial(theta.RF = 0.0001))
# Larger value of theta.RF gives larger 'islands'
#try(str(dat<- simNmixSpatial(theta.RF = 100))) # works with RandomFields
# Truncate abundance in final plots to presence/absence
str(dat<- simNmixSpatial(truncN = 0.5))
# Essentially do not truncate abundance in final plots
str(dat<- simNmixSpatial(truncN = 70))
Generate counts from a single population observed over T years under a binomial observation process
Description
Generates counts from a single population observed over T years and which can be observed with or without imperfect detection. The goal is to focus on what happens with relative-abundance inference when temporal patterns in abundance are confounded with temporal patterns in detection probability. Hence, we can simulate a stable population or one with linear increase or decrease with specified start and end points, and around which there is Poisson noise. The observed counts are Binomial outcomes with a detection probability which can similarly be chosen to be constant or change linearly over time.
Usage
simNpC(T = 20, expN = c(100, 75), dp = c(0.5, 0.5), show.plot = TRUE)
Arguments
| T | The length of the time series. | 
| expN | The expected abundance at start and end of period, linear trend. | 
| dp | The detection probability at start and end of period, linear trend. | 
| show.plot | Choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| lambda | T vector, expected abundance for each year. | 
| p | T vector, detection probability (dp) for each year. | 
| N | T vector, realized abundance. | 
| C | T vector, observed counts. | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 1.2.
Examples
# Run with the default arguments and look at the structure of the output:
set.seed(123)
tmp <- simNpC()
str(tmp)
tmp$C
Simulate data for static occupancy models under wide range of conditions
Description
Functions to simulate static occupancy measurements replicated at M sites during J occasions. Population closure is assumed for each site. Expected occurrence may be affected by elevation (elev), forest cover (forest) and their interaction. Expected detection probability may be affected by elevation, wind speed (wind) and their interaction.
simOccCat allows for a categorical habitat-type covariate (HAB) to affect occurrence and another, the observer (OBS), to affect detection.
Usage
simOcc(M = 267, J = 3, mean.occupancy = 0.6, beta1 = -2, beta2 = 2, beta3 = 1,
  mean.detection = 0.3, time.effects = c(-1, 1),
  alpha1 = -1, alpha2 = -3, alpha3 = 0, sd.lp = 0.5,
  b = 2, show.plots = TRUE)
simOccCat(M = 267, J = 3, mean.occupancy = 0.6, beta1 = 0, beta2 = 0, beta3 = 0,
  mean.detection = 0.3, time.effects = c(0, 0),
  alpha1 = 0, alpha2 = 0, alpha3 = 0, sd.lp = 0, b = 0,
  nHab = 5, range.HAB = 2, nObs = 10, range.OBS = 4,
  show.plots = TRUE)
Arguments
| M | Number of spatial replicates (sites) | 
| J | Number of temporal replicates (occasions) | 
| mean.occupancy | Mean occurrence at value 0 of occurrence covariates | 
| beta1 | Main effect of elevation on occurrence | 
| beta2 | Main effect of forest cover on occurrence | 
| beta3 | Interaction effect on occurrence of elevation and forest cover | 
| mean.detection | Mean detection prob. at value 0 of detection covariates | 
| time.effects | bounds (on logit scale) for uniform distribution from which time effects gamma will be drawn | 
| alpha1 | Main effect of elevation on detection probability | 
| alpha2 | Main effect of wind speed on detection probability | 
| alpha3 | Interaction effect on detection of elevation and wind speed | 
| sd.lp | standard deviation of random site effects (on logit scale) | 
| b | constant value of 'behavioral response' leading to 'trap-happiness' (if b > 0) or 'trap shyness' (if b < 0) | 
| nHab | number of categories to simulate for the site-specific categorical covariate, HAB. | 
| range.HAB | a non-negative number that controls the size of the HAB effect. | 
| nObs | number of observers, the categories to simulate for the survey-specific categorical covariate, OBS. | 
| range.OBS | a non-negative number that controls the size of the OBS effect. | 
| show.plots | if TRUE, plots of the data will be displayed; set to FALSE if you are running simulations | 
Value
A list with the values of the input arguments and the following additional elements:
| gamma | The time effects, a vector of length J | 
| eps | Individual random effects, a vector of length M | 
| elev | Elevation, a vector of length M | 
| forest | Forest cover, a vector of length M | 
| wind | wind speed, a M x J matrix | 
| psi | Probability of occurrence, a vector of length M | 
| z | Realized occurrence (0/1), a vector of length M | 
| p | probability of capture, possibly with a behavioral effect, a M x J matrix | 
| p0 | probability of capture when not captured on previous occasion, a M x J matrix | 
| p1 | probability of capture when captured on previous occasion, a M x J matrix | 
| y | simulated capture history, a M x J matrix | 
| sumZ | True number of occupied sites | 
| sumZ.obs | Number of sites observed to be occupied | 
| psi.fs.true | True proportion of occupied sites in sample (sumZ/N) | 
| psi.fs.obs | Proportion of sites observed to be occupied (sumZ.obs/N) | 
In addition, simOccCat has
| HAB | Categorical site covariate, a vector of length M | 
| OBS | Categorical detection covariate, a M x J matrix | 
| coefHAB | The effect of each category of HAB on occurrence | 
| coefOBS | The effect of each category of OBS on detection | 
Author(s)
Marc Kéry, Andy Royle, Gesa von Hirschheydt, Mike Meredith,
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.8.
Examples
# Generate data with the default arguments and look at the structure
tmp <- simOcc()
str(tmp)
# Simplest possible occupancy model, with constant occupancy and detection
str(simOcc(mean.occ=0.6, beta1=0, beta2=0, beta3=0, mean.det=0.3, time.effects=c(0, 0),
  alpha1=0, alpha2=0, alpha3=0, sd.lp=0, b=0))
# psi = 1 (i.e., species occurs at every site)
str(simOcc(mean.occ=1))
# p = 1 (i.e., species is always detected when it occurs)
str(simOcc(mean.det=1))
# Other potentially interesting settings include these:
str(simOcc(J = 2))                 # Only 2 surveys
str(simOcc(M = 1, J = 100))        # No spatial replicates, but 100 measurements
str(simOcc(beta3 = 1))             # Including interaction elev-wind on p
str(simOcc(mean.occ = 0.96))       # A really common species
str(simOcc(mean.occ = 0.05))       # A really rare species
str(simOcc(mean.det = 0.96))       # A really easy species
str(simOcc(mean.det = 0.05))       # A really hard species
str(simOcc(mean.det = 0))          # The dreaded invisible species
str(simOcc(alpha1=-2, beta1=2))    # Opposing effects of elev on psi and p
str(simOcc(J = 10, time.effects = c(-5, 5))) # Huge time effects on p
str(simOcc(sd.lp = 10))            # Huge (random) site effects on p
str(simOcc(J = 10, b = 0))         # No behavioral response in p
str(simOcc(J = 10, b = 2))         # Trap happiness
str(simOcc(J = 10, b = -2))        # Trap shyness
# Using categorical covariates only
str(simOccCat())
# Categorical and continuous covariates
str(tmp <- simOccCat(beta1 = -2, beta2 = 2, beta3 = 1,
    mean.detection = 0.3, time.effects = c(-1, 1),
    alpha1 = -1, alpha2 = -3, alpha3 = 0,
    sd.lp = 0.5, b = 2))
# Check how balanced the levels are for HAB
barplot(sort(table(tmp$HAB), decreasing=TRUE), xlab="Habitat category",
    ylab="Frequency", main="Frequency distribution of habitat categories")
Simulate replicated detection/nondetection data under a spatial, static occupancy model
Description
Simulates replicated detection/nondetection data under a spatial, static occupancy model for a semi-realistic landscape in a square of 50x50 km in the Bernese Oberland around Interlaken, Switzerland. Unit of the data simulation is a 1km2 quadrat, hence, there are 2500 units. It uses the data set BerneseOberland, which has covariates for elevation and forest cover.
For occupancy, the function allows you to specify a quadratic effect of elevation. Then, a Gaussian spatial random field (s) with negative exponential correlation function is simulated using simExpCorrRF; you can set the variance and the range (scale) parameter theta. Basically, the larger the value of theta.RF, the bigger are the 'islands' simulated in the random field. The abundance in each quadrat (i) is built up via the following linear predictor:
psi[i] <- qlogis(beta0 + beta[1] * elev[i] + beta[2] * elev[i]^2 + s[i]) 
z[i] ~ Bernoulli(psi[i])
Replicated detection/nondetection data are simulated as usual under a Bernoulli observation model, and detection probability is allowed to vary by one site and one observational covariate: respectively quadrat forest cover (in the BerneseOberland data set), and wind-speed, which is invented data. Counts at each site (i) and for each occasion (j) are then produced according to the following model:
p[i,j] <- plogis(alpha0 + alpha[1] * forest[i] + alpha[2] * wind[i,j]) 
y[i,j] ~ Bernoulli(z[i] * p[i,j]) 
Finally, we assume that only a subset of the 2500 quadrats is surveyed. Hence, we allow you to choose the number of quadrats that are surveyed and these will then be randomly placed into the landscape. We then assume that counts will only be available for these surveyed quadrats, i.e., detection/non-detection data from all non-surveyed quadrats will be replaced with NA.
To recreate the data sets used in the book with R 3.6.0 or later, call RNGversion="3.5.3" before the call to simOccSpatial. This should only be used for reproduction of old results.
Usage
simOccSpatial(nsurveys = 3, mean.psi = 0.6, beta = c(2, -2),
  mean.p = 0.4, alpha = c(-1, -1), sample.size = 500, variance.RF = 1, theta.RF = 10,
  seeds = c(10, 100), show.plots = TRUE, verbose = TRUE)
Arguments
| nsurveys | number of surveys per quadrat. | 
| mean.psi | probability of occupancy at a quadrat with mean values of all covariates;  | 
| beta | vector of length 2, the linear and quadratic coefficients for the effect of elevation on occupancy. | 
| mean.p | Expected detection at the average value of all detection covariates (and ignoring all random effects): detection model | 
| alpha | vector of length 2, the coefficients for the effect of forest and wind on detection. | 
| sample.size | the number of quadrats surveyed. | 
| variance.RF | variance of the random field. | 
| theta.RF | parameter governing spatial correlation (=1/phi); the larger the value of theta.RF, the bigger are the 'islands' simulated in the random field. | 
| seeds | vector of length 2; random seeds used for the random field and the selection of quadrats surveyed respectively. | 
| show.plots | if TRUE, plots of the data will be displayed; set to FALSE if you are running simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the arguments input and the following additional elements:
| xcoord,ycoord | The x and y coordinates from the  | 
| elevation | The elevation covariate from the  | 
| forest | The forest cover covariate from the  | 
| elevationS | The elevation covariate standardized to mean 0, SD 1. | 
| forestS | The forest cover covariate standardized to mean 0, SD 1. | 
| wind | The wind covariate, generated internally. | 
| field | The spatially-autocorrelated covariate, generated internally. | 
| alpha0 | Intercept for the detection model, equal to  | 
| beta0 | Intercept for the abundance model, equal to  | 
| z | True occupancy in each quadrat. | 
| trueNocc | True number of occupied sites. | 
| obsNocc | Number of sites where the species was detected. | 
| true_psi_fs | |
| obs_psi_fs | |
| p | probability of detection for each survey. | 
| y | detected/non-detected data for each quadrat. | 
| surveyed.sites | the IDs of the quadrats surveyed. | 
| yobs | detected/non-detected data for each quadrat surveyed, NA for quadrats not surveyed. | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 9.2.
Examples
# Generate data with the default arguments and look at the structure
str(dat <- simOccSpatial())
str(dat <- simOccSpatial(show.plots=FALSE))
# More surveys
str(dat<- simOccSpatial(nsurveys = 10))
# Minimal number of surveys is 1
str(dat<- simOccSpatial(nsurveys = 1))
# A truly ubiquitous species
str(dat <- simOccSpatial(mean.psi = 1))
# Only negative linear effect of elevation
str(dat <- simOccSpatial(beta = c(2, 0)))
# No effect of elevation at all (see effects of spatial field now clearly)
str(dat <- simOccSpatial(beta = c(0, 0)))
# Perfect detection (p = 1)
str(dat <- simOccSpatial(mean.p = 1))
# No effect in detection of forest cover
str(dat <- simOccSpatial(alpha = c(0, -1)))
# No effect in detection of wind speed (see neatly forest effect now)
str(dat <- simOccSpatial(alpha = c(-1, 0)))
# Sample only 100 quadrats
str(dat <- simOccSpatial(sample.size = 100))
# Sample all 2500 quadrats
str(dat <- simOccSpatial(sample.size = 2500))
# Larger variance of the multivariate Gaussian Random variable in the random field
#  (this will increase the effect of the field on occupancy and detection)
str(dat <- simOccSpatial(variance.RF = 10))
# No spatial autocorrelation (Variant 1: set variance to 0)
str(dat <- simOccSpatial(variance.RF = 0))
# No spatial autocorrelation (Variant 2: set theta very close to 0,
#  but not quite 0, otherwise function breaks)
str(dat <- simOccSpatial(theta.RF = 0.0001))
# Larger value of theta.RF gives larger 'islands'
#try(str(dat <- simOccSpatial(theta.RF = 100)))  # Works with RandomFields
Simulate time-to-detection occupancy data (single visits)
Description
Function simulates time-to-detection occupancy design data under the model of Garrard et al. (2008), also see Bornand et al. (2014)
Usage
simOccttd(M = 250, mean.psi = 0.4, mean.lambda = 0.3,
  beta1 = 1, alpha1 = -1, Tmax = 10, show.plot = TRUE, verbose = TRUE)
Arguments
| M | Number of sites | 
| mean.psi | intercept of occupancy probability | 
| mean.lambda | intercept of Poisson rate parameter | 
| beta1 | slope of continuous covariate B on logit(psi) | 
| alpha1 | slope of continuous covariate A on log(lambda) | 
| Tmax | maximum search time (in arbitrary units, which are same as response), response will be censored at  | 
| show.plot | choose whether to show plots or not. Set to FALSE when running simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the values of the arguments input and the following additional elements:
| covA | Simulated values of covariate A, a vector of length M | 
| covB | Simulated values of covariate B, a vector of length M | 
| psi | Probability of occurrence at each site, a vector of length M | 
| z | Realized occurrence at each site, a 0/1 vector of length M | 
| ttd.temp | Uncensored simulated time-to-detection at each site, a vector of length M | 
| ttd | Censored simulated time-to-detection at each site, a vector of length M | 
| d | Censoring indicator, a 0/1 vector of length M | 
| sum.z | Total number of sites occupied | 
| n.obs | Total number of sites where the species was observed | 
Author(s)
Marc Kéry & Andy Royle
References
Garrard, G.E., Bekessy, S.A., McCarthy, M.A., & Wintle, B.A. (2008) When have we looked hard enough? A novel method for setting minimum survey effort protocols for flora surveys. Austral Ecology, 33, 986-998.
Bornand, C.N., Kéry, M., Bueche, L., & Fischer, M. (2014) Hide and seek in vegetation: time-to-detection is an efficient design for estimating detectability and occurrence. Methods in Ecology and Evolution, 5, 433-442.
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.12.1.
Examples
# Generate data with the default arguments and look at the structure
tmp <- simOccttd()
str(tmp)
Generate counts under a variant of a 'phenomenological model'
Description
Function generates (insect) counts under a variant of a 'phenomenological model' of Dennis et al. (2016). Interannual population model is exponential population growth, with Poisson initial abundance governed by initial.lambda and annually varying growth rate (or productivity parameter) gamma. Within-year dynamics is described by a Gaussian curve with date of mean flight period mu (site- and year-specific) and length of flight period sigma (only year-specific). Counts are made subject to a detection probability (p), which varies randomly according to a uniform distribution for every single count. Counts are plotted for up to 16 populations only (but can be simulated for any number).
Usage
simPH(npop = 18, nyears = 17, nreps = 10, date.range = 1:150, initial.lambda = 300,
  gamma.parms = c(0, 0.3), mu.range = c(50, 80),  sigma.range = c(10, 20),
  p.range = c(0.4, 0.6), show.plot = TRUE)
Arguments
| npop | The number of populations. | 
| nyears | The number of years (seasons). | 
| nreps | The number of surveys per year. | 
| date.range | The dates over which surveys may be conducted. | 
| initial.lambda | The Poisson mean of initial relative population size. | 
| gamma.parms | The mean and SD of lognormal interannual productivity. | 
| mu.range | The range of date of peak flight period (varies by site and year). | 
| sigma.range | The range of sigma of Gaussian phenology curve (varies by year only). | 
| p.range | The range of detection probabilities (varies by site, year and visit). | 
| show.plot | Choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| gamma | (nyears-1) vector, change in abundance for each interval between years. | 
| n | site x year matrix, true relative abundance. | 
| mu | site x year matrix, mean date of the flight period. | 
| sigma | nyears vector, half-length of flight period. | 
| date | site x year x reps array, dates of the surveys. | 
| a | site x year x reps array, phenology term. | 
| lambda | site x year x reps array, expected counts. | 
| p | site x year x reps array, probability of detection. | 
| C | site x year x reps array, simulated counts. | 
Author(s)
Marc Kéry & Andy Royle
References
Dennis, E.B., et al (2016) Dynamic models for longitudinal butterfly data, Journal of Agricultural, Biological and Environmental Statistics, 21, 1-21.
Kéry & Royle (2021) Applied Hierarchical Modeling in Ecology AHM2 - 1.8.1.
Examples
# Run with the default arguments and look at the structure of the output:
set.seed(123)
tmp <- simPH()
str(tmp)
summary(tmp$C)
Simulate data for a demographic state-space model
Description
Simulate multiple time-series of counts under a pure Markov model (with exponential population model) or under an extended Markov model with exponential-plus-random-immigration population model; see Sollmann et al.(2015). Default is Markov model, setting sd.rho to a value greater than 0 changes to extended Markov and sets the amount of random immigration.
Usage
simPOP(M = 100, T = 10, mean.lam = 3, beta.lam = 0, sd.log.lam = 0,
  mean.gamma = 1.0, beta.gamma = 0, sd.log.gamma.site = 0,
  sd.log.gamma.time = 0, sd.log.gamma.survey = 0, sd.rho = 0,
  mean.p = 0.6, beta.p = 0, sd.logit.p.site = 0, sd.logit.p.time = 0,
  sd.logit.p.survey = 0, show.plot = TRUE)
Arguments
| M | The number of sites. | 
| T | The number of years. | 
| mean.lam | The mean abundance for year 1. | 
| beta.lam | The covariate coefficient for lambda. | 
| sd.log.lam | The over-dispersion in lambda. | 
| mean.gamma | The mean population growth rate. | 
| beta.gamma | The covariate coefficient for gamma. | 
| sd.log.gamma.site | SD of random site effects for gamma. | 
| sd.log.gamma.time | SD of random time effects for gamma. | 
| sd.log.gamma.survey | SD of random survey (site+time) effects for gamma. | 
| sd.rho | The random immigration term. | 
| mean.p | The mean detection probability. | 
| beta.p | The covariate coefficient for p. | 
| sd.logit.p.site | SD of random site effects for p on the logit scale. | 
| sd.logit.p.time | SD of random time effects for p on the logit scale. | 
| sd.logit.p.survey | SD of random survey (site+time) effects for p on the logit scale. | 
| show.plot | Choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments entered and the following additional elements:
| Xsite1 | M vector, site covariate affecting initial abundance (lambda). | 
| Xsiteyear1 | M x T matrix, yearly site covariate affecting recruitment (gamma). | 
| Xsiteyear2 | M x T matrix, yearly site covariate affecting detection (p). | 
| eps.N | M vector, site over-dispersion at t = 1. | 
| lambda | M vector, abundance in year 1. | 
| eps.gamma.site | M vector, random site effect for gamma. | 
| eps.gamma.time | T vector, random time effect for gamma. | 
| eps.gamma.survey | M x T matrix, random survey effect for gamma. | 
| gamma | M x T matrix, population growth rate. | 
| rho | (T-1) vector, immigration rate. | 
| eps.p.site | M vector, random site effect for detection. | 
| eps.p.time | T vector, random time effect for detection. | 
| eps.p.survey | M x T matrix, random survey effect for detection. | 
| p | M x T matrix, detection probability. | 
| N | M x T matrix, true population. | 
| C | M x T matrix, simulated counts. | 
| zeroNyears | scalar, sum(N == 0). | 
| Nextinct | scalar, number of sites where N == 0 at time T. | 
| extrate | scalar, proportion of sites where N == 0 at time T. | 
| sumN | T vector, total population in each year. | 
| gammaX | (T-1) vector, realized population growth rate. | 
Author(s)
Marc Kéry & Andy Royle
References
Sollmann, R. et al. (2015) An open-population hierarchical distance sampling model. Ecology, 96, 325-331.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 1.7.1.
Examples
# Run with the default arguments and look at the structure of the output
set.seed(123)
tmp <- simPOP()
str(tmp)
head(tmp$C)
Simulate a spatial point pattern in a heterogeneous landscape
Description
The function simulates a spatial point pattern in a heterogeneous landscape simulated on a square landscape. The study area ('core') is simulated inside the larger landscape that includes a buffer. The size of the core is defined by the lscape.size minus twice the buffer.
There is one habitat covariate X that affects the intensity of the points. X is spatially structured with negative-exponential spatial autocorrelation; the parameters of the field can be chosen to create large 'islands' of similar values or no 'islands' at all, in which case the field is spatially unstructured.
The intensity of STATIC points (e.g. animal activity centers) may be inhomogeneous and affected by the coefficient beta, which is the log-linear effect of X.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Previous versions used RandomFields, but that is not currently available on CRAN. fields is now used, but it cannot deal with large values of lscape.size and theta.X. If you have RandomFields installed (perhaps by getting it from the CRAN archive), you can load a version of simPPe that supports it with source(system.file("RandomFieldsSupport", "simPPe.R", package="AHMbook")).
Usage
simPPe(lscape.size = 150, buffer.width = 25, variance.X = 1, theta.X = 10,
  M = 250, beta = 1, quads.along.side = 6, show.plots = TRUE)
Arguments
| lscape.size | size (width = height) of the square landscape, including core and buffer. | 
| buffer.width | width of buffer around core study area. | 
| variance.X | variance of Gaussian random field (covariate X). | 
| theta.X | scale parameter of correlation in field (must be >0). | 
| M | expected number of activity centers in core area. | 
| beta | coefficient of the habitat covariate. | 
| quads.along.side | number of quadrats along the side of the core area; the total number of quadrats will be quads.along.side^2, thus indirectly defining the quadrat area. | 
| show.plots | if TRUE, summary plots are displayed. | 
Value
A list with the values of the input arguments and the following additional elements:
| core | range of x and y coordinates in the 'core' | 
| M2 | number of ACs in the total landscape, including the buffer | 
| grid | coordinates of the center of each pixel | 
| pixel.size | length of side of each pixel | 
| size.core | the width=height of the core area | 
| prop.core | the proportion of the landscape inside the core | 
| X | matrix of covariate values for each pixel | 
| probs | matrix of probabilities of an AC being inside each pixel (sums to 1) | 
| pixel.id | the ID of the pixel for each AC | 
| u | 2-column matrix, coordinate of each AC | 
| nsite | number of quadrats | 
| quad.size | width = height of each quadrat | 
| breaks | boundaries of the quadrats | 
| mid.pt | mid-points of the quadrats | 
| lambda_pp | intensity of point pattern (ACs per unit area) | 
| Nac | site-specific abundance of ACs | 
| zac | site-specific occurrence (0/1) of ACs | 
| E_N | average realized abundance per quadrat | 
| E_z | average realized occupancy per quadrat | 
Author(s)
Marc Kéry & Andy Royle.
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 10.
Examples
# Nice plot (produces the really nice Fig. 10-2 in the book)
# RNGkind(sample.kind = "Rounding") # run this for R >= 3.6.0
set.seed(117, kind="Mersenne-Twister")
# Fails if RandomFields is not available
#try(str(dat <- simPPe(lscape.size = 200, buffer.width = 25, variance.X = 1,
#  theta.X = 70, M = 200, beta = 1, quads.along.side = 6)))
str(dat <- simPPe(lscape.size = 200, buffer.width = 20, variance.X = 1,
  theta.X = 5, M = 250, beta = 1, quads.along.side = 6))
Simulate data for a standardized line transect
Description
This simulates line transect distance sampling data with a spatial distribution of objects in a heterogeneous landscape where the parameter beta controls the effect of habitat. Habitat is simulated according to a Gaussian random field model defined within the function. Uses a half normal detection model (if perp = TRUE) or a Gaussian hazard model (perp = FALSE).
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
simSpatialDSline(N=1000, beta = 1, sigma=0.25, alpha0 = -2, W=1/2, L = 4,
    perp=FALSE, show.plots=TRUE)
Arguments
| N | total population size in the rectangle | 
| beta | coefficient of spatial covariate x for the density model. | 
| sigma | scale of half-normal detection function | 
| alpha0 | intercept of the hazard function. | 
| W | half-width of the rectangle, which extends W each side of the transect line. | 
| L | length of the transect. | 
| perp | if TRUE, data are simulated for a traditional distance sampling model with perpendicular distances; if FALSE (the default) a model with 'forward distance' data, ie, the distance from the observer to the animal on first detection. | 
| show.plots | if TRUE, summary plots are displayed. | 
Value
A list with the values of the input arguments and the following additional elements:
| delta | the distance between pixel centers (spatial resolution of the raster | 
| grid | 2-column matrix with x/y coordinates of all pixels | 
| Habitat | value of habitat covariate for each pixel | 
| Habraster | a Raster object with the habitat covariate | 
| u1,u2 | x and y coordinates for all the animals in the population | 
| traps | 2-column matrix of trap locations | 
If perp = TRUE we have
| data | a 2-column matrix with x and y coordinates of each animal captured. | 
| pixel | pixel ID for each animal captured. | 
and if perp = FALSE we have
| data | a matrix with rows for each animal captured and columns for trap of first capture, distance from trap to animal, and x and y coordinates of the animal. | 
| pbar | probability that each animal is the population is captured at least once | 
| pixel | pixel ID for each animal captured. | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 11.
Examples
# Run the function with default values and look at the output
str(tmp <- simSpatialDSline(), 1)  # use str(., max.level=1) to limit the amount of output.
Simulate data for replicate line transect surveys with temporary emigration
Description
This simulates line transect distance sampling data with a spatial distribution of objects in a heterogeneous landscape where the parameter beta controls the effect of habitat. Multiple sample occasions are simulated and temporary emigration is allowed (parameter phi). Habitat is simulated according to a Gaussian random field model defined within the function. Uses a half normal detection model (if perp = TRUE) or a Gaussian hazard model (perp = FALSE).
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
simSpatialDSte(nsites=28, dim=10, beta=1, lam0=2.5, nsurveys=4, sigma=3,
  phi=0.6, theta=2, show.plots=3)
Arguments
| nsites | number of sites | 
| dim | number of pixels along each side of the square site | 
| beta | the effect of habitat on the number of individuals in a pixel. | 
| lam0 | expected population size at each site | 
| nsurveys | the number of replicate surveys | 
| sigma | scale of half-normal detection function | 
| phi | probability an individual is available for detection, ie, not temporarily emigrated. | 
| theta | exponential correlation in the spatial covariate. | 
| show.plots | the number of sites for which plots should be displayed; set to 0 to suppress plotting. | 
Value
A list with the values of the input arguments and the following additional elements:
| npixels | the number of pixels in each site (= dim^2) | 
| B | distance from line to edge of square (= dim/2) | 
| M | true number of individuals at each site | 
| d | perpendicular distance of each pixel from the line | 
| Habitat | pixels x sites matrix, value of habitat covariate for each pixel | 
| y | sites x pixels x surveys array, number of animals detected | 
| Counts | sites x surveys matrix, number of animals detected (summed over pixels) | 
Author(s)
Mizel et al (2018) Appendix S1, based in turn on Kéry & Royle (2016).
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology.
Mizel, J.D., Schmidt, J.H., & Lindberg, M.S. (2018) Accommodating temporary emigration in spatial distance sampling models. Journal of Applied Ecology, 55, 1456-1464.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 11.
Examples
# Run the function with default values and look at the output
str(tmp <- simSpatialDSte())
Simulate N-mixture data under a time-for-space substitution design
Description
A simple function to simulate data under binomial N-mixture model where you have a single site that is surveyed over 'nyears' primary sampling periods ('seasons', 'years'), within each of which there are 'nreps' secondary samples.
Usage
simpleNmix(nyears = 12, nreps = 4, beta0 = 2, beta1 = 0.1,
  alpha0 = 0.5, alpha1 = -0.1, alpha2 = 1, show.plot = TRUE)
Arguments
| nyears | Number of primary sampling periods. | 
| nreps | Number of secondary samples within each primary period. | 
| beta0 | the intercept of a log-linear model of expected abundance (lambda). | 
| beta1 | the Time coefficient of a log-linear model for lambda. | 
| alpha0 | the intercept of a logit-linear model for detection (p). | 
| alpha1 | the Time coefficient of a logit-linear model for detection (p). | 
| alpha2 | the coefficient of a survey-specific covariate such as temperature (temp). | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
Value
A list with the values of the arguments input and the following additional elements:
| N | The realized number of individuals at each primary season, a vector of length  | 
| C | The number of individuals counted at each survey, a  | 
| Time | The Time covariate, a vector of length  | 
| temp | The temperature covariate, a  | 
| p | The probability of detection, a  | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM2 - 6.12.
Examples
# Simulate a data set with the default arguments and look at the structure of the output
tmp <- simpleNmix()
str(tmp)
Prepare input for BUGS model when fitting a spline for a covariate
Description
Function chooses knots and creates design matrices for fixed and random-effects parts of a spline model for a chosen covariate. Based on code by Crainiceanu et al. (2005) and Zuur et al. (2012). Allows you to choose number of knots or else uses it by the rule given in Crainiceanu et al. (2005). Prepares fixed part of covariate as a quadratic polynomial.
Usage
spline.prep(cov, nknot = NA)
Arguments
| cov | the covariate, a numeric vector | 
| nknot | optional, number of knots | 
Value
A list with the following elements:
| cov | The input covariate | 
| knots | The values of the knots | 
| X | The fixed-effects design matrix | 
| Z | The random-effects design matrix | 
Author(s)
Marc Kéry & Andy Royle
References
Crainiceanu, C.M., Ruppert, D., & Wand, M.P. (2005) Bayesian analysis for penalized spline regression using WinBUGS. Journal of Statistical Software, 14.
Zuur, A.F., Saveliev, A.A., Ieno, E.N. (2012) Zero-inflated Models and Generalized Linear Mixed Models with R. Highlands Statistics
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.14.
Data for observations of Middle Spotted Woodpecker in Switzerland
Description
The data set is based on checklists which volunteers submit from their bird-watching trips and which were then summarized by day and site (1 km2 quadrat) such that we have the number of surveys (= checklists) per quadrat and day, and the number among these on which a Middle-Spotted Woodpecker was recorded, during 162 days (Julian days 51–212, corresponding to 20 February–31 July). There are data from a total of 144,517 recorded surveys on 116,204 day/quadrat combinations during 26 years from 1545 1 km2 quadrats in which the species was ever recorded in Switzerland since 1985.
Usage
data("spottedWoodpecker")Format
spottedWoodpecker is a data frame with 116,204 rows and 7 columns:
- site
- quadrat identifier. 
- coordx, coordy
- the x and y coordinates of the center of the quadrat; the coordinate reference system intentionally not specified. 
- year, jdate
- the year and Julian date of the observations. 
- y
- the number of surveys during which the species was detected. 
- nsurveys
- the number of recorded surveys in the quadrat on that day. 
Source
Database of Swiss Ornithological Institute (courtesy of Nicolas Strebel).
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 4.11.2.
Examples
data(spottedWoodpecker)
str(spottedWoodpecker)
Scaling and centering of vectors, matrices and arrays
Description
Maps a numeric variable to a new object with the same dimensions. standardize is typically used to standardize a covariate to mean 0 and SD 1. standardize2match is used to standardize one object using the mean and SD of another; it is a wrapper for standardize(x, center=mean(y), scale=sd(y)).
Usage
standardize(x, center = TRUE, scale = TRUE)
standardize2match(x, y)
Arguments
| x,y | a numeric vector, matrix or multidimensional array;  | 
| center | either a logical or a numeric value of length 1. | 
| scale | either a logical or a numeric value of length 1. | 
Details
standardize differs from scale by (1) accepting multidimensional arrays but not data frames; (2) not standardizing column-wise but using a single value to center or to scale; (3) if x is a vector, the output will be a vector (not a 1-column matrix). If each column in the matrix represents a different variable, use scale not standardize.
Centering is performed before scaling.
If center is numeric, that value will be subtracted from the whole object. If logical and TRUE, the mean of the object (after removing NAs) will be subtracted.
If scale is numeric, the whole object will be divided by that value. If logical and TRUE, the standard deviation of the object (after removing NAs) will be used; this may not make sense if center = FALSE.
Value
A numeric object of the same dimensions as x with the standardized values. NAs in the input will be preserved in the output.
For the default arguments, the object returned will have mean approximately zero and SD 1. (The mean is not exactly zero as scaling is performed after centering.)
Author(s)
Mike Meredith, after looking at the code of base::scale.
Examples
# Generate some fake elevation data
elev <- runif(100, min=100, max=500)
mean(elev) ; sd(elev)
str( e <- standardize(elev) )
mean(e) ; sd(e)
# Standardize so that e=0 corresponds to exactly 300m and +/- 1 to
#   a change of 100m:
e <- standardize(elev, center=300, scale=100)
mean(e)
mean(elev) - 300
range(e)
range(elev) - 300
# Generate data matrix for survey duration for 3 surveys at 10 sites
dur <- matrix(round(runif(30, 20, 60)), nrow=10, ncol=3)
d <- standardize(dur)
mean(d) ; sd(d)
# Standardize new data to match the mean and SD of 'dur'
(new <- seq(20, 60, length.out=11))
standardize2match(new, dur)
# compare with base::scale
dx <- base::scale(dur)
colMeans(dx) ; apply(dx, 2, sd)
colMeans(d) ; apply(d, 2, sd)
# Don't use 'standardize' if the columns in the matrix are different variables!
American Tree Sparrow data from Alaska
Description
Mizel et al. (2018) describe the data on American Tree Sparrows (Spizella arborea) that were collected in May and June 2016 at Noatak National Preserve, Alaska. The study area was a 10.8 x 9.6 km rectangle divided up into 288 plots of 600 m x 600 m, of which 150 had some coverage by a transect running through them. Routes were usually chosen to intersect the centroid of the cell and deflected by 0 or 90 degrees when required to move into an adjacent cell. In some cases, a transect was shifted to avoid terrain features. Distance measurements were truncated at 250m from the observer. Each visited plot was surveyed an average of three times (max = 5). Each plot was gridded to a 30-m resolution and spatial attributes were associated with each of these pixels. This resulted in an average of 328 pixels per plot within the buffered transect. Bird locations were assigned to the nearest pixel centroid.
Usage
data("treeSparrow")Format
treeSparrow is a list with 3 elements:
- surveyData
- a data frame with rows for 466 surveys and the following columns: - Site : a numeric site identifier, 1 to 150 
- Visit : a numeric visit identifier, 1 to 5 
- Survey : a numeric survey identifier, 1 to 466 
- count : the number of tree sparrows detected. 
- juldate : the Julian date of the survey, day 1 = 1 Jan. 
- effort : time spent in each plot during each visit (in hours) 
- reltime : number of hours since 0200 (there is no sunrise during the survey period in the Arctic) 
- Noise : background noise (combination of wind, creek, and mosquito-associated noise): 0 = no noise, 1 = slight, 2 = noticeable reduction in hearing, 3 = prohibitive noise 
- Observer2 : Observed ID 
 
- obs
- a data frame with rows for 325 birds observed and the following columns: - Pixel : a numeric pixel identifier, 1 to 49250 
- Site : a numeric site identifier, 1 to 150 
- Visit : a numeric visit identifier, 1 to 5 
- NEAR_DIST : distance between the bird and the transect line 
- SurveyID : a numeric survey identifier, 1 to 466 
- ind : an index identifying the individual within the survey. 
 
- pixels
- a data frame with rows for 49250 pixels and the following columns: - pixID : a numeric pixel identifier, 1 to 49250 
- Site : a numeric site identifier, 1 to 150 
- X : x-coordinate of the pixel 
- Y : y-coordinate of the pixel 
- NEAR_DIST : distance between the center of the pixel and the transect line 
- NDVI : Normalized Difference Vegetation Index 
- elev : elevation of the center of the pixel, m. 
 
Source
Mizel et al. (2018)
References
Mizel, J.D., Schmidt, J.H. and Lindberg, M.S. (2018) Accommodating temporary emigration in spatial distance sampling models. Journal of Applied Ecology, 55, 1456-1464.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 11.8.4.
Examples
data(treeSparrow)
str(treeSparrow)
Time-to-detection data for peregrines from the French Jura mountains
Description
Between 7 and 9 March 2015, 38 breeding cliffs were visited up to 3 times to survey for Peregrine Falcons (Falco peregrinus). Observation duration varied from 3 to 95 minutes. The time to detection was recorded for each bird seen. If no birds were seen, the time was entered as NA.
Usage
data("ttdPeregrine")Format
A data frame with 70 rows and the following columns:
- Date
- a factor with 3 levels giving the date 
- DayNumber
- the number of the day: 1, 2, or 3 
- SiteNumber
- identification number of the site 
- Start.hour, Start.minute
- the time of starting the search 
- End.hour, End.minute
- the time of ending the search 
- MinOfDay
- the time of the start of the search, minutes after 06:00 
- Tmax
- the duration of the search, in minutes 
- ttd
- the time to detection, minutes; NA if no birds were seen during the search 
- sex
- the sex of the birds seen; NA if no birds were seen 
Source
Based on the field work of one of the authors.
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.12.
Examples
# Using the built-in data set instead of the TXT file for the example in Section
#  10.12 of the book is easy, as the format is the same as the file.
?ttdPeregrine  # check the description of the data
data(ttdPeregrine)
# Instead of data <- read.table("ttdPeregrine.txt", header = TRUE, sep = "\t") do:
data <- ttdPeregrine
# Then continue with the rest of the analysis on p.618
Validates simulated observational data against data-generating values.
Description
Where species identifications (from photos, recorded calls, etc) is doubtful, it is sometimes feasible for a subset of the data to be validated by experts. Chambert et al (2017) used simulations to investigate the effectiveness of such partial validation. Where a data set with false positives has been simulated and the true status is known, valid_data compares a subset with truth and returns the number of detections checked per site and the number checked and found to be valid. The function attempts to validate equal numbers of detections at each site (not equal proportions) subject to the actual number of detections and the total number to be validated; if the number of detections is small, all will be validated, otherwise a random sample is used; see Examples.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
valid_data(N, tp, n.valid, prop.valid=FALSE)
Arguments
| N | a vector with the number of detections for each site, including true positives and false positives. | 
| tp | a vector with the number of true positive detections for each site. | 
| n.valid | the number of detections to be validated (if  | 
| prop.valid | logical; determines how  | 
Value
A list with components:
| n | a vector with the number of detections checked for each site. | 
| k | a vector with the number of detections checked and found to be valid (true positives) for each site. | 
Author(s)
Mike Meredith.
References
Chambert, T., Waddle, J.H., Miller, D.A.W., Walls, S.C., & Nichols, J.D. (2017) A new framework for analysing automated acoustic species-detection data: occupancy estimation and optimization of recordings post-processing. Methods in Ecology and Evolution, 9, 560-570.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 7.6.2.
Examples
# Generate some data for 100 sites
set.seed(42, kind="Mersenne-Twister")
z <- rbinom(100, 1, 0.7)  # z[i] == 1 if species present at site 1, 0 otherwise
tp <- rpois(100, 3*z)     # Number of true detections, 0 if species not present
N <- tp + rpois(100, 0.5) # Add false positives
# Validate a subset of 150 detections
out <- valid_data(N, tp, 150)
head(tmp <- cbind(z=z, tp=tp, N=N, n=out$n, k=out$k), 10)
colSums(tmp)
# Plot the number validated vs all detections:
graphics::sunflowerplot(N, out$n, xlab="Number of detections at each site",
  ylab="Number validated")
# For sites with <= 2 detections, all are validated; otherwise at least 2 are validated,
#   with a 3rd draw from randomly selected sites to bring the total to 150.
Data for Dutch wagtails
Description
The Dutch Centre for Field Ornithology (Sovon) monitored grassland birds in Flevoland between April and mid-July 2011. 235 points were surveyed on up to 4 occasions; observations were divided into distance classes 50m wide, and the number of observations in each class recorded. These data are for the Yellow Wagtail (Motacilla flava).
Usage
data("wagtail")Format
wagtail is a list with 8 elements:
- potato
- for each point, the percentage of the area which are potato fields. 
- grass
- for each point, the percentage of the area under permanent grassland. 
- lscale
- for each point, an index of whether the landscape is open (0) or closed (100). 
- date
- a points x occasions matrix, the Julian date of the survey. 
- hour
- a points x occasions matrix, the hour of the survey. 
- breaks
- the boundaries between the distance classes; birds more than 300m from the point were not included. 
- Y
- a matrix of counts, with a row for each site; columns 1 to 6 give the counts in the distance classes for the 1st survey occasion, columns 7 to 12 for the 2nd occasion, and so on (this is the format required for - unmarkedFrameGDS).
- rep
- a points x occasions character matrix with the occasion number (this is used as a categorical variable in the analysis). 
Source
Dutch Centre for Field Ornithology (SOVON)
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 9.5.3.
Examples
data(wagtail)
str(wagtail)
Data for the Mighty Water Voles of Scotland, UK.
Description
Habitat patches suitable for Water Voles (Arvicola amphibius) were surveyed for vole latrines (fecal deposits used for territory marking), used to indicate presence. It is suspected that some of the detections recorded may be false positives. The data set covers 114 patches surveyed up to 3 times per year from 2009 to 2011.
Usage
data("waterVoles")Format
waterVoles is a data frame with 332 rows corresponding to surveys, and 5 columns:
- Patch
- an alphanumeric site identifier. 
- y1, y2, y3
- detection (1)/non-detection (0). 
- Year
- the year of the survey. 
Source
Chris Sutherland, Xavier Lambin and colleagues.
References
Sutherland, C., Elston, D.A., & Lambin, X. (2012) Multi-scale processes in metapopulations: effects of stage structure, rescue effect and correlated extinctions. Ecology, 93, 2465-2473.
Sutherland, C.S., Elston, D.A., & Lambin, X. (2014) A demographic, spatially explicit patch occupancy model of metapopulation dynamics and persistence. Ecology, 95, 3149-3160.
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 7.2.2.
Examples
data(waterVoles)
str(waterVoles)
Coordinates for a wiggly line transect
Description
Coordinates for a wiggly line transect used as an example in Kéry & Royle (2021) chapter 11.9.
Usage
data("wigglyLine")Format
wigglyLine is a 2-column matrix of x and y coordinates.
Source
Made-up data.
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 11.
Examples
data(wigglyLine)
str(wigglyLine)
plot(wigglyLine, type='l')
Simulate static occupancy data
Description
Function to generate a static occupancy data set with really wiggly covariate relationships in occupancy and detection probability.
To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.
Usage
wigglyOcc(seed = 1, show.plot = TRUE, verbose = TRUE)
Arguments
| seed | Seed for random number generator | 
| show.plot | choose whether to show plots or not. Set to FALSE when using function in simulations. | 
| verbose | if TRUE, output will be written to the console. | 
Value
A list with the following elements:
| M | Number of sites | 
| J | Number of replicate surveys | 
| Xsite | Simulated site covariate, a vector of length M | 
| Xsurvey | Simulated survey covariate, a M x J matrix | 
| psi | Occupancy probability, a vector of length M | 
| z | Realized occupancy, a 0/1 vector of length M | 
| p | Detection probability, a M x J matrix | 
| y | detection history, a M x J matrix of 0/1 | 
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 10.14.
Examples
# Generate data with the default arguments and look at the structure:
tmp <- wigglyOcc()
str(tmp)
Constant effort site (CES) data for British Willow Warblers
Description
British Willow Warblers (Phylloscopus trochilus) caught at a total of 193 ringing (or banding) stations in England, Wales and Scotland. The willow warbler is a small (8–10 g) migratory species that used to be extremely widespread and common throughout Britain, but whose populations have suffered severe declines since the mid-90s. We analyze data from 11 years (1986–1996) before the major population decline.
Usage
data("willowWarbler")Format
willowWarbler is a list with 4 elements:
- birds
- a data frame with rows for 10551 birds and the following columns: - 1986-1996 : capture history, 1 if the bird was captured that year, 0 otherwise. 
- cesID : numerical code for the Constant Effort Site where the bird was caught. 
 
- cells
- a data frame with rows for 9667 5x5 km cells covering the whole of Britain, and the following columns: - lon, lat : easting and northing of the center of the cell. 
- gdd : mean growing degree days (GDD: the sum of daily mean temperatures above 5.5C). 
- blockID : the ID of the 25x25 km block into which the cell falls. 
 
- CES
- a data frame with rows for each of the Constant Effort Sites, and the following columns: - cesx, cesy : approximate easting and northing of the site. 
- blockID : the ID of the 25x25 km block into which the site falls. 
- cellID : the ID of the 5x5 km cell into which the site falls. 
 
- blocks
- a data frame with rows for 495 25x25 km blocks covering the whole of Britain, and the following columns: - blockX, blockY : easting and northing of the center of the block. 
 
Source
British Ornithological Trust (BTO)
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 3.4.
Examples
data(willowWarbler)
str(willowWarbler)
attach(willowWarbler)
ch <- as.matrix(birds[ , 1:11]) # extract capture histories.
Starting values for survival analysis in JAGS/BUGS
Description
Generates a matrix for use as starting values for the survival indicator in a JAGS or BUGS model for survival analysis, traditionally designated z.
Usage
zinit(CH)
Arguments
| CH | An individuals x time matrix of capture records, 1 if captured, 0 otherwise, no missing values. | 
Value
An individuals x time matrix with 1 in each row after the first capture; all other elements NA, including the first capture occasion.
Author(s)
Marc Kéry & Andy Royle
References
Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 3.2.3.
Examples
# Generate a fake capture history
( ch <- matrix(rbinom(30, 1, 0.5), 6, 5) )
zinit(ch)