Tracks genets through time for multiple species and sites

This function tracks individual organisms in mapped quadrats through time to generate a demographic dataset that includes survival and growth for each individual.

Usage

trackSpp(
  dat,
  inv,
  dorm,
  buff,
  buffGenet,
  clonal,
  species = "Species",
  site = "Site",
  quad = "Quad",
  year = "Year",
  geometry = "geometry",
  aggByGenet = TRUE,
  printMessages = TRUE,
  flagSuspects = FALSE,
  shrink = 0.1,
  dormSize = 0.05,
  ...
)

Arguments

dat

An sf data.frame of the same format as grasslandData. It must have columns that contain ... #' * a unique identification for each research site in character format with no NAs (the default column name is "Site")

species name in character format with no NAs (the default column name is "Species")
unique quadrat identifier in character format with no NAs (the default column name is "Quad")
year of data collection in integer format with no NAs (the default column name is "Year")
an s.f 'geometry' column that contains a polygon or multipolygon data type for each individual observation (the default column name is "geometry") and an s.f 'geometry' column

This function will add columns called "basalArea_ramet", "trackID", "age", "size_tplus1", "recruit," "nearEdge," and "survives_tplus1", so 'dat' should not contain columns with these names.

inv

A named list of the same format as grasslandInventory. The name of each element of the list is a quadrat name in 'dat', and the contents of that list element is a numeric vector of all of the years in which that quadrat (or other unique spatial area) was sampled. Make sure this is the years the quadrat was actually sampled, not just the years that have data in the 'dat' argument! This argument allows the function to differentiate between years when the quadrat wasn't sampled and years when there just weren't any individuals of a species present in that quadrat.

dorm

A numeric vector of length 1, indicating the number of years an individual of these species is allowed to go dormant, i.e. be absent from the map but be considered the same individual when it reappears. This must be an integer greater than or equal to 0. OR dorm can be a data.frame with the columns "Species" and "dorm". This data.frame must have a row for each unique species present in 'dat', with species name as a character string in the "Species" column, and a numeric value greater than or equal to 0 in the 'dorm' column that indicates the number of years each species is allowed to go dormant.

buff

A numeric vector of length 1 that is greater than or equal to zero, indicating how far (in the same units as spatial values in 'dat') a polygon can move from year i to year i+1 and still be considered the same individual. OR buff can be a data.frame with the columns "Species" and "buff". This data.frame must have a row for each unique species present in 'dat', with species name as a character string in the "Species" column, and a numeric value in the 'buff' column specifying the 'buff' argument for each species.

buffGenet

A numeric vector of length 1 that is greater than or equal to zero, indicating how close (in the same units as spatial values in 'dat') polygons must be to one another in the same year to be grouped as a genet (if 'clonal' argument = TRUE). OR buffGenet can be a data.frame with the columns "Species" and "buffGenet". This data.frame must have a row for each unique species present in 'dat', with species name as a character string in the "Species" column, and a numeric value greater than or equal to 0 in the 'buffGenet' specifying the 'buffGenet' argument for each species. This argument is passed to the groupByGenet function, which is used inside the assign function.

clonal

A logical vector of length 1, indicating whether a species is allowed to be clonal or not (i.e. if multiple polygons (ramets) can be grouped as one individual (genet)). If clonal = TRUE, the species is allowed to be clonal, and if clonal = FALSE, the species is not allowed to be clonal. OR clonal can be a data.frame with the columns "Species" and "clonal". This data.frame must have a row for each unique species present in 'dat', with species name as a character string in the "Species" column, and a logical value in the 'clonal' specifying the 'clonal' argument for each species.

species

An optional character string argument. Indicates the name of the column in 'dat' that contains species name data. It is unnecessary to provide a value for this argument if the column name is "Species" (default value is 'Species').

site

An optional character string argument. Indicates the name of the column in 'dat' that contains site name data. It is unnecessary to provide a value for this argument if the column name is "Site" (default value is 'Site').

quad

An optional character string argument. Indicates the name of the column in 'dat' that contains quadrat name data. It is unnecessary to provide a value for this argument if the column name is "Quad" (default is 'Quad').

year

An optional character string argument. Indicates the name of the column in 'dat' that contains data for year of sampling. It is unnecessary to provide a value for this argument if the column name is "Year" (default is 'Year').

geometry

An optional character string argument. Indicates the name of the column in 'dat' that contains sf geometry data. It is unnecessary to provide a value for this argument if the column name is "geometry" (default is 'geometry').

aggByGenet

A logical argument that determines whether the output of trackSpp() will be aggregated by genet. If the value is TRUE (the default), then each unique trackID (or genet) in each year will be represented by only one row in the output data.frame. This prepares the dataset for most demographic analyses. If the value is FALSE, then each unique trackID in each year may be represented by multiple rows in the data (each ramet gets a row). Note that if the value is TRUE, then some columns present in the input data.frame 'dat' will be dropped. If you do not wish this to happen, then you can aggregate the data.frame to genet by hand.

printMessages

A logical argument that determines whether this function returns messages about genet aggregation, as well as messages indicating which year is the last year of sampling in each quadrat and which year(s) come before a gap in sampling that exceeds the 'dorm' argument (and thus which years of data have an 'NA' for "survives_tplus1" and "size_tplus1"). If printMessages = TRUE (the default), then messages are printed. If printMessages = FALSE, messages are not printed.

flagSuspects

A logical argument of length 1, indicating whether observations that are 'suspect' will be flagged. The default is flagSuspects = FALSE. If flagSuspects = TRUE, then a column called 'Suspect' is added to the output data.frame. Any suspect observations get a 'TRUE' in the 'Suspect' column, while non-suspect observations receive a 'FALSE'. There are two ways that an observation can be classified as 'suspect'. First, if two consecutive observations have the same trackID, but the basal area of the observation in year t+1 is less that a certain percentage (defined by the shrink arg.) of the basal area of the observation in year t, it is possible that the observation in year t+1 is a new recruit and not the same individual. The second way an observation can be classified as 'suspect' is if it is very small before going dormant. It is unlikely that a very small individual will survive dormancy, so it is possible that the function has mistakenly given a survival value of '1' to this individual. A 'very small individual' is any observation with an area below a certain percentile (specified by 'dormSize') of the size distribution f or this species, which is generated using all of the size data for this species in 'dat'.

shrink

A single numeric value. This value is only used when flagSuspects = TRUE. When two consecutive observations have the same trackID, and the ratio of size t+1 to size t is smaller than the value of shrink, the observation in year t gets a 'TRUE' in the 'Suspect' column. For example, shrink = 0.2, and an individual that the tracking function has identified as 'BOUGRA_1992_5' has an area of 9 cm^2 in year t and an area of 1.35 cm^2 in year t+1. The ratio of size t+1 to size t is 1.35/9 = 0.15, which is smaller than the cutoff specified by shrink, so the observation of BOUGRA_1992_5' in year t gets a 'TRUE' in the 'Suspect' column. The default value is shrink = 0.10.

dormSize

A single numeric value. This value is only used when flagSuspects = TRUE and dorm is greater than or equal to 1. An individual is flagged as 'suspect' if it 'goes dormant' and has a size that is less than or equal to the percentile of the size distribution for this species that is designated by dormSize. For example dormSize = 0.05, and an individual has a basal area of 0.5 cm^2. The 5th percentile of the distribution of size for this species, which is made using the mean and standard deviation of all observations in 'dat' for the species in question, is 0.6 cm^2. This individual does not have any overlaps in the next year (year t+1), but does have an overlap in year t+2. However, because the basal area of this observation is smaller than the 5th percentile of size for this species, the observation in year t will get a 'TRUE' in the 'Suspect' column. It is possible that the tracking function has mistakenly assigned a '1' for survival in year t, because it is unlikely that this individual is large enough to survive dormancy. The default value is dormSize = .05.

...

Other arguments passed on to methods. Not currently used.

Value

An sf data.frame with the same columns as 'dat,' but with the following additional columns:

trackID: A unique value for each individual genet, consisting of the 6-letter species code, the year in which this individual was recruited, and a unique index number, all separated by a "_".
age: An integer indicating the age of this individual in year t. Values of NA indicate that an accurate age cannot be calculated because this individual was observed either in the first year of sampling or in a year following a gap in sampling, so the exact year of recruitment is not known.
size_tplus1: The size of this genet in year t+1, in the same units as the 'area' column in 'dat'.
recruit: A Boolean integer indicating whether this individual is a new recruit in year t (1), or existed in a previous year (0). Values of NA indicate that this individual was observed either in the first year of sampling or in a year following a gap in sampling, so it is not possible to accurately determine whether or not it is a new recruit in year t.
survives_tplus1: A Boolean integer indicating whether this individual survived (1), or died (0) in year t+1.
basalArea_genet: The size of this entire genet in year t, in the same units as the 'area' column in 'dat.' If the 'clonal' argument = FALSE, then this number will be identical to the 'basalArea_ramet' column.
basalArea_ramet: This is only included if 'aggByGenet' = FALSE. This is the size of this ramet in year t, in the same units as the 'area' column in 'dat'. If the 'clonal' argument = FALSE , then this number will be identical to the 'basalArea_genet' column.
nearEdge: A logical value indicating whether this individual is within a buffer (specified by the 'buff' argument) from the edge of the quadrat.

Details

This is a wrapper function that applies assign across multiple species, quadrats, and sites. For each species and quadrat, trackSpp() loads a spatially referenced data.frame ('dat'), and then uses the groupByGenet function to assign genetIDs to polygons (if 'clonal' = TRUE) such that polygons that form the same genet have the same genetID. A buffer of a distance defined by 'buff' is applied around each genet polygon. Then, the spatial data for each genet from the current year (year t) is compared to individuals in the next year (year t+1). Then trackSpp() calculates the amount of overlapping area between polygons of each year t genet and polygons of each year t+1 genet (using st_intersection). If there is unambiguous overlap between a 'parent' genet from year t and a 'child' genet from year t+1, then that 'child' gets the same identifying trackID as the parent. If there is a 'tie,' where more than one parent overlaps the same child or more than one child overlaps the same parent, the parent-child pair with the greatest amount of overlap receives the same trackID. Polygons in year t+1 that do not have a parent are given new trackIDs and are identified as new recruits. If dormancy is not allowed, then polygons in year t that do not have child polygons get a '0' in the 'survival' column. If dormancy is allowed, parent polygons without child polygons are stored as 'ghosts' and are then compared to data from year t+1+i to find potential child polygons, where i='dorm' argument. For a more detailed description of the trackSpp() function, see the vignette: vignette("Using_the_plantTracker_trackSpp_function", package = "plantTracker")

Examples

dat <- grasslandData[grasslandData$Site == c("AZ") &
 grasslandData$Species %in% c("Bouteloua rothrockii",
  "Calliandra eriophylla"),]
names(dat)[1] <- "speciesName"
inv <- grasslandInventory[unique(dat$Quad)]
outDat <- trackSpp(dat = dat,
 inv = inv,
 dorm = 1,
 buff = .05,
 buffGenet = 0.005,
 clonal = data.frame("Species" = unique(dat$speciesName),
 "clonal" = c(TRUE,FALSE)),
 species = "speciesName",
 aggByGenet = TRUE
 )
#> Site: AZ
#> -- Quadrat: SG2
#> ---- Species: Bouteloua rothrockii
#> ; Calliandra eriophylla
#> Note: Individuals in year 1927 have a value of 'NA' in the 'survives_tplus1' and 'size_tplus1' columns because 1927 is the last year of sampling in this quadrat.
#> -- Quadrat: SG4
#> ---- Species: Bouteloua rothrockii
#> ; Calliandra eriophylla
#> Note: Individuals in year 1927 have a value of 'NA' in the 'survives_tplus1' and 'size_tplus1' columns because 1927 is the last year of sampling in this quadrat.
#> Note: The output data.frame from this function is shorter than your input data.frame because demographic data has been aggregated by genet. Because of this, some columns that were present in your input data.frame may no longer be present. If you don't want the output to be aggregated by genet, include the argument 'aggByGenet == FALSE' in your call to trackSpp().