Title: | Functions for Spatial Thinning of Species Occurrence Records for Use in Ecological Models |
---|---|
Description: | A set of functions that can be used to spatially thin species occurrence data. The resulting thinned data can be used in ecological modeling, such as ecological niche modeling. |
Authors: | Matthew E. Aiello-Lammens [aut, cre], Robert A. Boria [aut], Aleksandar Radosavljevic [aut], Bruno Vilela [aut], Robert P. Anderson [aut], Robert Bjornson [ctb], Steve Weston [ctb] |
Maintainer: | Matthew E. Aiello-Lammens <[email protected]> |
License: | GPL-3 |
Version: | 0.2.0 |
Built: | 2025-01-29 04:08:01 UTC |
Source: | https://github.com/mlammens/spthin |
A dataset containing compiled occurrence record locations for Heteromys anomalus in northern coastal South America. These records have been examined to check for accurate species identification.
A data frame with 201 rows and 4 variables
SPEC. species name assigned to occurrence record
LAT. decimal degree latitude value
LONG. decimal degree longitude value
REGION. region, or island, of occurrence
Three plots (selected by which
) are currently available:
a plot of the number of repetitions versus the number of maximum records retained
at each repetition ([1] observed values; [2] log transformed) and
a histogram of the maximun records retained [3].
plotThin( thinned, which = c(1:3), ask = prod(par("mfcol")) < length(which) && dev.interactive(), ... )
plotThin( thinned, which = c(1:3), ask = prod(par("mfcol")) < length(which) && dev.interactive(), ... )
thinned |
A list of data.frames returned by |
which |
if a subset of the plots is required, specify a subset of the numbers 1:3. |
ask |
logical; if |
... |
other parameters to be passed through to plotting functions. |
Summarize the results of thin
function.
summaryThin(thinned, show = TRUE)
summaryThin(thinned, show = TRUE)
thinned |
A list of data.frames returned by |
show |
logical; if |
Returns a list with the (1) maximun number of records, (2) number of data frames with maximun number of records and (3) a table with the number of data frames per number of records.
thin
returns spatially thinned species occurence data sets.
A randomizaiton algorithm (thin.algorithm
) is used to create
data set in which all occurnece locations are at least thin.par
distance apart. Spatial thinning helps to reduce the effect of uneven,
or biased, species occurence collections on spatial model outcomes.
thin( loc.data, lat.col = "LAT", long.col = "LONG", spec.col = "SPEC", thin.par, reps, locs.thinned.list.return = FALSE, write.files = TRUE, max.files = 5, out.dir, out.base = "thinned_data", write.log.file = TRUE, log.file = "spatial_thin_log.txt", verbose = TRUE )
thin( loc.data, lat.col = "LAT", long.col = "LONG", spec.col = "SPEC", thin.par, reps, locs.thinned.list.return = FALSE, write.files = TRUE, max.files = 5, out.dir, out.base = "thinned_data", write.log.file = TRUE, log.file = "spatial_thin_log.txt", verbose = TRUE )
loc.data |
A data.frame of occurence locations. It can include several columnns, but must include at minimum a column of latitude values, a column of longitude values, and a column of species names. |
lat.col |
Name of column of latitude values. Caps sensitive. |
long.col |
Name of column of longitude values. Caps sensitive. |
spec.col |
Name of column of species name. Caps sensitive. |
thin.par |
Thinning parameter - the distance (in kilometers) that you want records to be separated by. |
reps |
The number of times to repete the thinning process. Given the random process of removing nearest-neighbors there should be 'rep' number of different sets of coordinates. |
locs.thinned.list.return |
TRUE/FALSE - If true, the 'list' of the data.frame of thinned locs resulting from each replication is returned (see Returns below). |
write.files |
TRUE/FALSE - If true, new *.csv files will be written with the thinned locs data |
max.files |
The maximum number of *csv files to be written based on the thinned data |
out.dir |
Directory to write new *csv files to |
out.base |
A file basename to give to the thinned datasets created |
write.log.file |
TRUE/FALSE create/append log file of thinning run |
log.file |
Text log file |
verbose |
TRUE/FALSE - If true, running details of the function are print at the console. |
locs.thinned.dfs A list of data.frames, each data.frame the spatially thinned locations of the algorithm for a single replication. This list will have 'reps' elements.
thin.algorithm
implements a randomization approach to
spatially thinning species occurence data. This function is the algorithm underlying
the thin
function.
thin.algorithm(rec.df.orig, thin.par, reps)
thin.algorithm(rec.df.orig, thin.par, reps)
rec.df.orig |
A data frame of long/lat points for each presence record. The data.frame should be a two-column data frame, one column of long and one of lat |
thin.par |
Thinning parameter - the distance (in kilometers) that you want records to be separated by. |
reps |
The number of times to repete the thinning process. Given the random process of removing nearest-neighbors there should be 'rep' number of different sets of coordinates. |
reduced.rec.dfs: A list object of length 'rep'. Each list element is a different data.frame of spatially thinned presence records.