Estimate the relationship between the kernel density of two layers of points
Source:R/hotspot_dual_kde.R
hotspot_dual_kde.Rd
Estimate the relationship between the kernel density of two layers of points
Usage
hotspot_dual_kde(
x,
y,
cell_size = NULL,
grid_type = "rect",
bandwidth = NULL,
bandwidth_adjust = 1,
method = "ratio",
grid = NULL,
weights = NULL,
quiet = FALSE,
...
)
Arguments
- x, y
sf
data frames containing points.- cell_size
numeric
value specifying the size of each equally spaced grid cell, using the same units (metres, degrees, etc.) as used in thesf
data frame given in thex
argument. Ignored ifgrid
is notNULL
. If this argument andgrid
areNULL
(the default), the cell size will be calculated automatically (see Details).- grid_type
character
specifying whether the grid should be made up of squares ("rect"
, the default) or hexagons ("hex"
). Ignored ifgrid
is notNULL
.- bandwidth
either a single
numeric
value specifying the bandwidth to be used in calculating the kernel density estimates, or a list of exactly 2 such values. If this argument isNULL
(the default), the bandwidth for bothx
andy
will be determined automatically using the result ofbandwidth.nrd
called on the co-ordinates of the points inx
. If this argument islist(NULL, NULL)
, separate bandwidths will be determined automatically forx
andy
based on each layer.- bandwidth_adjust
single positive
numeric
value by which the value ofbandwidth
for bothx
andy
will be multiplied, or a list of two such values. Useful for setting the bandwidth relative to the default.- method
character
specifying the method by which the densities,d()
, ofx
andy
will be related:ratio
(the default) calculates the density of
x
divided by the density ofy
, i.e.d(x) / d(y)
.log
calculates the natural logarithm of the density of
x
divided by the density ofy
, i.e.log(d(x) / d(y))
.diff
calculates the difference between the density of
x
and the density ofy
, i.e.d(x) - d(y)
.sum
calculates the sum of the density of
x
and the density ofy
, i.e.d(x) + d(y)
.
The result of this calculation will be returned in the
kde
column of the return value.- grid
sf
data frame containing polygons, which will be used as the grid for which densities are estimated.- weights
NULL
(the default) or a vector of length two giving eitherNULL
or the name of a column in each ofx
andy
to be used as weights for weighted counts and KDE values.- quiet
if set to
TRUE
, messages reporting the values of any parameters set automatically will be suppressed. The default isFALSE
.- ...
Further arguments passed to
kde
.
Value
An sf
tibble of grid cells with corresponding point
counts and dual kernel density estimates for each cell. This can be plotted
using autoplot
.
This function creates a regular two-dimensional grid of cells (unless a
custom grid is specified with grid
), calculates the density of points
in each cell for each of x
and y
using functions from the
SpatialKDE
package, then produces a value representing a relation
between the two densities. The count of points in each cell is also returned.
Dual kernel density values can be useful for understanding the relationship between the distributions of two sets of point locations. For example:
The ratio between two densities representing the locations of burglaries and the locations of houses can show the distribution of the risk (incidence rate) of burglaries. The logged ratio may be useful to show relationships where one set of points has an extremely skewed distribution.
The difference between two densities can show the change in distributions between two points in time.
The sum of two densities can be used to estimate the total density of two types of point, e.g. the locations of occurrences of two diseases.
Coverage of the output data
The grid produced by this function covers the convex hull of the points in
x
. This means the result may include KDE values for cells that are
outside the area for which data were provided, which could be misleading. To
handle this, consider cropping the output layer to the area for which data
are available. For example, if you only have crime data for a particular
district, crop the output dataset to the district boundary using
st_intersection
.
References
Yin, P. (2020). Kernels and Density Estimation. The Geographic Information Science & Technology Body of Knowledge (1st Quarter 2020 Edition), John P. Wilson (ed.). doi:doi:10.22224/gistbok/2020.1.12
Examples
# See also the examples for `hotspot_kde()` for examples of how to specify
# `cell_size`, `bandwidth`, etc.
library(sf)
# Transform data to UTM zone 15N so that cell_size and bandwidth can be set
# in metres
memphis_robberies_utm <- st_transform(memphis_robberies, 32615)
memphis_population_utm <- st_transform(memphis_population, 32615)
# Calculate burglary risk based on residential population. `weights` is set
# to `c(NULL, population)` so that the robberies layer is not weighted and
# the population layer is weighted according to the number of residents in
# each census block.
# \donttest{
hotspot_dual_kde(
memphis_robberies_utm,
memphis_population_utm,
bandwidth = list(NULL, NULL),
weights = c(NULL, population)
)
#> Cell size set to 500 metres automatically
#> Bandwidth for `x` set to 5,592 metres automatically based on rule of thumb
#> Bandwidth for `y` set to 5,175 metres automatically based on rule of thumb
#> The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
#> which was just loaded, will retire in October 2023.
#> Please refer to R-spatial evolution reports for details, especially
#> https://r-spatial.org/r/2023/05/15/evolution4.html.
#> It may be desirable to make the sf package available;
#> package maintainers should consider adding sf to Suggests:.
#> The sp package is now running under evolution status 2
#> (status 2 uses the sf package in place of rgdal)
#> Simple feature collection with 3302 features and 2 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 761931.5 ymin: 3876436 xmax: 797931.5 ymax: 3906436
#> Projected CRS: WGS 84 / UTM zone 15N
#> # A tibble: 3,302 × 3
#> n kde geometry
#> * <dbl> <dbl> <POLYGON [m]>
#> 1 0 0.00228 ((762931.5 3876436, 762931.5 3876936, 763431.5 3876936, 763431…
#> 2 0 0.00214 ((763431.5 3876436, 763431.5 3876936, 763931.5 3876936, 763931…
#> 3 0 0.00203 ((763931.5 3876436, 763931.5 3876936, 764431.5 3876936, 764431…
#> 4 0 0.00194 ((764431.5 3876436, 764431.5 3876936, 764931.5 3876936, 764931…
#> 5 0 0.00188 ((764931.5 3876436, 764931.5 3876936, 765431.5 3876936, 765431…
#> 6 0 0.00185 ((765431.5 3876436, 765431.5 3876936, 765931.5 3876936, 765931…
#> 7 0 0.00184 ((765931.5 3876436, 765931.5 3876936, 766431.5 3876936, 766431…
#> 8 0 0.00185 ((766431.5 3876436, 766431.5 3876936, 766931.5 3876936, 766931…
#> 9 0 0.00188 ((766931.5 3876436, 766931.5 3876936, 767431.5 3876936, 767431…
#> 10 0 0.00195 ((767431.5 3876436, 767431.5 3876936, 767931.5 3876936, 767931…
#> # ℹ 3,292 more rows
# }