Spatial Autocorrelation

Short Description	Spatial autocorrelation analysis can be used in urban health research to examine the presence and characteristics of spatial clustering or spatial dependence in health data.
Data
Suggested tools
Category	Spatial Analysis
Variable

Overview

Spatial autocorrelation refers to the concept that spatial data points close to one another are more likely to have similar values than those that are farther apart. This principle is fundamental in geographic information systems (GIS), spatial analysis, and spatial statistics. It challenges the assumption of independence often made in traditional statistical methods, recognizing that the spatial arrangement of data points can significantly influence their analysis. In the context of urban health, spatial autocorrelation is normally used to measure the degree of similarity or dissimilarity between neighboring locations in terms of their health outcomes or attributes. In other words, it measures how related a health variable is to its spatial context. Positive spatial autocorrelation indicates that similar health values tend to cluster together, while negative spatial autocorrelation suggests the presence of dissimilar clusters.

Description

Spatial autocorrelation can be positive, negative, or neutral. Positive spatial autocorrelation occurs when similar values cluster together in space, whereas negative spatial autocorrelation indicates that neighboring values are dissimilar. Neutral or no spatial autocorrelation suggests that the spatial distribution is random.

Key Measures for Spatial Autocorrelation

Moran's I (Global measure)
Moran's I is a widely used metric for measuring spatial autocorrelation, defined as:
$I = \frac{n}{W} \frac{\sum_{i=1}^{n}\sum_{j=1}^{n} w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_{i=1}^{n}(x_i - \bar{x})^2}$
Where: $N$ is the number of spatial units indexed by $i$ and $j$ ; $x_i$ and $x_j$ are the observations at locations $i$ and $j$ ; $\bar{x}$ is the mean of all observations; $w_{ij}$ is a spatial weight between locations $i$ and $j$ , and $W$ is the sum of all spatial weights.
It ranges from -1 to 1, where: Positive values indicate positive spatial autocorrelation (similar values cluster together); Negative values indicate negative spatial autocorrelation (dissimilar values cluster together); Values around 0 indicate spatial randomness.

Geary's C (Global measure)
Geary's C focuses on the differences between neighboring observations, providing a sensitivity to local variations in spatial autocorrelation. Similar to Moran's I, Geary's C measures spatial autocorrelation, but it's sensitive to local clustering rather than global patterns.
$C = \frac{(N-1)}{2W} \frac{\sum_{i=1}^{N} \sum_{j=1}^{N} w_{ij}(x_i - x_j)^2}{\sum_{i=1}^{N}(x_i - \bar{x})^2}$
⚠️
Moran's I and Geary's C measure overall spatial autocorrelation, with Moran's I being more sensitive to global patterns and Geary's C being more sensitive to local patterns.

Getis-Ord G* (Local measure)
The Getis-Ord G* statistic identifies hot spots and cold spots in spatial data:
$G^*(i) = \frac{\sum_{j=1}^{N} w_{ij} x_j - \bar{x} \sum_{j=1}^{N} w_{ij}}{\sqrt{\frac{N\sum_{j=1}^{N} x_j^2 - (\sum_{j=1}^{N} x_j)^2}{N-1}}}$
where hot spots are areas with high values and cold spots are areas with low values.
⚠️
Getis-Ord G identifies local clusters of high or low values in the spatial data.

Step to Calculate Global Spatial Autocorrelation

Create a Spatial Weights Matrix

Spatial cluster analysis relies on a spatial weights matrix that defines the spatial relationships between locations within the study area. The spatial weights matrix determines the neighbors of each location and quantifies the strength of their spatial relationship. Commonly used spatial weights matrices include contiguity-based (e.g., Queen's or Rook's neighbors) or distance-based (e.g., inverse distance, kernel weights) matrices.

Select a Measure of Spatial Autocorrelation

Global spatial autocorrelation measures the overall degree of spatial clustering or dispersion in the health data across the entire urban area. It is commonly assessed using statistical measures such as Moran's I or Geary's C. These measures provide a summary statistic that indicates the presence and strength of spatial autocorrelation. Positive values of Moran's I indicate positive spatial autocorrelation (clustering), while negative values indicate negative spatial autocorrelation (dispersion).

Calculate the Measure of Spatial Autocorrelation

Use the selected measure to calculate spatial autocorrelation, applying the spatial weights matrix to your data. This step typically involves statistical software or GIS tools such as QGIS and Geoda that can handle spatial data.

Step to Calculate Local Spatial Autocorrelation

Local spatial autocorrelation analysis explores the spatial clusters at a local level, identifying specific areas of high or low values of a health indicator. This is typically done using Local Indicators of Spatial Association (LISA), such as Local Moran's I or Getis-Ord Gi*. LISA generates spatially explicit maps that highlight statistically significant clusters (hotspots or coldspots) and identifies locations where the observed values deviate significantly from what would be expected under spatial randomness.

For more specific uses of local spatial autocorrelation in the context of urban health, Please refer to Hot Spots Analysis and Co-location Analysis.

Implication on Urban Health

Understanding Spatial Patterns of Disease (univariate): By examining the spatial autocorrelation of health data, researchers can gain insights into the spatial patterns of diseases or health conditions within urban areas. This can help in understanding the underlying factors contributing to disease spread and identifying areas at higher risk.

Exploring Environmental Health Factors (bivariate): Spatial autocorrelation analysis can be used to investigate the relationship between health outcomes and health determinant factors such as air pollution, access to green spaces, or availability of healthy food options. By analyzing the spatial autocorrelation of both health and environmental data, researchers can identify areas where environmental factors may be influencing health outcomes.

Tutorial (External)

Recommend software: Geoda

Global Spatial Autocorrelation (1)

In this Chapter, we will explore the analysis of global spatial autocorrelation measures, focusing on visualization. We will review the Moran scatter plot as a means to graphically express Moran’s I, as well as the non-parametric spatial correlogram and smoothed distance scatter plot to to assess the magnitude and the range of spatial autocorrelation. We will continue with the Cleveland house sales data set that we used in the analysis of distance-based spatial weights.

https://geodacenter.github.io/workbook/5a_global_auto/lab5a.html