Abstract
Physical or geographic location proves to be an important feature in many data science models, because many diverse natural and social phenomenon have a spatial component. Spatial autocorrelation measures the extent to which locally adjacent observations of the same phenomenon are correlated. Although statistics like Moran’s I and Geary’s C are widely used to measure spatial autocorrelation, they are slow: All popular methods run in Ω (n2) time, rendering them unusable for large datasets, or long time-courses with moderate numbers of points. We propose a new SA statistic based on the notion that the variance observed when merging pairs of nearby clusters should increase slowly for spatially autocorrelated variables. We give a linear-time algorithm to calculate SA for a variable with an input agglomeration order (available at https://github.com/aamgalan/spatial_autocorrelation). For a typical dataset of n≈ 63 , 000 points, our SA autocorrelation measure can be computed in 1 second, versus 2 hours or more for Moran’s I and Geary’s C. Through simulation studies, we demonstrate that SA identifies spatial correlations in variables generated with spatially-dependent model half an order of magnitude earlier than either Moran’s I or Geary’s C. Finally, we prove several theoretical properties of SA: namely that it behaves as a true correlation statistic and is invariant under addition or multiplication by a constant.
| Original language | English |
|---|---|
| Pages (from-to) | 919-941 |
| Number of pages | 23 |
| Journal | Knowledge and Information Systems |
| Volume | 64 |
| Issue number | 4 |
| DOIs | |
| State | Published - Apr 2022 |
Keywords
- Algorithm design and analysis
- Autocorrelation
- Biomedical informatics
- Clustering algorithms
- Computational efficiency
- Magnetic resonance
Fingerprint
Dive into the research topics of 'Fast spatial autocorrelation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver