unsprawl.spatial¶
Spatial analysis module for MRT accessibility scoring.
This module provides geospatial analysis capabilities for computing MRT accessibility scores using KDTree-based nearest-neighbor queries.
Attributes¶
Classes¶
Compute MRT accessibility scores using spatial nearest-neighbor queries. |
Module Contents¶
- KDTree = None¶
- class TransportScorer(stations_df=None, cache_dir=None)[source]¶
Compute MRT accessibility scores using spatial nearest-neighbor queries.
This scorer loads a catalog of MRT station coordinates, strictly excluding all LRT stations using a regex filter ‘^(BP|S[WE]|P[WE])’. The pattern matches the line codes for Bukit Panjang (BP), Sengkang (SW/SE), and Punggol (PW/PE) LRT loops, ensuring that only heavy rail stations are retained.
A KDTree (from scikit-learn) is used for vectorized nearest-neighbor computation across thousands of records instantly, avoiding Python loops.
Accessibility score definition¶
score = max(0, 10 - (dist_km * 2)) where dist_km is the Euclidean distance in kilometers from the HDB listing coordinate to the nearest MRT station in the filtered catalog.
- logger¶
- _cache_dir¶
- static _exclude_lrt(df)[source]¶
Exclude LRT stations using strict regex on line codes.
Excludes station rows whose line_code matches ‘^(BP|S[WE]|P[WE])’. Column expectations: - name: station name (str) - line_code: string line code such as ‘NS’, ‘EW’, ‘DT’, ‘CC’, ‘BP’, ‘SW’ - lat, lon: numeric coordinates in degrees
- load_stations(stations_df)[source]¶
Load station catalog, exclude LRT, and build KDTree index.
- Parameters:
stations_df (pd.DataFrame) – DataFrame with columns: [‘name’, ‘line_code’, ‘lat’, ‘lon’].
- load_stations_geojson(path)[source]¶
Load MRT stations from an LTA Exit GeoJSON file and build KDTree.
The GeoJSON is expected to be a FeatureCollection where each feature is a station exit with properties containing station information. This loader will:
Extract station name and line code from common property keys.
Preserve robust fallback logic for station name parsing across GeoJSON variants (STATION_NA / STN_NAME / STN_NAM / NAME / etc.).
Strictly exclude LRT using the regex ‘^(BP|S[WE]|P[WE])’ on line codes when available, and additionally filter out any stations with ‘LRT’ in the name as a safety fallback.
Build a KDTree over exit coordinates (lon, lat). Using exits provides accurate pedestrian access points for distance calculations.
- Parameters:
path (str) – Path to the GeoJSON file.
- static _haversine_meters(latlon1, latlon2)[source]¶
Compute haversine distance in meters between arrays of points.
- Parameters:
latlon1 (np.ndarray) – Array of shape (n, 2) with columns [lat_rad, lon_rad] in radians.
latlon2 (np.ndarray) – Array of shape (n, 2) with columns [lat_rad, lon_rad] in radians.
- calculate_accessibility_score(df)[source]¶
Annotate DataFrame with nearest MRT and accessibility score.
Adds columns: - Nearest_MRT: name of nearest heavy-rail MRT station - Dist_m: distance to nearest station in meters - Accessibility_Score: score = max(0, 10 - (dist_km * 2))
Expectations: Input df must have ‘lat’ and ‘lon’ columns (degrees).