unsprawl.providers.data.sg.govsg

unsprawl.providers.data.sg.govsg.

GovSGProvider is a dumb fetcher for Singapore open datasets.

Responsibilities

  • Talk to data.gov.sg (initiate-download endpoint)

  • Download the dataset artifact to the local cache (~/.unsprawl/data)

  • Fall back to synthetic datasets if the network/API fails

Non-responsibilities

  • No knowledge of Asset or the core simulation schema.

  • No Singapore-specific valuation logic.

NOTE(judge)

This provider is a canonical example of our global platform architecture: API I/O is isolated from normalization (adapter layer) and from physics (core).

Attributes

Classes

GovSGDatasetIds

Dataset identifiers for data.gov.sg initiate-download API.

GovSGProvider

Provider for Singapore open data via data.gov.sg.

Functions

_default_data_root()

Return the default dataset cache root (~/.unsprawl/data).

_api_get_download_url(dataset_id, *[, timeout_s])

Hit the initiate-download endpoint to get the temporary URL.

_download_file(url, dest_path, *[, timeout_s])

Stream a remote file to disk.

_generate_synthetic_resale_csv(path, *[, limit])

Generate a synthetic resale dataset for deterministic offline usage.

_generate_synthetic_mrt_geojson(path)

Generate a synthetic MRT exits GeoJSON.

Module Contents

log[source]
class GovSGDatasetIds[source]

Dataset identifiers for data.gov.sg initiate-download API.

hdb_resale: str = 'd_8b84c4ee58e3cfc0ece0d773c8ca6abc'
mrt_exits: str = 'd_b39d3a0871985372d7e1637193335da5'
EXPECTED_RESALE_COLUMNS
_default_data_root()[source]

Return the default dataset cache root (~/.unsprawl/data).

Uses ~ expansion to remain portable across OSes.

_api_get_download_url(dataset_id, *, timeout_s=30.0)[source]

Hit the initiate-download endpoint to get the temporary URL.

_download_file(url, dest_path, *, timeout_s=30.0)[source]

Stream a remote file to disk.

_generate_synthetic_resale_csv(path, *, limit=5000)[source]

Generate a synthetic resale dataset for deterministic offline usage.

_generate_synthetic_mrt_geojson(path)[source]

Generate a synthetic MRT exits GeoJSON.

class GovSGProvider(*, data_root=None, dataset_ids=None)[source]

Provider for Singapore open data via data.gov.sg.

data_root
dataset_ids
property resale_prices_path: Path

Default cache location for resale prices CSV.

property mrt_exits_path: Path

Default cache location for MRT exits GeoJSON.

fetch_resale_prices(*, limit=5000, force=False)[source]

Fetch resale prices dataset as a DataFrame.

Network-first, synthetic fallback.

Parameters:
  • limit – Row cap for synthetic fallback.

  • force – If True, re-download/re-generate even if cache exists.

fetch_mrt_exits(*, force=False)[source]

Fetch MRT exits GeoJSON as a dict.

Network-first, synthetic fallback.