unsprawl.fetch¶
Data fetching utilities for downloading HDB and MRT datasets.
This module handles downloading datasets from Data.gov.sg APIs and provides fallback synthetic data generation when official sources are unavailable.
Attributes¶
Functions¶
|
Validate that a CSV file has the expected HDB resale schema. |
|
Validate that a GeoJSON or CSV file has the expected MRT schema. |
|
Prompt user for permission to download a dataset. |
|
Hit the initiate-download endpoint to get the temporary S3 URL. |
|
Generate synthetic HDB resale dataset for testing. |
|
Generate synthetic MRT GeoJSON for testing. |
|
Stream the file from the S3 URL to disk. |
|
Fetch HDB resale dataset using the official Data.gov.sg 'initiate-download' |
|
Fetch MRT GeoJSON dataset. |
|
Ensure HDB dataset exists and has valid schema, downloading if necessary. |
|
Ensure MRT dataset exists and has valid schema, downloading if necessary. |
Module Contents¶
- DATASET_IDS¶
- DEFAULT_HDB_PATH¶
- DEFAULT_MRT_PATH¶
- EXPECTED_HDB_COLUMNS¶
- EXPECTED_MRT_COLUMNS¶
- validate_hdb_schema(path)[source]¶
Validate that a CSV file has the expected HDB resale schema.
- Parameters:
path (str) – Path to the CSV file to validate.
- Returns:
True if the file exists and has the expected columns, False otherwise.
- Return type:
- validate_mrt_schema(path)[source]¶
Validate that a GeoJSON or CSV file has the expected MRT schema.
- Parameters:
path (str) – Path to the GeoJSON or CSV file to validate.
- Returns:
True if the file exists and has the expected structure, False otherwise.
- Return type:
- prompt_user_download(dataset_name)[source]¶
Prompt user for permission to download a dataset.
- Parameters:
dataset_name (str) – Name of the dataset to download (“HDB resale data” or “MRT stations data”).
- Returns:
True if user approves download, False otherwise.
- Return type:
- api_get_download_url(dataset_id, verbose=0)[source]¶
Hit the initiate-download endpoint to get the temporary S3 URL.
- fetch_hdb_data(limit, out_dir, filename, verbose=0)[source]¶
Fetch HDB resale dataset using the official Data.gov.sg ‘initiate-download’ API.
- ensure_hdb_dataset(path, verbose=0)[source]¶
Ensure HDB dataset exists and has valid schema, downloading if necessary.
- Parameters:
path (str) – Path where the HDB dataset should exist.
verbose (int) – Verbosity level for logging.
- Returns:
True if dataset is available and valid, False if user declined download or download failed.
- Return type:
- ensure_mrt_dataset(path, verbose=0)[source]¶
Ensure MRT dataset exists and has valid schema, downloading if necessary.
- Parameters:
path (str) – Path where the MRT dataset should exist.
verbose (int) – Verbosity level for logging.
- Returns:
True if dataset is available and valid, False if user declined download or download failed.
- Return type: