mastr

Download Marktstammdatenregister (MaStR) from Zenodo.

download_mastr_data()[source]: Download MaStR data from Zenodo.

download_mastr_geocoding()[source]: Download MaStR_geocoding data from Zenodo.

extract_and_preprocess_mastr()[source]

Extract the downloaded MaStR dump and create cleaned, schema-aligned CSVs.

This routine expects a MaStR ZIP archive (downloaded by download_mastr_data()) to be present in WORKING_DIR_MASTR_NEW. It unpacks the archive, reads the raw CSV files shipped in the dump, applies a set of harmonization steps (column renaming, categorical normalization, data enrichments), and writes cleaned CSVs. The function performs the following steps:

Locate and extract the MaStR ZIP
Read raw CSVs from the extracted dump folder

bnetza_mastr_wind_raw.csv, bnetza_mastr_solar_raw.csv, bnetza_mastr_biomass_raw.csv, bnetza_mastr_hydro_raw.csv, bnetza_mastr_gsgk_raw.csv, bnetza_mastr_storage_raw.csv, bnetza_mastr_combustion_raw.csv, bnetza_mastr_nuclear_raw.csv, bnetza_mastr_locations_extended_raw.csv, bnetza_mastr_grid_connections_raw.csv.

Voltage-level enrichment for locations
Solar-specific fixes
Common harmonization across technologies
Write cleaned outputs (UTF-8, no index) to WORKING_DIR_MASTR_NEW - bnetza_mastr_wind_cleaned.csv - bnetza_mastr_solar_cleaned.csv - bnetza_mastr_biomass_cleaned.csv - bnetza_mastr_hydro_cleaned.csv - bnetza_mastr_gsgk_cleaned.csv - bnetza_mastr_storage_cleaned.csv - bnetza_mastr_combustion_cleaned.csv - bnetza_mastr_nuclear_cleaned.csv

Returns:: None – Results are written to disk as CSV files (see list above).

class mastr_data_setup(dependencies)[source]

Bases: Dataset

Download Marktstammdatenregister (MaStR) from Zenodo.

Dependencies

Setup

The downloaded data incorporates two different datasets:

Dump 2021-04-30

Source: https://zenodo.org/records/10480930
Used technologies: PV plants, wind turbines, biomass, hydro plants, combustion, nuclear, gsgk, storage
Data is further processed in the PowerPlants dataset

Dump 2022-11-17

Source: https://zenodo.org/records/10480958
Used technologies: PV plants, wind turbines, biomass, hydro plants
Data is further processed in module mastr and PowerPlants

See documentation section Marktstammdatenregister for more information.

name: str = 'MastrData'

sources: DatasetSources = DatasetSources(tables={}, files={}, urls={'mastr': {'zenodo': {'deposit_id': '14783581', 'file_basename': 'bnetza_mastr', 'dump_name': 'bnetza_open_mastr_2025-02-09', 'technologies': ['biomass', 'combustion', 'gsgk', 'hydro', 'nuclear', 'solar', 'storage', 'wind']}}, 'geocoding': {'dump_name': 'mastr_geocoding_dump_2025-02-09_14783581.gpkg', 'deposit_id': 17279317}}): The sources used by the datasets. Could be tables, files and urls

targets: DatasetTargets = DatasetTargets(tables={}, files={'mastr': {'download_dir': {'path': './bnetza_mastr/dump_2025-02-09'}}, 'geocoding': 'mastr_geocoding'}): The targets created by the datasets. Could be tables and files

tasks: Tasks = (<function download_mastr_data>, <function extract_and_preprocess_mastr>, <function download_mastr_geocoding>)

version: str = '0.0.4'