pv_rooftop_buildings¶

Distribute MaStR PV rooftop capacities to OSM and synthetic buildings. Generate new PV rooftop generators for scenarios eGon2035 and eGon100RE.

Data cleaning and inference: * Drop duplicates and entries with missing critical data. * Determine most plausible capacity from multiple values given in MaStR data. * Drop generators which don’t have any plausible capacity data

(23.5MW > P > 0.1).

Randomly and weighted add a start-up date if it is missing.
Extract zip and municipality from ‘Standort’ given in MaStR data.
Geocode unique zip and municipality combinations with Nominatim (1 sec delay). Drop generators for which geocoding failed or which are located outside the municipalities of Germany.
Add some visual sanity checks for cleaned data.

Allocation of MaStR data: * Allocate each generator to an existing building from OSM. * Determine the quantile each generator and building is in depending on the

capacity of the generator and the area of the polygon of the building.

Randomly distribute generators within each municipality preferably within the same building area quantile as the generators are capacity wise.
If not enough buildings exists within a municipality and quantile additional buildings from other quantiles are chosen randomly.

Desegregation of pv rooftop scenarios: * The scenario data per federal state is linearly distributed to the mv grid

districts according to the pv rooftop potential per mv grid district.

The rooftop potential is estimated from the building area given from the OSM buildings.
Grid districts, which are located in several federal states, are allocated PV capacity according to their respective roof potential in the individual federal states.
The desegregation of PV plants within a grid districts respects existing plants from MaStR, which did not reach their end of life.
New PV plants are randomly and weighted generated using a breakdown of MaStR data as generator basis.
Plant metadata (e.g. plant orientation) is also added random and weighted from MaStR data as basis.

class EgonMastrPvRoofGeocoded(**kwargs)[source]¶

Bases: sqlalchemy.ext.declarative.api.Base

altitude¶

geometry¶

latitude¶

location¶

longitude¶

point¶

zip_and_municipality¶

class EgonPowerPlantPvRoofBuildingScenario(**kwargs)[source]¶

Bases: sqlalchemy.ext.declarative.api.Base

building_id¶

bus_id¶

capacity¶

einheitliche_ausrichtung_und_neigungswinkel¶

gens_id¶

hauptausrichtung¶

hauptausrichtung_neigungswinkel¶

index¶

scenario¶

voltage_level¶

weather_cell_id¶

class OsmBuildingsFiltered(**kwargs)[source]¶

Bases: sqlalchemy.ext.declarative.api.Base

amenity¶

area¶

building¶

geom¶

geom_point¶

id¶

name¶

osm_id¶

tags¶

class Vg250Lan(**kwargs)[source]¶

Bases: sqlalchemy.ext.declarative.api.Base

ade¶

ags¶

ags_0¶

ars¶

ars_0¶

bem¶

bez¶

bsg¶

debkg_id¶

fk_s3¶

gen¶

geometry¶

gf¶

ibz¶

id¶

nbd¶

nuts¶

rs¶

rs_0¶

sdv_ars¶

sdv_rs¶

sn_g¶

sn_k¶

sn_l¶

sn_r¶

sn_v1¶

sn_v2¶

wsk¶

add_ags_to_buildings(buildings_gdf: geopandas.geodataframe.GeoDataFrame, municipalities_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Add information about AGS ID to buildings. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

municipalities_gdf (geopandas.GeoDataFrame) – GeoDataFrame with municipality data.

Returns:	gepandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with AGS ID added.

add_ags_to_gens(valid_mastr_gdf: geopandas.geodataframe.GeoDataFrame, municipalities_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Add information about AGS ID to generators. :Parameters: * valid_mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame with valid and cleaned MaStR data.

municipalities_gdf (geopandas.GeoDataFrame) – GeoDataFrame with municipality data.

Returns:	gepandas.GeoDataFrame – GeoDataFrame with valid and cleaned MaStR data with AGS ID added.

add_buildings_meta_data(buildings_gdf: geopandas.geodataframe.GeoDataFrame, prob_dict: dict, seed: int) → geopandas.geodataframe.GeoDataFrame[source]¶

Randomly add additional metadata to desaggregated PV plants. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data with desaggregated PV

plants.

prob_dict (dict) – Dictionary with values and probabilities per capacity range.

seed (int) – Seed to use for random operations with NumPy and pandas.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM building data with desaggregated PV plants.

add_bus_ids_sq(buildings_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Add bus ids for status_quo units

Parameters:	buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data with desaggregated PV plants.
Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM building data with bus_id per generator.

add_overlay_id_to_buildings(buildings_gdf: geopandas.geodataframe.GeoDataFrame, grid_federal_state_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Add information about overlay ID to buildings. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

grid_federal_state_gdf (geopandas.GeoDataFrame) – GeoDataFrame with intersection shapes between counties and grid districts.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with overlay ID added.

add_start_up_date(buildings_gdf: geopandas.geodataframe.GeoDataFrame, start: pandas._libs.tslibs.timestamps.Timestamp, end: pandas._libs.tslibs.timestamps.Timestamp, seed: int)[source]¶

Randomly and linear add start-up date to new pv generators. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data with desaggregated PV

plants.

start (pandas.Timestamp) – Minimum Timestamp to use.

end (pandas.Timestamp) – Maximum Timestamp to use.

seed (int) – Seed to use for random operations with NumPy and pandas.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with start-up date added.

add_voltage_level(buildings_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Get voltage level data from mastr table and assign to units. Infer missing values derived from generator capacity to the power plants.

Parameters:	buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data with desaggregated PV plants.
Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM building data with voltage level per generator.

add_weather_cell_id(buildings_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

allocate_pv(q_mastr_gdf: gpd.GeoDataFrame, q_buildings_gdf: gpd.GeoDataFrame, seed: int) → tuple[gpd.GeoDataFrame, gpd.GeoDataFrame][source]¶

Allocate the MaStR pv generators to the OSM buildings. This will determine a building for each pv generator if there are more buildings than generators within a given AGS. Primarily generators are distributed with the same qunatile as the buildings. Multiple assignment is excluded. :Parameters: * q_mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded and qcut MaStR data.

q_buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing qcut OSM buildings data.

seed (int) – Seed to use for random operations with NumPy and pandas.

Returns:	tuple with two geopandas.GeoDataFrame s – GeoDataFrame containing MaStR data allocated to building IDs. GeoDataFrame containing building data allocated to MaStR IDs.

allocate_scenarios(mastr_gdf: geopandas.geodataframe.GeoDataFrame, valid_buildings_gdf: geopandas.geodataframe.GeoDataFrame, last_scenario_gdf: geopandas.geodataframe.GeoDataFrame, scenario: str)[source]¶

Desaggregate and allocate scenario pv rooftop ramp-ups onto buildings. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

valid_buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

last_scenario_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings matched with pv generators from temporally preceding scenario.

scenario (str) – Scenario to desaggrgate and allocate.

Returns:	tuple – geopandas.GeoDataFrame GeoDataFrame containing OSM buildings matched with pv generators. pandas.DataFrame DataFrame containing pv rooftop capacity per grid id.

allocate_to_buildings(mastr_gdf: gpd.GeoDataFrame, buildings_gdf: gpd.GeoDataFrame) → tuple[gpd.GeoDataFrame, gpd.GeoDataFrame][source]¶

Allocate status quo pv rooftop generators to buildings. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing MaStR data with geocoded locations.

buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data with buildings without an AGS ID dropped.

Returns:	tuple with two geopandas.GeoDataFrame s – GeoDataFrame containing MaStR data allocated to building IDs. GeoDataFrame containing building data allocated to MaStR IDs.

Estimate normal building area range per capacity range. Calculate the mean roof load factor per capacity range from existing PV plants. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

cap_ranges (list(tuple(int, int))) – List of capacity ranges to distinguish between. The first tuple should start with a zero and the last one should end with infinite.

min_building_size (int, float) – Minimal building size to consider for PV plants.

upper_quantile (float) – Upper quantile to estimate maximum building size per capacity range.

lower_quantile (float) – Lower quantile to estimate minimum building size per capacity range.

Returns:	dict – Dictionary with estimated normal building area range per capacity range.

calculate_building_load_factor(mastr_gdf: geopandas.geodataframe.GeoDataFrame, buildings_gdf: geopandas.geodataframe.GeoDataFrame, rounding: int = 4) → geopandas.geodataframe.GeoDataFrame[source]¶

Calculate the roof load factor from existing PV systems. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

rounding (int) – Rounding to use for load factor.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing geocoded MaStR data with calculated load factor.

calculate_max_pv_cap_per_building(buildings_gdf: gpd.GeoDataFrame, mastr_gdf: gpd.GeoDataFrame, pv_cap_per_sq_m: float | int, roof_factor: float | int) → gpd.GeoDataFrame[source]¶

Calculate the estimated maximum possible PV capacity per building. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

pv_cap_per_sq_m (float, int) – Average expected, installable PV capacity per square meter.

roof_factor (float, int) – Average for PV usable roof area share.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with estimated maximum PV capacity.

cap_per_bus_id(scenario: str) → pandas.core.frame.DataFrame[source]¶

Get table with total pv rooftop capacity per grid district.

Parameters:	scenario (str) – Scenario name.
Returns:	pandas.DataFrame – DataFrame with total rooftop capacity per mv grid.

Calculate the share of PV capacity from the total PV capacity within capacity ranges. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

cap_ranges (list(tuple(int, int))) – List of capacity ranges to distinguish between. The first tuple should start with a zero and the last one should end with infinite.

Returns:	dict – Dictionary with share of PV capacity from the total PV capacity within capacity ranges.

clean_mastr_data(mastr_df: pd.DataFrame, max_realistic_pv_cap: int | float, min_realistic_pv_cap: int | float, rounding: int, seed: int) → pd.DataFrame[source]¶

Clean the MaStR data from implausible data.

Drop MaStR ID duplicates.
Drop generators with implausible capacities.
Drop generators without any kind of start-up date.
Clean up Standort column and capacity.

Parameters:

mastr_df (pandas.DataFrame) – DataFrame containing MaStR data.
max_realistic_pv_cap (int or float) – Maximum capacity, which is considered to be realistic.
min_realistic_pv_cap (int or float) – Minimum capacity, which is considered to be realistic.
rounding (int) – Rounding to use when cleaning up capacity. E.g. when rounding is 1 a capacity of 9.93 will be rounded to 9.9.
seed (int) – Seed to use for random operations with NumPy and pandas.

Returns:

pandas.DataFrame – DataFrame containing cleaned MaStR data.

create_geocoded_table(geocode_gdf)[source]¶: Create geocoded table mastr pv rooftop :Parameters: geocode_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoding information on pv rooftop locations.

create_scenario_table(buildings_gdf)[source]¶: Create mapping table pv_unit <-> building for scenario

desaggregate_pv(buildings_gdf: geopandas.geodataframe.GeoDataFrame, cap_df: pandas.core.frame.DataFrame, **kwargs) → geopandas.geodataframe.GeoDataFrame[source]¶

Desaggregate PV capacity on buildings within a given grid district. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

cap_df (pandas.DataFrame) – DataFrame with total rooftop capacity per mv grid.

Other Parameters:

prob_dict (dict) – Dictionary with values and probabilities per capacity range.
cap_share_dict (dict) – Dictionary with share of PV capacity from the total PV capacity within capacity ranges.
building_area_range_dict (dict) – Dictionary with estimated normal building area range per capacity range.
load_factor_dict (dict) – Dictionary with mean roof load factor per capacity range.
seed (int) – Seed to use for random operations with NumPy and pandas.
pv_cap_per_sq_m (float, int) – Average expected, installable PV capacity per square meter.

Returns:

geopandas.GeoDataFrame – GeoDataFrame containing OSM building data with desaggregated PV plants.

desaggregate_pv_in_mv_grid(buildings_gdf: gpd.GeoDataFrame, pv_cap: float | int, **kwargs) → gpd.GeoDataFrame[source]¶

Desaggregate PV capacity on buildings within a given grid district. :Parameters: * buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing buildings within the grid district.

pv_cap (float, int) – PV capacity to desaggregate.

Other Parameters:

prob_dict (dict) – Dictionary with values and probabilities per capacity range.
cap_share_dict (dict) – Dictionary with share of PV capacity from the total PV capacity within capacity ranges.
building_area_range_dict (dict) – Dictionary with estimated normal building area range per capacity range.
load_factor_dict (dict) – Dictionary with mean roof load factor per capacity range.
seed (int) – Seed to use for random operations with NumPy and pandas.
pv_cap_per_sq_m (float, int) – Average expected, installable PV capacity per square meter.

Returns:

geopandas.GeoDataFrame – GeoDataFrame containing OSM building data with desaggregated PV plants.

determine_end_of_life_gens(mastr_gdf: geopandas.geodataframe.GeoDataFrame, scenario_timestamp: pandas._libs.tslibs.timestamps.Timestamp, pv_rooftop_lifetime: pandas._libs.tslibs.timedeltas.Timedelta) → geopandas.geodataframe.GeoDataFrame[source]¶

Determine if an old PV system has reached its end of life. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

scenario_timestamp (pandas.Timestamp) – Timestamp at which the scenario takes place.

pv_rooftop_lifetime (pandas.Timedelta) – Average expected lifetime of PV rooftop systems.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing geocoded MaStR data and info if the system has reached its end of life.

drop_buildings_outside_grids(buildings_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Drop all buildings outside of grid areas. :Parameters: buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

Returns:	gepandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with buildings without an bus ID dropped.

drop_buildings_outside_muns(buildings_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Drop all buildings outside of municipalities. :Parameters: buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing OSM buildings data.

Returns:	gepandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with buildings without an AGS ID dropped.

drop_gens_outside_muns(valid_mastr_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Drop all generators outside of municipalities. :Parameters: valid_mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame with valid and cleaned MaStR data.

Returns:	gepandas.GeoDataFrame – GeoDataFrame with valid and cleaned MaStR data with generatos without an AGS ID dropped.

drop_invalid_entries_from_gdf(gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Drop invalid entries from geopandas GeoDataFrame. TODO: how to omit the logging from geos here??? :Parameters: gdf (geopandas.GeoDataFrame) – GeoDataFrame to be checked for validity.

Returns:	gepandas.GeoDataFrame – GeoDataFrame with rows with invalid geometries dropped.

drop_unallocated_gens(gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Drop generators which did not get allocated.

Parameters:	gdf (geopandas.GeoDataFrame) – GeoDataFrame containing MaStR data allocated to building IDs.
Returns:	geopandas.GeoDataFrame – GeoDataFrame containing MaStR data with generators dropped which did not get allocated.

egon_building_peak_loads()[source]¶

federal_state_data(to_crs: pyproj.crs.crs.CRS) → geopandas.geodataframe.GeoDataFrame[source]¶

Get feder state data from eGo^n Database. :Parameters: to_crs (pyproj.crs.crs.CRS) – CRS to transform geometries to.

Returns:	geopandas.GeoDataFrame – GeoDataFrame with federal state data.

frame_to_numeric(df: pd.DataFrame | gpd.GeoDataFrame) → pd.DataFrame | gpd.GeoDataFrame[source]¶

Try to convert all columns of a DataFrame to numeric ignoring errors. :Parameters: df (pandas.DataFrame or geopandas.GeoDataFrame)

Returns:	pandas.DataFrame or geopandas.GeoDataFrame

geocode_data(geocoding_df: pandas.core.frame.DataFrame, ratelimiter: geopy.extra.rate_limiter.RateLimiter, epsg: int) → geopandas.geodataframe.GeoDataFrame[source]¶

Geocode zip code and municipality. Extract latitude, longitude and altitude. Transfrom latitude and longitude to shapely Point and return a geopandas GeoDataFrame. :Parameters: * geocoding_df (pandas.DataFrame) – DataFrame containing all unique combinations of

zip codes with municipalities for geocoding.

ratelimiter (geopy.extra.rate_limiter.RateLimiter) – Nominatim RateLimiter geocoding class to use for geocoding.

epsg (int) – EPSG ID to use as CRS.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing all unique combinations of zip codes with municipalities with matching geolocation.

geocode_mastr_data()[source]¶: Read PV rooftop data from MaStR CSV TODO: the source will be replaced as soon as the MaStR data is available

in DB.

geocoded_data_from_db(epsg: str | int) → gpd.GeoDataFrame[source]¶

Read OSM buildings data from eGo^n Database. :Parameters: to_crs (pyproj.crs.crs.CRS) – CRS to transform geometries to.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data.

geocoder(user_agent: str, min_delay_seconds: int) → geopy.extra.rate_limiter.RateLimiter[source]¶

Setup Nominatim geocoding class. :Parameters: * user_agent (str) – The app name.

min_delay_seconds (int) – Delay in seconds to use between requests to Nominatim. A minimum of 1 is advised.

Returns:	geopy.extra.rate_limiter.RateLimiter – Nominatim RateLimiter geocoding class to use for geocoding.

geocoding_data(clean_mastr_df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶

Setup DataFrame to geocode. :Parameters: clean_mastr_df (pandas.DataFrame) – DataFrame containing cleaned MaStR data.

Returns:	pandas.DataFrame – DataFrame containing all unique combinations of zip codes with municipalities for geocoding.

get_probability_for_property(mastr_gdf: gpd.GeoDataFrame, cap_range: tuple[int | float, int | float], prop: str) → tuple[np.array, np.array][source]¶

Calculate the probability of the different options of a property of the existing PV plants. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

cap_range (tuple(int, int)) – Capacity range of PV plants to look at.

prop (str) – Property to calculate probabilities for. String needs to be in columns of mastr_gdf.

Returns:	tuple – numpy.array Unique values of property. numpy.array Probabilties per unique value.

grid_districts(epsg: int) → geopandas.geodataframe.GeoDataFrame[source]¶

Load mv grid district geo data from eGo^n Database as geopandas.GeoDataFrame. :Parameters: epsg (int) – EPSG ID to use as CRS.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing mv grid district ID and geo shapes data.

load_building_data()[source]¶

Read buildings from DB Tables:

openstreetmap.osm_buildings_filtered (from OSM)
openstreetmap.osm_buildings_synthetic (synthetic, created by us)

Use column id for both as it is unique hence you concat both datasets. If INCLUDE_SYNTHETIC_BUILDINGS is False synthetic buildings will not be loaded.

Returns:	gepandas.GeoDataFrame – GeoDataFrame containing OSM buildings data with buildings without an AGS ID dropped.

load_mastr_data()[source]¶: Read PV rooftop data from MaStR CSV Note: the source will be replaced as soon as the MaStR data is available in DB. :returns: geopandas.GeoDataFrame – GeoDataFrame containing MaStR data with geocoded locations.

Read MaStR data from csv.

Parameters:	index_col (str, int or list of str or int) – Column(s) to use as the row labels of the DataFrame. usecols (list of str) – Return a subset of the columns. dtype (dict of column (str) -> type (any), optional) – Data type for data or columns. parse_dates (list of names (str), optional) – Try to parse given columns to datetime.
Returns:	pandas.DataFrame – DataFrame containing MaStR data.

Calculate the mean roof load factor per capacity range from existing PV plants. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

cap_ranges (list(tuple(int, int))) – List of capacity ranges to distinguish between. The first tuple should start with a zero and the last one should end with infinite.

Returns:	dict – Dictionary with mean roof load factor per capacity range.

merge_geocode_with_mastr(clean_mastr_df: pandas.core.frame.DataFrame, geocode_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Merge geometry to original mastr data. :Parameters: * clean_mastr_df (pandas.DataFrame) – DataFrame containing cleaned MaStR data.

geocode_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing all unique combinations of zip codes with municipalities with matching geolocation.

Returns:	gepandas.GeoDataFrame – GeoDataFrame containing cleaned MaStR data with matching geolocation from geocoding.

most_plausible(p_tub: tuple, min_realistic_pv_cap: int | float) → float[source]¶

Try to determine the most plausible capacity. Try to determine the most plausible capacity from a given generator from MaStR data. :Parameters: * p_tub (tuple) – Tuple containing the different capacities given in

the MaStR data.

min_realistic_pv_cap (int or float) – Minimum capacity, which is considered to be realistic.

Returns:	float – Capacity of the generator estimated as the most realistic.

municipality_data() → geopandas.geodataframe.GeoDataFrame[source]¶: Get municipality data from eGo^n Database. :returns: gepandas.GeoDataFrame – GeoDataFrame with municipality data.

osm_buildings(to_crs: pyproj.crs.crs.CRS) → geopandas.geodataframe.GeoDataFrame[source]¶

Read OSM buildings data from eGo^n Database. :Parameters: to_crs (pyproj.crs.crs.CRS) – CRS to transform geometries to.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data.

overlay_grid_districts_with_counties(mv_grid_district_gdf: geopandas.geodataframe.GeoDataFrame, federal_state_gdf: geopandas.geodataframe.GeoDataFrame) → geopandas.geodataframe.GeoDataFrame[source]¶

Calculate the intersections of mv grid districts and counties. :Parameters: * mv_grid_district_gdf (gpd.GeoDataFrame) – GeoDataFrame containing mv grid district ID and geo shapes data.

federal_state_gdf (gpd.GeoDataFrame) – GeoDataFrame with federal state data.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data.

probabilities(mastr_gdf: gpd.GeoDataFrame, cap_ranges: list[tuple[int | float, int | float]] | None = None, properties: list[str] | None = None) → dict[source]¶

Calculate the probability of the different options of properties of the existing PV plants. :Parameters: * mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing geocoded MaStR data.

cap_ranges (list(tuple(int, int))) – List of capacity ranges to distinguish between. The first tuple should start with a zero and the last one should end with infinite.

properties (list(str)) – List of properties to calculate probabilities for. Strings need to be in columns of mastr_gdf.

Returns:	dict – Dictionary with values and probabilities per capacity range.

pv_rooftop_to_buildings()[source]¶: Main script, executed as task

scenario_data(carrier: str = 'solar_rooftop', scenario: str = 'eGon2035') → pandas.core.frame.DataFrame[source]¶

Get scenario capacity data from eGo^n Database. :Parameters: * carrier (str) – Carrier type to filter table by.

scenario (str) – Scenario to filter table by.

Returns:	geopandas.GeoDataFrame – GeoDataFrame with scenario capacity data in GW.

sort_and_qcut_df(df: pd.DataFrame | gpd.GeoDataFrame, col: str, q: int) → pd.DataFrame | gpd.GeoDataFrame[source]¶

Determine the quantile of a given attribute in a (Geo)DataFrame. Sort the (Geo)DataFrame in ascending order for the given attribute. :Parameters: * df (pandas.DataFrame or geopandas.GeoDataFrame) – (Geo)DataFrame to sort and qcut.

col (str) – Name of the attribute to sort and qcut the (Geo)DataFrame on.

q (int) – Number of quantiles.

Returns:	pandas.DataFrame or gepandas.GeoDataFrame – Sorted and qcut (Geo)DataFrame.

synthetic_buildings(to_crs: pyproj.crs.crs.CRS) → geopandas.geodataframe.GeoDataFrame[source]¶

Read synthetic buildings data from eGo^n Database. :Parameters: to_crs (pyproj.crs.crs.CRS) – CRS to transform geometries to.

Returns:	geopandas.GeoDataFrame – GeoDataFrame containing OSM buildings data.

timer_func(func)[source]¶

validate_output(desagg_mastr_gdf: pd.DataFrame | gpd.GeoDataFrame, desagg_buildings_gdf: pd.DataFrame | gpd.GeoDataFrame) → None[source]¶

Validate output.

Validate that there are exactly as many buildings with a pv system as there are pv systems with a building
Validate that the building IDs with a pv system are the same building IDs as assigned to the pv systems
Validate that the pv system IDs with a building are the same pv system IDs as assigned to the buildings

Parameters:	desagg_mastr_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing MaStR data allocated to building IDs. desagg_buildings_gdf (geopandas.GeoDataFrame) – GeoDataFrame containing building data allocated to MaStR IDs.

zip_and_municipality_from_standort(standort: str, verbose: bool = False) → tuple[str, bool][source]¶

Get zip code and municipality from Standort string split into a list. :Parameters: * standort (str) – Standort as given from MaStR data.

verbose (bool) – Logs additional info if True.

Returns:	str – Standort with only the zip code and municipality as well a ‘, Germany’ added.