hh_profiles

Household electricity demand time series for scenarios eGon2035, reGon2037, reGon2045 and status2024 at census cell level are set up.

Electricity demand data for households in Germany in 1-hourly resolution for an entire year. Spatially, the data is resolved to 100 x 100 m cells and provides individual and distinct time series for each household in a cell. The cells are defined by the dataset Zensus 2011.

class EgonDestatisZensusHouseholdPerHaRefined(**kwargs)[source]

Bases: Base

Class definition of table society.egon_destatis_zensus_household_per_ha_refined.

cell_id

characteristics_code

grid_id

hh_10types

hh_5types

hh_type

id

nuts1

nuts3

class EgonEtragoElectricityHouseholds(**kwargs)[source]

Bases: Base

Class definition of table demand.egon_etrago_electricity_households.

The table contains household electricity demand profiles aggregated at MV grid district level in MWh.

bus_id

p_set

q_set

scn_name

class HouseholdDemands(dependencies)[source]

Bases: Dataset

Household electricity demand time series for scenarios eGon2035, reGon2037, reGon2045 and status2024 at census cell level are set up.

Electricity demand data for households in Germany in 1-hourly resolution for an entire year. Spatially, the data is resolved to 100 x 100 m cells and provides individual and distinct time series for each household in a cell. The cells are defined by the dataset Zensus 2011.

Dependencies

Resulting tables

demand.iee_household_load_profiles is created and filled
demand.egon_household_electricity_profile_in_census_cell is created and filled
society.egon_destatis_zensus_household_per_ha_refined is created and filled
demand.egon_etrago_electricity_households is created and filled

The following datasets are used for creating the data:

Electricity demand time series for household categories produced by demand profile generator (DPG) from Fraunhofer IEE (see get_iee_hh_demand_profiles_raw())
Spatial information about people living in households by Zensus 2011 at federal state level
- Type of household (family status)
- Age
- Number of people
Spatial information about number of households per ha, categorized by type of household (family status) with 5 categories (also from Zensus 2011)
Demand-Regio annual household demand at NUTS3 level

What is the goal?

To use the electricity demand time series from the demand profile generator to created spatially reference household demand time series for Germany at a resolution of 100 x 100 m cells.

What is the challenge?

The electricity demand time series produced by demand profile generator offer 12 different household profile categories. To use most of them, the spatial information about the number of households per cell (5 categories) needs to be enriched by supplementary data to match the household demand profile categories specifications. Hence, 10 out of 12 different household profile categories can be distinguished by increasing the number of categories of cell-level household data.

How are these datasets combined?

Spatial information about people living in households by zensus (2011) at federal state NUTS1 level df_zensus is aggregated to be compatible to IEE household profile specifications.
- exclude kids and reduce to adults and seniors
- group as defined in HH_TYPES
- convert data from people living in households to number of households by mapping_people_in_households
- calculate fraction of fine household types (10) within subgroup of rough household types (5) df_dist_households
Spatial information about number of households per ha df_census_households_nuts3 is mapped to NUTS1 and NUTS3 level. Data is refined with household subgroups via df_dist_households to df_census_households_grid_refined.
Enriched 100 x 100 m household dataset is used to sample and aggregate household profiles. A table including individual profile id’s for each cell and scaling factor to match Demand-Regio annual sum projections for 2035 and 2050 at NUTS3 level is created in the database as demand.household_electricity_profiles_in_census_cells.

What are central assumptions during the data processing?

Mapping zensus data to IEE household categories is not trivial. In conversion from persons in household to number of households, number of inhabitants for multi-person households is estimated as weighted average in OO_factor
The distribution to refine household types at cell level are the same for each federal state
Refining of household types lead to float number of profiles drew at cell level and need to be rounded to nearest int by np.rint().
100 x 100 m cells are matched to NUTS via cells centroid location
Cells with households in unpopulated areas are removed

Drawbacks and limitations of the data

The distribution to refine household types at cell level are the same for each federal state
Household profiles aggregated annual demand matches Demand Regio demand at NUTS-3 level, but it is not matching the demand regio time series profile
Due to secrecy, some census data are highly modified under certain attributes (quantity_q = 2). This cell data is not corrected, but excluded.
There is deviation in the Census data from table to table. The statistical methods are not stringent. Hence, there are cases in which data contradicts.
Census data with attribute ‘HHTYP_FAM’ is missing for some cells with small amount of households. This data is generated using the average share of household types for cells with similar household number. For some cells the summed amount of households per type deviates from the total number with attribute ‘INSGESAMT’. As the profiles are scaled with demand-regio data at nuts3-level the impact at a higher aggregation level is negligible. For sake of simplicity, the data is not corrected.
There are cells without household data but a population. A randomly chosen household distribution is taken from a subgroup of cells with same population value and applied to all cells with missing household distribution and the specific population value.

Helper functions

To access the DB, select specific profiles at various aggregation levels use get_hh_profiles_from_db()
To access the DB, select specific profiles at various aggregation levels and scale profiles use get_scaled_profiles_from_db()

name: str = 'Household Demands'

sources: DatasetSources = DatasetSources(tables={'demandregio_hh': 'demand.egon_demandregio_hh', 'destatis_zensus_population_per_ha_inside_germany': 'society.destatis_zensus_population_per_ha_inside_germany', 'destatis_zensus_population_per_ha': 'society.destatis_zensus_population_per_ha', 'egon_destatis_zensus_household_per_ha': 'society.egon_destatis_zensus_household_per_ha', 'egon_map_zensus_vg250': 'boundaries.egon_map_zensus_vg250', 'vg250_lan': 'boundaries.vg250_lan', 'demandregio_household_load_profiles': 'demand.demandregio_household_load_profiles'}, files={'household_electricity_demand_profiles': {'path_testmode': 'hh_el_load_profiles_2511.hdf', 'path': 'hh_el_load_profiles_100k.hdf'}, 'zensus_household_types': {'path': 'Zensus2011_Personen.csv'}}, urls={}): The sources used by the datasets. Could be tables, files and urls

targets: DatasetTargets = DatasetTargets(tables={'iee_household_load_profiles': 'demand.iee_household_load_profiles', 'hh_profiles_in_census_cells': 'demand.egon_household_electricity_profile_in_census_cell', 'zensus_household_per_ha_refined': 'society.egon_destatis_zensus_household_per_ha_refined', 'etrago_electricity_households': 'demand.egon_etrago_electricity_households'}, files={}): The targets created by the datasets. Could be tables and files

version: str = '0.0.16'

class HouseholdElectricityProfilesInCensusCells(**kwargs)[source]

Bases: Base

Class definition of table demand.egon_household_electricity_profile_in_census_cell.

Lists references and scaling parameters of time series data for each household in a cell by identifiers. This table is fundamental for creating subsequent data like demand profiles on MV grid level or for determining the peak load at load area level.

cell_id

cell_profile_ids

factor_2024

factor_2035

factor_2037

factor_2045

grid_id

nuts1

nuts3

class IeeHouseholdLoadProfiles(**kwargs)[source]

Bases: Base

Class definition of table demand.iee_household_load_profiles.

id

load_in_wh

type

adjust_to_demand_regio_nuts3_annual(df_hh_profiles_in_census_cells, df_iee_profiles, df_demand_regio)[source]

Computes the profile scaling factor for alignment to demand regio data

The scaling factor can be used to re-scale each load profile such that the sum of all load profiles within one NUTS-3 area equals the annual demand of demand regio data.

Parameters:

df_hh_profiles_in_census_cells (pd.DataFrame) – Result of assign_hh_demand_profiles_to_cells().
df_iee_profiles (pd.DataFrame) – Household load profile data
- Index: Times steps as serial integers
- Columns: pd.MultiIndex with (HH_TYPE, id)
df_demand_regio (pd.DataFrame) – Annual demand by demand regio for each NUTS-3 region and scenario year. Index is pd.MultiIndex with tuple(scenario_year, nuts3_code).

Returns:

pd.DataFrame – Returns the same data as assign_hh_demand_profiles_to_cells(), but with filled factor_<year> columns for each configured scenario.

assign_hh_demand_profiles_to_cells(df_zensus_cells, df_iee_profiles)[source]

Assign household demand profiles to each census cell.

A table including the demand profile ids for each cell is created by using get_cell_demand_profile_ids(). Household profiles are randomly sampled for each cell. The profiles are not replaced to the pool within a cell but after.

Parameters:

df_zensus_cells (pd.DataFrame) – Household type parameters. Each row representing one household. Hence, multiple rows per zensus cell.
df_iee_profiles (pd.DataFrame) – Household load profile data
- Index: Times steps as serial integers
- Columns: pd.MultiIndex with (HH_TYPE, id)

Returns:

pd.DataFrame – Tabular data with one row represents one zensus cell. The column cell_profile_ids contains a list of tuples (see get_cell_demand_profile_ids()) providing a reference to the actual load profiles that are associated with this cell.

clean(x)[source]

Clean zensus household data row-wise

Clean dataset by

converting ‘.’ and ‘-’ to str(0)
removing brackets

Table can be converted to int/floats afterwards

Parameters:: x (pd.Series) – It is meant to be used with df.applymap()
Returns:: pd.Series – Re-formatted data row

create_missing_zensus_data(df_households_typ, df_missing_data, missing_cells)[source]

Generate missing data as average share of the household types for cell groups with the same amount of households.

There is missing data for specific attributes in the zensus dataset because of secrecy reasons. Some cells with only small amount of households are missing with attribute HHTYP_FAM. However the total amount of households is known with attribute INSGESAMT. The missing data is generated as average share of the household types for cell groups with the same amount of households.

Parameters:

df_households_typ (pd.DataFrame) – Zensus households data
df_missing_data (pd.DataFrame) – number of missing cells of group of amount of households
missing_cells (dict) – dictionary with list of grids of the missing cells grouped by amount of households in cell

Returns:

df_average_split (pd.DataFrame) – generated dataset of missing cells

create_table()[source]

get_cell_demand_metadata_from_db(attribute, list_of_identifiers)[source]

Retrieve selection of household electricity demand profile mapping

Parameters:

attribute (str) – attribute to filter the table
- nuts3
- nuts1
- cell_id
list_of_identifiers (list of str/int) – nuts3/nuts1 need to be str cell_id need to be int

See also

houseprofiles_in_census_cells()

Returns:: pd.DataFrame – Mapping of household demand profiles to zensus cells

get_iee_hh_demand_profiles_raw()[source]

Gets and returns household electricity demand profiles from the egon-data-bundle.

Household electricity demand profiles generated by Fraunhofer IEE. Methodology is described in Erzeugung zeitlich hochaufgelöster Stromlastprofile für verschiedene Haushaltstypen. It is used and further described in the following theses by:

Jonas Haack: “Auswirkungen verschiedener Haushaltslastprofile auf PV-Batterie-Systeme” (confidential)
Simon Ruben Drauz “Synthesis of a heat and electrical load profile for single and multi-family houses used for subsequent performance tests of a multi-component energy system”, http://dx.doi.org/10.13140/RG.2.2.13959.14248

Notes

The household electricity demand profiles have been generated for 2016 which is a leap year (8784 hours) starting on a Friday. The weather year is 2011 and the heat timeseries 2011 are generated for 2011 too (cf. dataset egon.data.datasets.heat_demand_timeseries.HTS), having 8760h and starting on a Saturday. To align the profiles, the first day of the IEE profiles are deleted, resulting in 8760h starting on Saturday.

Returns:: pd.DataFrame – Table with profiles in columns and time as index. A pd.MultiIndex is used to distinguish load profiles from different EUROSTAT household types.

get_load_timeseries(df_iee_profiles, df_hh_profiles_in_census_cells, cell_ids, year, aggregate=True, peak_load_only=False)[source]

Get peak load for one load area in MWh

The peak load is calculated in aggregated manner for a group of zensus cells that belong to one load area (defined by cell_ids).

Parameters:

df_iee_profiles (pd.DataFrame) – Household load profile data in Wh
- Index: Times steps as serial integers
- Columns: pd.MultiIndex with (HH_TYPE, id)
Used to calculate the peak load from.
df_hh_profiles_in_census_cells (pd.DataFrame) – Return value of adjust_to_demand_regio_nuts3_annual().
cell_ids (list) – Zensus cell ids that define one group of zensus cells that belong to the same load area.
year (int) – Scenario year. Is used to consider the scaling factor for aligning annual demand to NUTS-3 data.
aggregate (bool) – If true, all profiles are aggregated
peak_load_only (bool) – If true, only the peak load value is returned (the type of the return value is float). Defaults to False which returns the entire time series as pd.Series.

Returns:

pd.Series or float – Aggregated time series for given cell_ids or peak load of this time series in MWh.

get_scaled_profiles_from_db(attribute, list_of_identifiers, year, aggregate=True, peak_load_only=False)[source]

Retrieve selection of scaled household electricity demand profiles

Parameters:

attribute (str) – attribute to filter the table
- nuts3
- nuts1
- cell_id
list_of_identifiers (list of str/int) – nuts3/nuts1 need to be str cell_id need to be int
year (int) –
- 2035
- 2050
aggregate (bool) – If True, all profiles are summed. This uses a lot of RAM if a high attribute level is chosen
peak_load_only (bool) – If True, only peak load value is returned

Notes

Aggregate == False option can use a lot of RAM if many profiles are selected

Returns:: pd.Series or float – Aggregated time series for given cell_ids or peak load of this time series in MWh.

houseprofiles_in_census_cells()[source]

Allocate household electricity demand profiles for each census cell.

Creates table demand.egon_household_electricity_profile_in_census_cell` that maps household electricity demand profiles to census cells. Each row represents one cell and contains a list of profile IDs. This table is fundamental for creating subsequent data like demand profiles on MV grid level or for determining the peak load at load area level.

Use get_houseprofiles_in_census_cells() to retrieve the data from the database as pandas.

impute_missing_hh_in_populated_cells(df_census_households_grid)[source]

Fills in missing household data in populated cells based on a random selection from a subgroup of cells with the same population value.

There are cells without household data but a population. A randomly chosen household distribution is taken from a subgroup of cells with same population value and applied to all cells with missing household distribution and the specific population value. In the case, in which there is no subgroup with household data of the respective population value, the fallback is the subgroup with the last last smaller population value.

Parameters:: df_census_households_grid (pd.DataFrame) – census household data at 100x100m grid level
Returns:: pd.DataFrame – substituted census household data at 100x100m grid level

inhabitants_to_households(df_hh_people_distribution_abs)[source]

Convert number of inhabitant to number of household types

Takes the distribution of peoples living in types of households to calculate a distribution of household types by using a people-in-household mapping. Results are not rounded to int as it will be used to calculate a relative distribution anyways. The data of category ‘HHGROESS_KLASS’ in census households at grid level is used to determine an average wherever the amount of people is not trivial (OR, OO). Kids are not counted.

Parameters:: df_hh_people_distribution_abs (pd.DataFrame) – Grouped census household data on NUTS-1 level in absolute values
Returns:: df_dist_households (pd.DataFrame) – Distribution of households type

mv_grid_district_HH_electricity_load(scenario_name, scenario_year)[source]

Aggregated household demand time series at HV/MV substation level

Calculate the aggregated demand time series based on the demand profiles of each zensus cell inside each MV grid district. Profiles are read from local hdf5-file or demand timeseries per nuts3 in db. Creates table demand.egon_etrago_electricity_households with Household electricity demand profiles aggregated at MV grid district level in MWh. Primarily used to create the eTraGo data model.

Parameters:

scenario_name (str) – Scenario name identifier, i.e. “eGon2035”
scenario_year (int) – Scenario year according to scenario_name

Returns:

pd.DataFrame – Multiindexed dataframe with timestep and bus_id as indexers. Demand is given in kWh.

mv_hh_electricity_load()[source]

Calculate MV grid district household electricity load per scenario.

Loops over the scenarios configured via --scenarios and triggers mv_grid_district_HH_electricity_load() for each of them.

process_nuts1_census_data(df_census_households_raw)[source]

Make data compatible with household demand profile categories

Removes and reorders categories which are not needed to fit data to household types of IEE electricity demand time series generated by demand-profile-generator (DPG).

Kids (<15) are excluded as they are also excluded in DPG origin dataset
Adults (15<65)
Seniors (<65)

Parameters:: df_census_households_raw (pd.DataFrame) – cleaned zensus household type x age category data
Returns:: pd.DataFrame – Aggregated zensus household data on NUTS-1 level

proportionate_allocation(df_group, dist_households_nuts1, hh_10types_cluster)[source]

Household distribution at nuts1 are applied at census cell within group

To refine the hh_5types and keep the distribution at nuts1 level, the household types are clustered and drawn with proportionate weighting. The resulting pool is splitted into subgroups with sizes according to the number of households of clusters in cells.

Parameters:

df_group (pd.DataFrame) – Census household data at grid level for specific hh_5type cluster in a federal state
dist_households_nuts1 (pd.Series) – Household distribution of of hh_10types in a federal state
hh_10types_cluster (list of str) – Cluster of household types to be refined to

Returns:

pd.DataFrame – Refined household data with hh_10types of cluster at nuts1 level

refine_census_data_at_cell_level(df_census_households_grid, df_census_households_nuts1)[source]

Processes and merges census data to specify household numbers and types per census cell according to IEE profiles.

The census data is processed to define the number and type of households per zensus cell. Two subsets of the census data are merged to fit the IEE profiles specifications. To do this, proportionate allocation is applied at nuts1 level and within household type clusters.

Mapping table
characteristics_code	characteristics_text	mapping
1	Einpersonenhaushalte (Singlehaushalte)	SR; SO
2	Paare ohne Kind(er)	PR; PO
3	Paare mit Kind(ern)	P1; P2; P3
4	Alleinerziehende Elternteile	SK
5	Mehrpersonenhaushalte ohne Kernfamilie	OR; OO

Parameters:

df_census_households_grid (pd.DataFrame) – Aggregated zensus household data on 100x100m grid level
df_census_households_nuts1 (pd.DataFrame) – Aggregated zensus household data on NUTS-1 level

Returns:

pd.DataFrame – Number of hh types per census cell

regroup_nuts1_census_data(df_census_households_nuts1)[source]

Regroup census data and map according to demand-profile types.

For more information look at the respective publication: https://www.researchgate.net/publication/273775902_Erzeugung_zeitlich_hochaufgeloster_Stromlastprofile_fur_verschiedene_Haushaltstypen

Parameters:: df_census_households_nuts1 (pd.DataFrame) – census household data on NUTS-1 level in absolute values
Returns:: df_dist_households (pd.DataFrame) – Distribution of households type

set_multiindex_to_profiles(hh_profiles)[source]

The profile id is split into type and number and set as multiindex.

Parameters:: hh_profiles (pd.DataFrame) – Profiles
Returns:: hh_profiles (pd.DataFrame) – Profiles with Multiindex

write_hh_profiles_to_db(hh_profiles)[source]

Write HH demand profiles of IEE into db. One row per profile type. The annual load profile timeseries is an array.

schema: demand tablename: iee_household_load_profiles

Parameters:: hh_profiles (pd.DataFrame) – It is meant to be used with df.applymap()

write_refinded_households_to_db(df_census_households_grid_refined)[source]