
Household electricity demand time series for scenarios in 2035 and 2050 assigned to OSM-buildings.

Assignment of household electricity demand timeseries to OSM buildings and generation of randomly placed synthetic 5x5m buildings if no sufficient OSM-data available in the respective cencus cell.

The resulting data is stored in separate tables

  • openstreetmap.osm_buildings_synthetic:
    Lists generated synthetic building with id and cell_id
  • demand.egon_household_electricity_profile_of_buildings:
    Mapping of demand timeseries and buildings including cell_id, building area and peak load

Both tables are created within map_houseprofiles_to_buildings().

The following datasets from the database are used for creation:

  • demand.household_electricity_profiles_in_census_cells:
    Lists references and scaling parameters to time series data for each household in a cell by identifiers. This table is fundamental for creating subsequent data like demand profiles on MV grid level or for determining the peak load at load. Only the profile reference and the cell identifiers are used.
  • society.egon_destatis_zensus_apartment_building_population_per_ha:
    Lists number of apartments, buildings and population for each census cell.
  • boundaries.egon_map_zensus_buildings_residential:
    List of OSM tagged buildings which are considered to be residential.

What is the goal?

To assign every household demand timeseries, which already exist at cell level, to a specific OSM building.

What is the challenge?

The census and the OSM dataset differ from each other. The census uses statistical methods and therefore lacks accuracy at high spatial resolution. The OSM datasets is community based dataset which is extended throughout and does not claim to be complete. By merging these datasets inconsistencies need to be addressed. For example: not yet tagged buildings in OSM or new building areas not considered in census 2011.

How are these datasets combined?

The assignment of household demand timeseries to buildings takes place at cell level. Within each cell a pool of profiles exists, produced by the ‘HH Demand” module. These profiles are randomly assigned to a filtered list of OSM buildings within this cell. Every profile is assigned to a building and every building get a profile assigned if there is enough households by the census data. If there are more profiles then buildings, all additional profiles are randomly assigned. Therefore multiple profiles can be assigned to one building, making it a multi-household building.

What are central assumptions during the data processing?

  • Mapping zensus data to OSM data is not trivial. Discrepancies are substituted.
  • Missing OSM buildings are generated by census building count.
  • If no census building count data is available, the number of buildings is

derived by an average rate of households/buildings applied to the number of households.

Drawbacks and limitations of the data

  • Missing OSM buildings in cells without census building count are derived by

an average rate of households/buildings applied to the number of households. As only whole houses can exist, the substitute is ceiled to the next higher integer. Ceiling is applied to avoid rounding to amount of 0 buildings.

  • As this datasets is a cascade after profile assignement at census cells

also check drawbacks and limitations in hh_profiles.py.

Example Query

  • Get a list with number of houses, households and household types per census cell
SELECT t1.cell_id, building_count, hh_count, hh_types
        SELECT cell_id, Count(distinct(building_id)) as building_count,
        count(profile_id) as hh_count
            FROM demand.egon_household_electricity_profile_of_buildings
        Group By cell_id
    ) as t1
    SELECT cell_id, array_agg(array[cast(hh_10types as char),
     hh_type]) as hh_types
    FROM society.egon_destatis_zensus_household_per_ha_refined
    GROUP BY cell_id
    ) as t2
ON t1.cell_id = t2.cell_id


This module docstring is rather a dataset documentation. Once, a decision is made in … the content of this module docstring needs to be moved to docs attribute of the respective dataset class.

class BuildingElectricityPeakLoads(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Base

class HouseholdElectricityProfilesOfBuildings(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Base

class OsmBuildingsSynthetic(**kwargs)[source]

Bases: sqlalchemy.ext.declarative.api.Base

generate_mapping_table(egon_map_zensus_buildings_residential_synth, egon_hh_profile_in_zensus_cell)[source]

Generate a mapping table for hh profiles to buildings.

All hh demand profiles are randomly assigned to buildings within the same cencus cell.

  • profiles > buildings: buildings can have multiple profiles but every
    building gets at least one profile
  • profiles < buildings: not every building gets a profile
  • egon_map_zensus_buildings_residential_synth (pd.DataFrame) – Table with OSM and synthetic buildings ids per census cell
  • egon_hh_profile_in_zensus_cell (pd.DataFrame) – Table mapping hh demand profiles to census cells

pd.DataFrame – Table with mapping of profile ids to buildings with OSM ids

generate_synthetic_buildings(missing_buildings, edge_length)[source]

Generate synthetic square buildings in census cells for every entry in missing_buildings.

Generate random placed synthetic buildings incl geom data within the bounds of the cencus cell. Buildings have each a square area with edge_length^2.

  • missing_buildings (pd.Series or pd.DataFrame) – Table with cell_ids and building number
  • edge_length (int) – Edge length of square synthetic building in meter

pd.DataFrame – Table with generated synthetic buildings, area, cell_id and geom data


Peak loads of buildings are determined.

Timeseries for every building are accumulated, the maximum value determined and with the respective nuts3 factor scaled for 2035 and 2050 scenario.


In test-mode ‘SH’ the iteration takes place by ‘cell_id’ to avoid intensive RAM usage. For whole Germany ‘nuts3’ are taken and RAM > 32GB is necessary.


Cencus hh demand profiles are assigned to buildings via osm ids. If no OSM ids available, synthetic buildings are generated. A list of the generated buildings and supplementary data as well as the mapping table is stored in the db.

schema: openstreetmap tablename: osm_buildings_synthetic
schema: demand tablename: egon_household_electricity_profile_of_buildings


match_osm_and_zensus_data(egon_hh_profile_in_zensus_cell, egon_map_zensus_buildings_residential)[source]

Compares OSM buildings and census hh demand profiles.

OSM building data and hh demand profiles based on census data is compared. Census cells with only profiles but no osm-ids are identified to generate synthetic buildings. Census building count is used, if available, to define number of missing buildings. Otherwise, the overall mean profile/building rate is used to derive the number of buildings from the number of already generated demand profiles.

  • egon_hh_profile_in_zensus_cell (pd.DataFrame) – Table mapping hh demand profiles to census cells
  • egon_map_zensus_buildings_residential (pd.DataFrame) – Table with buildings osm-id and cell_id

pd.DataFrame – Table with cell_ids and number of missing buildings

reduce_synthetic_buildings(mapping_profiles_to_buildings, synthetic_buildings)[source]

Reduced list of synthetic buildings to amount actually used.

Not all are used, due to randomised assignment with replacing Id’s are adapted to continuous number sequence following openstreetmap.osm_buildings