Revolutionising Energy Planning: How Synthetic Time Series Generative AI Powers Forward-Thinking Development

Dr. Daniel Dimanov
May 1, 2024
5 min read

In the rapidly evolving landscape of urban development and energy management, the need for robust, forward-thinking planning tools has never been more critical. Traditional energy forecasting methods, heavily reliant on historical consumption data, are increasingly challenged by the dynamic nature of urban growth and technological innovation. These methods often struggle to adapt to new scenarios or account for unprecedented changes, leaving planners and policymakers grappling with incomplete or outdated information.

As new areas develop and existing networks undergo significant changes, the ability to anticipate and effectively manage energy demand becomes a cornerstone of sustainable development. Current state-of-the-art forecasting techniques, while helpful, typically extend past data trends into the future, assuming that what has been will continue to be.

This approach can miss the mark in today’s fast-changing world, where demographic shifts, economic fluctuations, and new technologies frequently disrupt old patterns.

Enter the innovative realm of generative artificial intelligence (Gen AI), particularly time series generative adversarial networks (GANs). These advanced models offer more than just predictions; they serve as tools for generating new, synthetic time series data from a variety of inputs. Whilst the example provided utilises random noise and demographic insights from census data to generate half-hourly load measurements of low-voltage feeders, it is important to note that this is just one potential application. In reality, the generation of synthetic data would typically incorporate a broader range of data types, not just demographic information. Although demographic data are correlated with energy usage patterns, they represent only part of the complex factors influencing energy consumption. This enhanced capability not only deepens our understanding of potential future scenarios but also significantly improves data-driven decision-making across multiple sectors:

Proactive Energy Planning

By generating realistic energy consumption profiles for new developments, planners can design more effective infrastructure and adjust to planned network changes with greater confidence.

Scenario Testing

Data Imputation

Research and Development

Extension of Limited Data Samples

Expanding Training Datasets through Synthetic Data

Enhancing Data Fairness by Addressing Bias

As we delve deeper into the capabilities and applications of time series GANs, it becomes clear that these tools are not just another step in the evolution of forecasting—they are redefining the very approach to energy management and planning in an increasingly uncertain world.

Unpacking Time Series Generative Adversarial Networks

The Core of GANs

Generative Adversarial Networks (GANs) represent a powerful class of AI models designed to generate new data that mimics real data. At its core, a GAN consists of two main components engaged in a continuous game: the generator, which creates data, and the discriminator, which evaluates it. This setup enables the model to improve iteratively, with the generator aiming to produce increasingly realistic data, and the discriminator becoming better at distinguishing the fake from the real.

Adapting GANs for Time Series Data

Enhancing Realism and Reliability

Applications in Energy Management

As we explore the capabilities of time series GANs, it becomes evident why their application is particularly revolutionary in fields like energy management. However, one model, DoppelGANger, stands out by pushing the boundaries further, optimising the generation process to handle an even broader array of inputs and complexities. In the next section, we will delve into how DoppelGANger specifically adapts and advances these principles to provide unprecedented tools for scenario planning and data synthesis.

DoppelGANger: A Closer Look at Its Innovative Architecture

The Main Architecture of DoppleGANger. Taken from the paper https://arxiv.org/pdf/1909.13403

Architectural Overview

DoppelGANger is structured around two main components: the Attribute Generator and the Time Series Generator, each paired with its respective discriminator. The Attribute Generator is designed to produce categorical and continuous attributes, while the Time Series Generator focuses on generating corresponding time series data based on the attributes produced.

Attribute Generator

The Attribute Generator uses a feedforward neural network to output a set of attributes. This network takes a random noise vector as input and transforms it through multiple layers to generate a vector of attributes. These attributes are then passed to the Time Series Generator as conditional inputs, influencing the generation of time series data.

Time Series Generator

The Time Series Generator in DoppelGANger is more complex due to the sequential nature of its output. It starts with a combination of random noise and the output from the Attribute Generator. This combined input is processed through a series of recurrent neural network layers (specifically, LSTM units), which are well-suited for handling sequential data due to their ability to maintain information across time steps. The LSTM layers are designed to progressively refine the generated time series, ensuring that it aligns with the conditioned attributes and mimics the statistical properties of real-world data.

Discriminators

Training Process

Mathematical Problem Definition

Key Hyperparameters and Their Influence

Autocorrelation of different techniques compared with Real data. Taken from the paper https://arxiv.org/pdf/1909.13403

Next, we present a feasibility study using an example use case to show the potential of such approaches. In practice such systems will include a lot more data, refinement and data sources. For our trial we are not using features like weekdays/weekends, temperature data, sunlight data, specific property data (age/efficiency, size, occupancy, gas heating, Economy7 tarrifs) and many more. We just want to tryout the approach with a simple experiment to evaluate its potential.

Use Case: Generating Energy profile of LV feeder based on Demographics

In order to test the utility of generating realistic energy profiles based on demographics, we used several different data sources and merged them together. We will now first go through the data sources we've used, then discuss how we've handled, processed, combined and used the data with DoppleGANger, then we will go through our results based on small-scale experiments we've conducted.

Data Sources

First, we acquired two different datasets from UK Power Networks. The first one details usage and other information about secondary substations in the UK for January 2024, while the second one details smart meter LV feeder level readings of energy.

The Secondary Substation data contained information about the functional location of the substations, voltage, number of transformers, customer count, postcode of the substation and lots of more useful features.

The LV Feeder Smart Meter dataset has half-hourly readings of 1988 unique LV feeders that are associated with different secondary substations. While the data details the import and export of energy, for the purpose of this experiment we only use the total consumption and the secondary substation id to associate the smart meter readings to the substations.

Then, we can use the attributes from the substation data to predict the time series features (half-hourly consumption for the whole of January). The problem is that the attributes in the substation dataset are not extremely informative, so instead, we cross-referenced the postcodes of the secondary substations with the corresponding Census records obtained from the Office for National Statistics (ons.gov.uk ) website. Using this data, we built a comprehensive library for postcode to demographic attributes based on the postcodes covered in the dataset and this new rich-attribute dataset containing key attributes about the secondary substations together with the relevant demographic attributes constitute our 128 total attributes.

Data Processing

Early results

Although with the technique described above we only had 1939 records (households) with 1488 feature values (time series half-hour ticks for 31 days of 31st of January), we trained DoppleGANger for a few hundred epochs before reviewing the results. To train it we used all the normalised attributes and the DoppleGANger noise generators in an attempt to generate the half-hourly energy demand for the LV feeder smart meter data based solely on the attributes. Below is a figure that depicts how the data from our newly trained model compares to the real one after renormalisation. It is worth mentioning that the raw output values are still normalised between 0 and 1 and can be converted as they represent a fraction of 100 000W.

Generated data with DoppleGANger for the whole of January 2024 — Generated data for the whole of January 2024

As seen by this figure for one of the generated entries while it is not perfect, the synthetic generator has captured most of the trends in energy usage. While demographics alone are insufficient to make accurate estimations, we looked at what were some other key attributes that can benefit the approach. A key finding was that even with the use of more attributes more time-series data is required for accurate estimation throughout the month as three additional features (time series attributes) could greatly benefit the generation process, and these are the addition of daylight hours throughout the month, temperature forecast for the month and working days versus holidays or weekends. While we had a plan for extracting and adding this, we simply conducted preliminary experiments to evaluate the feasibility of this approach to be applied in the energy scenario. Below, we also present a day view into the first day of the month for which we had full real data, and while there is are some clear inaccuracies , the correlation between the two time series remains high and the the two time series display the same intraday pattern, predicting the peak energy demand at around the right level.

Generated data with DoppleGANger for a single day of January 2024 — Generated data for a single day of January 2024

As discussed in the introduction the main purpose of these experiments was to determine the feasibility of the generational aspect which we explore next. First, we modify the attributes of the same record shown in the Figures above and simply double the number of households.

Generated data for the whole of January 2024 with double the number of households

As expected, the synthetic energy demand jumps substantially. Interestingly, the mean energy demand of the newly produced modified attributes is 6521.2Wh, while the original one is 3894.6Wh, which is not exactly double but rather close to 70% increase. The standard deviation has decreased from 1930.3Wh to 1471.3Wh (~25%) and the maximum has increased to 10429.1Wh compared to 8259.9Wh (only ~25% increase).

In our exploration, the impact of varying a single attribute reveals some intriguing patterns, notably visible in the synthetic data generated. For example, consider the noticeable trough around 5 PM, which might typically coincide with the end of the workday. Interestingly, this pattern in the synthetic data, while closely mirroring real data, presents a slight deviation—a half-hour difference. This discrepancy may be influenced by several factors such as the percentage of single-person households among others. It's also conceivable that this variance could highlight a potential limitation of the model, possibly stemming from the restricted volume of training data.

Conclusion: Assessing the Promise of DoppelGANger in Energy Planning

Despite the inherent challenges, such as limited data and the model's reliance solely on static attributes—excluding dynamic factors like weather conditions (like daily sunrise/sunset, cloudiness, temperature and many others) , electricity prices, or special dates—the DoppelGANger approach offers a groundbreaking solution to a critical issue in energy planning. By synthesising realistic time-series data based on demographic attributes, DoppelGANger enables more no-history prediction and scenario testing. This capability is vital for effective energy management, particularly in the planning, construction, and development phases of urban and rural areas. In essence, tools like DoppelGANger not only enhance our ability to model and understand complex systems but also underscore the increasing importance of synthetic data in strategic decision-making and policy development. As we refine these models and expand their training datasets, their potential to revolutionise energy planning and management becomes even more pronounced, bridging the gap between theoretical planning and real-world application.