Artificial intelligence (AI) policy: ASHRAE prohibits the entry of content from any ASHRAE publication or related ASHRAE intellectual property (IP) into any AI tool, including but not limited to ChatGPT. Additionally, creating derivative works of ASHRAE IP using AI is also prohibited without express written permission from ASHRAE.

Close
logoShaping Tomorrow's Built Environment Today

Searching for the Best Data-Driven Energy Prediction Techniques

Searching for the Best Data-Driven Energy Prediction Techniques

From ASHRAE Journal Newsletter, April 27, 2021

More than 3,600 teams from 94 countries competed in The ASHRAE Great Energy Predictor III competition in 2019. Their quest? Discovering the best data-driven building energy prediction techniques. Clayton Miller, Ph.D., Associate Member ASHRAE, who led the event’s technical planning team, talked with ASHRAE Journal about the competition, the results of which were published in an article in Science and Technology for the Built Environment.

1. What is "The ASHRAE Great Energy Predictor III competition”?

ASHRAE hosted the Great Energy Predictor III (GEPIII) machine learning competition in the fall of 2019. Its focus was finding the best data-driven building energy prediction techniques. It attracted 4,370 participants in 3,614 teams from 94 countries. Competitors submitted a total of 39,403 predictions.

2. Why was it important to bring back the competition after 25 years?

The Great Energy Predictor I and II competitions were held in the mid-1990s, led by Jeff Haberl, Ph.D., P.E., Fellow/Life Member ASHRAE, and Jan Kreider, Ph.D., P.E., Life Member ASHRAE, at Texas A&M University and the University of Colorado Boulder, respectively. These competitions set the stage for data-driven building energy prediction innovation in the early days of artificial intelligence research for buildings. Many new techniques, tools and data sources have emerged in the more than 20 years since those early competitions, not the least the widespread use of the internet and crowdsourcing platforms like Kaggle, the machine learning competition platform used for GEPIII. Discussions from ASHRAE TC 4.7, Energy Calculations, in 2018 were the catalyst for the competition's resurrection, using these innovations and focusing on a far more extensive data set. Jeff Haberl, the team’s link to the previous competitions, was a strong motivator in renewing the effort in a new form.

3. What was the primary goal of the competition?

The objective for participants was uncovering the machine learning workflow that resulted in the lowest accuracy-based error. This error calculation was based on how well a contestant's prediction model performed in the context of predicting long-term hourly energy measurements from buildings. These data included 2,380 energy meters representing electricity, heating and chilled water, and steam energy consumption from 1,448 buildings in 16 different data donor locations. The primary technical goal was discovering which model types, machine learning steps and workflows performed the best on this specific application of long-term meter prediction. At the end of the competition, the teams with the most accurate predictions would win $25,000 in prize money in return for sharing their code and explanations of their solutions. These technical objectives are interesting for anyone seeking the most innovative ways of performing machine learning on building energy.

Beyond just technical aspects, the planning team had the key objective for the competition to push the data science and building science communities closer to each other through an exchange of concepts, terminologies and techniques. Kaggle’s 5 million users and ASHRAE’s 50,000 members had a limited overlap before the competition. Machine learning experts knew little about buildings, and building energy analysts only used the most basic machine learning techniques. The goal was for this competition to be a catalyst for exchange that would continue post-competition.

4. What is the significance of the discoveries?

The most critical technical discoveries from the competition were based on finding which types of models and configurations performed best for this application and the steps in the machine learning process that yielded the best results at this scale. Decision tree ensemble models such as Gradient Boosting Trees were the most popular and effective model type for time-series hourly energy regression in this context. These model types can be implemented using numerous open-source Python and R packages such as XGBoost and LightGBM. Another significant finding was that all the machine learning workflow steps had an impact on model accuracy, and some of those activities required domain knowledge to undertake. For example, one member of the top winning team had some background in metering and understood the best way to preprocess the training data to remove anomalous behavior that would reduce their solution's effectiveness. It was also found that the best solutions were not just a single trained model but large ensembles of models whose predictions were post-processed to create the right balance in the bias-variance trade-off needed to win. These technical insights form the foundation for researchers when approaching building energy prediction for large groups of buildings.

5. How can the results be used in the future for other research?

The competition results have been shared in several open-source repositories that give future analysts and researchers a starting point for leveraging the discoveries. The primary repository contains the code and detailed documentation of the top five winning solutions and includes several links to YouTube playlists containing detailed explanation videos from the winning teams. Another repository contains data from the competition itself in terms of the contestants, discussion board topics and other information about the planning of the competition. Finally, the competition data was open-sourced in a repository and open-access publication, and it includes additional data sets and documentation not found in the competition.

6. What lessons, facts and/or guidance can an engineer working in the field take away from the results?

From a practical perspective, the biggest takeaway for engineering and energy professionals is that it's essential to learn new tools such as coding as the amount of data grows in our industry. This competition provides hundreds of analysis examples in the form of notebooks that were created by the contestants and that can be cloned and learned from by professionals who want to pick up Python or R programming languages. A large percentage of these notebooks and the discussion that accompanies them are targeting data science beginners. We hope that coding and data science skills will become commonplace as part of a digital hybrid skill set of any building performance-related professional due to the competition's content.

7. In addition to the STBE article, where can readers go to learn more?

Beyond the repositories and online sources of information listed previously, several ASHRAE Seminar videos can be watched to learn about the competition's planning and results. The following seminars and presentations can be found in the ASHRAE Technology Portal under ASHRAE Conference Seminars and viewed for more information (subscription required):

Another presentation, Data Science Meets ASHRAE: The Great Energy Predictor Shootout III, from a Lawrence Berkeley National Laboratory Energy Technologies Area seminar, can be found here (no subscription required).

 

Competition Organizing and Technical Committees

Competition planning for GEPIII was led by ASHRAE TC 4.7, with Chris Balbach, Associate Member ASHRAE, leading the operational planning team that included Krishnan Gowri, Ph.D., Fellow ASHRAE; Anthony Fontanini, Ph.D., Member ASHRAE; and Jeff Haberl, Ph.D., P.E., Fellow/Life Member ASHRAE. Clayton Miller, Ph.D., Associate Member ASHRAE, led the technical planning team that included Pandarasamy Arjunan, Ph.D.; Anjukan Kathirgamanathan, Ph.D.; June Young Park, Ph.D.; and Zoltan Nagy, Ph.D., Associate Member ASHRAE.

Close