Back to Homepage

Predicting Household Utility Usage with Generalized Boosted Regression Modeling (GBM)

Housing Prices View Project Presentation

Overview

The goal of this project is to predict the household energy consumption for the the following year for the treatment and control groups using Gradient Boosting.

Domain Expertise

This project deals with standard economic time series data; my background being in economics, I am very familiar in dealing with this type of data and analysis.

Data Provenance

The professor provided us an uncleaned dataset that he sourced on the monthly electricity consumption for 23,456 households for the years 2010-2011.

We were provided the following variables for household characteristics:

And the following variables for the observed energy consumption for each household:

Technical Stack

This project was completed in R with the package gbm to model with gradient boosting.

Methods

The data was first cleaned and preprocessed. Rows with NA were removed, the dummy variables were converted to factors, and outliers were removed.

The monthly household energy consumption lusage is set as the response and is modeled by all variables excluding hh_id, year, and month. We model each treatment group separately with gradient boosting with a gaussian distribution where the total number of trees to be fit is 5000 and the depth between interactions is 3.

Results

The true values of monthly household energy consumption for the 2011 treatment group of lusage range from [4.516,8.059], with a mean of 6.373. The predicted values using the gbm method have a slightly larger range of [3.927,8.123], with a relatively similar mean of 6.346.

Conclusion

We accurately predicted the distribution and range of values taken for the 2011 treatment group. It was not perfect, as expected, but offers us a fairly accurate prediction of a household’s energy consumption for the following year.

Other confounding variables including government mandated water limits were not accounted for and may be useful in improving future predictions.

© 2024 Serena Alvarez   •  Theme:  Moonwalk