Python package to implement data-pipeline to process high-resolution power meter data.
The demand_acep package implements a data-pipeline. The data-pipeline performs three tasks - Extraction, Transformation and Loading (ETL).
- Extract: The high-resolution (~7 Hz) power meter data for each meter and each channel is read from the NetCDF files to a pandas dataframe.
- Transform: The data is down-sampled to a lower resolution (1 minute default), missing data is filled, individual channel data is combined with other channels to create a dataframe down-sampled, filled dataframe per day per meter, and this dataframe is exported to a csv file. So, we have for each day of data, a csv file for each meter containing the data for all channels at a lower resolution.
- Load: All the down-sampled data is loaded (copied not inserted for speed) on to the timeseries database, TimescaleDB. The data was copied back from the database to perform the data imputation for the missing days and re-copied to create the complete data. The ETL process is summarised in the poster shown below.
All or some steps can be re-used or repeated as desired. Further analysis using the complete data was performed and results have been in presented in the documentation.
pip install demand-acepThis package has only been tested on Linux.
Usage examples and further analysis can be seen in the scripts folder.
- Extract data to csv: This file shows how to extract data for a data to csv. This read a data for a day, and performs the transformation and creates CSVs for each meter and described before.
- Extract data for multiple days in parallel: This file shows how to use
multi-processinglibrary in python to extract data for multiple days in parallel. The more cores the system has, the faster the total data can be extracted. - Copy data in parallel to TimescaleDB database: This jupyter notebook shows how to copy the csv files to the database in parallel.
- Perform data imputation for long timescales (days-months): This jupyter notebook shows how to perform data imputation for long timescales, essentially when the data was not downloaded for a particular day, or months.
- Read from database to pandas dataframe: This jupyter notebook shows how to read the data from a postgres (TimescaleDB) database into a dataframe.
The module supports TDD and includes setup for automatic test runner. To begin development, install Python 3.6+ using Anaconda and NodeJS for your platform and then do the following:
- Clone the repository on your machine using
git clone https://github.com/demand-consults/demand_acep. This will create a copy of this repository on your machine. - Go to the repository folder using
cd demand_acep. - Get python dependencies using
pip install -r requirements.txt. - Get the required node modules using
npm install. Install Grunt globally usingnpm install -g grunt. This step and Nodejs is only required for automated test running. - In a dedicated terminal window run
grunton the command line. This will watch for changes to any of the.pyfiles in thedemand_acepfolder and run the tests usingpytest. - Make tests for the functionality you plan to implement in the
testsfolder and add the data needed for tests to thedatafolder located indemand_acep\data.
doc folder contains the documentation related to the package. To make changes to the documentation, following workflow is suggested:
- From the root directory of the package, i.e. here, run
grunt doc. This command watches for changes in the.rstfiles in thedocfolder and runsmake html. This has the effect of building your documenation on each save. - To view the changes, it is suggested to run a local webserver. This can be done by first installing a webserver with
pip install sauth, and then running the webserver like so:sauth <username> <password> localhost <port>from thedocfolder in a separate terminal window. Specify a username, password and a port number, for example - 8000. Then navigate to: http://localhost:8000 in your web-browser and enter the username and password you set while runningsauth. The live changes to the documentation can be viewed by navigating to thehtmlfolder in thebuilddirectory located atdoc\build\html. - As you make changes to the documentation in the
.rstfiles, and re-save them,grunt docautomatically updates thehtmlfolder and changes can be viewed in the browser by refreshing it.
An R package creates diverse plots per day, weekday, month and year for peak demand power consumption of several meters to support this project. These plots lead to benefit-cost analyses and cost saving plots. In addition, this package forecasts peak power demand using ARIMA on a daily and monthly basis. Correlation and a simple regression are also included.
To use ths package, follow the steps:
- Install
devtools
install.packages("devtools")
- Load the package
library(devtools)
- Install this package
demand
install_github("reconjohn/demand")
- Load the package
library(demand)
Now you are all set!
Brief description of demand charge using R package, demand
Using R package demand, peak demand, correlation, forecast, and demand charge were plotted. Refer to the followings for more details about demonstration of code from demand package and its results.
- 0.0.1
- Released to ACEP on 06/21/2019.
Chintan Pathak, Yohan Min, Atinuke Ademola-Idowu - cp84@uw.edu, min25@uw.edu, aidowu@uw.edu.
Distributed under the MIT license. See LICENSE for more information.
- Fork it (https://github.com/demand-consults/demand_acep/fork)
- Create your feature branch (
git checkout -b feature/fooBar) - Commit your changes (
git commit -am 'Add some fooBar') - Push to the branch (
git push origin feature/fooBar) - Create a new Pull Request
