This site provides a number of different data sets related to Alaskan energy use. They are primarily useful for creating energy models that require weather data and fuel/utility costs. Much of the data comes from the AkWarm Energy Modeling software. There are two types of files on the site:
the raw, source data files that contain the original data. These files are found in the raw
subdirectory; and
files that contain data processed from the original files. These files are found in the proc
subdirectory. The processing often selects out a
subset of fields and sometimes combines datasets together. The processed data is typically in two formats:
a pickled and compressed (bz2) Python Pandas DataFrame, and a standard CSV file. Info on using the Pandas
format is below.
You can access any of the data files by simply downloading them in your browser or another tool that can make HTTP requests. The file names and directory paths are indicated in the various sections below. You can also access the files via programming code, and examples are given below using the Python language. You can download this Jupyter Notebook and execute the sample code provided below to experiment with accessing the data. Here is the link to download this notebook. You will need Python 3.6+ installed to execute the code.
To see the code that was used to process the raw data into processed files, see this Notebook on the associated GitHub Site.
# Execute this cell prior to any of the cells below
import urllib
import io
import pandas as pd
import requests
These Python functions can be used to retrieve the processed data on the site, either as a Pandas DataFrame or as CSV text.
base_url = 'http://ak-energy-data.analysisnorth.com/'
def get_df(file_path):
"""Returns a Pandas DataFrame that is found at the 'file_path'
below the Base URL for accessing data. The 'file_path' should end
with '.pkl' and points to a pickled, compressed (bz2), Pandas DataFrame.
"""
b = requests.get(urllib.parse.urljoin(base_url, file_path)).content
df = pd.read_pickle(io.BytesIO(b), compression='bz2')
return df
def get_csv(file_path):
"""Returns a string in CSV format that is found at the 'file_path'
below the Base URL for accessing data. The 'file_path' should end
with '.csv' and points to a CSV data file.
"""
txt = requests.get(urllib.parse.urljoin(base_url, file_path)).text
return txt
Here are the data files available having City and Utility Data. The files in the city-util/proc
directory contain processed data derived from data in the files found in the city-util/raw
directory.
├── city-util
│ ├── proc
│ │ ├── city.csv
│ │ ├── city.pkl
│ │ ├── misc_info.csv
│ │ ├── misc_info.pkl
│ │ ├── utility.csv
│ │ └── utility.pkl
│ └── raw
│ ├── City Utility Links.xlsx
│ ├── City.xlsx
│ ├── Misc Info.xlsx
│ └── Utility.xlsx
For example, browser access to the utility.csv file would be:
http://ak-energy-data.analysisnorth.com/city-util/proc/utility.csv
# Access as a Pandas DataFrame
df = get_df('city-util/proc/city.pkl')
df.head()
# The first row in the table
df.iloc[0]
# Example that uses Python to access the data as CSV text.
# As mentioned above, the CSV files can be downloaded directly
# with a browser by entering the appropriate URL, such as:
#
# http://ak-energy-data.analysisnorth.com/city-util/proc/city.csv
txt = get_csv('city-util/proc/city.csv')
lines = txt.splitlines()
for i in range(5):
print(lines[i])
# Access as a Pandas DataFrame
df = get_df('city-util/proc/utility.pkl')
df.head()
# The first row in the table
df.iloc[0]
# The "blocks" field is a Python list of tuples (rate, upper usage cut-off
# for the block.)
df.iloc[0].Blocks
# Access as a Pandas Series
series = get_df('city-util/proc/misc_info.pkl')
# This is really just one record so it is a Pandas Series object
series
All of the Alaska TMY3 (Typical Meterological Year) files are available, although
only one site is included for Anchorage (International Airport). The files are named
through use of the TMY3 Site ID. Original source files ar ein the wx/tmy3/raw
directory. An Excel file containing ASHRAE 2017 Design Heating Temperatures
for each TMY3 site is available (design_temps.xlsx). Further info on the
processed files is below.
└── wx
└── tmy3
├── proc
│ ├── 700197.csv
│ ├── 700197.pkl
│ ├── 700260.csv
│ ├── 700260.pkl
│ ├── 700637.csv
│ ├── 700637.pkl
... the rest of the Alaska TMY3 files
│ ├── tmy3_meta.csv
│ └── tmy3_meta.pkl
└── raw
├── 700197.csv
├── 700260.csv
├── 700637.csv
├── 701043.csv
... the rest of the Alaska TMY3 files
├── design_temps.xlsx
df = get_df('wx/tmy3/proc/700197.pkl')
df.head()
# Units are IP (English), temperature in degrees F, rh in %, wind speed in miles per hour.
# Timestamps are placed in the middle of the hour with the year arbitrarily
# set to 2018.
# There is a file with summary info about each site
# available.
df = get_df('wx/tmy3/proc/tmy3_meta.pkl')
df.head()
See (http://www.neep.org/initiatives/high-efficiency-products/emerging-technologies/ashp/cold-climate-air-source-heat-pump)
for more info about the data. The original NEEP spreadsheet is found at heat-pump/raw/neep_ashp_data.xlsx
├── heat-pump
│ ├── proc
│ │ ├── hp_specs.csv
│ │ └── hp_specs.pkl
│ └── raw
│ └── neep_ashp_data.xlsx
# The processed data only includes the Ductless, mini-split models.
df = get_df('heat-pump/proc/hp_specs.pkl')
df.head()