5. Merge in EIA power plant data

Let’s start looking into Power plants, since they’re a largecontributor to GHG emissions. Question: are there clear trends in emissions acrosspower plants within the U.S.? To do this, we need to merge in a data source.

The Energy Information Administration(EIA) has a separate database containing data on every power plant across the U.S. However, the EPA and EIA data doesn't directly sync up nicely. The EPA has created a separate file, called a crosswalk, that contains the list of power plant names and their associated EPA and EIA data. We need to use this crosswalk file to merge the two datasets.

5.1.  Datajoins and all that fun stuff.

Go back to the Data Source page and add a new text fileconnection. We’ll need to add the crosswalk data file, then the Powerplant generation num file.


First, we'll join the crosswalk data. We want to do a left-join, settingFacility ID from our original data source equal to GHGRP Facility ID in thecrosswalk data.

Quick Primer on join types

Now, add in the powerplant generation num file. We’ll doanother left join, setting the Crosswalk ORIS Code equal to the Plant ID in thepowerplant generation num file. Again, this connects the real EIA data (contained in powerplant generation num) to the real EPA data (our initial data source).


Okay, now back to graphing awesome stuff. Onto the next section.

SUBPAGES (10): 1. INTRODUCTION 2. CREATING 2-D GRAPHS 3. 3-D GRAPHS AND MORE! 4. MAPS 5. MERGE IN EIA POWER PLANT DATA 6. DOES THE AMOUNT OF ELECTRICITY GENERATED INFLUENCE GHG EMISSIONS? 7. CALCULATED FIELDS 7. CALCULATED FIELDS 8. TRI DATASET & TABLE CALCULATIONS 9. DASHBOARDS