Geocoding a table of addresses


Geocoding is the process of transforming a description of a location—such as a pair of coordinates, an address, or a name of a place—to a location on the earth's surface. You can geocode by entering one location description at a time or by providing many of them at once in a table. The resulting locations are output as geographic features with attributes, which can be used for mapping or spatial analysis.

You can quickly find various kinds of locations through geocoding. The types of locations that you can search for include points of interest or names from a gazetteer, like mountains, bridges, and stores; coordinates based on latitude and longitude or other reference systems, such as the Military Grid Reference System (MGRS) or the U.S. National Grid system; and addresses, which can come in a variety of styles and formats, including street intersections, house numbers with street names, and postal codes.

This process requires a table that stores the addresses you want to geocode and an address locator or a composite address locator. This tool matches the addresses against the locator and saves the result for each input record in a new point feature class. When using the ArcGIS World Geocoding Service, this operation consumes credits.

1. Downloading Toxic Release Inventory (TRI) data

For this exercise, we are going to download the Toxic Release Inventory (TRI) Facilities data from 2017 using the TRI Explorer service. The TRI database contains information about releases of specific toxic chemicals from industry or federal facilities to the land, water, and air. Use the data for releases reported in SC during 2017 you downloaded in the Add Lat Long Data to ArcGIS Pro exercise. We will geocode the data into a set of point features. Refer to that exercise if you need to download the data and format the table. A copy of the table should be stored in C:\Working_with_Tabular_Data.

C:\Working_with_Tabular_Data\Tutorials\IMAGES\3.png

2. Preparing Your Data
Before you can import tabular data into ArcGIS, you must make sure it is in a file type and a format that ArcGIS can recognize and your field names are clean and matching geocoding categories. First, we will format our data fields, and then we will save it correctly.

Formatting Tabular Files
C:\Working_with_Tabular_Data\Tutorials\IMAGES\6.png

Addresses must be correctly formatted in order to display correctly in ArcGIS and fields cannot start with a number or contain special characters except for underscore (_).

Open the TRI_Facilities_2017_Cleaned.csv in Microsoft Excel. Save the file as an Excel workbook with the same name, i.e. TRI_Facilities_2017_Cleaned.xlsx.
Inspect your data and see if it contains the following columns: Address, City, State, Zip_Code. Enter the corresponding data into each column, if needed.
You need to ensure that your data contain the City and State, since the same address can be repeated in several cities and states.

Read carefully the rules for field names:

  • When creating spreadsheets, make sure fields are fewer than 255 characters. ArcGIS reads the first 255 field characters. Fields with more than 255 characters are converted to BLOB fields and are not readable. Abbreviate, manually truncate, or split any fields longer than 255 characters.
  • Check the numeric field type before and after importing Excel data. ArcGIS typically converts spreadsheet numeric fields to double precision (Double), which may not meet your needs. If necessary, create new fields of the desired type and calculate values into them.
  • Check the format for date fields. ArcGIS uses the Lotus date/time format. In this format, the calendar date is represented by a whole number value that represents the number of days since January 1, 1900, plus one day (due to a bug in Lotus 123 and carried over to Excel). Time is represented as the decimal portion of a 24-hour day. 
If date/time data is important, format the input spreadsheet using a standard Excel date/time format. We only have the year information, so we can skip this step.
  • Follow ArcGIS field naming rules when creating Excel column names. The first row of an Excel worksheet sets the name for each column. Column names become field names when an Excel worksheet is imported into ArcGIS. Always follow these naming rules:
    • Column/Field names must begin with a letter.
    • Column/Field names must contain only letters, numbers, and the underscore character. Make sure the field names do not have spaces or other problematic characters (eg: *, &, !, #, etc).

Column/Field names may not consist solely of reserved words (date, value, name, text, and year). Do not use these words in field names. See the list of reserved words. ArcGIS typically adds a trailing underscore to reserved word field names added by copying and pasting from other sources.



 What problems do you see in with the fields? 
  • Rename the NAME field to FACILITY.
  • Change the FAC-ID field to FAC_ID.
  • Remove the asterisk (*) in the *OTHER_AMT field.


Useable File Extensions

Once you have formatted your data, you will save it using a file type that ArcGIS can recognize. The following file types can be used in ArcGIS. All of these file types can be read by Microsoft Excel:
  • .csv
  • .txt
  • .xls
Click on File, Save As, Name: TRI_Facilities_2017_Cleaned_2.xlsx. Click Save.

3. Starting a New Project in ArcGIS Pro

Go to the Start menu and open ArcGIS Pro and sign in using your Clemson ID. 

To do this, click the Sign In menu at the top corner.

Click on Enterprise login. In the Your ArcGIS organization’s URL box, enter clemson so that your URL reads: clemson.maps.arcgis.com. Select Continue.

A new window appears. Click on Clemson University.

The Clemson University regular login screen appears. Enter your Clemson username and password. 

ArcGIS Pro automatically opens the start page. Here you find options to either open an existing project or create a project using one of the available templates. These templates provide a starting point for the project. Additional maps, scenes, and catalog views can be added to your project at any time, regardless of the initial template.
On the start page, under Blank Templates, click Map.


On the Create a New Project dialog box, in the Name box, type Geocoding

To save a project to a different location click the Browse button and browse to the folder Working_with_Tabular_Data on the C Drive.

The Create a new folder for this project check box is checked by default. It is usually convenient to keep your project files organized in a folder.
Click OK.
The new project opens with a map view.
If you've opened ArcGIS Pro before, the Contents and Catalog panes may be open. Other panes may be open as well. You'll set the pane state to the default for mapping.
If you don’t see the Catalog or Contents, don’t panic! On the ribbon, click the View tab. In the Windows group, click Reset Panes Reset Panes and click Reset Panes for Mapping (Default).
The Contents and Catalog panes are now open if they were not open before. Any other open panes are closed.

4. Set the coordinate system for your map 

In the Contents pane, right-click on Map then click Properties. 
On the Map Properties dialog box, click the Coordinate Systems tab.
The buttons below the Current XY and Current Z headings show the current horizontal and vertical coordinate systems of the map or scene, respectively. There may be no vertical coordinate system defined. Click Details for either coordinate system to see how they are defined. As you see in the picture above, your map has the WGS 1984 Web Mercator Auxiliary Sphere as the default projected coordinate system.
  





















Question: What is the vertical coordinate system? 

To change the horizontal or vertical coordinate system, click the button below the Current XY. You have to choose an appropriate coordinate system from the corresponding Coordinate Systems Available list. You can enter a search term in the Search box to help locate a specific coordinate system.

Here, we want to choose a projected coordinate system that is appropriate for our study area—which is Clemson, South Carolina.

While you are in the Map Properties window, click the Coordinate Systems tab. Under Coordinate Systems Available list, expand the Projected coordinate System dropdown. Click on State Plane and expand the dropdown next to it. Click on the 3rd one from the top; NAD 1983 (2011) (Meters). Now, scroll down until you find the projection for South Carolina.




5. Connecting to the folder and importing your data in ArcGIS

On the catalog pane right click Folders and click Add Folder Connection . You can also on the Insert tab, in the Project group, click Add Folder . Navigate to C:\...\Working_with_Tabular_Data

Right-click the TRI_Facilities_2017_Cleaned.csv   and click Add To Current Map . 
In the Contents pane, right-click TRI_Facilities_2017_Cleaned.csv and click Open . Open the attribute table and confirm once again that all fields display correctly. Close the attribute table view.


6. Geocoding addresses in ArcGIS


Geocoding is the process of transforming a description of a location, such an address or a name of a place, to a location on the earth’s surface. The resulting locations are output as geographic features with attributes, which can be used for mapping or spatial analysis.

In order to geocode addresses, we need an address reference dataset and an address locator. The reference dataset contains a database with the location of addresses for a particular region or locality. The address locator is the entity that specifies the method to interpret a particular type of address input, relate it with the reference dataset and deliver a matching option back to the user interface.
This picture illustrates an example of how the process works →




If the Geoprocessing pane is not open, on the Analysis tab, click Tools to open the Geoprocessing pane. 

On the Geoprocessing pane, type geocode addresses in the search box. The first search result is Geocode Addresses tool .

You can also find this tool under Geocoding Tools. Click on the tool to open it.






Click the Browse button  next to the Input Table box. On the Input Table dialog box, under Project Project , click Folders to browse to C:\Working_with_Tabular_Data. Click on TRI_Facilities_2017_Cleaned.csv.

Next to Browse button 
  in the Input Address Locator dialog box, click on the drop-down and select Clemson USA Geocoder. 

Attention: 
Using ArcGIS World Geocoding Service is only for the purpose of small number of records and we do not recommend using it for large datasets (more than 1000 records). Instead, we use the address locators that are available for us as part of our subscription to ESRI software. 
To do so, either browse to CCGT drive :\\Geocoding_Data_2017\Geocoding Data and click on USA_StreetAddress.loc, or use Clemson USA Geocoder. You must be signed in to be able to see this option. 




Make sure all the address fields correspond with the ones in the Clemson USA Geocoder.



State → State
ZIP → Zip_Code
Address or Place → Address
City → City
County 
→ County

Specify the name of your output in Output Feature Class as TRI_Facilities_2017_Geocoded and navigate it to C:\Working_with_Tabular_Data\Geocoding\Geocoding.gdb as your output location. 

Click Run to geocode the table. You should see the blue ribbon moving.

When the geoprocessing tool is done running, click on View Details to view the results. You can see that we have:



7. Rematch addresses


After a table of addresses is geocoded, you may find that not all of the addresses or locations in your table were matched to the results you expected; for example, points may not have been created in the location you expected or lack the precision you were hoping for. Inspecting your table may reveal the reason for an unexpected match; for instance, your input may have been missing a city field, or the street name may have been misspelled. For cases such as these, you can review the results, make corrections in your table, and update your geocoding results. To do this, you can use the interactive rematch tool in ArcGIS Pro to manually review addresses to make corrections to your original input and regeocode, reposition the location of the matched address, or select a different candidate. You can also modify the locator's settings and geocode the addresses that were matched to unexpected results. This process is called rematching.

The following steps show how to rematch addresses in a geocoded feature class in ArcGIS Pro using the interactive Rematch pane:

In the Contents pane, right-click TRI_Facilities_2017_Geocoded to rematch, click Data, and click Rematch Addresses to open the Rematch pane. This allows you to view and examine addresses that failed to match, as well as addresses that failed to match at or above a specified match level.
In the Contents pane, Right-click on TRI_Facilities_2017_Geocoded and open Attribute Table 
.
Right-click on the Score field and click on Sort Ascending . This way you can detect your unmatching records by selecting those with smaller matching scores. 
On Rematch Addresses tab, click on the plus sign on top right side of the tool . Check the box next to the field for the facility name so that you can search the location of that facility in Google. 













Pick the first record that has a low matching score in Rematch Addresses tab. Select the facility name and Google 
the name US MARINE CORPS RECRUIT DEPOT PARRIS ISLAND (RANGES). 

This is a facility with 85% matching score. You can see that Google finds a 
corresponding address for this facility. Open the Google search result and copy the facility address.

Now, we need to match this address to an address in our ArcGIS World Geocoding Service. 
Paste the facility address in front of Address or Place (#10).
Click on Apply to find the results for the 
address you just entered (#11)


Now you have a list of candidates that could possibly match your record. Pick the one with the highest score and click on Match Match. 

Use the Rematch pane to review the incorrectly matched addresses (or unmatched addresses) as needed. The tool allows you to go through the addresses one by one, dealing only with the current problematic address.

You can also review and rematch addresses that tied or matched if you're not satisfied with the match that your locator found automatically.


In some cases you may get addresses that falsely match to the wrong location. If this occurs, you can use the Unmatch button to remove the match if no appropriate match exists.

For addresses that could not find any candidates, but for which you know the location on the map, you can use the Pick from Map button to match the address by clicking a location on the map.








Congratulations, you are a tiger of a geocoder!