This guide was created by the staff of the GIS/Data Center at Rice University and is to be used for individual educational purposes only. The steps outlined in this guide require access to ArcGIS Pro software and data that is available both online and at Fondren Library. |
The following text styles are used throughout the guide: Explanatory text appears in a regular font.
Folder and file names are in italics. Names of Programs, Windows, Panes, Views, or Buttons are Capitalized. 'Names of windows or entry fields are in single quotation marks.' "Text to be typed appears in double quotation marks." |
The following step-by-step instructions and screenshots are based on the Windows 10 operating system with the Windows Classic desktop theme and ArcGIS Pro 3.2.0 software. If your personal system configuration varies, you may experience minor differences from the instructions and screenshots. |
Before beginning the tutorial, you will copy all of the required tutorial data onto your Desktop. Option 1 is best if you are completing this tutorial in one of our short courses or from the GIS/Data Center and Option 2 is best if you are completing the tutorial from your own computer.
If you are completing this tutorial from a public computer in Fondren Library and are logged on using the gistrain profile, follow the instructions below:
If you are completing this tutorial from a personal computer, you will need to download the tutorial data online by following the instructions below:
You will now add a feature class to a new map that depicts all of the land parcels in downtown Houston. This feature class was originally obtained from the Harris County Appraisal District (HCAD) Public Data GIS website at http://pdata.hcad.org/GIS/index.html. The original parcels shapefile contained all of the parcels in Harris County, but working with such a large file can be time consuming, so, for the purposes of this tutorial, the layer was clipped to contain only those parcels in downtown Houston.
You will now examine the DowntownParcels layer’s attribute table.
The HCAD_NUM field contains the unique 13-digit HCAD account number for each parcel. You are also provided with the owner name and address for each parcel, but no other information. Fortunately, a tremendous amount of information about the parcels can be obtained from HCAD in standalone tabular format. In this case, you would like to know the particular land use for each parcel, such as residential, commercial, or industrial. This information is contained in a separate Land table, which was originally obtained from the HCAD Public Data Download website at the http://pdata.hcad.org/download/index.html.
Notice that the Land table has been added to the bottom of the Contents pane under a new 'Standalone Tables' section. This is because the table does not currently have spatial information associated with it, so it cannot yet be displayed on the map and, therefore, cannot be listed by drawing order.
Notice that you are again provided with the unique 13-digit HCAD account number, but, this time, the field name is ACCOUNT, rather than HCAD_NUM, as it was in the DowntownParcels layer attribute table. When you perform a join, you are able to specify the field names of the two common join fields, so it is okay for them to have different field names, so long as the actual data contained within the two fields is identical.
You are also provided with the land use code and the land use description for each parcel. In order to symbolize the parcels by land use, you will first need to join the land use table to the parcels feature class. You will now explore the various land use codes.
Notice how many different land use categories there are. For example, commercial land use is divided into very specific categories, such as Restaurant, Ice House, Bar/Lounge, Day Care Center, Medical Office, and Bank. While it is great to have this information available, symbolizing so many different categories on a single map can make the map difficult to interpret. Instead, it might be preferable to symbolize the parcels using more generalized land use categories. Fortunately, the City of Houston has created a lookup table that assigns each appraisal district land use code to one of ten generalized land use categories. You will now explore this lookup table.
Notice that the COUNTY field tells you which county uses the particular land use code listed. “HCAD” stands for Harris County. If you scroll down to the bottom of the table, you will notice “FBCAD” and “MCAD” for Fort Bend County and Montgomery County, respectively. The LANDUSE_CODE and LANDUSE_DESCRIPTION fields should look familiar, since they contain the same information as the Land table you looked at previously. The GROUP_CODE and GROUP_DSCR fields store one of ten general land use categories that correspond to each specific land use code. In order to symbolize the parcels using these generalized land use categories, you will also need to join the lookup table to the land use table and the parcels feature class. You will perform the join using the land use code field that is common to both your land use table and your lookup table.
Now that you have examined all of the data tables, you are ready to join them all together. It is possible to join tables together in any order, but you will want to join both land use tables to the back of your DowntownParcels attribute table, so that you can later symbolize the parcels based on their land use. You will first join the Land table to the DowntownParcels feature class using the unique HCAD account number as the common join field.
A tool titled Add Join should pop up on the screen. In the Add Join tool, notice that 'Input Table' is already set to the DowntownParcels layer, since that was the layer used to open the tool.
Notice the four new fields added to the end of the DowntownParcels attribute table showing the land use for every parcel. This is an example of a successful tabular join.
Because you want to symbolize the parcels using the more generalized land use categories, you will now also join the LandUseCode_LookupTable table using the land use code as the common field.
Notice that all of the values in the six newly joined fields are null. This is an indication that something is wrong with the join. The two most common reasons for a join not to work are either the data values in the two common join fields are not identical between tables or the standalone join table itself is not formatted correctly for import into ArcGIS, which would only occur if it is imported from another file format, such as Excel or CSV. In this case, the standalone join table was already in a native ArcGIS file geodatabase, so there should not be a problem with the overall table formatting. Instead, you will look into the actual values stored within the land use code field in both tables for any discrepancies.
You will now examine the two tables more closely to determine why the join resulted in null values.
This fields view displays a list of all the fields in the corresponding attribute table, along with their field properties. In the 'Field Name' column, locate the Land.LAND_USE_C field. In the 'Data Type' column to the right, notice that the data type is Long, which stands for long integer. In other words, the data is stored in a numeric field. Further down the table, locate the LandUseCode_LookupTable.LANDUSE_CODE field. To the right, notice that the data type is Text. Even if the 4-digit codes stored in both fields are identical, the join will not work if one table stores the values as numbers and one stores the values as text. If the data types in both fields had matched, then the join error would likely have been due to the actual data (or 4-digit codes) being different or selecting the incorrect column name on which to perform the join.
In order to successfully perform a join, the corresponding join fields in both tables must be of the same field type. Unlike in Excel, you cannot simply convert an existing field from one type to another. Instead, you must create a new field of the correct field type and copy the data over into the new field. In this case, you will convert the LAND_USE_C field in the Land table to a text field type to match the text field type used in the LandUseCode_LookupTable table.
The new LAND_USE_C_Text field has now been added to the end of the Land table.
The Calculate Field tool should pop-up on the screen.
Notice that the box below now reads LAND_USE_C_Text = !LAND_USE_C!, as shown below. This statement will copy all of the content from the original LAND_USE_C integer field into the new LAND_USE_C_Text text field.
In the Land table view, ensure that the land use codes have, indeed, been copied over into the new LAND_USE_C_Text field and that they are left-aligned as text, rather than right-aligned as numbers.
You will now attempt to join the lookup table again using the newly edited text field, but you will first need to remove the original join.
The Remove Join tool will pop up on the screen.
You will now repeat the process of joining the lookup table.
Notice that the generalized group code and description fields have been added to the end of the attribute table.
The joins that you have created are only stored in this particular map view. In other words, none of the original tables have been modified. If you were to add the DowntownParcels layer to another new map, you would no longer see the joined tables at the back of the layer’s attribute table. In addition, if your joined data tables were ever moved or deleted, the joined data would no longer be visible in this map view. For this reason, it is often a good idea to export data containing successful tabular joins, so that you do not have to worry about losing the joined data in the future.
Notice that all of the joined attributes are now a permanent part of this new feature class attribute table. Since this permanent feature class contains everything you need in one layer, you may now remove your original parcels layer and both joined tables.
You have successfully joined tabular data in ArcGIS Pro!
You will now symbolize the parcels according to their generalized land use categories.
You will want to group “Commercial” and “Office” into one land use category.
In the first exercise using HCAD data, both the layer attribute table and the join table were already formatted properly, so that you could bring them directly into ArcGIS. Unfortunately, such a situation rarely occurs and most data tables require some formatting to prepare them for import into ArcGIS.
In this exercise, you will be mapping land cover by state. The original data table was obtained from the 2011 Statistical Abstract published by the U.S. Census Bureau at http://www.census.gov/compendia/statab/. The particular table was accessed from the “Geography & Environment” section under the “Land, Land Use, and Earthquakes” category and was called table “265, Land Cover/Use by State: 2003”. You will now format this table for import into ArcGIS.
Along the left side of the spreadsheet are the row numbers. Along the top of the spreadsheet are the column letters. When you are asked to click on a row or column, you will be clicking directly on these numbers and letters.
Notice that the first three rows contain title information for the table, but do not contain any data, so you will delete these rows. Notice also that row 1 is barely visible, because its height has been greatly reduced.
Sometimes data tables contain a lot more information than you actually need. While it is okay to import large tables into ArcGIS Pro, it can make future tasks more difficult for you. Every time you have to select a particular field for symbology, labels, or queries, you will have to scroll through a long list of fields, rather than a targeted list containing only your selected variables of interest. For this reason, it is ideal to remove all unnecessary data columns in Excel before importing into ArcGIS.
For this particular exercise, you will map the percent of all land that is developed land and the percent of all nonfederal land that is forest land. Such a task will require keeping the columns titled “State,” “Total surface area,” “Forest land,” “Developed Land,” and “Water Areas.” You will remove all of the rest of the columns.
Ensure that you are left with the five necessary columns previously mentioned unselected.
The selected columns should disappear and you should be left with the columns labeled “State,” “Total surface area,” “Forest land,” “Developed Land,” and “Water Areas.”
When you bring an Excel table into ArcGIS, it assumes that the first row contains the field name and then it assumes the data begins in the second row. Whatever data type it sees in that second row, it assumes is the data type for the entire column. If there is ever more than one row containing title information, ArcGIS will see that title text in the entire second row and assume that all the data contained in the entire table is text, when, in fact, the majority of the data is numerical. In short, all but one field name row must be removed, but, first, you must create an appropriate field name row.
This newly inserted row will contain your field names. For field names, you should not use any spaces or special characters and should not start with a numeral or use more than 13 characters. You may use underscores, mixed case letters, and numerals (in the middle or end of the field name only).
Since you have created a row with appropriate field names, you may now delete all rows above it.
Because you are mapping data for each state, you do not need the total data for the entire United States contained in row 2 or the accidental repeat of the Alabama data in row 3.
Scroll to the bottom of the table and notice there are extra rows that do not contain data here, as well.
At this point, the table is mostly cleaned up, so you will add two new columns to calculate the percent of the total land area that is developed and forest.
Now you will calculate the formula where Per_Forest = Forest/(Total-Water).
Hover your cursor over the small black box in the bottom right corner of the highlighted cell as shown below. Notice that your cursor changes from a thick white cross to a thin black cross.
One last problem with this worksheet, which is difficult to detect, is that there is actually a space stored after every state name. This space will prevent the table from joining properly to another table whose state names do not have a trailing space. You will now use the TRIM function to remove this extra space. The trim function removes all extra spaces from text, leaving only a single space between words and no spaces at the start or end of the text.
At this point your table has everything you need and nothing extra (except for a duplicate state name field, which will be deleted shortly), but it is always recommended to clear all formatting and programming before bringing the Excel file into ArcGIS. In addition, columns B, G, and H currently have formulas, which need to be converted to text before ArcGIS can read them properly. To get rid of all formatting and formulas, you will copy all of the data into a new blank worksheet as text.
You should now have a clean version of your data with no special formatting. Next you will expand the cells to ensure that all of their content is displayed. Ensure that all data is still selected from being pasted. Hover your cursor over one of the lines between any two of the column letters. Notice that your cursor changes from a single arrow pointing down to a double arrow pointing left and right.
The width of all your columns should now be expanded to show all of the content of the cells.
Finally, you will delete the original State column, which still contains trailing spaces.
You are finished formatting your table.
Now you will return to ArcGIS to map the land cover data by state.
This creates a new map in the project in which you will map your state land cover data. You will now save this map with a different name, so as to distinguish this exercise from the last one.
You will now explore the state data.
Examine the attribute table for the US_States layer and notice that the field containing the state names is “NAME10”.
It will take a moment for the table to load. Note that the 'Sheet' drop-down menu is now populated and Sheet1 is selected by default. Because a single Excel file can have multiple worksheets and each worksheet is treated in ArcGIS as a separate table, you must always specify which specific worksheet you want to use.
Examine the US_LandCover table and notice that the field containing the state names is “State”. Using your knowledge from the first exercise, try to join the US_LandCover worksheet to the US_States feature class and symbolize the states by both the percent developed land and the percent forest land. If you get stuck, ask for help.