Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Spatial Autocorrelation tool evaluates whether data is clustered, dispersed, or randomly distributed, based on both feature locations and feature values simultaneously.

...

In the top left, notice that the z-score is 39.43. In the top right, notice that any z-score larger than 2.58 has less than 1% chance of occurring randomly. In the center section is the normal distribution curve and a dotted line illustrating where the z-score for this analysis is located on that curve. It also illustrates that a positive z-score indicates that the data is clustered, while a negative z-score would indicate that the data is dispersed. Below the diagram is a helpful summary sentence, "Given the z-score of 39.43, there is a less than 1% likelihood that this clustered pattern could be the result of random chance." Because the z-score is more than an order of magnitude larger than 2.58, you will notice, in the top left, that the p-value is actually 0.00, which means there is essentially no chance that this pattern is random.

Simply knowing that the percentages of people who drove alone to work are highly clustered might not provide you with actionable knowledge, but what this result indicates is that there is a strong spatial pattern present in this data, which is worth investing the time to investigate further. If On the other hand, if the data was randomly distributed, you could stop here. Because you know your data displays strong clustering, you can now ask more interesting questions. Is it the high or low percentages of commuters driving alone that are clustered? Where are they clustered? You will see find that all of the spatial statistics tools build upon each other to answer these questions and provide additional knowledge.

...

Again, the z-score indicates that the percent of people who took public transportation to work is highly clustered. The enormous z-score of 55.84 indicates that the public transportation variable is the most highly clustered of the three. Logically, this makes sense, because we would think people living near metro stations or major bus route stops would be more likely to use them and, therefore, people taking public transportation to work would be strongly spatially clustered around those stop locations.

...

This variable displays a reverse pattern in that the significance of clustering decreases as distance increases. In fact, when the distance is greater than approximately 65,000 feet, the pattern becomes random, rather than clustered, as indicated by the yellow data points on the chart. This is the same distance at which driving alone displayed peak clustering. You could rerun the tool and set specify a beginning distance and a distance increment to determine if there is a peak distance below 40,000 feet.

...

This layer also provides boundaries for all of the census tracts in Harris County, but, this time, with race data attached.

  1. In the Contents pane, right-click the CensusTracts_Race_2014 layer name and select Attribute Table.

The attribute table includes data on the percent of the population within each census tract of each race. For this analysis, you will study the distribution of the white population.

 

  1. Close the CensusTracts_Race_2014 table view.

Since hot spot analysis is based on a distance threshold, it is important to get a good understanding of the spatial autocorrelation of your data, as it relates to distance threshold. One option, which you will pursue, is to run the Incremental Spatial Autocorrelation tool on the same variable first, to help determine an appropriate distance threshold, which can then be entered in the Hot Spot Analysis tool. Another option is to use the Optimized Hot Spot Analysis tool, which will automatically select an optimal distance threshold for you.

...

  1. At the bottom of the Catalog pane, click the Geoprocessing tab.
  2. At the top of the Geoprocessing pane, click the Back arrow button.
  3. Within the Analyzing Patterns toolset, click the Incremental Spatial Autocorrelation tool.
  4. For 'Input Features', use the drop-down menu to select the CensuTracts_Race_2014 layer.
  5. For 'Input Field', use the drop-down menu to select the Percent_White field.
  6. For 'Number of Distance Bands', type 15 15.
  7. For 'Beginning Distance', type 10000.
  8. For 'Output Report File', type "ISA_White_15.pdf".
  9. Click Run.
  10. Hover over the Completed with warnings message and click the Output Report File hyperlink.

...

Info

If your newly added HSA_White layer changes from the default blue, white, and red symbology to a single color symbology, hover over the Undo button on the Quick Access toolbar, above the Ribbon.

If the undo step says "Symbology - Update layer renderer : HSA_White, the then click the Undo button to restore the default symbology.

Notice that there are hot spots in red, which are statistically significant clusters where a high percentage of the population is whiteof high white population, along with cold spots in blue, which are statistically significant clusters were a low percentage of the population is whiteof low white population. There are also areas in white, which are not statistically significant. If you are familiar with how race is distributed throughout the city, this result should not be too surprising.

...

Again, a new layer has been added to your map and you may need to undo the layer render if your census tracts are showing as a single color. The resulting high-high and low-low clusters should fairly closely match the hot and cold spots in the prior analysis and are represented in light pink and blue. The areas of low white population within a cluster of high white population are represented in dark blue and the areas of high white population within a cluster of low white population are represented in dark red. In this case, many of these outliers are on the periphery of the hot and cold spots as they transition into areas which are not significant, which makes the results less interesting. Cluster and outlier analysis is often more telling at a smaller geographic unit. For example, if you were to rerun this same analysis at the census block level, you would see individual red blocks with a high concentration of white population within a huge neighborhood of light blue with a low white population. These outliers will prompt you to pose interesting questions to try to explain them. In one neighborhood we investigated in Houston, these high-low outliers, or red blocks, corresponded to blocks with high numbers of building permits. Those two pieces of information combined present a compelling picture of gentrification. Instructions for running the analysis on the block level data will be added to this wiki next week. Because the geographic units are much smaller, calculating the results takes more time.