The purpose of this assignment was to learn how to analyze correlation and spatial autocorrelation while using Excel, SPSS and GeoDa. This assignment is separated into two sections and each part uses different data and requires critical thinking skills to understand the correlations and explain the patterns.
Part 1: Correlation
Question1:
For part one of this assignment, a scatter plot was created and a trend line was inserted in Excel from information provided relating to the correlation between distance (ft) and sound level (dB)(Figure 1).
Figure 1: Scatter Plot |
Figure 2: Pearson Correlation |
Question 2:
After that, a correlation matrix was created in SPSS using data from census tracts and population in Detroit, Michigan (Figure 3). This data holds information about different races in Detroit as well as other variables like Finance, Bachelor's Degrees, Median Household Income, etc. The correlation matrix shows relationships between each of these variables.
![]() |
Figure: 3: Correlation Matrix |
Part 2: Spatial Autocorrelation
Introduction:
For the second part of this assignment, I have been asked by the Texas Election Commission (TEC) to analyze the patterns from the presidential elections from the years 1980 and 2012. The TEC has provided data from those election years in hopes that observations can be made about different patterns so they can inform the governor of Texas whether or not the voting patterns have changed between the 32 years. The specific data provided that has been given from the TEC is voter turnout and the percent Democratic vote for both election years. The information I will need to retrieve myself is the percent Hispanic populations for 2010 as well as the Texas state shapefile from the US census.
Methods:
To start, the percent Hispanic populations for 2010 in Texas from the US Census website was downloaded. The data included a lot of other unneeded variables so there needed to be a reduction from the amount of columns from the excel file to just the percent Hispanic. Next, the Texas shapefile was downloaded from the US Census website as well. After all of the data components were collected, ArcMap was opened and then uploaded the Texas shapefile and joined the percent Hispanic data and the data provided by the TEC through the Geo_ID field. After all the table were joined, I exported the new combined data as a shapfile so I could then open it in GeoDa and start to analyze the relationships between percent Hispanic populations, voter turnout, and the percent Democratic vote to determine whether there is a spatial autocorrelation between them. Once my new Texas shapefile was uploaded into GeoDa, I created a spatial weight with rook continuity. Then, scatter plots representing the Moran's I were created for voter turnouts and the percent Democratic vote for the years 1980 and 2012 as well as the percent Hispanic population for 2010 (5 scatter plots total). Also, LISA cluster maps were created for each of the five variables as well.
Results:
***Before displaying the map results, some background information about what the LISA cluster maps are showing is needed first. The red shown on the maps represents areas of High-High (+,+), which means that a specific area of high value is surrounded by other areas of high value. The pink shown on the maps represents areas of High-Low (+,-) which means that an area of high value is surrounded by low values and is considered an outlier. The blue shown on the maps represents areas of Low-Low (-,-), which means that an area of low value is surrounded by other areas of low value. Lastly, the light blue shown on the maps represents areas of Low-High (-,+), which means that an area of low value is surrounded by areas of high value and is also considered an outlier. Any white value indicates no significance. This information is important to know when analyzing LISA maps.
1980 Voter Turnout: The scatter plot (Figure 4) and the LISA map (Figure 5) really helps represent the data visually. The scatter plot shows a slight positive correlation with a Moran's I of 0.468. The LISA map shows high voter turnouts in northern Texas as well as a small area in the center. Low voter turnout ares are in southern and eastern Texas.
![]() |
Figure 4: Scatter Plot of Voter Turnout in1980 |
![]() |
Figure 5: LISA Map of Voter Turnout in 1980 |
![]() |
Figure 6: Scatter Plot of Voter Turnout in 2012 |
![]() |
Figure 7: LISA Map of Voter Turnout in 2012 |
1980 Percent Democratic Vote: The scatter plot (Figure 8) for the percent Democratic vote in 1890 has a more significant positive correlation than the others so far and has a Moran's I of 0.575. The LISA map (Figure 9) shows a low percent of Democratic voters in the northern part of the panhandle of Texas. There is a high percentage of Democratic voters in the southern tip of Texas as well as some of the eastern side.
![]() |
Figure 8: Scatter Plot of Percent Democratic Voters 1980 |
![]() |
Figure 9: LISA Map of Percent Democratic Voters 1980 |
![]() |
Figure 10: Scatter Plot of Percent Democratic Voters 2012 |
![]() |
Figure 11: LISA Map of Percent Democratic Voters 2012 |
![]() |
Figure 12: Scatter Plot of Percent Hispanic Populations |
![]() |
Figure 13: LISA Map of Percent Hispanic Populations |
Scatter plots and LISA maps are great ways to represent and visualize different sets of data so it is easier to find autocorrelations. A few correlations are discovered from these maps. First of all, high percents of democratic voters tend to be counties that also have high percents of Hispanic populations. This makes sense due to the fact that most Hispanic populations do tend to vote democratic. An interest difference that related back to the study question is that there were less counties with high voter turnout in 2012 compared to 1980. This could have been influenced by who was running for president at the time. Another interesting observation was the locational transition of voters from the percent Democratic populations in 1980 to 2012. There was a high percent of Democratic voter turnout on the eastern side of Texas in the 1980 election but in the 2012 election those some counties became insignificant and counties across the state on the western side became high percent. The cause of that transition would be of interest to look into further. Another notable observation is from Figure 13. This LISA map shows the smallest amount of outliers but those counties are located in interesting areas where there must be a significant reason for that outcome. Overall, the TEC could inform the governor of Texas that there have been a decrease of voter turnout since 1980 and that counties that have high percents of Hispanic populations will most likely have high percents of Democratic voter turnout.
No comments:
Post a Comment