What is spatial autocorrelation and how it is measured? Assess its importance to spatial analysis as implemented in GIScience”

GY7707 Geospatial Analytics

CW Assignment 1

Note: The CW1 assignment below is a CHOICE. For those who have access to R on a

computer then Task 1 involves the analysis of a real data set. For those students who do

not have access to a sufficiently fast computer and/or R, then Task 2 is an essay question.


Task 1: Pipeline Accidents in the US

This part of the assignment makes use of real data supplied by the US Department of

Transportation Pipeline and Hazardous Materials Safety Administration on pipeline accidents

involving gas or oil for the period 1986 to 2017 and gathered together and edited by (and

supplied courtesy of) Dr. Richard Stover. This consists of 8890 incidents! This is available as

the file pipeline.csv on Blackboard (along with this file). You may like to view the web page at

http://www.biologicaldiversity.org/campaigns/americas_dangerous_pipelines and the

associated video at https://www.youtube.com/watch?v=3rxqUXqPzog&feature=youtu.be.

Select a single US state (note: NOT North Dakota, and ideally with sufficient accident data) so

that each student has a different state (check the selection of a state with me), and use R to


(a) A map of pipeline incidents (20%)

The incident data set includes a variable (coded YES or NO) corresponding to whether or not

the data use recorded latitude/longitude coordinates. As we want exact coordinates for the

spatial data analysis you should subset the data to exclude those data which do not (i.e., NO).

It is entirely up to you how to present the data on a map: e.g., you could generate a choropleth

map at county level, but I will be looking for some novelty and effective visualisations

alongside otherwise complete maps (scale/legend/orientation etc).

(b) A separate spatial analysis of the data set enabling an evaluation of the potential human

and environmental exposure to contamination within the state (80%)

1. It is *YOUR* choice as to what spatial data and analysis steps you include in addition to the

pipeline incident data and state boundary. How you deal with spatial analysis with temporal

data is also an issue. Spatial analysis should include the kinds of spatial analysis operations

discussed in some of the lectures/practicals, e.g., buffers, overlays, kernel densities etc. The

course text Lovelace et al. (2019) might well be a source of inspiration, or also the Brunsdon

and Comber (2015) book.

2. You should include population data and environmental data such as land use and rivers (as

well as associated water bodies). Spatial data is available from various sources via (for

example) the R OSMAR package. Population data can be obtained from a variety of sources

(for example http://zevross.com/blog/2015/10/14/manipulating-and-mapping-us-census-

data -in-r-using-the-acs-tigris-and-leaflet-packages/ and the USCensus2010 R library (see

lecture 7) but there are many others). Other geospatial data are available via


https://ckan.geoplatform.gov/. Remember such data may well be supplied in different

projections! So you will need to change the CRS before any spatial analysis. You should be

specific about the sources of any data you have included and include URLs to any data you

have identified and downloaded yourself. You should supply sufficient information so I can

download any data (or obtain within an appropriate R library) that you have used. Remember

to include any source/copyright/credit lines for such data in any maps/visual outputs.

3. A write-up of the spatial analysis, including what you did, and what the analysis revealed

about the risk to human and physical environments including your commented R code as an

an appendix, or in line with the text and images if you are creating a document using

RMarkdown. Your write-up should ideally be no more than 2500 words (not including code

and references) and include appropriate results in terms of tables and figures.”

4. You may like to refer to Obida et al. (2018) and Park et al. (2016) for some ideas about

similar data and some approaches to spatial analysis using the incident data in North Dakota.


Brunsdon, C. and Comber, L. 2015. An introduction to R for spatial analysis and mapping.

Sage: London

Lovelace, R., Nowosad, J. and Muenchow, J. 2019. Geocomputation with R. CRC Press. This is

also available on-line: https://geocompr.robinlovelace.net/

Obida, C.B., Blackburn, G.A., Whyatt, J.D. and Semple, K.T. 2018. Quantifying the exposure of

humans and the environment to oil pollution in the Niger Delta using advanced geostatistical

techniques. Environment International, 111, 2018: 32-42.

Park, Y.S., Al-Qublan, H., Lee, E. and Egilmez, G. 2016 Interactive spatiotemporal analysis of

oil spills using Comap in North Dakota. Informatics, 3(2), 4; doi:10.3390/informatics3020004.

Task 2: Essay

“What is spatial autocorrelation and how it is measured? Assess its importance to spatial

analysis as implemented in GIScience”

Answer the essay above with reference to a single application context and include examples

and illustrations. Essays should be academic with references and figures and ideally no more

than 2500 words (not including references).

This CW is due to be submitted (electronically via Turnitin) by May 4 2020. Nick Tate March 2020