{"id":81,"date":"2017-04-19T11:59:05","date_gmt":"2017-04-19T10:59:05","guid":{"rendered":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/?p=81"},"modified":"2017-04-28T10:12:58","modified_gmt":"2017-04-28T09:12:58","slug":"loading-and-visualizing-shapefiles-in-r","status":"publish","type":"post","link":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/2017\/04\/19\/loading-and-visualizing-shapefiles-in-r\/","title":{"rendered":"Loading and visualizing Shapefiles in R"},"content":{"rendered":"<p>Geographical Information Systems such as <a href=\"http:\/\/qgis.org\">QGIS<\/a> are common in the typical archaeologist&#8217;s toolbox. However, the complexities of our datasets mean that we need additional tools to identify patterns in time and space.<\/p>\n<p>Imagine that you are working with a set of points in a vector format such as a Shapefile. It can be a set of settlements, C14 dates or shipwrecks&#8230;it doesn&#8217;t matter; you will want to combine spatial analysis with other methods. This is particularly true when you are trying to get a feeling of the dataset by performing <em>Exploratory Data Analysis<\/em>, so you need to compile summary statistics and create some meaningful visualisations.<\/p>\n<p>My personal approach is to combine QGIS with R and exploit the awesome <a href=\"http:\/\/ggplot2.org\">ggplot<\/a> package to get some insight into the dataset beyond spatial patterns. R has some nice spatial functionality, but the cool thing is to integrate these methods with other plots to get a general picture of the case study you are studying.<\/p>\n<h2>Example: Aircraft crashes in the Orkney islands<\/h2>\n<p>The Orkney islands were a key location for the British war effort during the Two World Wars because it hosted <a href=\"www.scapaflowwrecks.com\">Scapa Flow<\/a>: the main naval base of the Royal Navy. For this reason it saw an intense aerial activity of all sorts: squadrons defending the base, German bombers attacking it, aircraft carrier operations&#8230;you name it. Inevitably this activity generated aircraft crashes due to accidents and combat and these events have been compiled by different initiatives such as the Project <a href=\"https:\/\/canmore.org.uk\/project\/935283\">Adair-Whitaker<\/a> or the <a href=\"http:\/\/www.crashsiteorkney.com\">ARGOS group<\/a>.<\/p>\n<p>I created a nice Shapefile of aircraft crash sites based on the information provided by <a href=\"https:\/\/canmore.org.uk\/\">Canmore<\/a>. How can I explore the dataset beyond the spatial dynamics? How many German aircrafts were shot down? What types and squadrons sustained more losses? Let&#8217;s take a look!<\/p>\n<h3>Download Data<\/h3>\n<p>The first thing to do is to load the Shapefile. This format requires of several files so I created a <em>zip<\/em> file with all the set. First of all we need to download it to a temporary directory and extract its contents:<\/p>\n[generic linenumbers=&#8221;False&#8221;]\ntmp &lt;- tempdir()<br \/>\nurl &lt;- &quot;https:\/\/github.com\/xrubio\/pastByNumbers\/raw\/master\/data\/aircraft_orkney.zip&quot;<br \/>\nfile &lt;- basename(url)<br \/>\ndownload.file(url, file)<br \/>\nunzip(file, exdir = tmp)<br \/>\n[\/generic]\n<h3>Load the spatial data<\/h3>\n<p>We will use the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/rgdal\/index.html\">rgdal<\/a> library to load the contents of the Shapefile into R:<br \/>\n[generic linenumbers=&#8221;False&#8221;]\nlibrary(rgdal)<br \/>\naircraftShp &lt;- readOGR(dsn = tmp, layer=&quot;aircraft_orkney&quot;)<br \/>\nstr(aircraftShp)<br \/>\n[\/generic]\n<p><em>aircraftShp<\/em> is a variable of type <em>SpatialPointsDataFrame<\/em> from package <em>sp<\/em>. It is a container of some interesting information such as the bounding box delimiting our spatial entities or the Coordinate Reference System that it is being used. Ideally we would like to get the attributes of the spatial entities as a data frame so we can use it with most R packages:<\/p>\n[generic linenumbers=&#8221;False&#8221;]\naircraft &lt;- as.data.frame(aircraftShp)<br \/>\nstr(aircraft)<br \/>\n[\/generic]\n<h3>Maps and planes<\/h3>\n<p>First of all let&#8217;s plot the spatial coordinates of the aircraft crash sites against a nice background. I created a rectangle location based on the bounding box of <em>aircraftShp<\/em>:<\/p>\n[generic linenumbers=&#8221;False&#8221;]\naircraftShp@bbox<br \/>\nlocation &lt;- c(-3.85, 58.7, -1.95, 59.4)<br \/>\n[\/generic]\n<p>We can use <a href=\"https:\/\/journal.r-project.org\/archive\/2013-1\/kahle-wickham.pdf\">ggmap<\/a> to download the background:<br \/>\n[generic linenumbers=&#8221;False&#8221;]\nlibrary(ggplot2)<br \/>\nlibrary(ggmap)<br \/>\nbgMap &lt;- get_map(location=location, source=&quot;stamen&quot;, maptype=&quot;watercolor&quot;)<br \/>\nggmap(bgMap) + geom_point(data=aircraft, aes(x=coords.x1, y=coords.x2))<br \/>\n[\/generic]\n<figure id=\"attachment_89\" aria-describedby=\"caption-attachment-89\" style=\"width: 480px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-89\" src=\"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/baseMap.png\" alt=\"\" width=\"480\" height=\"480\" srcset=\"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/baseMap.png 480w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/baseMap-150x150.png 150w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/baseMap-300x300.png 300w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/baseMap-200x200.png 200w\" sizes=\"auto, (max-width: 480px) 100vw, 480px\" \/><figcaption id=\"caption-attachment-89\" class=\"wp-caption-text\">Aircraft crash sites around Scapa Flow<\/figcaption><\/figure>\n<h3>Exploring other dimensions<\/h3>\n<p>This is ok but it is something we can already do with any GIS so&#8230;what&#8217;s the deal? The thing is that <em>aircraft<\/em> is an R Data Frame so we can plot any variable using a graphic library, something that R does way better than any GIS. For example, let&#8217;s take a look at the temporal dimension:<\/p>\n[generic linenumbers=&#8221;False&#8221;]\nggplot(aircraft, aes(x=year, fill=force)) + geom_histogram(binwidth=1) + facet_wrap(~war, scales=&#8221;free_x&#8221;)<br \/>\n[\/generic]\n<figure id=\"attachment_92\" aria-describedby=\"caption-attachment-92\" style=\"width: 625px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"size-large wp-image-92\" src=\"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-1024x1024.png\" alt=\"\" width=\"625\" height=\"625\" srcset=\"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-1024x1024.png 1024w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-150x150.png 150w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-300x300.png 300w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-768x768.png 768w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-600x600.png 600w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/time-200x200.png 200w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><figcaption id=\"caption-attachment-92\" class=\"wp-caption-text\">Losses by force and year<\/figcaption><\/figure>\n<p>We can also create a table of planes used by the different British squadrons that suffered losses:<\/p>\n[generic linenumbers=&#8221;False&#8221;]\nggplot(na.omit(aircraft), aes(y=type, col=force, x=as.factor(sqdn))) + geom_raster()<br \/>\n[\/generic]\n<figure id=\"attachment_91\" aria-describedby=\"caption-attachment-91\" style=\"width: 625px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" class=\"size-large wp-image-91\" src=\"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/sqdns-1024x717.png\" alt=\"\" width=\"625\" height=\"438\" srcset=\"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/sqdns-1024x717.png 1024w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/sqdns-300x210.png 300w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/sqdns-768x538.png 768w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/sqdns-600x420.png 600w, http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-content\/uploads\/sites\/7\/2017\/04\/sqdns-200x140.png 200w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><figcaption id=\"caption-attachment-91\" class=\"wp-caption-text\">Aircraft types per squadron<\/figcaption><\/figure>\n<h3>Combining everything<\/h3>\n<p>We can finally create an infographic combining space, time and other stuff with the library <a href=\"https:\/\/cran.r-project.org\/web\/packages\/gridExtra\/index.html\">gridExtra<\/a>:<\/p>\n<p>First, we create nice version of the plots we just saw:<\/p>\n[generic linenumbers=&#8221;False&#8221;]\ng1 &lt;- ggmap(bgMap, extent=&quot;panel&quot;, darken = c(.4,&quot;white&quot;)) + geom_point(data=aircraft, aes(x=coords.x1, y=coords.x2, col=war), size=2) + geom_text(data=aircraft, aes(x=coords.x1, y=coords.x2,label=type), family = &quot;Trebuchet MS&quot;, color=&quot;grey40&quot;, check_overlap=T, nudge_y=0.01) + theme(legend.position=&quot;bottom&quot;) + ggtitle(&quot;aircraft losses&quot;) + xlab(&quot;&quot;) + ylab(&quot;&quot;) + scale_color_manual(values=c(&quot;indianred2&quot;, &quot;goldenrod1&quot;))<\/p>\n<p>g2 &lt;- ggplot(na.omit(aircraft), aes(y=type, col=force, x=as.factor(sqdn))) + geom_point(size=3) + theme_bw() + theme(legend.position=&quot;bottom&quot;) + xlab(&quot;squadron&quot;) + ylab(&quot;&quot;) + ggtitle(&quot;Britain &#8211; aircraft types per squadron&quot;)<\/p>\n<p>g3 &lt;- ggplot(aircraft, aes(x=year, fill=force)) + geom_histogram(binwidth=1, col=&quot;black&quot;) + facet_wrap(~war, scales=&quot;free_x&quot;, ncol=1) + theme_bw() + theme(legend.position=&quot;bottom&quot;) + ggtitle(&quot;losses per year&quot;) + xlab(&quot;&quot;) + ylab(&quot;&quot;)<br \/>\n[\/generic]\n<p>And we can then combine the 3 plots into a nice visualization:<br \/>\n[generic linenumbers=&#8221;False&#8221;]\nlibrary(gridExtra)<br \/>\ngrid.arrange(arrangeGrob(arrangeGrob(g1,g2, heights=c(2\/3,1\/3), ncol=1), g3, widths=c(2\/3, 1\/3), ncol=2))<br \/>\n[\/generic]\n<figure style=\"width: 1800px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/github.com\/xrubio\/pastByNumbers\/raw\/master\/gallery\/aircraft_orkney.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-large\" src=\"https:\/\/github.com\/xrubio\/pastByNumbers\/raw\/master\/gallery\/aircraft_orkney.png\" width=\"1800\" height=\"1575\" \/><\/a><figcaption class=\"wp-caption-text\">Aicraft Losses in Orkney islands during the two World Wars<\/figcaption><\/figure>\n<h3>Interpretation<\/h3>\n<p>These visualizations allow us to identify several interesting things going on:<\/p>\n<ul>\n<li>The Second World War had way more accidents than the First one.<\/li>\n<li>It would seem that Luftwaffe stopped flying over Scapa Flow after the first years of WW2<\/li>\n<li>The Fleet Air Arm had a very large increase on losses after 1941.<\/li>\n<li>Most squadrons only flew 1 type of plane being the exception the <em>771 Naval Air Squadron<\/em>. It makes total sense given the diversity of missions flown by <a href=\"https:\/\/en.wikipedia.org\/wiki\/771_Naval_Air_Squadron\">this squadron<\/a>.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Geographical Information Systems such as QGIS are common in the typical archaeologist&#8217;s toolbox. However, the complexities of our datasets mean that we need additional tools to identify patterns in time and space. Imagine that you are working with a set of points in a vector format such as a Shapefile. It can be a set [&hellip;]<\/p>\n","protected":false},"author":13,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[9,14,12,10,11,13],"class_list":["post-81","post","type-post","status-publish","format-standard","hentry","category-r","tag-conflict-archaeology","tag-ggplot","tag-orkney","tag-shp","tag-warfare","tag-ww2"],"_links":{"self":[{"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/posts\/81","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/comments?post=81"}],"version-history":[{"count":18,"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/posts\/81\/revisions"}],"predecessor-version":[{"id":88,"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/posts\/81\/revisions\/88"}],"wp:attachment":[{"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/media?parent=81"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/categories?post=81"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/research.shca.ed.ac.uk\/past-by-numbers\/wp-json\/wp\/v2\/tags?post=81"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}