May tropical storms since 1851

Ana became our first named storm of the season on 5/8/2015 as a ‘subtropical’ storm, but then transitioned into a ‘tropical’ storm two days later.

Atlantic Hurricane Season hasn’t officially started! It runs June 1st to November 31st, but that doesn’t mean we can’t get storms outside of hurricane season.  We can get a tropical cyclone in any month of the year (and we have had one in each month).

Anywho, I was interested to see how many tropical storms we have had in the month of May since 1851. Our records go back to 1851, which is not far, but it’s what we have. Ah… imagine a world with all the data we could ask for 🙂

Rplot

To create this, you first need to “scrape” the data from Unisys.  I first saw this code by Arthur Charpentier in

http://freakonometrics.hypotheses.org/17113.

I did make some of my own adjustments, but the basis is the same.

Then I used ggmap and ggplot2 to create ‘google’ style map.

Scraping the data is cumbersome and takes some time (computing time). But, once you have the data, you will save it once and then be able to use it always.

The most difficult part of this project is probably the filtering of the data (once you have collected it). It can be tricky, but it is fun.


library(XML)
library(gdata)
library(ggmap)
library(ggplot2)
library(plyr)
library(stringr)
library(maps)
library(httr)
###I got a lot of code from the link below.  I have made a few of my own adjustments.
###http://freakonometrics.hypotheses.org/17113

extract.track=function(year=2014,p=TRUE, pacific=FALSE){
 #Will default to atlantic, unless pacific is specified "TRUE"
 loc <- paste("http://weather.unisys.com/hurricane/atlantic/",
 year, "H", "/index.php",sep="")
 H <- TRUE #whether the website requires the H in the path
 if(pacific){
 loc <- paste("http://weather.unisys.com/hurricane/e_pacific/",year,"/index.php",sep="") }

 response<- GET(loc)
 if(response$status_code != 200){#HTTP request failed.
 H <- FALSE
 loc <- paste("http://weather.unisys.com/hurricane/atlantic/",
 year, "/index.php",sep="")
 }

 if(year %in% c(1995, 1996, 2001)){loc <- paste("http://weather.unisys.com/hurricane/atlantic/",
 year, "/index.php",sep="")}
 #1996, 1995, & 2001 requires this type of path

 #htmlParse(response, encoding = "UTF-8")

 #tabs <- readHTMLTable(htmlParse(loc))
 tabs <- readHTMLTable(loc)[[1]]

 storms <- unlist(strsplit(paste(tabs$Name), split = " "))

 if(any(storms == "KEIT")){ #Takes care of 1988's Keith storm
 storms[which(storms == "KEIT")] <- "KEITH"}

 if(any(storms == "FLOY")){ #Takes care of 1993's Floyd storm
 storms[which(storms == "FLOY")] <- "FLOYD"}

 if(any(storms == "CHARLI")){ #Takes care of 1972's Charlie storm
 storms[which(storms == "CHARLI")] <- "CHARLIE"}

 #storms <- unlist(strsplit(as.character(tabs[[1]]$Name),split=" "))
 #x <- strsplit(paste(tabs[[3]][[2]]), "\\(")
 #y <- unlist(lapply(x, function(l) l[[1]]))
 #storms <- unlist(strsplit(y, split = " "))
 index <- storms%in%c("Tropical","Storm", paste("Hurricane"),
 paste("Hurricane-", 1:5, sep=""),
 "Depression","Subtropical","Extratropical",
 "Low",paste("Storm-",1:6,sep=""), "Xxx",
 "FIFTEE", "SIXTEE", "FOURTE", "NINETE")
 nstorms <- storms[!index]

 #k <- which(nstorms %in% c("ONE", "TWO", "THREE", "FOUR", "FIVE", "SIX", "SEVEN", "EIGHT", "NINE", "TEN",
 # "ELEVEN", "TWELVE", "THIRTEEN", "FOURTEEN", "FIFTEEN", "SIXTEEN", "SIXTEE", "SEVENTEEN") )

 #if(length(k)>0){ nstorms <- nstorms[-k] }

 TRACK=NULL
 track=NULL
 for(i in length(nstorms):1){

 if(H){
 loc=paste("http://weather.unisys.com/hurricane/atlantic/",year,
 "H", "/",nstorms[i],"/track.dat",sep="") }

 if(!H){
 loc=paste("http://weather.unisys.com/hurricane/atlantic/",year,
 "/",nstorms[i],"/track.dat",sep="") }

 if(year %in% c(1995, 1996, 2001)){loc <- paste("http://weather.unisys.com/hurricane/atlantic/",
 year, "/", nstorms[i], "/track.dat",sep="")}
 #again, these certain years require this time of path
 #I found this out just by checking for errors year by year.

 if(pacific){
 loc=paste("http://weather.unisys.com/hurricane/e_pacific/",year,
 "H", "/",nstorms[i],"/track.dat",sep="") }

 track=read.fwf(loc,skip=3,widths = c(4,6,8,12,4,6,20))

 #temp <- scan(loc, skip = 3, what = "character", sep = "\t")
 #temp
 #(str_split_fixed(temp, " ", 7))

 names(track)=c("ADV","LAT","LON","TIME","WIND","PR","STAT")
 track$LAT=as.numeric(as.character(track$LAT))
 track$LON=as.numeric(as.character(track$LON))
 track$WIND=as.numeric(as.character(track$WIND))
 track$PR=as.numeric(as.character(track$PR))
 track$year=year
 track$name=nstorms[i]
 TRACK=rbind(TRACK,track)
 if(p==TRUE){ cat(year,i,nstorms[i],nrow(track),"\n")}}
 return(TRACK)}

#Test out the function for one year
m=extract.track(1995, pacific = FALSE)
head(m)

# returns string w/o leading whitespace
trim.leading <- function (x) sub("^\\s+", "", x)

# returns string w/o leading or trailing whitespace
trim <- function (x) gsub("^\\s+|\\s+$", "", x)

#This will take about 15-20 minutes depending on your computer. But it is quite slow.
#You will only have to do this once, though, because then we will save the data to use
#later.
atlantic=NULL
#Now get data from 1851 to 2014
for(y in 1851:2014){
 atlantic=rbind(atlantic,extract.track(y))
}

#Create a variable called 'month'
month          <- substr(trim(paste(atlantic$TIME)), 1, 2)
atlantic$month <- month

#Create a variable called 'name.year' - this is crucial for filtering the data later.
atlantic$name.year <- paste(atlantic$name, atlantic$year)

#write to a csv file
write.csv(atlantic, "C:/Users/Lisa/Documents/wx blog/hurricane.csv", row.names = FALSE)

# returns string w/o leading or trailing whitespace
trim <- function (x) gsub("^\\s+|\\s+$", "", x)

#Read in the data you just created
#This step is a little redundant, yes, but I have this code split into
#2 different scripts.  Once you collect the data and write the csv, you will no
#longer need the above code.  You will just be able to read in the data and proceed below.
data <- read.csv("C:/Users/Lisa/Documents/wx blog/hurricane.csv")
data$name.year <- paste(data$year, data$name)

#Find the first entry of each storm.
origin <- ddply(data, .(name.year), function(x){x[1,]}) #find the first entry of each storm

#need to get the storms that developed into tropical storm or hurricane by may
#data[which(data$month == 5 & data$STAT == "TROPICAL STORM"), ]

#Find the names of storms that developed into a tropical storm by May.
early.names <- ddply(data, .(name.year), function(x){if(any(x$STAT == "TROPICAL STORM") &
 any( x$month == 5)) paste("TRUE") })

#Now get a subset of the large dataset that includes just these May Tropical Storms
subset <- data[data$name.year %in% paste(early.names$name.year),]

#get very last oberservation so we can mark with an x
last <- ddply(subset, .(name.year), function(x){x[nrow(x),]})

map <- get_map(location = c(lon = -80, lat = 30), zoom = 4)
p   <- ggmap(map)
p + geom_path(data = subset, aes(LON, LAT, col = name.year), lwd = 1.1) +
 geom_point(data = last, aes(LON, LAT, col=name.year), pch = 8, size = 4) +
 geom_point(data = last, aes(LON, LAT, col=name.year), pch = 8, size = 4) +
 ggtitle("May Tropical Storms \n Since 1851") +
 theme(plot.title = element_text(lineheight=.8, face="bold", size = 25,
 colour="hotpink", family = "Arial Black")) +
 xlab("Source: Unisys") + ylab("")


			

One thought on “May tropical storms since 1851

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: