Summary

The City of Toronto’s open data website contains information on incidents to which Toronto Paramedic Services responds, including the general incident type, priority and number of units arrived at scene. The data-set spans from 2012 through 2016 and contains 1,217,830 rows.

In this post the daily count of incidents, to which paramedics respond to, is analysed and broken down by year, month, and day of week variables. Three observations are identified from this analysis:

  1. Daily incident count is increasing from 2012 through 2016;
  2. Summer months see a higher daily incident count than winter months; and
  3. Friday’s typically have a higher incident count than other weekdays.

Analysis

Data

The raw data used in this analysis can be located on the City of Toronto’s open data website. A processed version of the raw data that contains the updated variables used in the analysis below can be located on my github page.

R libraries

The following libraries were loaded into my R workspace for this analysis.

library(tidyverse);library(magrittr);library(car);
library(GGally);
library(extrafont);
library(readxl)

Preprocessing

The data on the city of Toronto’s website is transformed into a single file with daily incident counts.

Ystart<-2012
Yend<-2016
for(i in Ystart:Yend){assign(paste("incidentdata",i,sep=""),read_excel("Data/TPS Incident Data_2012-2016.xlsx",sheet = paste(i)))}
# The individual data frames are combined, new temporal variables are created and used to create a daily incidence count variable. 
incidentdata_allyears<-rbind(incidentdata2012, incidentdata2013, incidentdata2014, incidentdata2015, incidentdata2016)
names(incidentdata_allyears)[2]<-"Date"
  incidentdata_allyears<-incidentdata_allyears %>%
    mutate(Date=as.Date(Date),Year=format(Date,"%Y"),Month=format(Date,"%m"),Dayofweek=format(Date,"%u")) %>%
    group_by(Date,Year,Month,Dayofweek) %>%
    summarise(Count=n())
  # Last row is removed due to ambiguity
  incidentdata_allyears<-incidentdata_allyears[-1827,] 

Results

Three figures are made from the transformed data and correspond to the three observations made in the summary section.

# A custom theme is defined 
theme_ai <- function(){
  theme_minimal() +
    theme(
      text = element_text(family = "Segoe UI", color = "gray25"),
      plot.title = element_text(size=16),
      plot.subtitle = element_text(size = 14),
      axis.text = element_text(size=13),
      plot.caption = element_text(color = "gray30", size=12),
      plot.background = element_rect(fill = "gray95"),
      plot.margin = unit(c(5, 10, 5, 10), units = "mm"),
      #axis.line = element_line(color="gray50")
      axis.ticks.x = element_line(color="gray35"),
      panel.grid.major.y = element_line(colour = "gray80"))
    
}
# Daily incident count is plotted against a year factor variable
  ggplot(filter(incidentdata_allyears,Count>200),aes(x=as.factor(Year),group=Year,y=Count))+
    geom_boxplot(fill='gray65')+
    labs(title="Paramedic dispatches are increasing year-over-year",
         subtitle="Daily paramedic dispatches from 2012 through 2016",
         x="",
         y="",
         caption="Source: City of Toronto | Creation: Anthony Ionno")+
    theme_ai()+
    scale_y_continuous(breaks=seq(0,900,50))

# Daily incident count is plotted against a monthly factor variable
  ggplot(filter(incidentdata_allyears,Count>200),aes(x=as.numeric(Month),group=as.numeric(Month),y=Count))+
    geom_boxplot(fill='gray65')+
    labs(title="Paramedic dispatches occur more frequently in the summer",
         subtitle="Daily paramedic dispatches from 2012 through 2016",
         x="",
         y="",
         caption="Source: City of Toronto | Creation: Anthony Ionno")+
    theme_ai()+
    scale_x_continuous(breaks=seq(1,12,1),labels = c("Jan","Feb","Mar","Apr","May","Jun","Jul",
                                                     "Aug","Sep","Oct","Nov","Dec"))+
    scale_y_continuous(breaks=seq(0,900,50))

# Daily incident count is plotted against a weekday factor variable
  ggplot(filter(incidentdata_allyears,Count>200),aes(x=as.numeric(Dayofweek),group=as.numeric(Dayofweek),y=Count))+
    geom_boxplot(fill='gray65')+
    labs(title="Paramedic dispatches occur more frequently on Fridays",
         subtitle="Daily paramedic dispatches from 2012 through 2016",
         x="",
         y="",
         caption="Source: City of Toronto | Creation: Anthony Ionno")+
    theme_ai()+
    scale_x_continuous(breaks=seq(1,7,1),labels = c("Mon","Tues","Weds","Thurs","Fri","Sat","Sun"))+
    scale_y_continuous(breaks=seq(0,900,50))