Tucson Car Collision Analysis

INFO 526 - Project Final

Project description
Author
Affiliation

Data Dazzlers:
Sanja Dmitrovic, Jiayue He, Vidhyananth Sivashanmugam,
Naitik Shah, Varun Soni, Mohammad Ali Farmani

School of Information, University of Arizona

Abstract

In this project, we look into car accident data from the city of Tucson’s police department, which provides substantial data on car accidents from 2018 - 2023. We aim to offer detailed insights into these car accident data by developing an interactive, user-friendly Quarto dashboard. This dashboard will allow users to access statistics related to the frequency, severity, and causes of traffic accidents across different times and locations in Tucson. The goal is to help various stakeholders such as city planners, policy-makers, local residents, and fellow students think of measures to improve road safety.

Introduction

In response to the critical issue of road safety in Tucson and as students at the University of Arizona, located in the heart of Tucson, we propose to develop a dynamic and user-friendly Quarto dashboard that offers detailed insights into the frequency, severity, and causes of traffic accidents across different times and locations in Tucson. We aspire to contribute to informed decision-making processes and the implementation of effective road safety measures.

Analyzing traffic collision data within the city not only allows us to apply the theoretical knowledge we’ve gained in our data analysis courses, but we can also provide insights into urban safety and transportation issues that affect peers and fellow residents. This project is not just an academic exercise; it’s a chance to contribute to a safer, more informed Tucson.

The project utilizes the comprehensive accident crash data set provided by the Tucson Police Department, GIS Data from the City of Tucson, which displays Tucson Police’s publicly-available records of vehicle collisions from 2018 to 2023. Example variables in this data set include date of collision, injury severity, manner of collision, if the collision was fatal, etc. A full list of these variables is provided below.

Overall Project Plan

For this project, we follow these four steps to create a user-friendly, interactive Quarto dashboard:

  1. Data Preparation and Analysis

  2. Development of Interactive Visualizations

  3. Accessibility and User Experience

  4. Outreach and Impact

Project Questions

Specially, we focused on two main questions:

  1. Does day of the week and/or time of day affect severity and the number of accidents?
  2. What is the relationship between the type of violation (e.g., failure to yield, aggressive driving) and if the accident resulted in a fatality?

Question 1

Does day of the week and/or time of day affect severity and the number of accidents?

Approach: We use time series and heat map plots to see which months, days, and times accidents occur more frequently.

First, a monthly time series categorized by year is created to see monthly and yearly trends in accident frequency.

Analysis: We find that 2019 is the year with the highest frequency of car accidents and 2020 is the year with the lowest frequency of car accidents. The decrease in accidents in 2020 is most likely due to COVID (April 2020 was first shutdown month), where fewer people were commuting in general. Also, there is a lower amount of accidents between May and August compared to the other months. This could be attributed to not as many people commuting to school since these months are considered summer break. We see more accidents occurring in October and April, which could be due to holidays such as Halloween and Spring Break.

Then, an hourly heat map categorized by day is plotted to observe peak times and days for accidents.

Analysis: Based on the plot, we found that during the week, time period from 13:30 to 18:30 more accident happens, one possible reason is that between this period of time, many parents would go out to pick up their children between 13:30 - 16:30. And during the period of 16:30 - 18:30 is the peak hour of people get off from the work. The second time period of more accident happens is between 6:30 to 8:30 which is also the peak hour when people go to the work. Therefore, drivers should be more careful during the peak hours, or if possible, people can also avoid getting out during the peak hours. We also see that fewer accidents occur in general over the weekend, so more caution must be taken on weekdays.

Question 2

What is the relationship between the type of violation (e.g., failure to yield, aggressive driving) and if the accident resulted in a fatality?

Approach: We use a heat maps and bar plots to see the relationship between the type of violation and the resulted injury, including if the injury was fatal. First, the total number of injuries per injury type is shown in a bar plot.

Analysis: We see that 50.2% of drivers do not get injured from an accident, but 49.8% do. The most common injury type is non-incapacitating injury followed by possible injury. It is also seen that 1.6% of accidents do result in a fatal injury.

Then, a heat map is plotted to observe the cause of injury severity to see which types of accidents lead to which types of injuries.

Analysis: Based on the plot, we found that left turns caused the most accidents including the highest fatal injury. Usually, left turn is allowed during the green light which also allows straight cars to cross, so many people who turned left might neglect the coming cars. It suggested that when drivers turn left, they should be more careful.

Then, another bar plot is created to look at the amount of fatalities by violation type.

Analysis: Based on the plot, we found that violations occurring due to speed , crosswalk , and failure to yield have higher fatalities.

Finally, various pie charts are plotted to see the percentage of accidents occurring per violation/operator type such as the accident occurring at an intersection or if the driver was speeding, distracted, or impaired.

Analysis: From these various pie charts, we see that 74.7% of accidents occur at an intersection, 18.5% due to the driver speeding, 6.9% due to the driving being impaired, and 38.1% due to the driver being distracted. Based on these results, we recommend that drivers take particular caution at intersections. It might also be beneficial to enforce stricter regulations on distractions such as being on the phone.

Conclusion

We demonstrate a successfully-designed Quarto dashboard that allows the user to see a substantial amount of statistics related to car accidents in Tucson. With a deep commitment to applying our analytical skills for the betterment of our community, we believe that this project represents a meaningful opportunity to make a tangible difference in the area of road safety in Tucson. We are excited about the potential of our interactive spatiotemporal visualization tool to inform Tucson drivers on best driving practices that can lead to positive changes in our community.