RMIT University
Browse

Applying Big Data Analytics on Motor Vehicle Collision Predictions in New York City

chapter
posted on 2024-11-01, 03:30 authored by Dhanushka Abeyratne, Malka N HalgamugeMalka N Halgamuge
This chapter aims to observe emerging patterns and trends by using big data analysis to enhance predictions of motor vehicle collisions. This chapter data set consists of 17 attributes and 998 193 collisions in New York City. The data is extracted from the New York Police Department (NYPD). Then the data set has been tested in three classification algorithms, which are k-nearest neighbor (kNN), random forest, and Naive Bayes. The outputs are captured using k-fold cross-validation method. These outputs are used to identify and compare classifier accuracy, and random forest node accuracy and processing time. Further, an analysis of raw data is performed describing the four different vehicle groups to detect significance within the recorded period. Finally, extreme cases of collision severity are identified by using outlier analysis. The analysis demonstrates that out of three classifiers, random forest has been classified to show the highest number of accuracies with 95.03%, followed by kNN with 94.93%, and Naive Bayes provided the least accuracy of 70.13%, although it has recorded the least processing time of 5.7938 seconds. Further, random forest confirmed stable high accuracy throughout each node used. Therefore, random forest classifier can be identified as the most accurate prediction method among all other tested classification methods. Additionally, statistical analysis shows each described vehicle group to be highly related to the recorded period of years (p < 0.001). Overall, this chapter has identified a highly accurate classification model and the significance of a vehicle group that could minimize road risks and motor vehicle collisions. Therefore, these results provide new evidence to support future researchers.

History

Related Materials

  1. 1.
    ISBN - Is published in 9781119544456 (urn:isbn:9781119544456)
  2. 2.

Start page

219

End page

239

Total pages

21

Outlet

Intelligent Data Analysis: From Data Gathering to Data Comprehension

Editors

Deepak Gupta, Siddhartha Bhattacharyya, Ashish Khanna, and Kalpna Sagar

Publisher

John Wiley & Sons

Place published

Hoboken, United States

Language

English

Copyright

© 2020 John Wiley & Sons Ltd

Former Identifier

2006117565

Esploro creation date

2022-11-26

Usage metrics

    Scholarly Works

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC