Problem
Urban transport authorities and emergency services require reliable indicators to anticipate periods of elevated road risk. This project explores whether daily collision counts can be predicted using a combination of calendar behaviour patterns and weather conditions.
Three key questions guided the analysis:
- Which variables correlate most strongly with collision frequency?
- Can simple statistical models provide usable forecasts?
- Do non-linear machine learning models offer meaningful improvements?
Data
- Time period analysed: 2012–2019
- Target variable: total daily collision count (normalised for modelling)
- Calendar variables: year, month, weekday indicators, day-of-year
- Weather variables: temperature, dew point, precipitation, visibility, wind speed, fog indicators and atmospheric pressure
Daily collision records were joined with weather observations to create a unified dataset suitable for exploratory analysis and predictive modelling.
Feature Engineering
- One-hot encoding for weekday and monthly patterns
- Creation of temporal indicators including day-of-year
- Normalisation of collision counts to stabilise model training
- Train-test split of 80/20 with a validation subset
Exploratory Insights
- Day-of-week behaviour shows the strongest correlation with collisions (r ≈ 0.53)
- Temperature and dew point show weak positive relationships
- Precipitation, fog, wind and visibility show minimal correlation
- Collision distributions remain relatively stable across years
The findings suggest behavioural patterns such as commuting activity dominate collision frequency, while weather conditions act as secondary modifiers.
Modelling Approach
Two modelling strategies were evaluated.
- Linear regression baseline models implemented in TensorFlow/Keras
- A deep neural network designed to capture non-linear interactions
The neural network architecture used two hidden layers with ReLU activation, trained using the Adam optimiser and MAE loss.
Model Performance
The neural network provides a modest improvement by capturing interactions between weekday behaviour and weather conditions.
Key Insight
Weekly behavioural patterns are the dominant driver of collision frequency. Weather conditions refine the prediction but do not fundamentally alter the risk structure. For example, weekend days consistently show lower predicted collision counts compared with mid-week commuting days, even under similar temperature conditions.
Applications
- Resource planning for police and emergency response teams
- Targeted road safety campaigns during high-risk periods
- Baseline forecasting for evaluating traffic safety interventions
- Risk modelling for transport policy and insurance applications
Ethics & Limitations
- Predictions should support operational planning rather than attribute blame to specific communities
- Human oversight is essential when applying predictive risk indicators
- Future improvements could include traffic volume, public events and holiday indicators