Graph-Based Traffic Forecasting on Santiago’s Road Network from GPS Data

* juherrera [at] ug.uchile.cl
1 2 3
Map-matching STGNN pipeline
Our complete pipeline to predict traffic from GPS data via STGNNs

Abstract

Traffic management requires rigorous monitoring and real-time decision-making. Although several prediction models using spatio-temporal graph neural networks (STGNNs) have been proposed, there are few examples in Latin America or with GPS data. We train a model with a prepared dataset from Santiago, Chile, using GPS data from public buses and the (now defunct) on-demand car rental company Awto. Our hypothesis is that by cross-referencing vehicle position data with OpenStreetMap data, we will be able to accurately predict short term traffic on the Santiago road network graph via STGNNs.

Dataset Preprocessing

The Awto dataset contains 123.082.800 pulses with attributes such as position, vehicle ID, speed, date and time, etc., between December 2022 and February 2023. On the other hand, the OpenStreetMap dataset is a representation of the Santiago road network in graph form.

We first match the datapoints to their corresponding streets (graph edges) on the road network via Largest Common Subsequence map-matching. We exclude night time and fill edges with zero or null speed with measurements previous days during the same time window. We simplify each OSM edge using the OSMnx python library, and construct an adjaceny matrix of each OSM edge using Gaussian Kernel. Essentially, each OSM edge is taken as a node, with weights inversely proprtional to the network distance.

We define the adjacency matrix A as follows:

$$ A_{ij} = \exp\left(-\frac{d_{ij}^2}{\sigma^2}\right) $$

where \( d_{ij} \) is the network distance between edges \( i \) and \( j \), and \( \sigma \) is a scaling parameter.

We then aggregate the data into 15 minute intervals, taking the average speed and total traffic per edge per interval. Finally, we standardize the data to have zero mean and unit variance.

Model Architecture

We use a Spatio-Temporal Graph Neural Network (STGNN) architecture. The model consists of multiple layers of graph neural networks (GNNs) to capture spatial dependencies, followed by temporal layers to capture temporal dependencies. The input to the model is a sequence of historical traffic data (in the form of time windows), and the output is the predicted traffic for future time steps.

Preliminary Results

A matched dataset containing an average speed graph time series for Awto.

Average speed and total traffic for one day on the matched Awto dataset
(Left): Average speed and (right): total traffic for one day on the matched Awto dataset
Distance, speed and traffic distributions
(Left): Distance to road (center): average speed per edge and, (right): daily traffic per edge