Open Traffic Data for Train Delay Prediction - PHD Thesis

Modeling, Analysis, and Application of Open Traffic Data for Train Delay Prediction

Hot PhD Thesis on Modeling, Analysis, and Application of Open Traffic Data for Train Delay Prediction

Research Area: Cloud Computing

Abstract:

Train delays are among the most complained events by the public communities in urban cities. Train delay prediction is critical for advanced traveler information systems (ATIS), which provides valuable information for enhancing the efficiency and effectiveness of intelligent transportation systems (ITS). However, the train delay prediction problem cannot be easily solved by modeling historical/static data from a single data source. A large amount of data is collected from sensor devices across the cyber-physical networks in the big data era. Multi modal transport management systems offer greater availability of various open data sources, such as General Transit Feed Specification (GTFS) static and real-time feeds. With the development of advanced machine learning techniques, a growing number of open data sources are playing more and more critical roles in planning and operation of transportation services. Recently, very few existing ‘big data’ methods meet the specific needs in railways.
Lastly, as the first work in this area in the world, we apply a real entropy for measuring the time series regularity and find approximated potential predictability on train delays. Different from the existing train delay studies that had strives to explore sophisticated algorithms, this study focuses on finding the bound of improvements on predicting multi-scenario train delays with different machine learning methods. Motivated by the observation of deep learning methods failing to improve the prediction performance if the delay occurs rarely, we present a novel augmented machine learning approach to improve the overall prediction accuracy further. Our solution proposes a rule-driven automation (RDA) method, including a delay status labeling (DSL) algorithm, and the resilience of section (RSE) and resilience of station (RST) indicators to generate the forecast for train delays. The experiment results demonstrate that the Random Forest based implementation of our RDA method (RF-RDA) can significantly improve the generalization ability of multivariate multi-step forecast models for multi-scenario train delay prediction.The proposed solution surpasses state-of-art baselines based on real-world traffic datasets, which treat various real-time delays differently.

Name of the Researcher: Jianqing Wu

Name of the Supervisor(s): Jun Shen, Luping Zhou, Chen Cai

Year of Completion: 2021

University: University of Wollongong

Thesis Link: Home Page Url

Office Address

Social List

Modeling, Analysis, and Application of Open Traffic Data for Train Delay Prediction

Hot PhD Thesis on Modeling, Analysis, and Application of Open Traffic Data for Train Delay Prediction

Abstract:

S-Logix (OPC) Private Limited