Research Area 3: Outlier Detection of Large Scale Traffic Data

[ Potential applications of this research: Traffic incident detection, traffic control and surveillance system, anomalies detection in large-scale data ] 

Traffic data is typically massive in size (Fig. 1(b)) yet exceedingly useful for road network management. However, it is full of errors, noise and abnormal traffic behaviors (Fig. 1(c)) which are regarded as outliers because they are inconsistent with the rest of the data. Hence the problem is not trivial. I developed a framework of a nonparametric Bayesian approach, Dirichlet process mixture models (DPMMs), for the traffic data to detect outliers (Fig. 1(d)). The DPMMs framework is a better choice over the traditional GMM due to its free of order selection and it is well suited to tackle the under fitting and over fitting problems. My recent research [1-2] has three contributions: (1) A new generic DPMM-based method is developed for unsupervised outlier detection of real-world traffic data (764,027 vehicles) with a detection rate of 96.68% in a 10-fold cross validation; (2) the method can be readily extended to describe the entire road network; (3) It outperforms the unsupervised approach like k-NN and S-estimators, and also the traditional supervised approach with Gaussian mixture model and k-mean clustering.

We have been doing some extended evaluations for various OD methods:One-class SVM and Kernel Density Estimation in [3] and Distance-based k-nearest Neighbors in [4] as well.

RA3_Fig-1

References:

[1] H.Y.T. Ngan ,N.H.C. Yung, A.G.O. Yeh, “Modeling of Traffic Data Characteristics based on Dirichlet Process Mixtures,” 8th IEEE Int’l Conf. Automation Science & Engineering (CASE), pp. 224-229, 2012.

[2] H.Y.T. Ngan, N.H.C. Yung and A.G.O. Yeh, “Detection of Outliers in Traffic Data based on Dirichlet Process Mixture Model,”  IET Intelligent Transportation Systems, vol. 9, no. 7, pp. 773-781, 2015.

[3] H.Y.T. Ngan, N.H.C. Yung, and A.G.O. Yeh, “A Comparative Study of Outlier Detection for Large-scale Traffic Data by One-class SVM and Kernel Density Estimation,” IS&T/SPIE Electronic Imaging, 94050I-94050I-10, 2015.

[4] T.T. Dang, H.Y.T. Ngan, W.Liu, “Distance-based k-nearest Neighbors Outlier Detection Method in Large-scale Traffic Data,” IEEE Int’l Conf. Digital Signal Processing (DSP), pp. 507-710, 2015.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s