IEEE/CAA Journal of Automatica Sinica  2016, Vol.3 Issue (2): 141-148   PDF    
Traffic Flow Data Forecasting Based on Interval Type-2 Fuzzy Sets Theory
Runmei Li1 , Chaoyang Jiang1, Fenghua Zhu2, Xiaolong Chen1    
1. Beijing Jiaotong University, Beijing 100044, China;
2. Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
Abstract: This paper proposes a long-term forecasting scheme and implementation method based on the interval type-2 fuzzy sets theory for traffic flow data. The type-2 fuzzy sets have advantages in modeling uncertainties because their membership functions are fuzzy. The scheme includes traffic flow data preprocessing module, type-2 fuzzification operation module and long-term traffic flow data forecasting output module, in which the Interval Approach acts as the core algorithm. The central limit theorem is adopted to convert point data of mass traffic flow in some time range into interval data of the same time range (also called confidence interval data) which is being used as the input of interval approach. The confidence interval data retain the uncertainty and randomness of traffic flow, meanwhile reduce the influence of noise from the detection data. The proposed scheme gets not only the traffic flow forecasting result but also can show the possible range of traffic flow variation with high precision using upper and lower limit forecasting result. The effectiveness of the proposed scheme is verified using the actual sample application.
Key words: Interval type-2 fuzzy sets     central limit theorem     confidence interval     long-term prediction    
Ⅰ. INTRODUCTION

Several researchers and research groups have worked hard to get reliable information about what will occur on a road using predictive traffic modeling[1, 2]. Traffic flow forecasting as it is called is divided into long-term forecasting and short-term forecasting[3, 4, 5, 6, 7, 8]. Long-term forecast involves forecasting 24 hours in advance using historical traffic flow data,and short-term forecast involves forecasting 1 hour,45,30 or 15 minutes in advance using real-time temporal and spatial traffic data.

A kind of fuzzy clustering method based on geographic information of given traffic network was used successfully for long-term traffic flow data prediction by Christiane et al.[5]. This method can get good accurate forecasting result but is not convenient when used in different traffic network. Williams[6] developed a multivariate autoregressive integrated moving average model that includes upstream traffic flow data. Thomas et al.[7] applied integral autoregressive moving average method to long-term prediction with higher accuracy. Whereas its errors have been becoming larger with the passage of time. Another major class of traffic flow forecasting models is the artificial neural network. Hou et al.[8] applied neural network method to forecast traffic flow data 24 hours in advance using historical traffic flow data.

The forecasting result based on a number of variables measured by devices such as loop detectors embedded in the roadway and digital image acquisition and processing systems is compared to actual data of one day to show the accuracy of the methods. However,the urban traffic network system is an open,complex nonlinear time-varying system. traffic flow data has great randomness and strong uncertainty of noise from data collection system. How can we accurately measure these data with deterministic numbers? Especially for long-term flow forecast which is necessary for the traffic scheduling application and traffic operation plan,traffic managers need not only the traffic flow forecasting result data but also the possible range of traffic flow variation with high precision described by upper and lower forecasting results. Moreover,to face the challenge of huge traffic flow data,effective data processing methods with strong data-processing capability are urgently needed. Also,because of the noise,the methods should have high fault tolerance.

Zadeh put forward the concept of type-2 fuzzy set in 1975 as an extension of traditional fuzzy set[9]. Differing from type-1 fuzzy set,its membership grades are type-1 fuzzy sets instead of crisp values. It can be adopted on many occasions,such as for uncertain membership functions and uncertain parameters. As is pointed out in [1, 2],a lot of time series prediction problems,which cannot determine the value of membership grades in the real world due to the uncertainty and randomness in data,can be solved by type-2 fuzzy set. This paper puts forward a traffic flow long-term forecasting scheme and implementation method based on the interval type-2 fuzzy sets theory and central limit theorem for the first time.

The scheme consists of three modules: traffic flow data preprocessing module,type-2 fuzzification module and long-term traffic flow data prediction module. The central limit theorem is adopted to convert point data of mass traffic flow in some time range into interval data of the same time range (also called confidence interval data) which is being used as the input of interval approach. The confidence interval data retain the uncertainty and randomness of traffic flow,meanwhile reduce the influence of noise from the detection data.

Compared with traditional forecasting methods,this method has several remarkable advantages. This paper successfully applies interval type-2 fuzzy sets theory to traffic field. Besides,central limit theorem is adopted to transform point data into interval data for interval approach. After innovative improvement in this paper,interval approach has the capacity to handle point data which can be used in many other fields than original interval approach. further more,this method has reasonable forecasting result,which is a region surrounded by two forecasting curves. In practice,traffic flow data is fluctuant and has an approximate fluctuation range. Hence,long-term traffic flow data forecasting result is in accord with actual traffic flow fluctuation situation. Last but not least,traffic flow data in simulation part is measured big data. Therefore,our proposed method offers big data processing capacity and application convenience along with relatively small and stable errors.

This paper is organized as follows. SectionⅠ introduces several basic concepts of type-2 fuzzy set. Section Ⅱ presents long-term traffic flow data forecasting framework. Implementation and simulation of three modules are displayed in Section Ⅲ and Section Ⅴ. Section Ⅳ presents several conclusions and topics for further discussion.

Ⅱ. TYPE-2 FUZZY SET CONCEPTS

Zadeh put forward the concept of type-2 fuzzy set in 1975. This section reviews some definitions of type-2 fuzzy sets theory.

Definition 1. A type-2 fuzzy set $\tilde {A}$,is characterized by its membership function $\mu_{\tilde {A}}(x,u)$,where $x\in X$ and $u\in J_x \subseteq [0,1]$,that is to say:

$\begin{equation} \label{eq1} \tilde {A}{=}\{((x,{u}),\mu_{\tilde{A}} (x,{u})){| }\forall x\in {X},\forall {u}\in {J}_x \subseteq [{0},{1}]\}, \end{equation}$ (1)

where $0 \le \mu_{\tilde{A}} (x,u)\le 1$.

Here is another expression[10].

$\begin{equation} \label{eq2} \tilde {A}=\int_{x\in X} {\int_{u\in J_x }} \frac{{\mu_{\tilde {A}}(x,u)}}{(x,u)},\quad J_x \subseteq [0,1]. \end{equation}$ (2)

In the type-2 fuzzy set $\tilde {A}$,membership degree of every variable $x$ is $u\in J_x \subseteq [0,1]$,which is type-1 fuzzy set. Compared with type-1 fuzzy set,type-2 fuzzy set is a region rather than a curve or a few discrete points,which is more effective to reflect the uncertainty and randomness in data. Type-2 fuzzy set has more powerful ability in dealing with high uncertainty systems. But it also has implementation complexity and computation complexity.

In order to reduce the computation complexity,a special type-2 fuzzy set called interval type-2 fuzzy set[10] is adopted in this paper,as introduced in Definition 2. Compared with ordinary type-2 fuzzy sets,these sets not only reduce the amount of computation greatly,but also retain the capability of processing uncertainty and randomness in the data.

Definition 2. Membership grades of every element in type-2 fuzzy set are type-1 fuzzy sets. If secondary membership grades are equal to 1,it is called interval type-2 fuzzy set. Fig. 1 shows three-dimensional view of an interval type-2 fuzzy set[11].

Download:
Fig. 1. Three-dimensional view of a type-2 fuzzy set.

Definition 3. Footprint of uncertainty (FOU) is the union of all the Cartesian products of every point and its primary membership. The corresponding formula can be written as[12]

$\begin{equation} \label{eq3} FOU(\omega )=\bigcup\limits_{x\in X} x \times L_x. \end{equation}$ (3)

If $\omega $ is a continuous type-2 fuzzy set,then (3) should be defined as[13, 14]

$\begin{equation} \label{eq4} FOU(\omega )=\bigcup\limits_{x\in X} x \times [{\mu _\omega ^1 }(x),{\mu _\omega ^1 } (x)]. \end{equation}$ (4)

Distribution of FOU of interval type-2 fuzzy set is uniform.

Definition 4. For continuous domain $X$ and $U$,an embedded type-1 fuzzy set $A_{e} {=}\int_{x\in X} {u/x}$,$u\in J_x$,is embedded in $FOU(\tilde{A})$[10]. If the domain is continuous,there will be an infinite number of $A_e$. Upper and lower membership functions,which are special embedded type-1 fuzzy sets,are two boundaries of FOU.

Definition 5. The centroid of a type-2 fuzzy set $\tilde {A}$ is the union of the centroids of its embedded type-1 fuzzy sets.

$\begin{equation} \label{eq5} C_{\tilde {A}} (x)=\mathop \cup \limits_{\forall A_e } c(A_e )=\{c_l (\tilde {A}),\ldots,c_r (\tilde {A})\}, \end{equation}$ (5)

where $\mbox{c}_l (\tilde {A})=\mathop {\min }\nolimits_{\forall A_e } c_{\tilde {A}} (A_e )$,${c}_{r} (\tilde {A})= \mathop {\max }\nolimits_{\forall A_e } c_{\tilde {A}} (A_e ). $

By defuzzifying upper and lower membership functions,left and right centroids are obtained. Connecting several centroids of upper membership functions,upper centroid curve is achieved. Lower centroid curve can be achieved in the same way.

Type-2 fuzzy set is able to deal with traffic flow data because of its capacity of handling data with uncertainty and randomness. This paper proposes a long-term traffic flow data forecasting framework based on the interval type-2 fuzzy sets theory. Adopting the central limit theorem,this paper converts point data of three months into intervals that are input of interval approach. After innovative improvement in this paper,Interval Approach has the capacity to handle point data which can be used in many other fields than original interval approach. Furthermore,this method has reasonable forecasting result,which is a region surrounded by two forecasting curves. In practice,traffic flow data is fluctuant and has an approximate variation range. Hence,long-term traffic flow data forecasting result is in accord with actual traffic flow fluctuation situation,which can express the possibility of traffic flow variation effectively.

Ⅲ. LONG-TERM TRAFFIC FLOW DATA FORECASTING FRAMEWORK

Long-term traffic flow data forecasting framework consists of three modules,which is shown in Fig. 2. Module 1,preprocesses traffic flow data and converts it into confidence intervals. Module 2,fuzzifies confidence intervals into many type-2 fuzzy sets. Module 3,obtains traffic flow data forecasting results from type-2 fuzzy sets.

Download:
Fig. 2. Long-term traffic flow data forecasting framework.

In the type-2 fuzzification module,type-2 fuzzy sets need to be established as shown in Fig. 2. Commonly adopted methods include person FOU approach,interval endpoints approach and interval approach[15, 16]. Person FOU approach is based on six premises and requires an priori assumption about whether or not FOU is symmetric[15]. Interval end-points approach must chooses the shape of FOU ahead of time[16]. This paper adopts interval approach,which captures the strong point of both person FOU approach and interval end-points approach.

Interval approach is rarely used in traffic field,since its inputs should be interval data. In order to apply type-2 fuzzy sets theory to transportation system,traffic flow data need to be transformed into intervals. It is a key problem that must be solved when applying type-2 fuzzy sets theory in transportation field. In data preprocessing module,this paper adopts central limit theorem to improve the input data limit of interval approach as shown in Fig. 2. After innovative improvement,interval approach has the capacity to handle point data and can be used in more fields than original interval approach.

In the long-term forecasting module,this method has reasonable forecasting result,which is a region surrounded by two forecasting curves as shown in Fig. 2. In practice,traffic flow data is fluctuant and has an approximate variation range. Hence,long-term traffic flow data forecasting result is in accord with actual traffic flow fluctuation situation. Besides,it is a method with relatively small and stable errors.

A. Traffic Flow Data Preprocessing Module

In order to apply type-2 fuzzy sets theory to transportation system,traffic flow data need to be transformed into intervals. To describe the fluctuation range of traffic flow data and convert data into intervals,this paper innovatively applies central limit theorem and confidence interval to traffic flow data preprocessing.

Confidence interval is the estimation interval of population parameter of sample statistics estimates. So this paper adopts confidence intervals and central limit theorem to convert traffic flow data into interval data. Confidence interval shows the range where the real value of parameter falls around a certain probability. The probability is believable degree of the measured parameter,called confidence level. In fact,it is a special fuzzification in this process,which is different from type-2 fuzzification.

To obtain the confidence interval,this paper applies central limit theorem using (6) and (7).

Theorem 1. When the sample size $n $ is large enough[17],

$\begin{equation} \label{eq6} T=\frac{\bar {X}-\mu }{\frac{S}{\sqrt n} } \end{equation}$ (6)

obeys normal distribution approximately,thus confidence intervals of $\mu $ is

$\begin{equation} \label{eq7} \left(\bar {X}-u_{\frac{\alpha} {2}} \frac{S}{\sqrt n },\bar {X}+u_{\frac{\alpha} {2}} \frac{S}{\sqrt n }\right), \end{equation}$ (7)

where ${\bar{X } }$ and $S$ are the mean and standard deviation of sample,$n $ is the size of sample,$\alpha $ is the quantile of normal distribution.

In this case,central limit theorem is applied to obtain confidence interval and confidence level of the true value of traffic flow data from a large amount of data. After this part,traffic flow data is transformed into intervals.

B. Type-2 Fuzzification Module

1) Data part of interval approach. Interval approach type-2 fuzzifies confidence intervals into type-2 fuzzy sets that are obtained in preprocessing module. Interval approach consists of two parts: data part and fuzzy set part. Data part deletes those confidence intervals that are not satisfactory. Fuzzy set part converts confidence intervals into type-2 fuzzy set. Data part consists of four steps: bad data processing,outlier processing,tolerance-limit processing and reasonable-interval processing.

Step 1. Bad data processing

The domain of the type-2 fuzzy sets is set to $[0,10]$. So every interval's end points should fall into this range. Besides,the right end point should be bigger than the left one for the same interval. If the right end point is smaller than the left end point,it will be deleted.

Step 2. Outlier processing

Data which is abnormally large or small will be eliminated.

Step 3. Tolerance limit processing

Here some rules proposed by Walpole[17] are used.

$\begin{align} \label{eq8} & {{a}^{(i)}\in [m_l -ks_l ,m_l +ks_l]},\notag\\ & {b^{(i)}\in [m_r -ks_r ,m_r +ks_r]},\notag \\ & {{L}^{(i)}\in [m_L -ks_L ,m_L +ks_L]}, \end{align}$ (8)

$m_l$ and $m_l$ are the mean and standard deviation of the left end points,respectively. $mr$ and $sr$ are the mean and standard deviation of the right end points,respectively. $a^{(i) }$ and $b^{(i) }$ are the left end point and the right end point of the $i$-th interval,respectively. $L^{(i) }$ is the length of the $i$-th interval. $m_L$ and $s_L$ are the mean and standard deviation of L[10].

Step 4. Reasonable-interval processing

Some rules are adopted in this section.

$\begin{equation}a^{(i) }< \xi ^{* },\quad b^{(i) }> \xi ^{* },\end{equation}$ (9)

where $\xi $$^{* }$ is one of the value[6]:

$\begin{align} \xi ^\ast =&\ \frac{m_r \sigma _l^2 -m_l \sigma _r^2 }{\sigma _l^2 -\sigma _r^2 } \notag\\ &\ \pm \frac{\sigma _l \sigma _r [(m_l -m_r )^2+2(\sigma _l^2 -\sigma _r^2 )\ln(\frac{\sigma _l} {\sigma _r })]^{\frac{1}{2}}}{\sigma _l^2 -\sigma _r^2 }. \end{align}$ (10)

2) Fuzzy set part of interval approach. Interval approach is based on the choice of a type-1 fuzzy set model and in this part we establish the type-1 fuzzy sets of each interval. The mean and standard deviation of a type-1 fuzzy set is calculated using (11) and (12). There are three types of type-1 fuzzy set models: symmetric triangle,left-shoulder and right-shoulder. Type-2 fuzzy sets can be obtained using the operation of union in (13),as shown in Fig. 3.

Download:
Fig. 3. An example of the union of type-1 fuzzy sets.
$\begin{align} \label{eq9} &m_{A}= \dfrac{\int_{a_{MF} }^{b_{MF} } {x\mu _A (x){\rm d}x} }{\int_{a_{MF} }^{b_{MF} } {\mu _A (x){\rm d}x} }, \end{align}$ (11)
$\begin{align} \label{eq10}& \sigma _A =\left[{\dfrac{\int_{a_{MF} }^{b_{MF} } {(x-m_A )^2\mu _A (x){\rm d}x} }{\int_{a_{MF} }^{b_{MF} } {\mu _A (x){\rm d}x} }} \right]^{\frac{1}{2}}, \end{align}$ (12)

where $a_{MF} $ and $b_{MF} $ are parameters of a type-1 fuzzy set.

$\begin{align} \mu _C (x)&=\max (\mu _A (x),\mu _B (x))\notag \\ &=\mu _A (x)\vee \mu _B (x), \end{align}$ (13)

$\mu _A (x)$ and $\mu _B (x)$ are membership functions of type-1 fuzzy sets $A$ and $B$. $\mu _C (x)$ is the membership function of the union of $A$ and $B$.

C. Traffic Flow Data Forecasting Module

Using (5),a lower centroid curve can be obtained by connecting many centroids of lower membership function of type-2 fuzzy set,which is the lower boundary of centroids of type-2 fuzzy set. In practice,lower traffic flow forecasting curve can be obtained by the lower centroid curve. Similarly,upper traffic flow forecasting curve can be obtained. The region,surrounded by two forecasting curves,is the long-term traffic flow data forecasting result.

D. Error Analysis

In order to verify the accuracy and effectiveness of long-term traffic flow data forecasting,errors are calculated and analyzed. This paper uses traffic flow data of 24 hours to have a comparison with forecasting region. This paper proposes a new method in (14) to analyze errors,because a traffic flow curve and a traffic flow region are compared.

$\begin{align} & {{\Delta }_{i}}=\left\{ \begin{array}{*{35}{l}} |{{X}_{1i}}-{{L}_{i}}|,& {{X}_{1i}}<{{L}_{i}},\\ |{{X}_{2i}}-{{L}_{i}}|,& {{X}_{2i}}>{{L}_{i}},\\ \qquad 0,& {{X}_{2i}}<{{L}_{i}}<{{X}_{1i}},\\ \end{array} \right.\text{ } \\ & {{\delta }_{i}}=\left\{ \begin{array}{*{35}{l}} \frac{|{{X}_{1i}}-{{L}_{i}}|}{{{L}_{i}}},& {{X}_{1i}}<{{L}_{i}},\\ \frac{|{{X}_{2i}}-{{L}_{i}}|}{{{L}_{i}}},& {{X}_{2i}}>{{L}_{i}},\\ \qquad 0,& {{X}_{2i}}<{{L}_{i}}<{{X}_{1i}},\\ \end{array} \right. \\ \end{align}$ (14)

$X_{1i} $ is the $i$-th upper traffic flow forecasting value,$X_{2i} $ is the $i$-th lower traffic flow forecasting value,$L_{i}$ is the $i$-th real value of one weekday. $i = 1,2,\ldots,288$.

Ⅳ. SIMULATION OF LONG-TERM FORECASTING FRAMEWORK A. Simulation of Preprocessing Module

To verify the proposed long-term traffic flow data forecasting method,this paper uses the actual traffic flow data during January and March in 2014. Eliminating some holidays,data of Tuesdays,Wednesdays,and Thursdays is selected. As shown in Fig. 4,raw traffic flow data of three months has statistical uniformity and fluctuation range. It is a very meaningful research question to describe the variation range.

Download:
Fig. 4. Raw traffic flow data of three months.

As shown in Fig. 4,traffic flow measured data are located in an approximate range,which shows the great randomness and strong uncertainty due to noise from data collection system. Therefore,measuring these data with deterministic numbers is inappropriate. Effective data processing methods with strong big data-processing capability are in urgent need to represent the traffic flow characteristics including randomness and uncertainty.

To describe the fluctuation range of traffic flow data and convert data into intervals,this paper applies central limit theorem to traffic flow data preprocessing. In order to obtain type-2 fuzzy set of 5 minutes,confidence level is set to 90%. Applying central limit theorem in (6) and (7),confidence interval of 15:01-15:05 is [1278,1550]. That is to say,there is 90% possibility for the real value to fall in the domain [1278,1550]. Similarly,type-2 fuzzy sets of 15:01-15:55 can be obtained,as shown in Table Ⅰ. After this process,traffic flow data is converted from point data into interval data.

TABLE I
CONFIDENCE INTERVALS OF 15:01-15:55
B. Simulation of Type-2 Fuzzification Module

1) Continuous scale. After preprocessing module,traffic flow data is transformed into confidence intervals. A continuous scale is established for each variable(sometimes a natural scale exists,e.g.,as in length,width,height,temperature,etc). The framework in this paper are described for the continuous scale numbered 0-10. Of course,other scales could be used. As shown in Table Ⅱ,right endpoints of confidence intervals after this part should fall into [0,10],all confidence intervals are divided by scale factors. Scale factor is equal to max (right end-point)/5 in this paper.

2) Data part of interval approach. Interval approach needs to eliminate inappropriate data using (8) to (10). Confidence intervals of 15:00-15:55 satisfy formulas,so all of them are retained.

TABLE Ⅱ
CONFIDENCE INTERVALS ON A CONTINUOUS SCALE

3) Fuzzy set part of interval approach. Every confidence interval corresponds to a type-1 fuzzy set model,as shown in Fig. 5. Applying union operation in (11) to these type-1 fuzzy sets,type-2 fuzzy set of 15:26-15:30 in Fig. 6 is obtained. Similarly,$24 \times 12=288$ type-2 fuzzy sets can be got after handling traffic flow data. Fig. 7 demonstrates type-2 fuzzy sets of 0:26-0:30,1:26-1:30,2:26-2:30,3:26-3:30,4:26-4:30 and 5:26-5:30. Other 282 fuzzy sets are not shown due to space limit.

Download:
Fig. 5. Type-1 fuzzy sets corresponding to 11 confidence intervals.

Download:
Fig. 6. Type-2 fuzzy set of 15:26-15:30.

Download:
Fig. 7. Six type-2 fuzzy sets of five-minutes.
C. Simulation of Long-term Forecasting Module

In type-2 fuzzification module,288 type-2 fuzzy sets are obtained using (5). By defuzzifying type-2 fuzzy sets,288 centroids of upper membership functions of type-2 fuzzy sets can be obtained. By connecting 288 centroids,upper centroid curve can be gained. As shown in Fig. 8,lower centroid curve can be got in the same way.

Download:
Fig. 8. Centroids of type-2 fuzzy sets of 24 hours.

Multiplying centroids by scale factors,traffic flow data forecasting curve is demonstrated in Fig. 9. The region,surrounded by upper and lower traffic flow data forecasting curves,is the long-term traffic flow data forecasting result.

Download:
Fig. 9. Lower and upper forecasting curves.
D. Error Analysis

In order to verify accuracy and validity of long-term traffic flow data forecasting framework,this paper adopts the real data of next Thursday to compare with the forecasting result. Applying (14),average absolute error and average relative error are easily obtained.

Firstly,long-term traffic flow data forecasting scheme adopts median filtering approach in the data preprocessing module. If one real day's traffic flow data is not filtered,Fig. 10 is obtained. Average absolute error and average relative error are 68.7277 and 11.59%,respectively. If one real day's traffic flow data is filtered,Fig. 11 is obtained. Average absolute error and average relative error are 34.5421 and 5.61%,respectively.

Download:
Fig. 10. Two forecasting curves and one real day’s traffic flow curve when long-term forecasting frame adopts median filtering approach.

Download:
Fig. 11. Two forecasting curves and one real day’s flow curve when both of them adopt median filtering approach.

In the second phase,long-term traffic flow data forecasting scheme does not adopt any filtering approaches in the data preprocessing module. If one real day's traffic flow data is not filtered,Fig. 12 is obtained. Average absolute error and average relative error are 53.2090 and 9.76%,respectively. If one real day's traffic flow data is filtered,Fig. 13 is obtained. Average absolute error and average relative error are 24.6526 and 4.13%,respectively.

Download:
Fig. 12. Two forecasting curves and one real day’s traffic flow curve when neither of them is filtered.

Download:
Fig. 13. Two forecasting curves and one real day’s flow curve when the real day’s traffic flow curve adopts median filtering approach.

The result of long-term traffic flow data forecasting framework is a region,so (14) is applied to compare a region with a curve,as calculated above. In order to compare with other literatures,this paper also analyze the errors using traditional equation (15). Average absolute error and average relative error of upper forecasting curve are 86.5326 and 12.17%,respectively. Average absolute error and average relative error of lower forecasting curve are 94.9988 and 12.14%,respectively. The region surrounded by two forecasting curves has smaller errors of two curves. Average absolute error and average relative error in different cases is demonstrated in Table Ⅲ.

$\begin{align} &\Delta _{i}= {\vert X}_{i }- L_{i}{\vert }, \notag\\[2mm] &{\delta }_{i}= \frac{{\vert X}_{i} - L_{i}}{\vert {L}_{i}},\end{align}$ (15)
TABLE Ⅲ
AVERAGE ABSOLUTE ERROR AND AVERAGE RELATIVE ERROR IN DIFFERENT CASES

where $X_{i}$ is the $i$-th traffic flow forecasting value,$L_{i}$ is the $i$-th real value of one weekday. $i = 1,2,{\ldots},288$.

Ⅴ. CONCLUSION

This paper proposes a long-term traffic flow data forecasting framework based on interval type-2 fuzzy sets theory. It deals with large amount of measured traffic flow data,and predicts one weekday's data. The framework consists of traffic flow data preprocessing module,type-2 fuzzification module and long-term traffic flow data forecasting module. As simulation verifies the accuracy and validity of the framework. In order to apply type-2 fuzzy sets theory to transportation system,traffic flow data need to be transformed into intervals. It is a key problem that must be solved when applying type-2 fuzzy sets theory in transportation field. So central limit theorem is adopted to improve the input data limit of interval approach. Applying central limit theorem,this paper converts traffic flow point data into confidence interval in preprocessing module. Meanwhile,confidence intervals are inputs of interval approach in type-2 fuzzification module. After innovative improvement,interval approach has the capacity to handle point data and can be used in more fields than original interval approach. Furthermore,this method has reasonable forecasting result,which is a region surrounded by two forecasting curves. In practice,traffic flow data is fluctuating and has an approximate variation range. Hence,long-term traffic flow data forecasting result is in accord with actual traffic flow fluctuation situation. To verify accuracy and validity of long-term traffic flow data forecasting framework,this paper assimilation with measured data and it is found that average relative error is steadily small. Therefore,it is a method with relatively small and stable errors,big data processing capacity and application convenience.

On the basis of this paper,further study will explore improved type-2 fuzzy sets. Besides,further study will explore forecasting data update based on data weighting technology.

References
[1] Zhang J P, Wang F Y, Wang K F, Lin W H, Xu X, Chen C. Datadriven intelligent transportation systems: a survey. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(4): 1624-1639
[2] Jagadeesh G R, Dhinesh G R, Srikanthan T. Method for accuracy assessment of aggregated freeway traffic data. IET Intelligent Transport Systems, 2014, 8(4): 407-414
[3] Zhu W, Wang F Y. On three types of covering-based rough sets. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(8): 1131- 1144
[4] Zhu W, Wang F Y. The fourth type of covering-based rough sets. Information Sciences, 2012, 201: 80-92
[5] Stutz C, Runkler T. Classification and prediction of road trafficc using application-specific fuzzy clustering. IEEE Transactions on Fuzzy Systems, 2002, 10(3): 297-308
[6] Williams B. Multivariate vehicular traffic flow prediction: evaluation of ARIMAX modeling. Transportation Research Record: Journal of the Transportation Research Board, 2001, 1776: 194-200
[7] Thomas T, Weijermars W, van Berkum E. Predictions of urban volumes in single time series. IEEE Transactions on Intelligent Transport Systems, 2010, 11(1): 71-80
[8] Hou Y, Edara P, Sun C. Traffic flow forecasting for urban work zones. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(4): 1761-1770
[9] Zadeh L A. The concept of a linguistic variable and its application to approximate reasoning-II. Information Sciences, 1975, 8(4): 301-357
[10] Li C D, Zhang G Q, Wang H D, Ren W N. Properties and data-driven design of perceptual reasoning method based linguistic dynamic systems. Acta Automatica Sinica, 2014, 40(10): 2221-2232
[11] Mendel J M, Hagras H, John R I. Guest editorial for the special issue on type-2 fuzzy sets and systems. IEEE Transactions on Fuzzy Systems, 2013, 21(3): 397-398
[12] Mo H, Wang J, Li X, Wu Z L. Linguistic dynamic modeling and analysis of psychological health state using interval type-2 fuzzy sets. IEEE/CAA Journal of Automatica Sinica, 2015, 2(4): 366-373
[13] Mo Hong, Wang Tao. Computing with words in generalized interval type-2 fuzzy sets. Acta Automatica Sinica, 2012, 38(5): 707-715 (in Chinese)
[14] Mo H, Wang F Y. Linguistic dynamic systems based on computing with words and their stabilities. Science in China Series F: Information Sciences, 2009, 52(5): 780-796
[15] Liu F, Mendel J M. Encoding words into interval type-2 fuzzy sets using an interval approach. IEEE Transactions on Fuzzy Systems, 2008, 16(6): 1503-1521
[16] Mendel J M. Computing with words and its relationships with fuzzistics. Information Sciences, 2007, 177(4): 988-1006
[17] Dehay D, Leskow J, Napolitano A. Central limit theorem in the functional approach. IEEE Transactions on Signal Processing, 2013, 61(16): 4025-4037