Analysis Of The Techniques Using An Image-CNN Framework To Solve A Time Series Classification Problem
Abstract
Historically, time series classification problems have been modelled using either instanceor feature based approaches. However a major drawback of these more traditionalapproaches has been the need for handcrafted features. The ascent of Deep Learning(DL) in recent years has provided an oppurtunity to mitigate this problem. With Deeplearning the algorithm can now seek to learn features itself. One such DL approach is touse a Convolutional Neural Network (CNN) by providing a time series as an input image. This paper is a review of various techniques that use an Image-CNN framework to solve atime series classification problem. The techniques reviewed in this paper first transform1D time series signals into 2D texture images to reap advantage of additional featuretypes and then model the problem as an image recognition task by taking advantage ofCNN which do not need hand crafted features and can unify both feature learning andclassification into one task. Results of the various techniques are then evaluated on apublicly available UCR data repository and directions of future work are broached.
Index Terms - Convolutional Neural Networks (CNN), Time Series Classification (TSC), GramianAngular Fields (GAF), Recurrence Plots (RF), Markov Transition Fields (MTF)I. IntroductionAccording to Box, time series can be defined as a collection of random variables indexed accordingto the order they are obtained in time. Historically, time series methods were applied to problemsin the physical and environmental sciences. However, with the advent of sensors over the lastfew decades many important real-world pattern recognition tasks deal with time-series analysis. Biomedical signals, Drowsy Detection (McDonald et al. 2012), financial data, industrial devices,forecasting and weather are few such examples (Box, Jenkins, and Reinsel 2008).
There are two separate time series analysis approaches that exist namely: time domain approach andfrequency domain approach. Time domain approaches include auto-correlation and auto-regressionalso known as Box-Jenkins methods. Frequency domain approaches commonly include spectral andwavelet analysis (Box, Jenkins, and Reinsel 2008). Time series classification approaches can be majorly classified into two, namely: instance basedand feature based. Instance based approaches such as 1-Nearest Neighbour (1-NN) (Xing, Pei, andYu 2009) and Dynamic Time wrapping (DTW) (Y. -S. Jeong, Jeong, and Omitaomu 2011) use adistance measure to assess the similarity between two time series and assign a label accordingly. Feature based approaches are two-stage algorithms where first they extract features by transformingthe time-series into a new space followed by a back end classifier that uses these features to predicta label (Eads et al. 2005).
According to Hatami, a major drawback of feature based classification approaches are that theyrequire ‘hand-crafted’ features. Due to this ‘hand-crafted’ nature of the problem model performanceis now dependent on the quality of these hand-crafted features which are naturally difficult toengineer in time series problems. Therefore detorirating the performance of feature-based problems inmany applications (Zheng et al. 2014). This problem has turned time series classification researchersattention onto deep learning techniques as using deep learning one can learn a hierarchical featurerepresentation from raw data implicitly (LeCun, Bengio, and Hinton 2015). Most of the attentionhas been given to Convolutional Neural Networks (CNN). The papers being reviewed in this articleuse varying CNN architectures (Hatami, Gavet, and Debayle 2017).
Encoding 1D time series into 2D texture images
Most of the approaches involving deep learning in TSC so far have been either focused on supplying1D time series as an input or transfroming the existing 1D time series into images and supplyingthose as input. The domain of encoding 1D time series into 2D texture images has been relativelyunexplored except for Wang (Wang and Oates 2015) and Hatami (Hatami, Gavet, and Debayle2017). In this section, we look at three techniques used to convert 1D time series into 2D images. Wang also explains how GAF has some interesting properties such as with increase in time, valuescontort at different angles on the circle similar to water rippling. As, the matrix G has largedimenesion n ∗ n where n is the length of the time series a Piecewise Aggregation Approximationfrom (E. J. Keogh and Pazzani 2000) is used to reduce dimensionality while keeping the trends(Wang and Oates 2015). Figure 1 illustrates the procedure to generate GAN.
Markov Transition Field
Most of the inspiration for MTF is taken from Campanharo (Campanharo 2011) with the onlydifference being that in case of Wang markov transition probabilities are represented sequentiallyand we are not working with networks. (Campanharo 2011) proposed a novel method to map time series to a network. In case of 2D images,as is our use case, it can be stated as follows:
- First, Identify Q quantile bins for a given time series X and assign each data point xi to thecorresponding bins qj ∀ j ∈ [1, Q]. If the number of observations n ∗ xi are evenly divisible bythe number of bins n ∗ qj then equal number of observations are assigned to each bin. He thenassigns these quantiles to network nodes beyond which will not be discussed here.
- Treat transitions among these quantiles bins as a first order markov chain with state space S= {1,2,3. . . Q}
- Construct an adjacency matrix W of dimensions Q ∗ Q where wi,j is denoted as the theprobability that a point in quantile qi is followed by a point in quantile qj
- Normalize this matrix to get a Markov transition matrix W with∑j wi,j = 1However, according to Wang transition matrix W doesn’t take into account the temporal dependenceof time series data ti, being a first order markov process, and therefore he extends transition matrixW on the magnitude axis into the MTF matrix by considering the temporal positions. Lastly, a convolution is performed on the MTF using a blurring kernel to reduce dimensionality(Wang and Oates 2015).
Recurrence Plots (RP)Hatami uses Recurrence Plots (RP) with a slight modification to accomplish the image encodingtask. RP’s were first proposed by Eckmann (Eckmann, Kamphorst, and Ruelle 1987) where he mentionsthe following steps to convert a time series into a recurrent plot for a dynamical system.
Tiled CNNTiled CNNs were first proposed by (Ngiam et al. 2010) to combat the hard-coded nature ofconvolutions and the weight tying property of CNN architecture which make CNNs very good atcapturing translational invariance. To elaborate, CNNs produce the same response regardless ofhow an input image is shifted. This phenomenon is more formally known as translational invariance.
This robustness of CNN is due to the inherent nature of convolutions which is further strengthenedby their architectural construct which makes use of the same filter to all the positions of the imageamong all the layers (also known as weight tying). Ngiam indicates that this inherent abilityto capture translational invariance is not necessarily a good thing as it leads the CNN to avoidcapturing more complex invariances like out-of-plane rotations, scale etc. He fixes this by proposing‘Tiling’ to let the network learn it’s own invariances from unlabeled data. Tiling let’s units furtherapart share weights while leaving the nearby units untied thus allowing the units of the second layerto learn complex range of invariances by pooling over simpler units having different basis functions. Distance of the units to be tied is determined by the parameter k which is more formally definedas the tile size. k = 1 results in a traditional CNN with all weights tied and k = inf resultsin none of the units to be tied. Greater the value of k more complex are the invariances learnt. Another parameter defined is maps (l) which is a combination of pooling and simple units.
Future Work
Potential future work can be directed in three ways: 1. Implement different image conversiontechniques and benchmark using the provided architectures. 2. Benchmark using different CNNarchitectures with these image conversion techniques. 3. Model TSC using other Neural Networkapproaches such as LSTM, RNN, GAN etc. Most importantly, such an TS - Image - CNN frameworkcould have a lot of impact in designing drowsy detection algorithms which are currently beingdesigned using traditional instance and feature based frameworks.