"How to Setup Value Series Feature Extraction Operator with Windowing Operator"

samup4web · November 2012

Hi all,

I trying to use Series Extension to perform classification on multivariate time series data. So far, I have been able get sliding window using the 'Windowing" operator (encode series by examples)
Example of source data:
point1 point2 point3 label
a1 b1 c1 class1
a2 b2 c2 class1
a3 b3 c3 class1
... .... ... ........

Since I wish to perform classification on these data points. I intend to extract features based on the sliding window and then have dataset similar to the example below (assume window size=2)

Point1-1 Point1-0 Point1-Extracted-Feat. Point2-1 Point2-0 Point2-Extracted-Feat. Point3-1 Point3-0 Point3-Extracted-Feat label
a1 a2 XXXXXX b1 b2 XXXXXXX c1 c2 XXXXXXXX Class1
a3 a4 XXXXXX b3 b4 XXXXXXX c3 c4 XXXXXXX Class1
... ... ......

With this approach, I can select the extracted features as attributes for the classification process.

So far, I have only been able to get the attributes (Point1-1, Point1-0, Point2-1, Point2-0, Point3-1, Point3-0) using the window operator. But when I attempt to use the operators such as Discrete Wavelet Transformation, it only operated on the first example (i.e a1, a2, b1, b2, c1,c2). I also had to use the "Data to Series"Operator to be able to use any of the extraction or transformation operator for series. I don't know if I am using the write approach.

What is the best setup for this approach?

What is the best setup for using sliding window and extracting features to give a similar dataset similar to the example I showed above.

Best Regards
/Sam

wessel · November 2012

Can you modify your example to include the attribute time?

Like, for example, maybe you want to predict the temperature 24 hours from now, based on historic temperature and humidity readings.

So then you have your original data-set like this:

#Time, humidity, temperature
timeValue1, humidityValue1, temperatureValue1
timeValue2, humidityValue2, temperatureValue2
...
timeValueN, humidityValueN, temperatureValueN

Lets write this a bit more short
#t,y,z
t1,y1,z1
t2,y2,z2
...
tN,yN,zN

Now after windowing with embedding dimension W your dataset becomes
#t,y-24,y-25,....,y-W,z-24,z-25,...,z-W,z-0

Here z-0 is the label attribute, it encodes z (temperature) at -0 hours relative to now.
So z-24 indicates the temperate 24 hours ago relative to now (where now refers to the 'now' of the row).

Note that t is an attribute with role ID, it doesn't get windowed.

Best regards,

Wessel

samup4web · November 2012

Hi Wessel,

I already have time (role=id) in my example, I just didn't include it in the illustration.

As I mentioned in my previous post. I already used the windowing operator, and I have the encoded example similar to what your described. But, I wish to apply series feature extraction on each window of the time series. I need this for more of classification purpose and not prediction

I am currently using the windowing operator, I do not know if this is the best approach for my task.

Is there any documentation about Series Value extension for RapidMiner 5?

wessel · November 2012

Okay, so what type of features you wish to extract?

Can't you use the raw inputs as input to a classifier, and use those 'predicted labels' as features?
Or hand craft the exact features you want using the script operator?

Best regards,

Wessel

MariusHelf · December 2012

Hi,

if I get it right, you want to extract a feature for each of the original attributes, and you want to do this on windowed data. Maybe this thread (click here) can help you.

Best regards,
Marius

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"How to Setup Value Series Feature Extraction Operator with Windowing Operator"

Answers