The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Reverse Mapping of PCA
Hi there, I am fairly new to using Rapidminer and a little stuck.
I have applied PCA on a dataset and retained 2 of the principal components.
I then apply the PCA model to a new dataset to calculate the scores.
I would like to then reverse map the scores to the input variables.
and then do a comparison of the original input variables vs. the newly calculated variables from the reverse mapping.
I am trying to use the errors to do online fault diagnosis.
How can I do the reverse mapping of the scores in Rapidminer?
Thanks
MrFury
I have applied PCA on a dataset and retained 2 of the principal components.
I then apply the PCA model to a new dataset to calculate the scores.
I would like to then reverse map the scores to the input variables.
and then do a comparison of the original input variables vs. the newly calculated variables from the reverse mapping.
I am trying to use the errors to do online fault diagnosis.
How can I do the reverse mapping of the scores in Rapidminer?
Thanks
MrFury
0
Answers
I remember doing this in the past, by feeding the PCA a row with all 1's.
Then using the result in a 'rename'.
Is your goal to get the names correct?
Or to apply a PCA model on unseen data?
Best regards,
Wessel
The goal is to do fault diagnosis on some sensor data I am getting from a piece of equipment. So apply the PCA model on unseen time series data.
So I am trying to do similar as described in this article: [http://www.wseas.us/e-library/conferences/2010/Merida/CIMMACS/CIMMACS-20.pdf]
Step 1: Capture most of the natural process variance with PCA on training data. The residual being the "noise"
Step 2: On a new set of data, apply the PCA model to the data to calculate the scores
Step 3: Monitor the T2 or Q statistic to indicate a fault (since the error will increase if seeing something different to what it was modeled on)
Step 4: By backwardly mapping the scores into the input variables one can calculate the error for each variable (E = x - x')
Thanks
Pieter
Then hand 'craft' features.
PC1 is typically the average value of the series.
PC2 is typically the difference between the first part of the series and the second part of the series.
By hand crafting you can create soft margins.
So you can specify that the middle part of the series gets a value close to zero.
Where the second last part gets a value quickly building up to -1.
And the first part starting from 1 and quickly dropping to near 0 when the middle part is reached.
As a side note, you really want to start supervised learning if possible.
Using unsupervised PCA will always give you vague answers.