PCA in Rapidminer
I would like to determine themes of a corpus of tweets using PCA. I created the process using the ff: read excel,nominal to numeric, PCA and connected the ports. There are no errors but I am not sure on how I can identify the hidden themes using PCA with the standard deviation, proportion of variance and cumulative variance. The proportion of variance ranges from 0-.001. I set the variance threshold at .95.
Can you please help me? Thank you
component | std dev | proportion of variance | cumulative variance |
PC 1 | 0.157 | 0.025 | 0.025 |
PC 2 | 0.137 | 0.019 | 0.045 |
PC 3 | 0.123 | 0.016 | 0.06 |
PC 4 | 0.118 | 0.014 | 0.075 |
PC 5 | 0.115 | 0.014 | 0.089 |
PC 6 | 0.112 | 0.013 | 0.101 |
PC 7 | 0.104 | 0.011 | 0.113 |
PC 8 | 0.1 | 0.01 | 0.123 |
PC 9 | 0.098 | 0.01 | 0.133 |
PC 10 | 0.097 | 0.01 | 0.143 |
PC 11 | 0.097 | 0.01 | 0.153 |
PC 12 | 0.093 | 0.009 | 0.161 |
PC 13 | 0.093 | 0.009 | 0.17 |
PC 14 | 0.092 | 0.009 | 0.179 |
PC 15 | 0.09 | 0.008 | 0.187 |
PC 16 | 0.089 | 0.008 | 0.196 |
PC 17 | 0.087 | 0.008 | 0.204 |
PC 18 | 0.087 | 0.008 | 0.211 |
PC 19 | 0.086 | 0.008 | 0.219 |
PC 20 | 0.084 | 0.007 | 0.226 |
PC 21 | 0.083 | 0.007 | 0.234 |
PC 22 | 0.082 | 0.007 | 0.241 |
PC 23 | 0.082 | 0.007 | 0.248 |
PC 24 | 0.081 | 0.007 | 0.254 |
PC 25 | 0.08 | 0.007 | 0.261 |
PC 26 | 0.08 | 0.007 | 0.268 |
Best Answer
-
Thomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
Hi jem810,
Have you checked out this post by our partner Simafore? http://www.simafore.com/blog/bid/62911/How-to-run-Principal-Component-Analysis-with-RapidMiner-Part-2
He goes in depth on how to use PCA with RapidMiner and how to intepret the EigenVectors
2
Answers
Is it possible to display the results graphically?
@sebastian_gonza Yes, there is the ability to graph.