The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
BUG REPORT: text mining, the clustering process
When I try to run the clustering process of text mining, it came out the error message. Process, error message and csv files are attached below.
Tagged:
0
Best Answer
-
jacobcybulski Member, University Professor Posts: 391 UnicornHi, you have not included the actual RMP file so I am only guessing what may have gone wrong. Your data is over 20K examples and your text has 1000s of unique terms, k-means clustering is not very good deaing with 1000s of attributes. So I assume you have ran out of memory on your computer. To test this out, I suggest to reduce your sample size to 1000 (just for testing). More importantly, you need to reduce the number of terms generated by the parsing process. So I suggest to enable pruning within the Process Documents from Data, make it simple, e.g. percentual from 5% to 30%, which would possibly bring the number of attributes to less than 300. If it works, use all 100% of data. I also note that you have not normalised your data before clustering, so it will be difficult to visually analyse your data. Good luck!
Jacob2