The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How do I use LDA for a single column in an Excel File
Hello Rapidminer Community,
First of all a big THANK YOU to anyone who helps, I´m currently writing my Thesis and I´m quite new to RAPIDMINER but you definetly rock!
I´m trying to use LDA for a single Column in an Excel File, so the Read Excel Operator is currently only importing one Column into the software which is different texts in 400 rows. Currently I´m using the process shown in the pictures I attached below, as it is quiet similar to the process which is shown in a Youtube Tutorial by Nasir Soft called: "13 -Topic Modeling and Latent Dirichlet Allocation (LDA) | Twitter Mining | Rapidminer Tutorial". I´m not allowed to post Links yet, but I will attach his process also (it is the fourth picture).
I think the processes are quite similar except for the fact, that I´m using the "Process Documents" Operator to preprocess the text data (Tokenizing, Stemming...).
Does anybody know what I´m doing wrong as i dont get any results.
First of all a big THANK YOU to anyone who helps, I´m currently writing my Thesis and I´m quite new to RAPIDMINER but you definetly rock!
I´m trying to use LDA for a single Column in an Excel File, so the Read Excel Operator is currently only importing one Column into the software which is different texts in 400 rows. Currently I´m using the process shown in the pictures I attached below, as it is quiet similar to the process which is shown in a Youtube Tutorial by Nasir Soft called: "13 -Topic Modeling and Latent Dirichlet Allocation (LDA) | Twitter Mining | Rapidminer Tutorial". I´m not allowed to post Links yet, but I will attach his process also (it is the fourth picture).
I think the processes are quite similar except for the fact, that I´m using the "Process Documents" Operator to preprocess the text data (Tokenizing, Stemming...).
Does anybody know what I´m doing wrong as i dont get any results.
0
Best Answers
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data ScientistHi there,are you sure there is any text going in? It doesn't look to bad from the first look.Cheers,Martin- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0 -
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data ScientistHey,check the output of the 'top' port. It tells you the most important words per topic, which is generally very helpful.I wrote something about it years back on my blog: https://towardsdatascience.com/topic-mining-on-amazon-reviews-ae76fc286c61Cheers,Martin
- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany1
Answers
I tried a simplified approach and that worked, sadly the output is not as nicely interpretable as i hoped but thank you nonetheless.