The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

How do I use LDA for a single column in an Excel File

MoeMoe Member Posts: 4 Learner I
Hello Rapidminer Community, 

First of all a big THANK YOU to anyone who helps, I´m currently writing my Thesis and I´m quite new to RAPIDMINER but you definetly rock!

I´m trying to use LDA for a single Column in an Excel File, so the Read Excel Operator is currently only importing one Column into the software which is different texts in 400 rows. Currently I´m using the process shown in the pictures I attached below, as it is quiet similar to the process which is shown in a Youtube Tutorial by Nasir Soft called: "13 -Topic Modeling and Latent Dirichlet Allocation (LDA) | Twitter Mining | Rapidminer Tutorial". I´m not allowed to post Links yet, but I will attach his process also (it is the fourth picture).

I think the processes are quite similar except for the fact, that I´m using the "Process Documents" Operator to preprocess the text data (Tokenizing, Stemming...).

Does anybody know what I´m doing wrong as i dont get any results. 







Best Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Solution Accepted
    Hi there,

    are you sure there is any text going in? It doesn't look to bad from the first look.

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Solution Accepted
    Hey,
    check the output of the 'top' port. It tells you the most important words per topic, which is generally very helpful.

    I wrote something about it years back on my blog: https://towardsdatascience.com/topic-mining-on-amazon-reviews-ae76fc286c61

    Cheers,
    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • MoeMoe Member Posts: 4 Learner I
    Hi Martin,

    I tried a simplified approach and that worked, sadly the output is not as nicely interpretable as i hoped but thank you nonetheless.  
  • MoeMoe Member Posts: 4 Learner I
    Thank you @MartinLiebig I will try that
Sign In or Register to comment.