The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

About text mining

morphismmorphism Member Posts: 18 Learner I
edited December 2018 in Help

Hello, how are you?


I have interest in text mining using RapidMiner


Is there any way I can do

"Nonnegative Matrix Factorization" or "Probabilistic Latent Sementic Analysis"

or "Nonlinear Transformation" to Document-Term-Matrix??


I want to do Classification, Clustering, Summarizing, Information Retrieval etc for text data


Thank you in advance and have a nice day.

Answers

  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    These types of text mining functions do not have native RapidMiner operators to support them. However, you could potentially accomplish them using the relevant R or Python packages through the scripting operators.  Having said that, these techniques are also somewhat more advanced or even esoteric approaches to text mining.  Have you tried the more straightforward bag-of-words approach using standard word vector creation (TF-IDF or similar) yet?  You might want to start with those and see what kind of results you get before moving onto the more complex approaches.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • morphismmorphism Member Posts: 18 Learner I

    Hello, Telcontar120.

    Thank you for your explanation.

    I am a beginner in Text mining using RapidMiner.

    I found in books such that,

    SVD or Nonnegative Matrix Factorization techniques can  be used before doing clustering so on,

    and I guessed there are no such operators for that.

    I wanted to know the full possible functions RapidMiner can do for text mining,

    and wanted to use "Nonnegative Matrix technique"


    Then, I found RapidMiner has "Singular Value Decomposition(SVD)"


    So could you please explain to me about How "SVD"  can be applied to text mining projects

    such as clustering, classification??



Sign In or Register to comment.