The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Remove Numeric and other data from Text Mining

aksaks Member Posts: 3 Learner I
Hi, I am new user to RP. I have imported a file for sentiment analysis. It is a financial file, I want to remove the number ($, 0, 1, ..9) from the loaded file. Which operator should I use? Thanks in advance.

Answers

  • aksaks Member Posts: 3 Learner I
    RP(Rapidminer Platform)
  • kaymankayman Member Posts: 662 Unicorn
    Use the replace token operator.

    If you click the edit icon and then the drop down you have a few pre-selections, usually the punctuation character (replace with spaceor so) works fine in these cases, you may want to add the number range 0-9 also if it's needed



  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    You can also just use the Replace operator on the text before tokenizing and remove [0-9]+ from the attribute(s) in question.

    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
Sign In or Register to comment.