The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Add spelling filter"

yijunyijun Member Posts: 1 Learner III
edited June 2019 in Help
There is filter in text processing to remove dictionary words (stop words). Is there a filter to remove none-dictionary words?

One of the use is to filter words NOT in user-file. If the user-file is "linux.words", English dictionary, then this will remove none-English words. This is useful when we want to remove bad words from poorly scanned collection of OCR text files.

Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi, this is currently not possible out of the box with the Text Processing operators. You can however transform the document into an example set and then use the standard RapidMiner operators to remove all words which are not contained in a dictionary.

    Best,
    Marius
Sign In or Register to comment.