The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Textual ETL: Stemming from dictionary

WanttoknowWanttoknow Member Posts: 6 Contributor II
edited November 2019 in Help
Hi,

First of all I have to say that RM5.0 is a wonderful tool. :o Congratulations.

I started with pre processing text for classification and I am having some problems with the "Stem (Dictionary)" component.

I am referring to a textfile for the patterns but I am not sure about the syntax of the entries/records in the textfile. The help is very brief about this

Right now the first line in my designated TXT file looks like this:

"move: moving moved move"

But it is not replacing any of the terms to their stem.

Any idea?

Answers

  • arminmaniaarminmania Member Posts: 7 Contributor II
    Hi,

    I am not sure, but I think you have to write as followed:

    move , moving moved move
  • TobiasMalbrechtTobiasMalbrecht Moderator, Employee-RapidMiner, Member Posts: 295 RM Product Management
    Hi,
    Wanttoknow wrote:

    Right now the first line in my designated TXT file looks like this:

    "move: moving moved move"
    did you try to put a blank before the colon?

    Kind regards,
    Tobias
  • WanttoknowWanttoknow Member Posts: 6 Contributor II
    Well, after a lot of trail and error this seems to work

    "
    aanleveren:aanlever.*
    aanleveren:aangelever.*
    zorgverzekering:zorgverzeker.*
    "
    But putting multiple patterns on 1 line like this "aanleveren : aanlever* aangelever*" doesn't work.

  • WanttoknowWanttoknow Member Posts: 6 Contributor II
    Another question:

    Is it possible to use an external list for the ReplaceToken component? That would be more convenient than entering records with the list editor of the component.
Sign In or Register to comment.