The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Textual ETL: Stemming from dictionary
Wanttoknow
Member Posts: 6 Contributor II
Hi,
First of all I have to say that RM5.0 is a wonderful tool. Congratulations.
I started with pre processing text for classification and I am having some problems with the "Stem (Dictionary)" component.
I am referring to a textfile for the patterns but I am not sure about the syntax of the entries/records in the textfile. The help is very brief about this
Right now the first line in my designated TXT file looks like this:
"move: moving moved move"
But it is not replacing any of the terms to their stem.
Any idea?
First of all I have to say that RM5.0 is a wonderful tool. Congratulations.
I started with pre processing text for classification and I am having some problems with the "Stem (Dictionary)" component.
I am referring to a textfile for the patterns but I am not sure about the syntax of the entries/records in the textfile. The help is very brief about this
Right now the first line in my designated TXT file looks like this:
"move: moving moved move"
But it is not replacing any of the terms to their stem.
Any idea?
Tagged:
0
Answers
I am not sure, but I think you have to write as followed:
move , moving moved move
Kind regards,
Tobias
"
aanleveren:aanlever.*
aanleveren:aangelever.*
zorgverzekering:zorgverzeker.*
"
But putting multiple patterns on 1 line like this "aanleveren : aanlever* aangelever*" doesn't work.
Is it possible to use an external list for the ReplaceToken component? That would be more convenient than entering records with the list editor of the component.