The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
prefixing attribute names produced by WVT?
Hi,
as I work with a text collection containing many, many words it contains also words like "label" and "id" already used for attribute names in an example set. I am getting warnings like the one below from the TextInput and I wonder whether there is an easy way to prefix all attribute names originating from words (similar to a StringToWordVector option -P known from Weka).
Thank you very much!
/kirke
as I work with a text collection containing many, many words it contains also words like "label" and "id" already used for attribute names in an example set. I am getting warnings like the one below from the TextInput and I wonder whether there is an easy way to prefix all attribute names originating from words (similar to a StringToWordVector option -P known from Weka).
Right now I don't believe it causes much trouble, but maybe I just missed some option in WVT TextInput to fix it.
[Warning] TextInput: The original example example set already contains an attribute named "label".
This is likely to cause trouble. Please rename the attribute in the original example set.
Thank you very much!
/kirke
0
Answers
One way for prefixing is to add "TokenReplace" operator as a child of the "TextInput" and define replacement by regular expressions. For example, if your words will only consist of letters from a to z, you can define "([A-Za-z]+)" as a word pattern and "word_$1" as a replacement, where "word_" is the prefix.
I hope it helps a bit.