Text Mining - Name Collision with special and regular attributes

text_miner · March 2010

Hi,

Since RapidMiner requires all attribute names to be unique, I've noticed a potential naming conflict when doing text mining. If a special attribute with name X exists, then a regular attribute with the same name cannot also exist (or the regular attribute gets removed when the special attribute is created). For example, the special attributes "id" and "label" are relatively common terms that may also appear in text documents.

Is there anyway to specify a prefix/postfix for all special attributes (e.g., metadata_ or specattr_) so name collisions are less likely to occur? If not, could something be added to the configuration options or on the root Process node to allow for this functionality?

Thanks!

land · March 2010

Hi,
the Document processing operators will take care that no attribute name is used twice. If words like label or id occur, they will be assigned attributes names label_0 (or label_1 if label_0 already exists). This is remembered in the word list so that the attribtues are named equally during application.

Greetings,
Sebastian

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Text Mining - Name Collision with special and regular attributes

Answers