The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

[SOLVED] Rename regular attributes generated by Text Processing

RucaRuca Member Posts: 13 Contributor II
edited November 2018 in Help
Hi all,

I'm a newbie in using RapidMiner. I hope I'm placing my issue in the right place. But, first of all let me congratulate the support team for lunching this forum. I hope I can contribute also to solve other issues.
Going back to my problem.

I'm using the Text Processing module in order to create term vector frequencies using the TF-IDF.
The idea is to have one vector for each document processed.
Everthing is working fine, but instead of having the filename of each vector on the heading of the vector generated, I'm having several labels (att_1, att_2, att3, etc.). In this way I'm not able to associate each vector to each document.
My objective is to have each column for each document and each row corresponds to the most frequent words.
I had to perform a Transpose operation in order to have the frequent words on rows and document on columns.

Can anyone give me a hint on how to rename the labels att_1, att2, etc... to te correct filenames?

Thank you very much for your help.

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Hi, I suppose you are using Process Documents from Files. This operator should keep the filenames of the documents at its output. Use Set Role to define that attribute as id before applying the Transpose operator: the Transpose operator uses the id attribute as column names in the transposed example set.

    Best,
    Marius
  • RucaRuca Member Posts: 13 Contributor II
    Hi Marius,

    Thank you very much for your clarification.
    As you mention the filenames are kept, but they are at the end of the example set. That's why I was not able to detect them.

    Best regards,
Sign In or Register to comment.