The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

[SOLVED] Text Mining

RhmanigRhmanig Member Posts: 9 Contributor II
edited November 2018 in Help
Hi Everyone  Smiley

I am relatively new to RapidMiner and would appreciate if you could help me with this. I have a large data set (search log data) containing information as below:

userID, Query, ItemRank, ClickURL, QueryTime
(There are multiple records with same UerID)

I would like to use RapidMiner to extract useful information, (i.e, find individuals with specific skills. hard is it not!)

So far I have cleansed the date, and now would like to split the data based on UserID.

Then put data (queries) related to each UserID in to a document so I can analysis individual user's query. But I am not sure how?

I have retrieved the data from csv, then used DataToDocument to create a document from each record (rows), now would like to combine document with same UserID into one, but it seems not pussible? Huh

Can you please suggest some way to achieve this.

Regards

Answers

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    Hello.  Hmmm...that's not the way I would do that.  I would import the csv and then use the "Loop Values" operator on the example set where the attribute would be your UserID.  Inside the loop, use the macro "loop_value" to then filter the example set to just have that UserID.  Then do what you like.

    Scott
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Can't you simply use the aggregation first using concat and group by userID? then you would need to replace the added | with a \n or something and you are done.

    If you provide example data, i could build an example process for you
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • RhmanigRhmanig Member Posts: 9 Contributor II
    Thank you guys.
Sign In or Register to comment.