The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to aggregate results
I'm processing textual data from several files and sub directories. I need to know how frequent some words/phrases occur and the occurrences density. I got the results as multiple files (IOObjectCollection (Loop Files)), and don't know how to aggregate them for further processing.
My process is:
Loop files
>> Loop Zip-file entries
---->> Read Document > Process Documents (I got example set as output but no word list!!)
----------------------------------------------------------->>Tokenize > Filter Tokens
I tried to link this with Set Role operator for example, but the attribute I'm looking for (the word I'm searching for) doesn't exist in some files and I believe this is why I get Attribute Not found. or maybe i'm missing something here. So, any clue how to aggregate such results?
Tagged:
0
Best Answers
-
sgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Managerhi @Ayoube the output of the Loop Files operator (and many other Loop operators) is a 'collection' of IOObjects. This is indicated by double wires at the output:
For ExampleSets, you can combine using Append IF all the ExampleSets have exactly the same number and type of attributes. Otherwise I would use the "Union with Append" building block:
Scott
6 -
tftemme Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, RMResearcher, Member Posts: 164 RM ResearchYou can also use the new Append (SuperSet) operator from the Operator Toolbox extension (since version 2.0). It is capable to append ExampleSets with different Attributes. In addition it also handles if the same attribute role (e.g. label) occurs twice for different attribute names, or if Attributes with the same name have different types.6