The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to join multiple excel sheets to combine them into one cluster (k-means) ?
Hey there,
I'm trying to join (in this example 2 but the purpose is to join a huge number of excels) some excel sheets with the join operator to cluster similar documents from different datasets. My problem is, that the join operator overwrites the datasets which are identical in their structure to that the exampleset that arrives at the cluster operator is empty. Attached you will find the process I'm using + the datasets.
How do I solve this? Thanks in advance!
I'm trying to join (in this example 2 but the purpose is to join a huge number of excels) some excel sheets with the join operator to cluster similar documents from different datasets. My problem is, that the join operator overwrites the datasets which are identical in their structure to that the exampleset that arrives at the cluster operator is empty. Attached you will find the process I'm using + the datasets.
How do I solve this? Thanks in advance!
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data ScientistHi,in this case you likely want to append BEFORE process Documents, otherwise your normalization equations are off.Another "trick" is the operator Append (SuperSet), which will allow this operation and add missings. But i suspect this is not what you want.Best,Martin- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany5
Answers
Dortmund, Germany
my bad, you're absolutely right! Now that I've changed the process there is another problem. Why are those empty?
~Martin
Dortmund, Germany
This is where it cracks. It is obviously not possible to append examplesets with different attribute names (which have now the values of the generated text-tokens - I will attach a picture for better understanding)
So my question is: How do I append those matrices to cluster them afterwards?