The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to format a Dataset for Rapidminer
Hello there,
i am absolutely new to Rapidminer, so here's my problem i can't really solve atm.
I have a huge dataset inside an excel file ( about 300 rows and 100 columns ), when i try to use this excel file as my input file ( via the "start data loading wizard" ) it doesnt identify properly each column, but divides them in many seperate columns in rapidminer, any suggestions how to solve this easily ?
Thanks
[edit1] now each column gets identified properly ( using the .cvs format type ) , but between each relevant column i now get a column with the string "," nad just can't get this away by modifying my .cvs file, any suggestions ?
[edit2] ok now its getting strange, in column 54 i have a titel where, an starting " and an ending " is given in between the String, so i deleted it, because this costed this column to be seperated into three diffrent coulumns, but when i deleted the two ", i get the described problem from [edit1]. Anyone who knows why this is so ?
When i retype the two " in that column, the problem still exists with described in [edit1], i am confused ?!
[edit3] What also interests me, is how to get the "Input" into the tree view, which is given in the online tutorial, but i can't find it when making an own setup ?!
i am absolutely new to Rapidminer, so here's my problem i can't really solve atm.
I have a huge dataset inside an excel file ( about 300 rows and 100 columns ), when i try to use this excel file as my input file ( via the "start data loading wizard" ) it doesnt identify properly each column, but divides them in many seperate columns in rapidminer, any suggestions how to solve this easily ?
Thanks
[edit1] now each column gets identified properly ( using the .cvs format type ) , but between each relevant column i now get a column with the string "," nad just can't get this away by modifying my .cvs file, any suggestions ?
[edit2] ok now its getting strange, in column 54 i have a titel where, an starting " and an ending " is given in between the String, so i deleted it, because this costed this column to be seperated into three diffrent coulumns, but when i deleted the two ", i get the described problem from [edit1]. Anyone who knows why this is so ?
When i retype the two " in that column, the problem still exists with described in [edit1], i am confused ?!
[edit3] What also interests me, is how to get the "Input" into the tree view, which is given in the online tutorial, but i can't find it when making an own setup ?!
0
Answers
there's a very simple way of loading excel files into rapidminer: Just use the ExcelExampleSource. It will directly read your .xls file without the usual quoting problems of .csv files.
To insert the ExcelExampleSource into the operator tree, just select it in the new operator tab on the right and drag it into the tree on the left.
Greetings,
Sebastian