The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
read in tab delimited file in Java
I am new to RapidMiner and have tried using the GUI to create a simple process to check the binary occurrence of term in comments. My input file is of tab delimited format <id>tab<comments>.
I used the 'Start Data Loading Wizard' from ExampleSource to input the data but as I need to integrate the process in Java environment, I read that IOContainer may be able to help me (from the tutorial.pdf). However I am not sure how to go about doing this.
I tried using the ExampleSource directly but it uses the attributes file which for my case, will change every time it runs as I uses different source file. I can't possible use the GUI to generate the aml file and then run the Java program so I need the program to read in the source file (which is the tab-delimited file) directly. Is there a way for ExampleSource to achieve this?
Appreciate any advice or suggestion.
By the way, is there any way to convert all letters to lower case? I found that Preprocessing.Attributes.Filter.Values has a parameter - convert_to_lowercase but could not get it to convert all the comments in the input file to lower case.
Thanks for all your advices
I used the 'Start Data Loading Wizard' from ExampleSource to input the data but as I need to integrate the process in Java environment, I read that IOContainer may be able to help me (from the tutorial.pdf). However I am not sure how to go about doing this.
I tried using the ExampleSource directly but it uses the attributes file which for my case, will change every time it runs as I uses different source file. I can't possible use the GUI to generate the aml file and then run the Java program so I need the program to read in the source file (which is the tab-delimited file) directly. Is there a way for ExampleSource to achieve this?
Appreciate any advice or suggestion.
By the way, is there any way to convert all letters to lower case? I found that Preprocessing.Attributes.Filter.Values has a parameter - convert_to_lowercase but could not get it to convert all the comments in the input file to lower case.
Thanks for all your advices
Tagged:
0
Answers
did you try reading the file as CSV? This should work also. Additionally, you can convert all characters to lowercase using the [tt]ToLowerCaseConverter[/tt] during the text preprocessing stage.
Kind regards,
Tobias