The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Read CSV to example set"
Hi,
Just beginning RapidMiner experiment & having trouble with "Read CSV" operator.
I can output the data to res (and see the ExampleSet), but when other operators require an example set in input, no data is available. Is this a limitation of Read CSV or is there a way to make the data available as an example set ?
Regards.
Just beginning RapidMiner experiment & having trouble with "Read CSV" operator.
I can output the data to res (and see the ExampleSet), but when other operators require an example set in input, no data is available. Is this a limitation of Read CSV or is there a way to make the data available as an example set ?
Regards.
Tagged:
0
Answers
Start Rapidminer and go Help->Tutorial, that will load runnable examples, so you have some idea of what RM can and cannot do. Believe me, it saves time in the long run!
if your operator provides an example set to the results port of the process, it will do the same for other operators. Did you check the connection from the output port of "Read CSV" to the input port of the following operator? Perhaps you might want to post your process (code from XML tab) here to reveal possible mistakes in process design.
Regards
Matthias
Many thanks for your quick reply.
Here is the code (nothing fancy). Doesn't work with CSV Reader but works well with Read Excel or Retrieve.
When you are modifying the file that has been stored as a Data Table in the repository, do you know how to automaticaly update this Data Table ?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.006">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.006" expanded="true" name="Process">
<process expanded="true" height="426" width="673">
<operator activated="true" class="read_csv" compatibility="5.1.006" expanded="true" height="60" name="Read CSV" width="90" x="45" y="120">
<parameter key="csv_file" value="D:\Data.csv"/>
<parameter key="date_format" value="yyyyMMdd"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="locale" value="French"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="Date.true.date.id"/>
<parameter key="1" value="Data.true.integer.attribute"/>
</list>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.1.002" expanded="true" height="76" name="Windowing" width="90" x="179" y="30">
<parameter key="horizon" value="1"/>
<parameter key="window_size" value="1"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="Data"/>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Windowing" to_port="example set input"/>
<connect from_op="Windowing" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Thank you for your insight. I've studied this tutorial last week and effectively the ressource is amazingly powerful and educative. But I haven't found an answer to my current problem. I've posted the code, but I don't think it will help. You can try for yourself with a very simple csv file, when you drag the mouse cursor over the operator output, it indicates "number of examples=-1".
Regards
Cheers,
Ingo
When I execute the process, I works fine to display the data (even if number of example set=-1). But when I add a windowing operator, which requires a number of example set superior to the horizon (set to 1), it fails.
Cheers
1. Load the data with "Read CSV", add an operator "Store" and save the data set directly again in your repository.
2. Drag the freshly saved data from your repository (it will be transformed into a new operator named "Retrieve" which will load the data for you from the repository)
Try again with this data set loaded with "Retrieve". Expected behaviour: Everything works like expected. Reason for your confusion: Search in the forum for "Repository" and "meta data". Best solution for you: Book a training at Rapid-I - it definitely will help
This would probably also the best option if you do not know what I mean with "Repository" ;D
Cheers,
Ingo
P.S. (for the more experienced readers here...): I never did expect that this - definitely very unique and innovative - feature of RapidMiner called "meta data propagation" would cause so much uncertainty for some users. I am open for all suggestions how we could make the difference more clear between "meta data" and "actual data" and why it is sometimes impossible to provide meta data (like for CSV files...)
just to be sure... you didn't use the "Window Document" operator after "Read CSV", did you? Which operators did you try?
I hoped you would post your process with this second operator to reveal possible problems
Regards
Matthias
Just read your post at http://rapid-i.com/rapidforum/index.php/topic,2902.msg11559.html#msg11559
Frequent update of my csv files is why I don't use the repository (unless there is a way to easily and automatically update it).
I don't understand why the same data can be output when in xls and can't in csv format. Fortunately I have found alternative ways to properly deal with this issue, but I would have prefered (it's not crucial) to output directly fron Read CSV.
Many thanks for your support.
Best regards.
-Gagi
Best,
PK
Best regards,
Marius