The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
CSVExampleSource.PARAMETER_COLUMN_META_DATA
Neomatrix433
Member Posts: 8 Contributor II
Hi I have a problem with the CSVExampelSource operator. I want to read a CSVDatei and add them to the DecisionTreeLearner.
How can I add my CSVExampelSource operator attributes and label?
This is my testdata:
Play;Outlook;Temperature;Humidity;Wind
yes;sunny;85;85.0;false
yes;overcast;80;90.0;true
no;overcast;83;78.0;false
no;rain;70;96.0;false
yes;rain;68;80.0;true
no;rain;65;70.0;true
yes;overcast;64;65.0;true
no;sunny;72;95.0;false
yes;sunny;69;70.0;false
no;sunny;75;80.0;false
yes;sunny;68;70.0;true
no;overcast;72;90.0;true
yes;overcast;81;75.0;true
no;rain;71;80.0;true
My code to test:
Process[0] (Process)
subprocess 'Main Process'
+- Read CSV[0] (Read CSV)
+- Set Role[0] (Set Role)
+- Decision Tree[0] (Decision Tree)
IOContainer (0 objects):
How can I put information if nominal, binominal, real and integer to the CSV operator ....
How can I add to CSVExampelSource Metadata? that the Decision Tree is created by Rapidmineroperator over the label play.
I need urgent help, it's about my Bachelor
And prepare the prozess in rapidminer is not a alternative!!
Sorry for my bad english!!
MFG
How can I add my CSVExampelSource operator attributes and label?
This is my testdata:
Play;Outlook;Temperature;Humidity;Wind
yes;sunny;85;85.0;false
yes;overcast;80;90.0;true
no;overcast;83;78.0;false
no;rain;70;96.0;false
yes;rain;68;80.0;true
no;rain;65;70.0;true
yes;overcast;64;65.0;true
no;sunny;72;95.0;false
yes;sunny;69;70.0;false
no;sunny;75;80.0;false
yes;sunny;68;70.0;true
no;overcast;72;90.0;true
yes;overcast;81;75.0;true
no;rain;71;80.0;true
My code to test:
and this is the output:
import java.io.File;
import java.util.LinkedList;
import java.util.List;
import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.example.Attribute;
import com.rapidminer.example.Attributes;
import com.rapidminer.example.table.AttributeFactory;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.operator.learner.tree.DecisionTreeLearner;
import com.rapidminer.operator.nio.CSVExampleSource;
import com.rapidminer.operator.preprocessing.filter.ChangeAttributeRole;
import com.rapidminer.tools.Ontology;
import com.rapidminer.tools.OperatorService;
import de.tu_berlin.mf.vlcu.utilityclasses.CSVReader;
public class ProcessCreator {
public static Process createProcess() {
// invoke init before using the OperatorService
RapidMiner.setExecutionMode(ExecutionMode.EMBEDDED_WITH_UI);
RapidMiner.init();
String dateipath1 = "./HLB_B3_Messung/Data_Controller/messung1/Testdaten.csv";
File file = new File(dateipath1);
CSVReader csvreader = new CSVReader(file.getAbsolutePath());
csvreader.readCSV();
String[] header = csvreader.getHeader();
String[][] data = csvreader.getData();
Process process = null;
try {
// create attribute list
List<Attribute> attributes = new LinkedList<Attribute>();
for (int a = 0; a < header.length; a++) {
if (header.equals("Play")) {
Attribute label = AttributeFactory.createAttribute("label", Ontology.NOMINAL);
attributes.add(label);
} else {
attributes.add(AttributeFactory.createAttribute("att" + a, Ontology.REAL));
}
}
// create process
process = new Process();
/* Reading Data */
CSVExampleSource csvdata = OperatorService.createOperator(CSVExampleSource.class);
// set parameters
csvdata.setParameter(CSVExampleSource.PARAMETER_CSV_FILE, file.getAbsolutePath());
csvdata.setParameter(CSVExampleSource.PARAMETER_FIRST_ROW_AS_NAMES, "true");
csvdata.setParameter(CSVExampleSource.PARAMETER_COLUMN_SEPARATORS, ";");
csvdata.setParameter(CSVExampleSource.PARAMETER_TRIM_LINES, "true");
ChangeAttributeRole changerole = OperatorService.createOperator(ChangeAttributeRole.class);
changerole.setParameter(ChangeAttributeRole.PARAMETER_NAME, "Play");
changerole.setParameter(ChangeAttributeRole.PARAMETER_TARGET_ROLE, Attributes.LABEL_NAME);
DecisionTreeLearner decisionTree = OperatorService.createOperator(DecisionTreeLearner.class);
decisionTree.setParameter(DecisionTreeLearner.PARAMETER_CRITERION, "gain_ratio");
decisionTree.setParameter(DecisionTreeLearner.PARAMETER_MINIMAL_SIZE_FOR_SPLIT, "4");
decisionTree.setParameter(DecisionTreeLearner.PARAMETER_MINIMAL_LEAF_SIZE, "2");
decisionTree.setParameter(DecisionTreeLearner.PARAMETER_MAXIMAL_DEPTH, "20");
decisionTree.setParameter(DecisionTreeLearner.PARAMETER_CONFIDENCE, "0.25");
process.getRootOperator().getSubprocess(0).addOperator(csvdata);
process.getRootOperator().getSubprocess(0).addOperator(changerole);
process.getRootOperator().getSubprocess(0).addOperator(decisionTree);
csvdata.getOutputPorts().getPortByName("output")
.connectTo(changerole.getInputPorts().getPortByName("example set input"));
changerole.getOutputPorts().getPortByName("example set output")
.connectTo(decisionTree.getInputPorts().getPortByName("training set"));
// add other operators and set parameters
// [...]
} catch (Exception e) {
e.printStackTrace();
}
return process;
}
public static void main(String[] argv) {
// create process
Process process = createProcess();
// print process setup
System.out.println(process.getRootOperator().createProcessTree(0));
try {
// perform process
IOContainer test = process.run();
System.out.println(test.toString());
// to run the process with input created by your application use
// process.run(new IOContainer(new IOObject[] { ... your objects ... });
} catch (OperatorException e) {
e.printStackTrace();
}
}
}
Process[0] (Process)
subprocess 'Main Process'
+- Read CSV[0] (Read CSV)
+- Set Role[0] (Set Role)
+- Decision Tree[0] (Decision Tree)
IOContainer (0 objects):
How can I put information if nominal, binominal, real and integer to the CSV operator ....
How can I add to CSVExampelSource Metadata? that the Decision Tree is created by Rapidmineroperator over the label play.
I need urgent help, it's about my Bachelor
And prepare the prozess in rapidminer is not a alternative!!
Sorry for my bad english!!
MFG
Tagged:
0
Answers
your output looks like you forgot to connect the output port of the last operator to the process output sink.
Anyway, on to your questions:
1) you need to set the parameter "data_set_meta_data_information" on your CSV operator to match the following list 2) Metadata is for process design help. As you are not using the RapidMiner design perspective to design your process you won't need it
3) If you have exactly one label in your data, the DecisionTree will automatically use it
As a general rule, if you need help on how certain parameters are named or how to use them, go to the RapidMiner design perspective, setup the operator in question there and then have a look at the process XML (XML tab next to the process). Name and values of all parameters can be seen there.
Regards,
Marco
I looked in the xml perpective and picked this out:
then run the Code and the output is:
Process[0] (Process)
subprocess 'Main Process'
+- Read CSV[0] (Read CSV)
+- Decision Tree[0] (Decision Tree)
01.03.2013 14:14:33 com.rapidminer.tools.WrapperLoggingHandler log
INFO: No filename given for result file, using stdout for logging results!
01.03.2013 14:14:33 com.rapidminer.Process run
INFO: Process starts
com.rapidminer.operator.UserError: Input example set does not have a label attribute
at com.rapidminer.operator.learner.AbstractLearner.doWork(AbstractLearner.java:139)
at com.rapidminer.operator.Operator.execute(Operator.java:834)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
at com.rapidminer.operator.Operator.execute(Operator.java:834)
at com.rapidminer.Process.run(Process.java:925)
at com.rapidminer.Process.run(Process.java:848)
at com.rapidminer.Process.run(Process.java:807)
at com.rapidminer.Process.run(Process.java:802)
at com.rapidminer.Process.run(Process.java:792)
at de.tu_berlin.mf.vlcu.test.ProcessCreator.main(ProcessCreator.java:95)
What is rong? connecting in this line is correct? i guess the line is not correct.
This can i read in the errormessage: Input example set does not have a label attribute.
Can anyone show me a Codesample to do this correct?
Thanks and regards
obviously you cannot use the xml snippet I quoted as a parameter, I merely pointed out the names of the key/value pairs for the list parameter. See ParameterTypeList class and its use in the RapidMiner sourcecode for further details. I guess you will need to dig quite a bit through the RapidMiner sourcecode seeing as for some reason you don't want to use the much less tedious and less error-prone approach to design your processes in the GUI instead of coding everything by hand from scratch.
Regards,
Marco
Imho the best is a good IDE for navigation, but having relevant messages for commits would be a welcomed change.
PS.: my fork is a bit out of date, not yet imported the changes since 5.3.5.
and the vision is ok. I use the search funktion of my IDE Eclipse is very helpfull 8) to find the code places in rapidminer.
is the difference between 5.2 and 5.3.5 so great in reference of my problem?
best regards from berlin ;D
pleace give me a hint to find this in the code, i guess it´s the solution part of my problem.
that is also a parameter of the CSVExampleSource operator, and it is the one I showed you earlier. You will have to create a ParameterTypeList object, fill it with the given key/value pairs and then add that object to the CSVExampleSource instance under the "data_set_meta_data_information" key.
Regards,
Marco
I have change the code: and the result is:
Process[0] (Process)
subprocess 'Main Process'
+- Read CSV[0] (Read CSV)
+- Decision Tree[0] (Decision Tree)
01.03.2013 20:27:44 com.rapidminer.tools.WrapperLoggingHandler log
INFO: No filename given for result file, using stdout for logging results!
01.03.2013 20:27:44 com.rapidminer.Process run
INFO: Process starts
01.03.2013 20:27:44 com.rapidminer.Process run
INFO: Process finished successfully after 0 s
IOContainer (0 objects):
The Problem with label and attributes looks solved......
IOContainer (o objects) what is wrong now ::) ??? ??? ??? ???
What is the problem now? I am a nervous wreck!
Could this help you to solve this problem?
HTH, gabor
the iocontainer is still clear!! this is the output of cachedExampleSet :
SimpleExampleSet:
14 examples,
4 regular attributes,
special attributes = {
label = #0: Play (nominal/single_value)/values=[yes, no]
}
the right output should be:
IOContainer (2 objects):
Humidity > 92.500: no {yes=0, no=2}
Humidity ≤ 92.500: yes {yes=7, no=5}
(created by Decision Tree)
SimpleExampleSet:
14 examples,
4 regular attributes,
special attributes = {
label = #0: Play (nominal/single_value)/values=[yes, no]
}
(created by Read CSV)
when I use the process in rapidminer create and export and run this in my own code!
Sorry, no experience in this regard, but this might help.
Cheers, gabor
Create a simpel prozess in rapidminer:
and export the prozess with rapidminer in file DecisionTreeprozess.rmp
then use the prozess in my own code: Output of this code:
Read CSV
Decision Tree
Process[0] (Process)
subprocess 'Main Process'
+- Read CSV[0] (Read CSV)
+- Decision Tree[0] (Decision Tree)
02.03.2013 14:18:48 com.rapidminer.tools.WrapperLoggingHandler log
INFO: No filename given for result file, using stdout for logging results!
02.03.2013 14:18:48 com.rapidminer.Process run
INFO: Process \Anwendungsentwicklung\eclipse 64 Bit\workspace\VLCU_Neu\.\HLB_B3_Messung\Data_Controller\messung1\DecisionTreeprozess.rmp starts
02.03.2013 14:18:48 com.rapidminer.Process run
INFO: Process \Anwendungsentwicklung\eclipse 64 Bit\workspace\VLCU_Neu\.\HLB_B3_Messung\Data_Controller\messung1\DecisionTreeprozess.rmp finished successfully after 0 s
IOContainer (2 objects):
Humidity > 92.500: no {yes=0, no=2}
Humidity ≤ 92.500: yes {yes=7, no=5}
(created by Decision Tree)
SimpleExampleSet:
14 examples,
4 regular attributes,
special attributes = {
label = #0: Play (nominal/single_value)/values=[yes, no]
}
(created by Read CSV)
Thanks to all helper!!!
pleace close this thread.