The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Precision Recall Curves
John_De_Jong
Member Posts: 10 Contributor II
Just like with Cross Validation and using performance, we get ROC curves with TPR, FPR, how can we get the Precision,Recall curves? I dont want the average precision and recall, but curves for each fold. Can anyone suggest, please
Thanks
uday
Thanks
uday
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.001" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<parameter key="parallelize_main_process" value="false"/>
<process expanded="true" height="500" width="752">
<operator activated="true" class="retrieve" compatibility="5.1.001" expanded="true" height="60" name="Retrieve" width="90" x="17" y="58">
<parameter key="repository_entry" value="Acceptor3KData"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.001" expanded="true" height="76" name="Set Role" width="90" x="112" y="165">
<parameter key="name" value="class"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="replace" compatibility="5.1.001" expanded="true" height="76" name="Replace" width="90" x="313" y="210">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="class"/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="nominal"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="file_path"/>
<parameter key="block_type" value="single_value"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="single_value"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="replace_what" value="[0]"/>
<parameter key="replace_by" value="1"/>
</operator>
<operator activated="true" class="x_validation" compatibility="5.1.001" expanded="true" height="130" name="Validation" width="90" x="447" y="75">
<parameter key="create_complete_model" value="false"/>
<parameter key="average_performances_only" value="false"/>
<parameter key="leave_one_out" value="false"/>
<parameter key="number_of_validations" value="5"/>
<parameter key="sampling_type" value="stratified sampling"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<parameter key="parallelize_training" value="false"/>
<parameter key="parallelize_testing" value="false"/>
<process expanded="true" height="500" width="351">
<operator activated="true" class="fast_large_margin" compatibility="5.1.001" expanded="true" height="76" name="Fast Large Margin" width="90" x="130" y="110">
<parameter key="solver" value="L2 SVM Dual"/>
<parameter key="C" value="1.0"/>
<parameter key="epsilon" value="0.01"/>
<list key="class_weights"/>
<parameter key="use_bias" value="true"/>
</operator>
<connect from_port="training" to_op="Fast Large Margin" to_port="training set"/>
<connect from_op="Fast Large Margin" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="108"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="500" width="351">
<operator activated="true" class="apply_model" compatibility="5.1.001" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance_binominal_classification" compatibility="5.1.001" expanded="true" height="76" name="Performance" width="90" x="179" y="30">
<parameter key="main_criterion" value="accuracy"/>
<parameter key="accuracy" value="true"/>
<parameter key="classification_error" value="true"/>
<parameter key="kappa" value="true"/>
<parameter key="AUC (optimistic)" value="true"/>
<parameter key="AUC" value="true"/>
<parameter key="AUC (pessimistic)" value="true"/>
<parameter key="precision" value="true"/>
<parameter key="recall" value="true"/>
<parameter key="lift" value="false"/>
<parameter key="fallout" value="false"/>
<parameter key="f_measure" value="true"/>
<parameter key="false_positive" value="true"/>
<parameter key="false_negative" value="true"/>
<parameter key="true_positive" value="true"/>
<parameter key="true_negative" value="true"/>
<parameter key="sensitivity" value="true"/>
<parameter key="specificity" value="true"/>
<parameter key="youden" value="false"/>
<parameter key="positive_predictive_value" value="true"/>
<parameter key="negative_predictive_value" value="true"/>
<parameter key="psep" value="false"/>
<parameter key="skip_undefined_labels" value="true"/>
<parameter key="use_example_weights" value="true"/>
</operator>
<operator activated="true" class="log" compatibility="5.1.001" expanded="true" height="76" name="Log" width="90" x="179" y="165">
<parameter key="filename" value="C:\Output.log"/>
<list key="log">
<parameter key="recall" value="operator.Validation.value.performance1"/>
<parameter key="precision" value="operator.Validation.value.performance2"/>
</list>
<parameter key="sorting_type" value="none"/>
<parameter key="sorting_k" value="100"/>
<parameter key="persistent" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<connect from_op="Log" from_port="through 1" to_port="averagable 2"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
<portSpacing port="sink_averagable 3" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Replace" to_port="example set input"/>
<connect from_op="Replace" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="training" to_port="result 1"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
0
Answers
Thanks again
Johan
sorry for that delay, but we are very busy right now with many projects. Unfortunately we cannot spend that much time for supporting community members as we want to.
Currently there's now integrated way of building such curves. There are three possibilities: You could implement it yourself and possibly contributing the code to the public, giving something back to the community. Or you could ask us for a quote for implementing it for you. The last possibility is to build a process that will create such a plot as outcome.
If it's completely analog to the AUROC, you will have to first sort after some of the confidences and then always increase a counter if it was a correct prediction. You could derive a dataset from this. But this indeed needs some macro generation, Set Data Operators and Loop Examples. Will be a quite sophisticated process...
Greetings,
Sebastian
I found a way to do it, here it is for others
1. Modified the LibLinear Java code in method predict to pass the label and print the decision value for the label 2. Have predictionOutputWriter initialized to write to data file
3. Use http://mark.goadrich.com/programs/AUC/ to get both precision-recall curves or accuracy curves from the above predictions
Sebastian
I would love to contribute this in package in summer
Thanks
Johan
contributing this would be great. Probably one can make it easier if introducing this code into one operator that generates an appropriate ResultObject like the ROC operator does.
Greetings,
Sebastian
Hi 5 years later; same question. Has there been any progress on this yet? Any ways of getting a PR curve out of rapidminer?
Has there ben any progress on an operator to get the PRC out of RM?
Thanks
Narayan
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts