The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to interpret a ROC plot?
Hi,
I generate a ROC plot with the process given below and I get a ROC plot. I assume that the red line (ROC) is the proportion of TP against the proportion of FP but I can't understand what the blue line (ROC (Thresholds)) represents. Can anyone explain?
Regards,
Carlos
<operator name="Root" class="Process" expanded="yes">
<operator name="ExcelExampleSource" class="ExcelExampleSource">
<parameter key="excel_file" value="/Users/csoares/Documents/Ensino/DBM/Materiais/Catalog_multi_aula.xls"/>
<parameter key="sheet_number" value="4"/>
<parameter key="first_row_as_names" value="true"/>
<parameter key="create_label" value="true"/>
<parameter key="label_column" value="2"/>
<parameter key="create_id" value="true"/>
</operator>
<operator name="SimpleValidation" class="SimpleValidation" expanded="yes">
<parameter key="keep_example_set" value="true"/>
<parameter key="create_complete_model" value="true"/>
<operator name="NaiveBayes" class="NaiveBayes">
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<parameter key="keep_model" value="true"/>
<list key="application_parameters">
</list>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
<operator name="ROCChart" class="ROCChart">
<parameter key="use_model" value="false"/>
</operator>
</operator>
</operator>
</operator>
I generate a ROC plot with the process given below and I get a ROC plot. I assume that the red line (ROC) is the proportion of TP against the proportion of FP but I can't understand what the blue line (ROC (Thresholds)) represents. Can anyone explain?
Regards,
Carlos
<operator name="Root" class="Process" expanded="yes">
<operator name="ExcelExampleSource" class="ExcelExampleSource">
<parameter key="excel_file" value="/Users/csoares/Documents/Ensino/DBM/Materiais/Catalog_multi_aula.xls"/>
<parameter key="sheet_number" value="4"/>
<parameter key="first_row_as_names" value="true"/>
<parameter key="create_label" value="true"/>
<parameter key="label_column" value="2"/>
<parameter key="create_id" value="true"/>
</operator>
<operator name="SimpleValidation" class="SimpleValidation" expanded="yes">
<parameter key="keep_example_set" value="true"/>
<parameter key="create_complete_model" value="true"/>
<operator name="NaiveBayes" class="NaiveBayes">
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<parameter key="keep_model" value="true"/>
<list key="application_parameters">
</list>
</operator>
<operator name="ClassificationPerformance" class="ClassificationPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="accuracy" value="true"/>
<list key="class_weights">
</list>
</operator>
<operator name="ROCChart" class="ROCChart">
<parameter key="use_model" value="false"/>
</operator>
</operator>
</operator>
</operator>
Tagged:
0
Answers
I have the same question, I'm sure its a simple answer but can't find an explanation in the documentation.
Thanks!
hello @michaelgloven - so for these kind of fundamental data science background topics I usually go "old school" with books (yes paper). My go-to texts are "Data Mining for the Masses" by Dr. Matthew North, and "Predictive Analytics and Data Mining" by Kotu & Deshpande. Both are excellent and are full of explicit examples using RapidMiner. For your question about ROC curves, Chapter 8 of Kotu & Deshpande is all about Model Evaluation which starts with a long explanation of ROC.
Scott
many thanks Scott. Figure 8.5 on page 269 of the Kotu book also has the ROC (thresholds) curve without explanation. It looks like the inverse of the ROC curve, probably a simple explanation, but still a mystery to me. I'll check out your second resource suggestion.
Mike
Hi @michaelgloven,
Each point of the ROC curve is the rate of true positives (or proportion of TP as it called in the first post) vs the rate of false positives (proportion of FP) for a specific applied threshold on the confidence of the corresponding classifier.
The ROC (thresholds) curve just shows this confidence threshold (sometimes also called confidence cut).
Hopes this helps,
Best regards,
Fabian
Fabian, appreciate your explanation on the thresholds...I figured it was simple, but needed an expert to point it out!