The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
inconsistent metadata subsequent to apply model operator?
It seems that if one checks the meta data in the data flow following the use of the
apply model operator, one finds it to be inconsistent. More precisely, in one of my experiments I was expecting
three more columns to be mentioned in the metadata: the confidences for the classes
Yes and No, and the prediction itself. None was included in the metadata, although all were included in the
scored dataset. In particular the two confidences were not visible to a select attribute operator via which I intended to drop them before storing the scored dataset in a database.
Any comments are welcome.
Best
Dan
apply model operator, one finds it to be inconsistent. More precisely, in one of my experiments I was expecting
three more columns to be mentioned in the metadata: the confidences for the classes
Yes and No, and the prediction itself. None was included in the metadata, although all were included in the
scored dataset. In particular the two confidences were not visible to a select attribute operator via which I intended to drop them before storing the scored dataset in a database.
Any comments are welcome.
Best
Dan
0
Answers
again I have to remind that the meta data is only executed on a dry run without taking a look at the actual data. If the label attribute's values are not known during meta data transformation, they cannot be inserted during the dry model application. The model's meta data simply don't know on which class values it has been created.
Greetings,
Sebastian
Both, thanks.
For seek of simplicity assume you learn a decision tree from the Iris dataset, you write the model in a file, then you want to use it for scoring the Iris dataset, and you want to drop the soft predictions/ confidence attributes from the scored dataset via the select attributes operator, as shown below. How can one alternatively drop the confidence attributes, as it seems that these and the prediction attribute are not visible by the select attribute operator, which is practically not useful here?
Obviously if one included, in the canvas, the process that built the model instead of reading it from a file, then the meta data would have contained these confidence attributes and thus they would have been visible and could have been discarded with the select attributes operator (before saving the scored dataset). But ... the model is just available in a file (as in a real application for instance the model may have been built through running several experiments until it is satisfactory).
Thanks
Dan
if you would save the model to the repository instead of a file, the meta data would be preserved. That's exactly why we introduced the repository in the first place...
Furthermore if you are going to make a process productive, you won't even want to start RapidMiner for the process results. We designed RapidAnalytics to solve this issue: You can run processes there either manually, but also by time schedule or expose them as a webservice. The latter gives you a very easy way of integrating RM into your IT infrastructure. Several formats for delivering the results are supported including xml, jason or directly delivering the plot.
For all of this, RapidAnalytics supports you with a so called "remote repository" that can be accessed by all team members, supporting User Rights. Since that, we regard exporting things into files as the old fashioned absolute baseline solution that's no longer the preferred way.
And if you really want to have control over the files, then take a look at the local repository directory: The content is simply stored in files, but the repository connects it with the meta data.
Greetings,
Sebastian
Best,
Dan