The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Working with SPSS & RapidMiner
JEdward
RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
Hi all,
I was wondering if it is possible to work with a combination of SPSS & RapidMiner.
Is it possible to create a model in SPSS and then export it with PMML (or another format) and then open this model into RapidMiner to be worked on, or vice versa?
It would certainly save time without having to rebuild historically created models from scratch in RM & could also enable colleagues using the different systems to collaborate.
I was wondering if it is possible to work with a combination of SPSS & RapidMiner.
Is it possible to create a model in SPSS and then export it with PMML (or another format) and then open this model into RapidMiner to be worked on, or vice versa?
It would certainly save time without having to rebuild historically created models from scratch in RM & could also enable colleagues using the different systems to collaborate.
0
Answers
Ernesto
Here is what the wiki says:
----------
The Predictive Model Markup Language (PMML) is an XML-based markup language developed by the Data Mining Group (DMG) to provide a way for applications to define models related to predictive analytics and data mining and to share those models between PMML-compliant applications.
PMML provides applications a vendor-independent method of defining models so that proprietary issues and incompatibilities are no longer a barrier to the exchange of models between applications. It allows users to develop models within one vendor's application and use other vendors' applications to visualize, analyze, evaluate or otherwise use the models. Previously, this was very difficult, but with PMML, the exchange of models between compliant applications is straightforward.
Since PMML is an XML-based standard, the specification comes in the form of an XML schema.
----------
Sounds cool, but not sure I can think of a project where I would use PMML.
unfortunately we are currently not able to import PMML models. Actually I don't think it does a lot of sense in the current state, because most brain always goes into preprocessing the data. At least in my experience there's nearly never the case that you can apply any pmml supported model directly on the data.
If you want to use RapidMiner in this fashion anyway, you might contact us for getting to know if we could make it possible.
Greetings,
Sebastian
I'll think I'll definitely look into Adapa for scoring some more. Their cloud based model sounds appealing to me .
One question, although RapidMiner isn't a scoring engine would I be correct in saying that RapidAnalytics is suited for scoring using models created with RapidMiner?
Sebastian, I'll also send you an email regarding what we can 'make possible'.
well, RapidMiner is not a scoring engine in the sense of "Load in arbitrary PMML models and score our database" but of course you can use RapidMiner - as well as RapidAnalytics - to score your database with models natively created by RapidMiner (or Weka or R). So if you already have your models (in PMML), then going for ADAPA or other engines which can only used for scoring might indeed be the simplest solution.
But things change of course if you also want to create such models or apply data transformations as well as Sebastian has pointed out. In that case the models and preprocessing models can best be applied directly from RapidMiner or RapidAnalytics since we support much more preprocessing steps than any other solution available.
So it really depends on what you have already and what your data analysis looks like. But in simple scenarios with less data transformation and ETL and only creating a model and applying it on large data sets, a combination of RapidMiner / RapidAnalytics with a dedicated scoring engine like ADAPA might indeed be the best choice. From my experience the world is unfortunately not so simple in many cases
Just my 2c. Cheers,
Ingo