The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
ETL Operations
Hi all, I'm evaluating to use RM in my project also to perfom typical ETL operations.
I've already found in this forum that Rapid Miner main purpose is not ETL (e.g. http://rapid-i.com/rapidforum/index.php/topic,986.0.html) , but I’d like to manage to fulfill my ETL requirements to use after all the power of Rapid Miner in Data Mining.
So in below I express the solution I found for some ETL operation … thanks in advance if someone has a more efficent suggestion to make the same thing.
My example has the Iris Data as input and a MySQL table as output.
1. APPEND ⇒
The aim is to append new record in the table (with a primary key): but the input set has also record the can violate the referential integrity
Operator WriteDB:
overwrite and overwrite first, append then (by the way, what is the difference??) ⇒the result is that the table is deleted and filled with new data ⇒ no good because I lose data initially stored in the table
append ⇒ error, due to refrerential integrity violation
The solution found is to filter input data and to append only the non-violating data by using directly SQL (by Execute SQL operator).
Is there a better solution for this problem?
2. MAPPING
The aim is to trasfer (append) records from a source table and target table with different fileld names.
How to bulid a mapping to append correctly the data, using in this case Operator WriteDB, method Append (the hypotesisi in no referential integrity problem in this case)?
Thanks in advance.
I've already found in this forum that Rapid Miner main purpose is not ETL (e.g. http://rapid-i.com/rapidforum/index.php/topic,986.0.html) , but I’d like to manage to fulfill my ETL requirements to use after all the power of Rapid Miner in Data Mining.
So in below I express the solution I found for some ETL operation … thanks in advance if someone has a more efficent suggestion to make the same thing.
My example has the Iris Data as input and a MySQL table as output.
1. APPEND ⇒
The aim is to append new record in the table (with a primary key): but the input set has also record the can violate the referential integrity
Operator WriteDB:
overwrite and overwrite first, append then (by the way, what is the difference??) ⇒the result is that the table is deleted and filled with new data ⇒ no good because I lose data initially stored in the table
append ⇒ error, due to refrerential integrity violation
The solution found is to filter input data and to append only the non-violating data by using directly SQL (by Execute SQL operator).
Is there a better solution for this problem?
2. MAPPING
The aim is to trasfer (append) records from a source table and target table with different fileld names.
How to bulid a mapping to append correctly the data, using in this case Operator WriteDB, method Append (the hypotesisi in no referential integrity problem in this case)?
Thanks in advance.
0