The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

MySQL and RapidMiner

Edin_KlapicEdin_Klapic Employee-RapidMiner, RMResearcher, Member Posts: 299 RM Data Scientist
edited November 2018 in Knowledge Base

Question

What do I need to know if I want to use MySQL database with RapidMiner?

Answer

In general there are two things you need to know and take care of.

 

1. Setting up the connection to the database

From within RapidMiner Studio you can connect by creating a database connection as described in this Knowledge Base article.

If you want to install RapidMiner Server on MySQL you need to download a MySQL connector and follow the installation instructions described in this article.

 

2. Settings for storing large(r) datasets

  • The MySQL variable 'max_allowed_packet' defines the size of one packet which is delivered or received from the MySQL Server. As of MySQL version 5.7 this variable is set to 4MB or 16MB by default and can have a maximum value of 1GB. So if storing a RapidMiner ExampleSet in MySQL aborts with something like:
    com.mysql.jdbc.PacketTooBigException: Packet for query is too large (30237356 > 16777216). You can change this value on the server by setting the max_allowed_packet' variable.
    you may want to adapt this value in the options file of your MySQL Server environment and restart the Server. If you use the MySQL Workbench you may find the settings here:

max_packet.PNG

 

 

  • The RapidMiner Server variable 'com.rapidanalytics.db.max_rowlength' reflects the max number of bytes per row RapidMiner uses to store an Object. Depending on the storage engine MySQL uses, one it may only be possible to store up to 1000 columns per ExampleSet. Please refer to your database administrator for information about the max number of columns your database allows. To avoid conflicts in advance we recommend to lower the value for this variable from the default of 20,000 to 15,000 or 10,000 Bytes and ensure that you only store ExampleSets with a number of columns which do not exceed the maximum value.
  1. Login as user 'admin' in the RapidMiner Server Webinterface
  2. Navigate to 'Administration' => 'System Settings'.
  3. If the variable 'com.rapidanalytics.db.max_rowlength' is not listed - Click on 'Add property'
  4. Enter com.rapidanalytics.db.max_rowlength in the field Property and 15,000 in the field Value
  5. Click 'Submit'variabe.PNG

 

 Now you should be all set for your work with RapidMiner and MySQL :smileyhappy:

 

Best,

Edin

Comments

  • luc_bartkowskiluc_bartkowski Member Posts: 46 Maven

    @Edin_Klapic,

     

    Please note also that MySQL views do not appear in the "tables" list of the "Read database" operator.

    But you can include a SQL "SELECT * FROM view_name" in the "SQL Query" section of this operator.

     

    Greetings,

    Luc 

Sign In or Register to comment.