I have funny characters in my example sets. I suspect an encoding problem.
Problem:
Encoding settings of the database, the settings of a database connection configured in RapidMiner Studio or Server, or the JBoss instance that hosts RapidMiner Server are incorrect. Many file input operators can also specify an encoding.
Solution:
You should use the
utf8
encoding wherever possible. Database settings can be made per
-
Database
: In MySQL, use “ALTER DATABASE xxx DEFAULT CHARACTER SET utf8” -
Table
: Newly created tables will inherit from the default character set and can be otherwise specified in the CREATE statement. -
RapidMiner Studio JDBC connection
: Set the appropriate connection properties (see below for a list). In RapidMiner Studio this is possible via Tools > Manage Database Connections > Advanced.
The encodingName you want to use is almost always utf8. What exactly the name of the JDCB property is, depends on the database. Known values are:
-
MySQL:characterEncoding
-
MS SQL Server via JTDS driver:CHARSET
-
Oracle:charset
Processes can configure the encoding via parameters of input operators.