Update Database is unexpectedly slow

BalazsBarany · March 2019

I had a situation today where a big table (nicely set up with indexes) has to be updated with Update Database, with the integer ID field as the key (attribute filter type: single, attribute: id). The ID is displayed by RapidMiner as an integer, too.

It was much slower than it should have been: when testing an update, it finished in milliseconds, but in RapidMiner it took seconds instead.

Having already seen this kind of unexpected slowness with other software, I checked the access statistics in the database. The table showed many (and growing) full table scans, but a constant number of index scans. So the index of the primary key field was not being used.

Creating a second index on the id field with the double precision (floating point number) type helped, the throughput of Update Database improved thousand fold, the index was being used according to the table statistics, and the number of full table scans stayed constant. This is of course not a nice workaround, but for me it helped.

Update Database uses "prepared statements". In these, all columns get an assigned data type. Depending on the compatibility between types in the database, an index can or can't be used if the data type doesn't match. Here, the integer index was not used with the floating point data type.

Hopefully this will be fixed in one of the next releases. Until then, you can work around the problem by creating a second index with the appropriate data type in your database to speed up Update Database.

This is on PostgreSQL 10, your database might work differently.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Update Database is unexpectedly slow

Open for Voting · Last Updated May 2019