The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Unable to write binary data into Postgres"
jonmillard
Member Posts: 1 Learner III
Hi Folks,
I'm new to RapidMiner, so hope you'll forgive me if I missed something fundamental here. I have done a search for this problem having been logged before, but nothing came up that was relevant to this issue.
I have written a basic job that:
1. Reads URLs from a database table in Postgres [Read Database]
2. Gets the pages into a variable [Get Pages]
3. Subsets the attribute set [Select Attributes]
4. Attempts to write the subsetted attributes, including the retrieved page, into another database table in Postgres
The fourth step above returns the following error / stack trace:
++++++++++++++++++++++++++++++++++
Exception: com.rapidminer.operator.UserError
Message: Database error occurred: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Stack trace:
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:115)
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:66)
com.rapidminer.operator.io.AbstractWriter.doWork(AbstractWriter.java:69)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.Process.run(Process.java:925)
com.rapidminer.Process.run(Process.java:848)
com.rapidminer.Process.run(Process.java:807)
com.rapidminer.Process.run(Process.java:802)
com.rapidminer.Process.run(Process.java:792)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)
Cause
Exception: org.postgresql.util.PSQLException
Message: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Stack trace:
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
com.rapidminer.tools.jdbc.DatabaseHandler.applyBatchInsertIntoTable(DatabaseHandler.java:692)
com.rapidminer.tools.jdbc.DatabaseHandler.createTable(DatabaseHandler.java:591)
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:107)
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:66)
com.rapidminer.operator.io.AbstractWriter.doWork(AbstractWriter.java:69)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.Process.run(Process.java:925)
com.rapidminer.Process.run(Process.java:848)
com.rapidminer.Process.run(Process.java:807)
com.rapidminer.Process.run(Process.java:802)
com.rapidminer.Process.run(Process.java:792)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)
++++++++++++++++++++++++++++++++++
I do understand why this is occurring - the destination column in Postgres is of type 'text' whereas at least one of the 'pages' is actually binary (in this case an XLS file). When the JDBC driver sees null values in the input, it throws an exception. I did try setting the destination field to 'bytea', but then either Rapidminer or the JDBC driver (can't recall which sorry) complains that the datatype is not compatible and thows an exception as well. I would prefer to store the downloaded content exactly as obtained (as a record) and for later use, but I cannot find an operator that will cast a 'text' variable to a byte array or something similar.
Any thoughts on what might be going wrong here or what I ought to try?
Thanks in advance,
Jon
I'm new to RapidMiner, so hope you'll forgive me if I missed something fundamental here. I have done a search for this problem having been logged before, but nothing came up that was relevant to this issue.
I have written a basic job that:
1. Reads URLs from a database table in Postgres [Read Database]
2. Gets the pages into a variable [Get Pages]
3. Subsets the attribute set [Select Attributes]
4. Attempts to write the subsetted attributes, including the retrieved page, into another database table in Postgres
The fourth step above returns the following error / stack trace:
++++++++++++++++++++++++++++++++++
Exception: com.rapidminer.operator.UserError
Message: Database error occurred: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Stack trace:
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:115)
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:66)
com.rapidminer.operator.io.AbstractWriter.doWork(AbstractWriter.java:69)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.Process.run(Process.java:925)
com.rapidminer.Process.run(Process.java:848)
com.rapidminer.Process.run(Process.java:807)
com.rapidminer.Process.run(Process.java:802)
com.rapidminer.Process.run(Process.java:792)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)
Cause
Exception: org.postgresql.util.PSQLException
Message: ERROR: invalid byte sequence for encoding "UTF8": 0x00
Stack trace:
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2103)
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1836)
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257)
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:512)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:388)
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:334)
com.rapidminer.tools.jdbc.DatabaseHandler.applyBatchInsertIntoTable(DatabaseHandler.java:692)
com.rapidminer.tools.jdbc.DatabaseHandler.createTable(DatabaseHandler.java:591)
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:107)
com.rapidminer.operator.io.DatabaseExampleSetWriter.write(DatabaseExampleSetWriter.java:66)
com.rapidminer.operator.io.AbstractWriter.doWork(AbstractWriter.java:69)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
com.rapidminer.operator.Operator.execute(Operator.java:833)
com.rapidminer.Process.run(Process.java:925)
com.rapidminer.Process.run(Process.java:848)
com.rapidminer.Process.run(Process.java:807)
com.rapidminer.Process.run(Process.java:802)
com.rapidminer.Process.run(Process.java:792)
com.rapidminer.gui.ProcessThread.run(ProcessThread.java:63)
++++++++++++++++++++++++++++++++++
I do understand why this is occurring - the destination column in Postgres is of type 'text' whereas at least one of the 'pages' is actually binary (in this case an XLS file). When the JDBC driver sees null values in the input, it throws an exception. I did try setting the destination field to 'bytea', but then either Rapidminer or the JDBC driver (can't recall which sorry) complains that the datatype is not compatible and thows an exception as well. I would prefer to store the downloaded content exactly as obtained (as a record) and for later use, but I cannot find an operator that will cast a 'text' variable to a byte array or something similar.
Any thoughts on what might be going wrong here or what I ought to try?
Thanks in advance,
Jon
Tagged:
0