The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Apache Derby
smackdown33
Member Posts: 19 Maven
Hi,
I was wondering does anyone know how to use Apache Derby database with Rapidminer.
Thanks for your help.
I was wondering does anyone know how to use Apache Derby database with Rapidminer.
Thanks for your help.
0
Answers
I've created a database of attributes with java through Apache Derby, the Derby side of things is then shutdown. I then give rapidminer the URL address to the database but it cannot connect.
Any ideas????
Thanks
I'm not familiar with Derby, but have some experience connecting to other databases through RM. Can you post the XML for the RM process you're actually trying? That would help remove some of the ambiguities in your question.
Does Derby have a JDBC driver? If so, then in theory at least RM should be able to read data through a DatabaseExampleSource operator.
Also, maybe this is a wording problem, but when you say "the Derby side of things is then shutdown", that makes me think that the Derby database process itself isn't running, in which case connecting to it is obviously a problem that would have nothing to do with RM :-) Can you describe what you mean another way?
Keith
thanks for the reply, ive got rapidminer to recognise Apache Derby by including the Derby client in the lib folder of rapidminer and restarting it.
However I am now faced with another issue regarding loading the data, when i try to run a simple test i get the following error:
[Fatal] IndexOutOfBoundsException occured in 1st application of DatabaseExampleSource (DatabaseExampleSource)
[Fatal] Process failed: operator cannot be executed (Index: 10, Size: 10). Check the log messages...
Would anyone be able to help with this please.
Thanks
[Fatal] UserError occured in 1st application of DatabaseExampleSource (DatabaseExampleSource)
[Fatal] Process failed: Database error occurred: Lexical error at line 1, column 15. Encountered: "`" (96), after : "".
Does anyone have any ideas or know of this problem.
Thanks
Keith
pretty new to this so ive been trying to run the feature selection process in the wizard. The database table consists of 801 columns and 1300 rows.
I hope this is the information your after.
Thanks
what Keith was trying to say is, that you post the complete XML description of your process. Then it only takes seconds for us, loading your process in rapidminer, to take closer look.
To post it here, open your process in rapidMiner and switch to the XML Tag. Then mark the complete text and copy it into the clip board.
Here in the Forum, you should press the # button and paste the complete process with the inserted code tag.
Greetings,
Sebastian
ok sorry, was a bit confused there, thanks for the reply. Please see process below: Any help would be greatly appreciated, thanks.
In the RM GUI, select the DatabaseExampleSource operator in your process chain, and click the Parameters tab. You should see a bunch of entries including, most critically, "database_system", "database_url", "username", and "password". This is how to tell RM what kind of database you're running, how to find it, and what credentials to use to connect. There are also "query" or "table_name" parameters that specify either a SQL query to execute, or the name of the table to retrieve.
The database_url parameter will be specific to your driver+database, so you may need to consult the Derby documentation to find out what the proper JDBC connection string looks like.
When you specify the parameters, the XML for that operator should look something like the following (I'm using SQL Server, not Derby, but it should give you an idea).
Hope that helps,
Keith
[Fatal] UserError occured in 1st application of DatabaseExampleSource (DatabaseExampleSource)
[Fatal] Process failed: Database error occurred: Lexical error at line 1, column 15. Encountered: "`" (96), after : "".
Any help would be greatly appreciated
Thanks to those who have already given their time and knowledge.
I'm not familiar with Derby, but am used to hoovering stuff to and from databases, and it looks like column 15 of the Location table is presenting an orphaned text delimiter, leaving unterminated text. To test just the connection try selecting just one column, other than 15! If that works you'll then need to SQueLch clean column 15.
[Fatal] UserError occured in 1st application of DatabaseExampleSource (DatabaseExampleSource)
[Fatal] Process failed: Database error occurred: Lexical error at line 1, column 15. Encountered: "`" (96), after : "".
Does the problem still occur if you uncheck "work_on_database", for either the full table, or the 1 column case? 800 columns and 1300 rows isn't really a lot of data, so I wouldn't think it was necessary to run with work_on_database on to save memory.
If you are successfully loading the data, perhaps you can try immediately writing it back out to a CSV file using CSVExampleSetWriter. Then alter a process that reads the data in from the CSV rather than reading from the database. If you still get a problem, then the problem isn't the database connection itself, but something in the data.
Keith
My real dataset will still be around 1300 examples but the number of attributes will range from 7200 - 108000 and when running the feature selection algorithm on such a dataset i run out of memory and im currently running 12Gb 1866MHz DDR3 on an I7.
Any Ideas?
Thanks for your help, much appreciated.
Keith
to my knowledge no row-oriented database can handle so many columns anyway. But if derby is able to, please tell me...
You should not use work_on_database, but instead use the CachedDatabaseExampleSource. It will only hold a subset in memory. But this does not neglect the problem with the number of columns...
Greetings,
Sebastian
Ive subsequently reduced my dataset to only 1 row and 4 columns and used both the "DatabaseExampleSource" with "work_on_database" selected and I still received the lexical error. With work_on_database unchecked everything works fine.
Ive also tried this with cachedDatabaseExampleSource and it too comes back with a lexical error but for line 1 column 13.
Any Ideas.
Thanks
to put an end to this discussion:
a) Your database doesn't support your targeted number of attributes anyway. So it doesn't make sense to waste time in trying to "work on database". You simply have to load it via file or put it together in rapid miner storing all data in main memory.
b) If you have such a great number of attributes you cannot use LinearRegression. For computing the coefficients, a quadratical matrix with the dimension of the number of attributes must be created. This will definitively not fit into any memory, if you have 100.000 attributes.
To come to an conclusion: Use another learner like the SVM. Use its linear kernel and the results will be comparable. This should solve your problems.
Greetings,
Sebastian