Reading a SQL BAK file for analysis
I have a couple of SQL BAK files that contain information in terms of analytics I am required to perform. I personally do not access to a SQL server to restore the files and then connect to them for the anlaysis.
Is it possible to read the files directly into RapidMiner to perform the anlaysis? I do not seem to be able to read the BAK files in as all the SQL operators require connection to a SQL server.
Best Answers
-
robin Member Posts: 100 Guru
@Thomas_Ott wrote:When you say SQL BAK files, do you mean they have a ".bak" extension on them? If so, they're likely to be backups of the SQL database.
You can't directly import them into RM but what you could try is loading htem into a new database locally and then connect RM to that database.
Thanks Thomas, I was dreading that this was the answer. I assume if would be the same for a .sql backup file?
1 -
Thomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
Well .sql files are a bit different. They're just the SQL code to execute something on the database. Those you can use with the Execute SQL operator, just make the database connection and select the .sql file under the "query file" paramater.
1 -
Thomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
So I'm not sure what you want to do. If I understand it, you are loading in backed up Database files. What do you plan on doing after you load them in? ETL? Build a model?
The Learning Curve operator is used in classificaiton models to see if you get more performance if you add more data. Sometimes this is the case, sometimes not. The Sample operator lets you take a represenative sample so you can build a process and then apply it to a larger data set.
If you run out of memory, then you might want to take a sample and eventually offload the process to a RapidMiner Server. Just make sure the RapidMiner Server sits on a larger memory box so you can do the processing.
0
Answers
When you say SQL BAK files, do you mean they have a ".bak" extension on them? If so, they're likely to be backups of the SQL database.
You can't directly import them into RM but what you could try is loading htem into a new database locally and then connect RM to that database.
That approach works well on the smaller databases. Once I start dealig in backed up SQL databases that are larger than 300 mb in size I run out of memory.
RM recomends that I use a learning process or sampling operator to upload the data. What would you recommend?
The simple answer is to use the tools for the roles that they are intented. Don't try and force RM to jump through hoops it is not designed for.
We're always amazed at how our community and users actually "use" or "abuse" RapidMiner.