The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Reading a SQL BAK file for analysis

robinrobin Member Posts: 100 Guru
edited November 2018 in Help

I have a couple of SQL BAK files that contain information in terms of analytics I am required to perform. I personally do not access to a SQL server to restore the files and then connect to them for the anlaysis. 

 

Is it possible to read the files directly into RapidMiner to perform the anlaysis? I do not seem to be able to read the BAK files in as all the SQL operators require connection to a SQL server. 

Tagged:

Best Answers

  • robinrobin Member Posts: 100 Guru
    Solution Accepted

    @Thomas_Ott wrote:

    When you say SQL BAK files, do you mean they have a ".bak" extension on them?  If so, they're likely to be backups of the SQL database. 

     

    You can't directly import them into RM but what you could try is loading htem into a new database locally and then connect RM to that database. 


    Thanks Thomas, I was dreading that this was the answer. I assume if would be the same for a .sql backup file?

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    Well .sql files are a bit different. They're just the SQL code to execute something on the database. Those you can use with the Execute SQL operator, just make the database connection and select the .sql file under the "query file" paramater. 

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    Solution Accepted

    So I'm not sure what you want to do. If I understand it, you are loading in backed up Database files. What do you plan on doing after you load them in? ETL? Build a model?

     

    The Learning Curve operator is used in classificaiton models to see if you get more performance if you add more data. Sometimes this is the case, sometimes not. The Sample operator lets you take a represenative sample so you can build a process and then apply it to a larger data set.

     

    If you run out of memory, then you might want to take a sample and eventually offload the process to a RapidMiner Server. Just make sure the RapidMiner Server sits on a larger memory box so you can do the processing. 

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    When you say SQL BAK files, do you mean they have a ".bak" extension on them?  If so, they're likely to be backups of the SQL database. 

     

    You can't directly import them into RM but what you could try is loading htem into a new database locally and then connect RM to that database. 

  • robinrobin Member Posts: 100 Guru

     


    @Thomas_Ott wrote:

    Well .sql files are a bit different. They're just the SQL code to execute something on the database. Those you can use with the Execute SQL operator, just make the database connection and select the .sql file under the "query file" paramater. 



     

    That approach works well on the smaller databases. Once I start dealig in backed up SQL databases that are larger than 300 mb in size I run out of memory. 

     

    RM recomends that I use a learning process or sampling operator to upload the data. What would you recommend?

  • robinrobin Member Posts: 100 Guru

    The simple answer is to use the tools for the roles that they are intented. Don't try and force RM to jump through hoops it is not designed for. 


    @robin wrote:

     


    @Thomas_Ott wrote:

    Well .sql files are a bit different. They're just the SQL code to execute something on the database. Those you can use with the Execute SQL operator, just make the database connection and select the .sql file under the "query file" paramater. 



     

    That approach works well on the smaller databases. Once I start dealig in backed up SQL databases that are larger than 300 mb in size I run out of memory. 

     

    RM recomends that I use a learning process or sampling operator to upload the data. What would you recommend?


     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    We're always amazed at how our community and users actually "use" or "abuse" RapidMiner. :)

Sign In or Register to comment.