The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"Automated short term gas production forecasting using machine learning/big data/data mining"

maurits_freriksmaurits_freriks Member Posts: 28 Learner III
edited June 2019 in Help

Hi, 

 

First let me introduce quickly. I'm Maurits Freriks, student Business Analytics of VU Amsterdam. Recently I'm doing an internship for 3 months. I've to investigate if it's possible to automated short term gas production. With other words: An predicition based on historical data. I do have a litte experience with rapid miner but not that much. And first of all I'm wondering if this problem could be solved with Rapid Miner?

 

What I've done so far:

- I've received an dataset with historical datavalues of the last 3 years. The data comes from measure points for example: The flow of the amount of gass on a specific time serie, degrees, pressure etc.

- I've devided this dataset in a smaller dataset containing only 1 month of data.

- I've built a process with the small dataset and operator polynomial regression. I've received a solution with some coeffincients but if i test this to to total data set, the deviation was to high so the formule was useless. 

 

Now my question is before spending more and more time in Rapid Miner, if there are some recommendations which operators I've to use. And for example do I have to make a testset and trainingset. If yes, is it right if I devided the total dataset into 80% training an 20% testset.

 

I appreciate your attention, effort and time. Hopefully someone could help me out! 

And by the way: Sorry for my english!!

 

With kind regards,

 

Maurits Freriks 

Answers

  • kypexinkypexin RapidMiner Certified Analyst, Member Posts: 291 Unicorn

    Hi @maurits_freriks

     

    From the description of your task it seems that you could actually use time series RapidMiner extension to predict production volumes. Hard to make any practical advise without seeing the actual data, but this type of predictions are quite common in some domains and you may just search thim forum for 'time series prediction' and you'll get tens of practical solutions on different data. This could be pretty good starting point for your problem also.  

     

    PS I personally only have played around a bit with time series extension but I know that many people here on the forum are actually very skilled in this topic; as I mentioned, it would be actually beneficial if you could also share the data itself. 

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hello @maurits_freriks - welcome to the community and very glad that you're using RapidMiner to solve your problem.  :)  I had a client a while ago who was in the oil & gas industry and I think you are on the right path.  To help choose a model, I would recommend using the mod.rapidminer.com page.  As for splitting the data and other "best practices", please go through all the tutorial processes.  They were written by data scientists and are very well done.


    Good luck!


    Scott

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist

    Dear Maurits,

     

    great to have you here! Have a look at my recent blog post on validation: https://towardsdatascience.com/when-cross-validation-fails-9bd5a57f07b5 it has a different focus, but the use case was similar.

     

    Cheers,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • maurits_freriksmaurits_freriks Member Posts: 28 Learner III

     

    Hi @kypexin,

     

    Thanks for your quick reply. I've attached a screenshot from my dataset. The both flows are exactly the same but the difference is only the measurement. With the historical flow from the day before and the actual pressure, CO2 and degrees I would like to make an prediction. Is this still possible with Time series Rapid miner extension? 

     

    I've searched a bit on the term "time series" but i didn't find any good answers for me to understand the method. 

  • maurits_freriksmaurits_freriks Member Posts: 28 Learner III

    Hi @sgenzer,

     

    Thanks for you quick reply! I really appreciate your effort!

    Could you be so kind the share the contact of your client in PM? Maybe he could help me out and give som tips and tricks!

     

    Thanks!

Sign In or Register to comment.