The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Information for bachelorthesis

hoetzelshoetzels Member Posts: 2 Contributor I
edited November 2018 in Help

Hello everybody,

at the moment I'am writting my bachelorthesis for a german company.

My subject is to show some possibilities how huge amounts of data can be summarized. The data aren't stored in a database, they arrive for example in a email box with pdf-format or office(word/excel)format. The person who sends the data shouldn't have any work to change or fit the data in a special format.

Is it possible to use a rapidminer programm to get the crucial information out of a mass of data? and can I track information back to the document??

I would be very greatful if i get some inforamtions.

Thanks

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,

    yes this is possible in general. All you need is to design a process that can extract the important content from the text documents. If you then install a RapidAnalytics, it can automatically listen to an email box and retrieve and process each incoming mail.
    The real problem lies in finding a good data mining process for the content extraction...

    Greetings,
      Sebastian
  • hoetzelshoetzels Member Posts: 2 Contributor I
    Thanks for the response,


    what do you mean with good data mining process (Just in a few words)?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    that's easy: A good process is a process that fulfills all goals of a given task with a low memory consumption and runtime. Some non functional properties like an easy process setup to make it easy to maintain can be added, too.

    Greetings,
    Sebastian
Sign In or Register to comment.