The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Information for bachelorthesis
Hello everybody,
at the moment I'am writting my bachelorthesis for a german company.
My subject is to show some possibilities how huge amounts of data can be summarized. The data aren't stored in a database, they arrive for example in a email box with pdf-format or office(word/excel)format. The person who sends the data shouldn't have any work to change or fit the data in a special format.
Is it possible to use a rapidminer programm to get the crucial information out of a mass of data? and can I track information back to the document??
I would be very greatful if i get some inforamtions.
Thanks
0
Answers
yes this is possible in general. All you need is to design a process that can extract the important content from the text documents. If you then install a RapidAnalytics, it can automatically listen to an email box and retrieve and process each incoming mail.
The real problem lies in finding a good data mining process for the content extraction...
Greetings,
Sebastian
what do you mean with good data mining process (Just in a few words)?
that's easy: A good process is a process that fulfills all goals of a given task with a low memory consumption and runtime. Some non functional properties like an easy process setup to make it easy to maintain can be added, too.
Greetings,
Sebastian