The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Summary of texts in RapidMiner
m_keshavarz_com
Member Posts: 28 Learner III
Hello
I searched the forum but did not get the desired result
So if my question is repetitive. Sorry
I want to summarize my articles and then I can analyze them
But I do not know how to summarize in the RapidMiner program?
is this possible
I know the aylian package has emotional analysis.
But I do not know how to summarize?
Can anyone help me?
Or should I use Python or R? Is it possible to make a simple example?
Thank you
I searched the forum but did not get the desired result
So if my question is repetitive. Sorry
I want to summarize my articles and then I can analyze them
But I do not know how to summarize in the RapidMiner program?
is this possible
I know the aylian package has emotional analysis.
But I do not know how to summarize?
Can anyone help me?
Or should I use Python or R? Is it possible to make a simple example?
Thank you
1
Answers
Hi @m_keshavarz_com,
First, my intimate conviction is that summarize a text is feasible with RapidMiner's native operators (I think with Tokenize / non letters and Tokenize / linguistic sentences...).
But waiting, I propose you a Python script using the NLTK library.
Many things to know to execute this process :
1. Install Python on your computer.
2. Install NLTK on your computer.(pip install nltk)
3. Download and install the necessary packages of NLTK (stopwords etc.) : For this uncomment and execute these
2 lines of code in the Execute Python operator :
After successfully installing these packages, you need to comment again these 2 lines of code.
4. Set your "text attribute" (with quotes) and your "sum up ratio" in the Set Macros's parameters :
Note : To have an idea, you can vary the "sum up ratio" between 0.1 (very short sum up) and 10 (very long sum up ~ original text).
To have an idea, here the result with a "sum up ratio" of 1 :
I hope it helps,
Regards,
Lionel
Hi,
I forgot to share the process :catvery-happy:
Regards,
Lionel
Hi @m_keshavarz_com,
I have a good new and a bad new :
the good new is that summarize a text is theoretically possible with RapidMiner's native operators
the bad new is that the resulting sentences of the sum up are in the mess.
Here the process :
You can adapt it to work with Twitter
Regards,
Lionel
Dear all,
In my previous post, I shared a process to "sum up" a text using only RapidMiner's operators.
The sentences of the resulting sum up are in the mess ... (not in the same order than in the original text which is...unfortunate)
After investigation(s), "guilty party" is the Process Documents to Data operator (associated to Tokenize (linguistic sentences)).
Indeed, after processing this operator ranks the "sentences attributes" by alphabetical order and so the original order is lost.
So my question is : Is there a way to conserve the original order of the sentences, in other words can these two operators render the results (the "sentences attributes") in the same order as the original text ?
Thanks you for your answers.
Regards,
Lionel
NB : I'm, of course, listening to an alternative way to "sum up" a text...
NB2 : If needed, the process is shared in my previous post.