question about FOG/readability testing

kevinace · November 2020

Dear All
I am doing a research paper in Text Readability
1. How do I use Rapidminer to measure complex words (words with Syllabus more than 3). I 'google' for "rapidminer readability" or "rapidminer complex words" but have not found the page I wanted.
2. How to use measure if the targetted content has sentimental words? (I have the list of keywords in 4 different tabs of an Excel spreadsheet)

I have seen few websites offer a free measure of FOG (and other) indexes, but if I simply use others' work, then there is no fun to learn rapidminer. So, thanks in advance for any advice for a new beginner like me

Thanks!

Kevin

MartinLiebig · December 2020

Hi,

RapidMiner has (mostly) two ways how it stores data: ExampleSets aka DataTables, which are the blue colored table objects and Documents which are grey and are designed to handle document.

Read Office gives you a Document, not an ExampleSet. What you need to do to use it with append is to convert it into a Table with Documents to Data.

on 2): Extract sentiment is wrapping existing models with a given dictionary. You can create your own model using Dictionary Based Sentiment.

on 3): Likely yes. How complex this is depends on your equation.

on 4) you can do things like extract the length of the document, extract the length of sentences and so on, yes. I think you cannot split into syllabus easily (but i might be mistaken here).

Cheers,

MArtin

MartinLiebig · November 2020

Hi,

there is no built-in readability index for documents

For sentiment, check the operators Extract Sentiment and Dictionary Based Sentiment. Both are part of operator toolbox extension.

Cheers,

Martin

kevinace · November 2020

Dear Martin

1. How do I setup parameter for Append and extract sentiment?
what i did: read office file - append - extract sentiment - res
Read office file: an article.doc
append - auto (default)
Error: wrong data: wrong input of type 'document' at port. (see screenshot)

2. I tried with Sentimental Analysis template, came up with 'positive prediction', confidence (-ve) 0.413, confidence (+ve) 0.587, which is wonderful.
But, is there a way I can import multiple data set?
Also, is there a way to change the Sentiment keywords with a 4-tabs Excel spreadsheet I have prepared?

3. Although there is no build-in readability index in Rapidminer, if I have the formula ready, is there a way to use it?

4. Although there is no build=in readability index, is there a way to measure syllabus in Rapidminer, for example, there are 10 words in total, with 3 words with syllabus more than 3.

Much appreciated!

Kevin

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

question about FOG/readability testing

Best Answer

Answers