Anaylize Comment Data from Facebook
Hello Rapidminer community!
I have been trying to figure out for some days how to:
1. Pull comments from Facebook posts into Rapidminer (Solved!)
2. Use that post text to extract keywords and the corresponding sentiment (including number of times a keyword is mentioned and the number of times it had a positive / negative / neutral usage).
The 1st step has already been solved. It is the 2nd that I am having issues with. I have already been able to get the sentiment for each comment but extrcting the keywords as been an issue. I would like the information to output like the process in this link:
http://blog.aylien.com/building-a-text-analysis-process-for-customer-reviews-in-rapidminer/
I have attached picture of the data I have extracted from Facebook so you can understand where I am beginning. Also included is the csv of this information.
Any help would be greatly appriciated!
Sincerely,
Gloria
Best Answer
-
sgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
hello @Gloria_Fock - great. That makes life a lot easier. This should get you in a good direction. You'll have to reset the Aylien connections back to yours and the csv to where it is stored on your computer.
<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_csv" compatibility="8.0.001" expanded="true" height="68" name="Read CSV" width="90" x="45" y="85">
<parameter key="csv_file" value="/Users/GenzerConsulting/Desktop/Documenta14_smalltrst.csv"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
<parameter key="1" value="Comment"/>
</list>
<parameter key="encoding" value="UTF-8"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="level.true.integer.attribute"/>
<parameter key="1" value="id.true.integer.attribute"/>
<parameter key="2" value="parent_id.true.polynominal.attribute"/>
<parameter key="3" value="object_id.true.polynominal.attribute"/>
<parameter key="4" value="object_type.true.polynominal.attribute"/>
<parameter key="5" value="query_status.true.polynominal.attribute"/>
<parameter key="6" value="query_time.true.polynominal.attribute"/>
<parameter key="7" value="query_type.true.polynominal.attribute"/>
<parameter key="8" value="name.true.attribute_value.attribute"/>
<parameter key="9" value="message.true.polynominal.attribute"/>
</list>
</operator>
<operator activated="true" class="com.aylien.textapi.rapidminer:aylien_sentiment" compatibility="0.2.000" expanded="true" height="68" name="Analyze Sentiment (2)" width="90" x="179" y="85">
<parameter key="connection" value="Aylien Jan 2016"/>
<parameter key="input_attribute" value="message"/>
</operator>
<operator activated="true" class="text:data_to_documents" compatibility="7.5.000" expanded="true" height="68" name="Data to Documents" width="90" x="313" y="85">
<parameter key="select_attributes_and_weights" value="true"/>
<list key="specify_weights">
<parameter key="message" value="1.0"/>
</list>
</operator>
<operator activated="true" class="com.aylien.textapi.rapidminer:aylien_document_classify_by_taxonomy" compatibility="0.2.000" expanded="true" height="82" name="Categorize (Document)" width="90" x="447" y="85">
<parameter key="connection" value="Aylien Jan 2016"/>
</operator>
<operator activated="true" class="text:documents_to_data" compatibility="7.5.000" expanded="true" height="82" name="Documents to Data" width="90" x="581" y="85">
<parameter key="text_attribute" value="Text"/>
</operator>
<operator activated="true" class="nominal_to_text" compatibility="8.0.001" expanded="true" height="82" name="Nominal to Text" width="90" x="715" y="85">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Text"/>
</operator>
<operator activated="true" class="text:process_document_from_data" compatibility="7.5.000" expanded="true" height="82" name="Process Documents from Data" width="90" x="849" y="85">
<parameter key="keep_text" value="true"/>
<parameter key="prune_method" value="percentual"/>
<list key="specify_weights"/>
<process expanded="true">
<operator activated="true" class="text:tokenize" compatibility="7.5.000" expanded="true" height="68" name="Tokenize" width="90" x="45" y="34"/>
<operator activated="true" class="text:filter_stopwords_english" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (English)" width="90" x="179" y="34"/>
<operator activated="true" class="text:filter_stopwords_german" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (German)" width="90" x="380" y="34"/>
<operator activated="true" class="text:filter_stopwords_arabic" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (Arabic)" width="90" x="514" y="34"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
<connect from_op="Filter Stopwords (English)" from_port="document" to_op="Filter Stopwords (German)" to_port="document"/>
<connect from_op="Filter Stopwords (German)" from_port="document" to_op="Filter Stopwords (Arabic)" to_port="document"/>
<connect from_op="Filter Stopwords (Arabic)" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<connect from_op="Read CSV" from_port="output" to_op="Analyze Sentiment (2)" to_port="Example Set"/>
<connect from_op="Analyze Sentiment (2)" from_port="Example Set" to_op="Data to Documents" to_port="example set"/>
<connect from_op="Data to Documents" from_port="documents" to_op="Categorize (Document)" to_port="documents 1"/>
<connect from_op="Categorize (Document)" from_port="documents" to_op="Documents to Data" to_port="documents 1"/>
<connect from_op="Documents to Data" from_port="example set" to_op="Nominal to Text" to_port="example set input"/>
<connect from_op="Nominal to Text" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/>
<connect from_op="Process Documents from Data" from_port="example set" to_port="result 1"/>
<connect from_op="Process Documents from Data" from_port="word list" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>Scott
1
Answers
hello @Gloria_Fock welcome to the community! I'd recommend posting your XML process here (see "Read Before Posting" on right when you reply). This way we can replicate what you're doing and help you better.
Scott
Here we go!
This is just the part where I input the Facebook comments from a database and process the sentiment. I would like to then take the output from this process and make it look like the data in this tutorial.
http://blog.aylien.com/building-a-text-analysis-process-for-customer-reviews-in-rapidminer/
The author has posted his sample process at the bottom. I have been trying to make this work for days now but this is a bit above my pay grade :womanembarrassed: so any help would be appreciated!
Sincerely,
Gloria
oh and you do know that we have a Facebook operator now, right?
https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_facebook
Scott
Hi Scott!
I would like to thank you for your prompt reply and the helpful answer! This has indeed worked for what I am attempting to use Rapidminer for. As for the Facebook operator, I was only able to get it to return a maximum of 25 results per query. Instead, I used Facepager to extract the data I needed from Facebook.
Thanks again and have a great weekend,
Gloria
oh good! And...what is Facepager?
Scott