Anaylize Comment Data from Facebook

Gloria_Fock · January 2018

Hello Rapidminer community!

I have been trying to figure out for some days how to:
1. Pull comments from Facebook posts into Rapidminer (Solved!)
2. Use that post text to extract keywords and the corresponding sentiment (including number of times a keyword is mentioned and the number of times it had a positive / negative / neutral usage).

The 1st step has already been solved. It is the 2nd that I am having issues with. I have already been able to get the sentiment for each comment but extrcting the keywords as been an issue. I would like the information to output like the process in this link:

http://blog.aylien.com/building-a-text-analysis-process-for-customer-reviews-in-rapidminer/

I have attached picture of the data I have extracted from Facebook so you can understand where I am beginning. Also included is the csv of this information.

Screen Shot 2018-01-25 at 16.52.03.png

Any help would be greatly appriciated!
Sincerely,
Gloria

sgenzer · January 2018

hello @Gloria_Fock - great. That makes life a lot easier. This should get you in a good direction. You'll have to reset the Aylien connections back to yours and the csv to where it is stored on your computer.

<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="read_csv" compatibility="8.0.001" expanded="true" height="68" name="Read CSV" width="90" x="45" y="85">
        <parameter key="csv_file" value="/Users/GenzerConsulting/Desktop/Documenta14_smalltrst.csv"/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
          <parameter key="1" value="Comment"/>
        </list>
        <parameter key="encoding" value="UTF-8"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="level.true.integer.attribute"/>
          <parameter key="1" value="id.true.integer.attribute"/>
          <parameter key="2" value="parent_id.true.polynominal.attribute"/>
          <parameter key="3" value="object_id.true.polynominal.attribute"/>
          <parameter key="4" value="object_type.true.polynominal.attribute"/>
          <parameter key="5" value="query_status.true.polynominal.attribute"/>
          <parameter key="6" value="query_time.true.polynominal.attribute"/>
          <parameter key="7" value="query_type.true.polynominal.attribute"/>
          <parameter key="8" value="name.true.attribute_value.attribute"/>
          <parameter key="9" value="message.true.polynominal.attribute"/>
        </list>
      </operator>
      <operator activated="true" class="com.aylien.textapi.rapidminer:aylien_sentiment" compatibility="0.2.000" expanded="true" height="68" name="Analyze Sentiment (2)" width="90" x="179" y="85">
        <parameter key="connection" value="Aylien Jan 2016"/>
        <parameter key="input_attribute" value="message"/>
      </operator>
      <operator activated="true" class="text:data_to_documents" compatibility="7.5.000" expanded="true" height="68" name="Data to Documents" width="90" x="313" y="85">
        <parameter key="select_attributes_and_weights" value="true"/>
        <list key="specify_weights">
          <parameter key="message" value="1.0"/>
        </list>
      </operator>
      <operator activated="true" class="com.aylien.textapi.rapidminer:aylien_document_classify_by_taxonomy" compatibility="0.2.000" expanded="true" height="82" name="Categorize (Document)" width="90" x="447" y="85">
        <parameter key="connection" value="Aylien Jan 2016"/>
      </operator>
      <operator activated="true" class="text:documents_to_data" compatibility="7.5.000" expanded="true" height="82" name="Documents to Data" width="90" x="581" y="85">
        <parameter key="text_attribute" value="Text"/>
      </operator>
      <operator activated="true" class="nominal_to_text" compatibility="8.0.001" expanded="true" height="82" name="Nominal to Text" width="90" x="715" y="85">
        <parameter key="attribute_filter_type" value="single"/>
        <parameter key="attribute" value="Text"/>
      </operator>
      <operator activated="true" class="text:process_document_from_data" compatibility="7.5.000" expanded="true" height="82" name="Process Documents from Data" width="90" x="849" y="85">
        <parameter key="keep_text" value="true"/>
        <parameter key="prune_method" value="percentual"/>
        <list key="specify_weights"/>
        <process expanded="true">
          <operator activated="true" class="text:tokenize" compatibility="7.5.000" expanded="true" height="68" name="Tokenize" width="90" x="45" y="34"/>
          <operator activated="true" class="text:filter_stopwords_english" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (English)" width="90" x="179" y="34"/>
          <operator activated="true" class="text:filter_stopwords_german" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (German)" width="90" x="380" y="34"/>
          <operator activated="true" class="text:filter_stopwords_arabic" compatibility="7.5.000" expanded="true" height="68" name="Filter Stopwords (Arabic)" width="90" x="514" y="34"/>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
          <connect from_op="Filter Stopwords (English)" from_port="document" to_op="Filter Stopwords (German)" to_port="document"/>
          <connect from_op="Filter Stopwords (German)" from_port="document" to_op="Filter Stopwords (Arabic)" to_port="document"/>
          <connect from_op="Filter Stopwords (Arabic)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Read CSV" from_port="output" to_op="Analyze Sentiment (2)" to_port="Example Set"/>
      <connect from_op="Analyze Sentiment (2)" from_port="Example Set" to_op="Data to Documents" to_port="example set"/>
      <connect from_op="Data to Documents" from_port="documents" to_op="Categorize (Document)" to_port="documents 1"/>
      <connect from_op="Categorize (Document)" from_port="documents" to_op="Documents to Data" to_port="documents 1"/>
      <connect from_op="Documents to Data" from_port="example set" to_op="Nominal to Text" to_port="example set input"/>
      <connect from_op="Nominal to Text" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/>
      <connect from_op="Process Documents from Data" from_port="example set" to_port="result 1"/>
      <connect from_op="Process Documents from Data" from_port="word list" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

Scott

sgenzer · January 2018

hello @Gloria_Fock welcome to the community! I'd recommend posting your XML process here (see "Read Before Posting" on right when you reply). This way we can replicate what you're doing and help you better.

Scott

Gloria_Fock · January 2018

Here we go!

This is just the part where I input the Facebook comments from a database and process the sentiment. I would like to then take the output from this process and make it look like the data in this tutorial.

http://blog.aylien.com/building-a-text-analysis-process-for-customer-reviews-in-rapidminer/

The author has posted his sample process at the bottom. I have been trying to make this work for days now but this is a bit above my pay grade :womanembarrassed: so any help would be appreciated!

Sincerely,
Gloria

<?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve Documenta14" width="90" x="45" y="85">
        <parameter key="repository_entry" value="//Local Repository/data/Documenta14_smalltrst"/>
      </operator>
      <operator activated="true" class="com.aylien.textapi.rapidminer:aylien_sentiment" compatibility="0.2.000" expanded="true" height="68" name="Analyze Sentiment (2)" width="90" x="179" y="85">
        <parameter key="connection" value="Aylien"/>
        <parameter key="input_attribute" value="message"/>
      </operator>
      <operator activated="true" class="text:data_to_documents" compatibility="7.5.000" expanded="true" height="68" name="Data to Documents" width="90" x="313" y="85">
        <list key="specify_weights"/>
      </operator>
      <operator activated="true" class="com.aylien.textapi.rapidminer:aylien_document_classify_by_taxonomy" compatibility="0.2.000" expanded="true" height="82" name="Categorize (Document)" width="90" x="447" y="85">
        <parameter key="connection" value="Aylien"/>
      </operator>
      <operator activated="true" class="text:documents_to_data" compatibility="7.5.000" expanded="true" height="82" name="Documents to Data" width="90" x="581" y="85">
        <parameter key="text_attribute" value="Text"/>
      </operator>
      <connect from_op="Retrieve Documenta14" from_port="output" to_op="Analyze Sentiment (2)" to_port="Example Set"/>
      <connect from_op="Analyze Sentiment (2)" from_port="Example Set" to_op="Data to Documents" to_port="example set"/>
      <connect from_op="Data to Documents" from_port="documents" to_op="Categorize (Document)" to_port="documents 1"/>
      <connect from_op="Categorize (Document)" from_port="documents" to_op="Documents to Data" to_port="documents 1"/>
      <connect from_op="Documents to Data" from_port="example set" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

sgenzer · January 2018

oh and you do know that we have a Facebook operator now, right?

https://marketplace.rapidminer.com/UpdateServer/faces/product_details.xhtml?productId=rmx_facebook

Scott

Gloria_Fock · January 2018

Hi Scott!

I would like to thank you for your prompt reply and the helpful answer! This has indeed worked for what I am attempting to use Rapidminer for. As for the Facebook operator, I was only able to get it to return a maximum of 25 results per query. Instead, I used Facepager to extract the data I needed from Facebook.

Thanks again and have a great weekend,
Gloria

sgenzer · January 2018

oh good! And...what is Facepager?

Scott

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Anaylize Comment Data from Facebook

Best Answer

Answers