The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Comparing movie perfomance
faizharry4
Member Posts: 5 Learner III
hi...im doing a project in rapid miner using search twitter and sentiment analysis...im trying to find a way to prove that marvel movies is better than dc movies and also im trying to extract new attributes from the data that been collected. for example, what kinds of words (common words) that used to describe the avengers. what are the word that used to describe the positive, negative, neutral. so far..i have no idea how to do that...i already collected the data using the seacrh twitter and sentiment analysis...but the later part..is a puzzler...can you please help me
Tagged:
0
Answers
@faizharry4 that's an interesting problem. It'll be hard comparing sentiment for Spiderman tweets vs Superman tweets. Have you thought about extract the sentiment scores for DC vs Marvel movies and doing a weight rolling average. Like 1000 pos / 20,000 tweet for DC vs 500 pos / 6000 tweets for Marvel, doing it per day and trending it? This way you might be able to see a rate of change before and after a movie is released?
basically im trying to compare between infinity wars vs justice league....what i have done now is basically retrieving data from twitter using search twitter and then using aylien to analyze sentiment then using data to documents and then use categorize (document) followed by documents to data operator and finally write excel to store the data that being retrieved...so now i have 200 tweets for each movie... and then im stuck for the next move...which is how to compare the two movies....
@faizharry4 200 tweets for each movie sounds awfully low. Maybe start generating a Wordlist for each movie and see what are the most common words used to describe each movie?
@Thomas_Ott the 200 tweet is only for the startup before it being expanded...i will add on no of tweet once i have figured out the soluton...anyway...as you suggested...how to generate a Wordlist for each movie and see what are the most common words used to describe each movie in rapcan we id miner?
and can we import data directly from metacrtitics, imdb, rotten tomatoes so that i can compare the perfomance of the two movies and then import other data from any website that has the gross of both film?
@faizharry4 use the Process Documents from Data operator, embed a tokenizer and other text processing operatprs. Then output the WOR port.
@faizharry4 also, you can get IMDB and Rotten Tomato info from using the Web Mining extension, you just have to create the process.
@Thomas_Ott thanks....i have try to create a process for the word count...but i come to blank...i try to do a word associaton...which word is associated with polarity of positive, negative and neutral but the result is empty
@faizharry4 you need the Process Documents from Data operator, not Process Documents from Files.
Also you will need probably use a Nominal to Text conversion operator.
@Thomas_Ott i've tried other method...but it seems my luck is not there...still wont give the result that i want...using sentiment analysis, it categorized the polarity based on the tweet...is it possible to find out the word that being associated with the neutral, positive and negative?
@faizharry4 if you're passing the sentiment into the process documents operator, try setting it as a label role. Or, if you are using the Extract Sentiment operator and set the Vector Creation to Binary Occurances you can output the EXA port and see the sentiment for the tweet ID and what word its attached too/
Like so: