The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Predict missing values
hatsjikidee
Member Posts: 3 Learner I
Hello all,
I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:
My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.
Thanks in advance!
I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:
My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.
Thanks in advance!
Tagged:
0
Best Answer
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn@hatsjikidee,
OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.
From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.
Hope this helps,
Regards,
Lionel
PS : There is a useful ressource (a book) for you :
- "RapidMiner, Data mining use cases and business analytics applications", (Chapter 9 : Constructing Recommender Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
- the associated extension "Recommenders" (to install from the MarketPlace).8
Answers
If you have some descriptive features of your songs, you can build a model based on your labeled data (your rated songs) and then apply this model to the unlabelled data (the unrated songs).
To help you further can you share your data ?
Hope this helps,
Regards,
Lionel
The dataset has 3 attributes:
Song name - Rating - Name (of the rater)
Every user has about 40 songs, of which 20 rated and 20 not. So the goal is to predict the missing ones based on what the user rated on the ones he did rate. Hope this gives more clarification.
I don't know if you are doing this as part of a class or just for the fun of doing it but in real life part of being a Data Scientist is analysing the problem, identify the data that may or may not predict an outcome and then extract it from it source. Sometimes the Example Set includes all the attributes you may need and sometimes you need to go out to the internet and find it to enhance your analysis.
Hope this helps and if you need help texts us and we'll be glad to guide you on the process.
Best Regards.