The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[SOLVED] How is term frequency calculated?
kasper2304
Member Posts: 28 Contributor II
Dear Rapid forum
Sorry for raising this question again, but I simply cannot figure out how rapidminer calculates term frequency.
where question have been addressed:
http://rapid-i.com/rapidforum/index.php/topic,4825.0.html
I have setup a reaaly simple example in rapidminer to try this out and in one documents I have 5 different terms and 5 terms i total within a document. This yields a tf score on 0.447 and I simply cannot figure out how this happens. It should not be that difficult but apparently it is...
Best
Kasper
Sorry for raising this question again, but I simply cannot figure out how rapidminer calculates term frequency.
where question have been addressed:
http://rapid-i.com/rapidforum/index.php/topic,4825.0.html
I have setup a reaaly simple example in rapidminer to try this out and in one documents I have 5 different terms and 5 terms i total within a document. This yields a tf score on 0.447 and I simply cannot figure out how this happens. It should not be that difficult but apparently it is...
Best
Kasper
0
Answers
http://www.youtube.com/watch?v=ToxzfYECxOU
Roland
He only speaks about the TF-IDF score not the TF score. I know they are closely related but I think i figured it out meanwhile:
In my case I have 5 terms meaning the the total number of terms is 5. A given term only occurs once in my case giving the equation of tf:
tf = countofterm(termi) / sqrt(totalnumberofterms)-> 1 / sqrt(5) = 0.447
I think my problem was that i was trying to read the source code but did not get it fully.
Anyways, thanks for the hint.
I consider the question answered
Kasper
glad you figured it out. Thanks for the details.
Roland