The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

TFIDF Mean

jaskiemrjaskiemr Member Posts: 8 Contributor II
I run TFIDF on some text, four files.

1) alpha bravo
2) alpha bravo
3) alpha bravo charlie delta
4) alpha bravo charlie delta

How is the "statistic" field calculated in the Meta data view output here? Is the mean here the calculation the td/idf measure (f[ij] / f[dj] * log( D / f )?

When I run it on "charlie" from above, RapidMiner gives 0.354. When I run the calculation by hand 1/4 * log( 4 / 2 ) I get 0.075. Is this normalized somehow or is the log the natural log or base 2?

Thank you for any input.
        mj

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    as I already explained in another topic, the mean is simply the statistical mean of all values in this attribute. Please take a look in the other topic for more information.

    Greetings,
      Sebastian
Sign In or Register to comment.