The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
WORD FREEQUENCIES PROBLEM
NewbieStudent
Member Posts: 2 Learner II
in Help
Hi, anyone know how can I calculate the frequencies of each males and females? I want to create two new column which are female and male with its frequencies for each rows.
I
I
0
Answers
This is a nice challenge. It can work with a combination of Split, De-Pivot, Split, Aggregate and Pivot.
First you should check your import process. participant_gender should be the attribute name, not the first data entry.
Do you have an ID elsewhere in your data? If not, you can use Generate ID to identify the rows.
Then use Split with the || separator. Split uses regular expression syntax, so it will be \|\| for this separator value. This will create a number of additional columns with single X::Male or Y::Female entries.
You can then use De-Pivot to put these columns into the rows based on the ID. You will get multiple entries for every ID.
This can again be split up with Split on the separator ::. This gives you the genders in an attribute. You can then use Aggregate to group on the ID and the gender and count the numbers. If you need these on new attributes, use Pivot to transfer the gender values on the columns.
Regards,
Balázs
You can use this approach.
To learn more check our free text mining course: https://academy.rapidminer.com/learn/course/text-and-web-mining-with-rapidminer/text-and-web-mining/lets-get-started