The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Automate manual iterations // Text analysis
Hello Community,
I am facing an automation problem.
Initial situation:
- I want to check how often words from a given list (categories) occur in my Example Sets (100 texts).
- I have 7 categories with labelled words (architecture, activities, culinary etc.)
- So the point is to check in 100 texts how many words occur per category.
Example (Culinary category):
- In the Culinary list there are words like "curry", "spaghetti" and "pizza".
- In the first text there are e.g. 2 hits (curry + pizza). So the final result for text 1 will be 2 for the category "Culinary".
- Next follows the 2nd category "Architecture" and so on.
- When all categories have been passed through, the second text is considered.
At the end we have a result, which words from the categories occur how often in the individual texts. From this we can then conclude what weighting the category has for the text.
The process is already running but only manually. So I have to change the parameters manually 100 x 7 times (Filter Example Range), which is not very nice. Is there a way how I can automatically run the lists against each other?
Routine (idea):
1. take category 1 and check in which texts from 1-100 the words occur how often.
2. take category 2 and check in which texts from 1-100 the words occur how often.
I hope you understand my problem and can help me! I have attached all relevant data.
Best regards,
Patrick
0
Best Answer
-
MarcoBarradas Administrator, Employee-RapidMiner, RapidMiner Certified Analyst, Member Posts: 272 Unicorn@Hyperrick you need to add the append operator afte each loop values operator because the output of the Loop Values is a collection.
The append will convert it to an example set. In case that the attributes names are not matching you'll need to use the append robust from the operator toolbox5
Answers
You'll need a Data Set that could look something like
Category ----|---- Word
Architecture Word1
Architecture Word2
Architecture .......n
Food Word1
Food Word2
Food ......n
With the loop values you will pick Category as the attribute you are going to loop and in the inner process you could filter examples that contain the category that the macro took on each iteration and then run your process.
Hope this helps.
thanks for your response.
I prepared the data as you said and am now able to have both lists as collections prepared.
Process overview (first part [yellow higlighted] works with your solution):
Result1: Collection of categories with words; ID is "word", label is "url":
Result 2: Collection of texts with words; ID is "word", label is "url":
The next step is to join only matching words together with the operator inner join. Unfortunately I have no idea to build the process from now on. The join doesn't work showing following error message:
Do I have to build another Loop operator around the join (and following oeprators "aggregate" etc.)?
Kind regards,
Patrick
Kind regards,
Patrick