The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
operator cannot be executed (Duplicate attribute name: cluster)
dranammari
Member Posts: 13 Contributor II
Hi everybody,
I am running a RapidMiner process that uses kmeans clustering to cluster a set of discussions. I want to extract the cluster centroids and save them to a CSV file for further programming in Java. Therefore, I have added two operators: Extract Cluster Prototypes, and Write CSV. Now I am having a Process Failed error. Here is the log messages:
Jan 17, 2012 2:34:08 PM SEVERE: Process failed: operator cannot be executed (Duplicate attribute name: cluster). Check the log messages...
Jan 17, 2012 2:34:08 PM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Read Database[1] (Read Database)
+- Rename[1] (Rename)
+- Set Role[1] (Set Role)
+- Data to Documents[1] (Data to Documents)
+- Process Documents[1] (Process Documents)
subprocess 'Vector Creation'
| +- Extract Content[9708] (Extract Content)
| +- Tokenize[9708] (Tokenize)
| +- Transform Cases[9708] (Transform Cases)
| +- Filter Stopwords (English)[9708] (Filter Stopwords (English))
| +- Filter Stopwords (Dictionary)[9708] (Filter Stopwords (Dictionary))
| +- Filter Tokens (by Length)[9708] (Filter Tokens (by Length))
| +- Generate n-Grams (Terms)[9708] (Generate n-Grams (Terms))
+- Clustering[1] (k-Means)
==> +- Extract Cluster Prototypes[1] (Extract Cluster Prototypes)
+- Write CSV[0] (Write CSV)
+- Select Attributes[0] (Select Attributes)
+- Write Database[0] (Write Database)
As you can see, the error says that: an operator cannot be executed (Duplicate attribute name: cluster). If we check the logs, an arrow points to the Extract Cluster Prototypes operator.
Can you please tell me what the problem might be and how to solve it? Is this a bug in the Extract Cluster Prototypes operator?
The process runs successfully and generates the clustering model without extracting the centroids.
Many thanks,
Ahmad
I am running a RapidMiner process that uses kmeans clustering to cluster a set of discussions. I want to extract the cluster centroids and save them to a CSV file for further programming in Java. Therefore, I have added two operators: Extract Cluster Prototypes, and Write CSV. Now I am having a Process Failed error. Here is the log messages:
Jan 17, 2012 2:34:08 PM SEVERE: Process failed: operator cannot be executed (Duplicate attribute name: cluster). Check the log messages...
Jan 17, 2012 2:34:08 PM SEVERE: Here: Process[1] (Process)
subprocess 'Main Process'
+- Read Database[1] (Read Database)
+- Rename[1] (Rename)
+- Set Role[1] (Set Role)
+- Data to Documents[1] (Data to Documents)
+- Process Documents[1] (Process Documents)
subprocess 'Vector Creation'
| +- Extract Content[9708] (Extract Content)
| +- Tokenize[9708] (Tokenize)
| +- Transform Cases[9708] (Transform Cases)
| +- Filter Stopwords (English)[9708] (Filter Stopwords (English))
| +- Filter Stopwords (Dictionary)[9708] (Filter Stopwords (Dictionary))
| +- Filter Tokens (by Length)[9708] (Filter Tokens (by Length))
| +- Generate n-Grams (Terms)[9708] (Generate n-Grams (Terms))
+- Clustering[1] (k-Means)
==> +- Extract Cluster Prototypes[1] (Extract Cluster Prototypes)
+- Write CSV[0] (Write CSV)
+- Select Attributes[0] (Select Attributes)
+- Write Database[0] (Write Database)
As you can see, the error says that: an operator cannot be executed (Duplicate attribute name: cluster). If we check the logs, an arrow points to the Extract Cluster Prototypes operator.
Can you please tell me what the problem might be and how to solve it? Is this a bug in the Extract Cluster Prototypes operator?
The process runs successfully and generates the clustering model without extracting the centroids.
Many thanks,
Ahmad
Tagged:
0
Answers
please post your process setup, so we can have a look at it. Just copy the contents of the XML tab on top of the process pane into your next post (use the #-button on top of the input field here in the forum).
Best,
Marius
It looks like I discovered the reason of the error, which is somehow "weird" to me!
In the text analysis step (the Process Documents operator), one of the resulted tokens is 'cluster', which is the same name as the 'cluster' attribute that will store the cluster number of each document after the clustering process. How did I discover this? I inserted the word 'cluster' in the Stop Word dictionary I am using for the Filter Tokens (Dictionary) operator, and the process run successfully!! Therefore, the Extract Cluster Prototypes operator fails to execute if it finds a token in the ExampleSet having the same name as the word 'cluster', which is the attribute storing the cluster labels!
Is this a bug in the operator? Ofcourse I don't have to consider the word 'cluster' as a stop word to solve this problem as this word is obviously not a stop word!
Here is the process in XML: Many thanks,
Ahmad
the Clustering operator always names his output attribute "cluster", which in your case is a bit sub-optimal. You could try to rename the attribute generated by Process Documents (if it exists) before applying clustering with a construction like this (not tested cause I don't have your data, but should work):