Renaming first attribute(s) in exampleset based on position, not name.
Hi there, I have a few inconsistent datasets due to some legacy styles when importing excel files. Typically my files contain 5 standard attributes (first 5 in the example set) and then one to indefinite additional attributes/columns
The problem is that attribute/column one and 2 are occasionally named incorrect without any real pattern behind, so the idea is to rename my first attribute just as "A" and my second attribute like "B". But I can't find a real way to rename an attribute based on it's position in the set, plenty of other ways if you know the name but that's the problem in my case. I thought of looping the set but also here there seems no real easy way to understand the actual position of the attribute.
Any advice?
Best Answer
-
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
Maybe I don't entirely understand your problem, but doesn't "Rename by Generic Names" do what you want? You specify the "root" ("att" by default) and then it appends a number which corresponds to the position in the dataset. As usual you can apply to all attributes or filter for the ones you want using regex, value type, or just a list of specific ones.
2
Answers
Hi!
If your Excel tables have a fixed structure, you could just specify the correct metadata instead of reading the column names from the file, and avoid the whole problem.
If that's not applicable, you can use Data to Weights and then Weights to Data to get a list of the attributes. Then Generate Id to assign numbers. That should help you find the attribute ranges you're searching for.
Regards,
Balázs