The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Nominal statistics problems
Hi I'm accessing to a database where I've numerical data and nominal data. When I execute the process in the GUI of RapidMiner I get results like mode and average in column statistics of MetaDataView tab. The problem is that with the API of rapid miner I'm not allowed to get statistics like mode and least (and the GUI shows it to me) except with the AVERAGE.
what i'm doing wrong ?
thx
this works but ....
File f = new File("operadores2.xml")
Process process = new Process(f)
IOContainer ioc = process.run()
ExampleSet ses = ioc.get(ExampleSet.class)
ses.recalculateAllAttributeStatistics()
ExampleTable ext = ses.getExampleTable()
System.out.println(ses.getStatistics(ext.getAttribute(14), Statistics.AVERAGE))
this doesn't work (attribute number 7 is nominal)
File f = new File("operadores2.xml")
Process process = new Process(f)
IOContainer ioc = process.run()
ExampleSet ses = ioc.get(ExampleSet.class)
ses.recalculateAllAttributeStatistics()
ExampleTable ext = ses.getExampleTable()
System.out.println(ses.getStatistics(ext.getAttribute(7), Statistics.MODE))
what i'm doing wrong ?
thx
Tagged:
0
Answers
perhabs the documentation does not say it explicit enough, but never use the ExampleTable unless you are going to construct a full new data set. NEVER. That's in 99% the wrong way.
Here it causes problems because:
The statistics are base on attributes inside an example set. This is because an exampleset might not cover all rows and not all columns of an ExampleTable. Since you never should operate on ExampleTables, there's no use of calculating statistics on them. So here's the way you should do it: There is no easy way of accessing the x-th attribute, because in general we want to avoid this way. Otherwise you would have to guarantee that the order of attributes always is constant. You should rather use the name of an attribute.
Greetings,
Sebastian
I have print the results in the standard output and I get this 0.0 or 1.0 are nominal attributes. In the GUI I get more information for example (mode = Barcelona(52) least = Madrid(1))
My question is, is it posible to get Barcelona and 52 values or not?
thanks again
yes, thats possible. You have to take the nominal mapping for getting the name of the mode value. And use the COUNT statistics for getting the numbers. Here's how to do: Greetings,
Sebastian
thx