The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[Solved] Data type "real" vs. "numeric"
Dear community,
On the first sight it seems to be a very basic question but even after reviewing the manual and a search in this forum I was not able to find an answer:
What is the difference between "real" and "numeric" values in Rapidminer? When to use which one?
Real numbers can be by definition any arbitrary point on a number line. In contrast to that the description for numeric values in the manual is "for numerical values in general". This sound pretty much like the same to me... but probably it is not...
Cheers
Sachs
On the first sight it seems to be a very basic question but even after reviewing the manual and a search in this forum I was not able to find an answer:
What is the difference between "real" and "numeric" values in Rapidminer? When to use which one?
Real numbers can be by definition any arbitrary point on a number line. In contrast to that the description for numeric values in the manual is "for numerical values in general". This sound pretty much like the same to me... but probably it is not...
Cheers
Sachs
0
Answers
aborg is right. Numeric can be either real or integer, whereas real can only be.. real ;-)
Best,
Nils
Thank you for your response.
From a mathematical point of view integer is a sub group of real. So I guess Rapidminer distinguishes between real and numerical due to performance reasons then? When should I take which one?
Best regards
Sachs
But until they do so, here is my thoughts.
The classification by maths is irrelevant here.
In computer science floats and integers are different!
See the Java Documentation on how floats and integers effect division, multiplication, etc.
As with anything with performance, test and find out!
The Java Virtual Machine basically makes it impossible to give a general statement about performance.
As a rule of thumb, you always want to use floats, unless you have specific reasons to use integers.
When I use "Generate ID" and then "Generate Attributes" @ id=id/2, the id attribute is automatically converted from integer to real.
Try use "Read CSV" @ datamanagement = int_array, you will get some funny results when you try to read floats.
Best regards,
Wessel
Hi Wessel,
I got your arguments and I agree that under this circumstances it is probably reasonable to make a test.
So if I have float values (like e.g. temperature readings, stock market data or weight data) I could use both, the numeric and real data type, right? I would just test performance to go for the faster one.
Best regards
Sachs
In the RapidMiner type hierarchy, Numerical is the Supertype of Integer and Real, and should be avoided. We are planning to deprecate the direct use of Numerical and recommend only the use of the well-defined subclasses, i.e. Integer and Real.
Best regards,
Marius
Alright, then I go for "real" in my case.
Thanks a lot!
Cheers
Sachs