The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Importing .dat data set
Herllo. There is a quite popular retail dataset from belgian anonymized stores, which can be found here:
http://fimi.ua.ac.be/data/retail.dat.gz
First file lines are:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 31 32
33 34 35
36 37 38 39 40 41 42 43 44 45 46
38 39 47 48
38 39 48 49 50 51 52 53 54 55 56 57 58
32 41 59 60 61 62
3 39 48
63 64 65 66 67 68
32 69
48 70 71 72
39 73 74 75 76 77 78 79
But I don't know retail.dat file format.
How can it be imported into rapidminer?
I could not find any format descriptors of this.
Thanks
http://fimi.ua.ac.be/data/retail.dat.gz
First file lines are:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 31 32
33 34 35
36 37 38 39 40 41 42 43 44 45 46
38 39 47 48
38 39 48 49 50 51 52 53 54 55 56 57 58
32 41 59 60 61 62
3 39 48
63 64 65 66 67 68
32 69
48 70 71 72
39 73 74 75 76 77 78 79
But I don't know retail.dat file format.
How can it be imported into rapidminer?
I could not find any format descriptors of this.
Thanks
0
Answers
I just made a few spot checks, but it looks like the files are regular excel files with a header (despite having a .dat file extension). Hence you would want to use the "Read Excel" operator. However, before importing, you will need to save the files with an Excel extension: Open the .dat file with Excel -> Save as -> Get rid of the .dat in the file name -> Select an Excel file format -> Save.
The guys at Rapidminer have made a nice series of introductory videos, which can be found on Youtube. The one dealing with importing data can be found here: https://www.youtube.com/watch?v=1EZk9w1ln0g&index=2&list=PLssWC2d9JhOZLbQNZ80uOxLypglgWqbJA