The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Spend Data classification to UNSPSC
Hi all - I'm new and this is my first post.
Do you think that spend data classification to UNSPSC (or any other given taxonomy) can be achieved using Rapid-i?
Before would look like this:
Description Supplier
cartouche pour 5SIMX INMAC
After would look like this:
Description Supplier USPSC (Could be eClass or any other taxonomy inc in house)
cartouche pour 5SIMX INMAC 44103105 Ink cartridges
Do you think that spend data classification to UNSPSC (or any other given taxonomy) can be achieved using Rapid-i?
Before would look like this:
Description Supplier
cartouche pour 5SIMX INMAC
After would look like this:
Description Supplier USPSC (Could be eClass or any other taxonomy inc in house)
cartouche pour 5SIMX INMAC 44103105 Ink cartridges
0
Answers
originally RapidMiner is designed to learn classification tasks from examples of previously human classified data. It seems to me, your problem does not have this data? It seems rather to be some sort of lookup problem in a great directory? If I'm wrong, feel free to explain, how the new attributes where assigend to the "cartouche pour 5SIMX" thing.
But anyway, Haddock is right: You might achieve anything with rapid miner, the question is more or less, how complex things may become.
And something you might be interested in, even before thinking about rapidminer: It's open source, so if you offer it's result to your customers inside a program or website, you might have to give them access to your code...
Greetings,
Sebastian
Let me give you another example along with the process:
Example 1
Description = VAIO FW48E/H Laptop
Supplier = Sony
To classify this example manually one would recognize the word Laptop and so classify it against UNSPSC code = 43211509 UNSPSC Description = Laptop / Notebook PCs.
Example 2
Description = VAIO FW48E/H
Supplier = Sony
To classify this example manually one would recognize the word VAIO and so classify it against UNSPSC code = 43211509 UNSPSC Description = Laptop / Notebook PCs.
Example 3
Description = FW48E/H
Supplier = Sony
To classify this example manually one would need to look up on the Sony website to find out what FW48E/H related to before being able to classify it against UNSPSC code = 43211509 UNSPSC Description = Laptop / Notebook PCs.
Or looking at it from another angle:
If we see Sony as a supplier then based on previous experience of them as a supplier one would expect them to be supplying games, computers TVs etc
If one then see VAIO in the description we would know that it's referring to a laptop because of our previos knowledge of what a VAIO is when supplied by SONY.
What I need to be able to achieve is to look at a line of data (Description + supplier) and then allocate a UNSPSC code/description to it. Having done this once I then need the software to be able to learn that the words laptop and VIAO are related to the UNSPSC code for laptops and that the presence of Sony as a supplier just concretes the case.
This would be achieved by reading the text found in Description and Supplier then identifying the words VIAO, Laptop and Sony from the text strings before using those to classify to the UNSPSC code and Description.
I hope this has explained things a little better.
Thanks in advance for any additional advice.
Although it may be possible to do brain surgery with a power drill, it may not always be optimal so to do.
It is no different with using RM in this scenario, because RM's main purpose is to winkle out patterns in data; but if any string could have UNSPSC code 43211509 what patterns would there be in your data ?
Indeed in the darker parts of eastern France the Sony Vaio might be the appellation of a strong cheese, why not?
Just ponderin'
I think you might do this using the TextPlugin and TextClassification. But I share haddocks doubt, that this will perform very well on new data. But again, this very much depends on the data you have and on the data you are going to classify, so I cannot predict something without taking a look on the complete data. Might be, there are some useful informations one could extract from the data.
If you want, you might contact us for setting up a small start-up project, where we could test together if it works, or you might try it yourself.
Greetings,
Sebastian
simply write an email to contact@rapid-i.com. This will be received by the responsible persons and I will go and explain them all I know about your problem, so he can decide how to proceed.
Greetings,
Sebastian