The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Future product sales prediction
Hello,
I am new to RapidMiner and data science in general. I study Business Administration and have decided to write my bachelor thesis on predictive analytics. For this purpose, I would like to develop a model from the sales data of a retail company in RapidMiner, with which I can predict stock levels for future smartphone models.
I really wanted to post some screenshots but it is not possible for me, as I am to new in this community. There is a PDF attached where the story is in context of some screenshots.
I have already been able to acquire some very helpful information in this forum and in the learning environment of rapidminer. But now I am at a point where I am stuck, because I also lack some basic knowledge.
I have received sales data from certain product categories of the last three years from a retailer. The data includes lines for each sale and some information about the product sold. I then edited the data in excel and divided some technical specifications as individual attributes. I have subdivided the specification "capacity" into "low", "medium" and high rather than the existing gigabytes (32GB, 64GB, ...) in order to eliminate discrepancies resulting from technical progress. As a further attribute, I have added the age of the product at the date of sale. The aim of me was to classify the individual attributes of a product as generally as possible. For legal reasons, I am unable to upload the entire datasets.
My idea was then to create a sum of the sales per week for the respective products, since the sales figures of one week are very close to the basic stock level. Furthermore, I removed attributes, which in my opinion are no longer relevant for the further processing.
In the predicting process, I would like to train a model with the prepared data and apply it to example Data of a new product.
For the data of the new product, i used of course the same attributes as the training data and wanted the prediction for all 52 weeks of a year.
I then created with the help of some tutorials a process, where I use the linear regression model trained by the prepared data and applying it to the created data for a possible new product.
Now I am faced with the problem that the data on the sales volume per week "sum(Amount)" generated by this process are not very realistic, or the data differs too little per week (the data output is eg between 9 and 13, while the input data of the Training sets for these specific attributes are between 1 and 72.
My questions are: am I on the right track at all? And is the linear regression the right model for my task?
Many thanks for your help
Andy
I am new to RapidMiner and data science in general. I study Business Administration and have decided to write my bachelor thesis on predictive analytics. For this purpose, I would like to develop a model from the sales data of a retail company in RapidMiner, with which I can predict stock levels for future smartphone models.
I really wanted to post some screenshots but it is not possible for me, as I am to new in this community. There is a PDF attached where the story is in context of some screenshots.
I have already been able to acquire some very helpful information in this forum and in the learning environment of rapidminer. But now I am at a point where I am stuck, because I also lack some basic knowledge.
I have received sales data from certain product categories of the last three years from a retailer. The data includes lines for each sale and some information about the product sold. I then edited the data in excel and divided some technical specifications as individual attributes. I have subdivided the specification "capacity" into "low", "medium" and high rather than the existing gigabytes (32GB, 64GB, ...) in order to eliminate discrepancies resulting from technical progress. As a further attribute, I have added the age of the product at the date of sale. The aim of me was to classify the individual attributes of a product as generally as possible. For legal reasons, I am unable to upload the entire datasets.
My idea was then to create a sum of the sales per week for the respective products, since the sales figures of one week are very close to the basic stock level. Furthermore, I removed attributes, which in my opinion are no longer relevant for the further processing.
In the predicting process, I would like to train a model with the prepared data and apply it to example Data of a new product.
For the data of the new product, i used of course the same attributes as the training data and wanted the prediction for all 52 weeks of a year.
I then created with the help of some tutorials a process, where I use the linear regression model trained by the prepared data and applying it to the created data for a possible new product.
Now I am faced with the problem that the data on the sales volume per week "sum(Amount)" generated by this process are not very realistic, or the data differs too little per week (the data output is eg between 9 and 13, while the input data of the Training sets for these specific attributes are between 1 and 72.
My questions are: am I on the right track at all? And is the linear regression the right model for my task?
Many thanks for your help
Andy
Tagged:
1
Answers
Scott