The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Auto Model Performance. Is it training, testing, or validation?
Best Answer
-
varunm1 Member Posts: 1,207 Unicorn@Konradlk
Here you go. I tried a couple of neural layers with different layer sizes and adding new layers. It looks like the best performance (in my trials) came with only one layer with 2 neurons. Adding more neurons or layers is reducing the Test performance as it seems overfitting.
The process attached seemed optimal with RMSE of 0.023 and Squared Correlation of 0.5. You can try other models and compare them with a neural network to see if the RMSE is decreasing and Square correlation is increasing. Higher squared correlation and lower RMSE are better.
Below are the testing data performances (RMSE & Squared Correlation respectively)
NN with a single layer and 4 neuron Test 0.025 0.430NN with a single layer and 10 neuron Test 0.027 0.419NN with two-layer and 2 neurons in each layer 0.027 0.395NN with a single layer and 2 neurons test 0.023 0.50
Hope this helps.Regards,
Varun
https://www.varunmandalapu.com/Be Safe. Follow precautions and Maintain Social Distancing
7
Answers
Auto model divides the original dataset into 60:40 split (Train: Test). The validation in the auto model is a multi hold out set validation. The model will be trained on 60% data and the 40% test data will be divided into 7 subsets. Once the model is trained, it will be used to make predictions on each of the 7 subsets independently and the performance of these 7 subsets will be averaged. So the performance you see in the auto model is from the test data using a multi hold out validation method.
Hope this helps. Please inform if you need more information.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
If you click on "performance" of each model you can find different performance metrics like accuracy, precision, recall etc
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Hi, Im looking to get the performance vector for each step of the process. So I am looking for the performance vector of Training, Validation and Testing.
I was previously using a process a coworker left me, and they explicitly said that they need errors for all 3 stages. I am sorry that this is unclear. I do not have the greatest understanding of this and trying to learn very very quickly.
My goal is to run several different prediction models and compare the performance of the different models.
This pictures down below was what i was left with. I can post more information if necessary.
Your process is correct.
You have effectively :
- the training performance (given by the Performance operator in the "training" part of the Cross Validation operator)
- the validation performance (given by the Performance operator in the "testing" part of the Cross Validation operator)
- the testing performance (given by the Performance operator in the main process)
Do you encounter some errors with this process ?
Regards,
Lionel
I do encounter errors when I try to change the neural network for deep learning or Generalized linear or SVM.
The problem I encounter is that no matter what the predictive models I run I get the exact same errors for each performance test.
When I run an auto model I get different errors for each model but not when I change them in my process. I change the models by just changing the neural network box to whatever else I wanted to run.
Can you inform the details of those errors? If possible provide us with data and .rmp file to debug.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
The method of validation is different in AutoModel and in your process :
- In AutoModel, a split validation with a multi hold out set validation is performed like described by Varun. You can open the process generated by AutoModel to understand how is validated your model.
- In your process, you are using a Cross Validation.
Although performance should not differ significantly in both cases, the use of 2 different validations method can explain the differences.
Moreover you are applying a preprocessing step to your data (Normalization). To my knowledge, AutoModel does not apply such preprocessing step by default. This difference in the preprocessing step can explain the difference in the performance results. Once again you can open the process generated by AutoModel and compare it to your process.
But in order we can reproduce what you observe, and find what exactly is going on, can you share your data and your process (the process of your screenshot)
Regards,
Lionel
Once again thank you both for your time and help. I am going to attach my .rmp file and both excel files I use. If either of you can help me figure out how to get decent data for neural network and at least one predictive model I would be so grateful.
For both of the excel files only the last sheet is used
Any reference performance values you have or you are looking for? I modified your process and added an optimize parameter grid for the neural network. I didn't change layer information inside a neural network like adding neurons or layers.
I attached the working process without errors. You can change the layers in the neural network operator inside the optimize parameter (Grid) to see how different layers work. I will try other settings, you can add layers and try as well. Use Squared correlation and RMSE as your performance evaluation metrics.
Please let us know if you have more questions
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing