The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Best Answer
-
rjones13 Member Posts: 204 UnicornHi @Qween,
Could you possibly explain to me what you mean by "manually calculated"? Just to help from my understanding what's going wrong.
The Replace Missing Values is to account for the fact that different datasets have different columns due to different content. Rather than allow missing values, instead we replace these with zeros which is more representative of what the data is showing. The Nominal to Binominal is to declare that the label is either positive or negative - technically it's not needed but it's good practice to declare it.
Best,
Roland0
Answers
Could you post your process and data? Just trying to understand why you've got two Process Documents subprocesses in your process.
Best,
Roland
Thank you for your response.
Review1&2 for the first ' process Documents From File" operator & Test file is the test data that I've used for the second ' process Documents From File" operator
I've had a look, and realised the slight issue with your process. When you're loading in your test data, you've assigned the class to be "text". So when it comes to scoring, it thinks there's a third class called text whereas your model has been trained to predict positive or negative. You'll just need to adjust your process to account for this and either split your test data or run some further processing. I've attached my attempt below.
Best,
Roland
Thank you very much for your explanation.
I've tried the code multiple times with changing some of the parameters but unfortunately it did not work. It gave me a note saying that SVM needs to be labeled. Moreover, it was also looking for the testing data which is the file with a name of 'test'
Could you try importing the attached process? I'd split up the test between positive and negative, as shown in the screenshot below
Let me know if this now works.
Best,
Roland
Thank you for you response. Unfortunately, I didn't work. at the beginning, it gave me a pop message about the dummy operator, please see first attachment. Then, when I have tried to take the dummy operators out of the process, another message from SVM appeared, please see the second attachment. I've tried another approaches also, but I failed to get it runs and I'm not sure why it didn't work as it works with you.
Unfortunately I don't see your attachment, but I think the process I'd build used operators from the Operator Toolbox extension. Please could you try installing this extension and then importing again?
Best,
Roland
I'm sorry I forget to attached them. However, I did the toolbox extension and I still get the same
Regards,
Could you add a breakpoint after "Nominal to Binominal" and run the process again? Please share a screenshot of the result
Thanks,
Roland
I've tried to add the breakpoint after "Nominal to Binominal" but I got the same
Regards,
Was there any data present when you reached the breakpoint - my reason for asking to try that was so we could see if the problem was missing data.
Could you confirm on the 4 "Process Documents from Files" operators that you've changed them to the appropriate file locations on your machine?
Best,
Roland
Oh yes, I realized that the software is pulling the data from your machine. However, I've fixed it and I got the attached results (Accuracy = 0.00%) which is doesn't make sense as I calculated manually and it supposed to be
Regards,
I can see you've assigned the classes "test_neg" and "test_pos" to your testing datasets. Please could you change this to "neg_reviews" and "pos_reviews". With this workflow, they are treated as different classes hence the supposed 0% accuracy - it requires you to be consistent with the class labels when testing. Hopefully with this fix you see the same 100% accuracy I did!
Best,
Roland
Sorry for making you busy. Just changed it . However, below is a screenshot of what got as a result. It still not matching the manual one which is
Regards,
This surprises me slightly that we've ended up with different results. Just checking you didn't adjust the model at all? As a manual check could you look at the scored data coming out of the Performance operator:
And then share the results so we can see which files vary:
This is a bit weird. Yes, I got different results. Although the process and the files, I believe are the same. Below are the results I got from the model.
Regards,
How about sharing the process again with all files corresponding to each operator to confirm that I haven't missed up somewhere? Would that be possible?
Thank you for your help & support
I agree it's odd, as from your screenshot it looks like 100% accuracy. I've attached a zip file here with the process and the files as I've organised them. All you should need to do is just change the file paths for the 4 "Process Documents from Files" operators to match your system.
Best,
Roland
Thank you very much. It worked. However, the result of the accuracy is 100% but when I calculated manually, it supposed to be 70%. I'm not sure if still there is something needs to be change like in the parameters or anything else?
Regards,