The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Use trained (SVM) model on different processes
Hi,
so I finally managed to get a sentiment analysis going, using the SVM classification algorithm.
Now my problem is, the dataset that I wish to analyze is huge (>400m tweets). The SVM model also took very long to train as the training database had already more than 100k rows.
As the main dataset is in a postgresql database I probably just will query specific days. And I wish to avoid that rapidminer builds the SVM model each time again.
How is this possible?
Thanks in advance
edit: Maybe I can add some questions here:
Which classification algorithm to you suggest? Which one is the fastest, which on the most accurate, which one the best combining?
How can I optimize performance of the whole process? As I wish to analyze 400m tweets, you can imagine that my first priority is getting that whole analysis done in a realistic timeframe, like 2-3 days processing time maximum (on a normal notebook). Do you think that is even possible?
so I finally managed to get a sentiment analysis going, using the SVM classification algorithm.
Now my problem is, the dataset that I wish to analyze is huge (>400m tweets). The SVM model also took very long to train as the training database had already more than 100k rows.
As the main dataset is in a postgresql database I probably just will query specific days. And I wish to avoid that rapidminer builds the SVM model each time again.
How is this possible?
Thanks in advance
edit: Maybe I can add some questions here:
Which classification algorithm to you suggest? Which one is the fastest, which on the most accurate, which one the best combining?
How can I optimize performance of the whole process? As I wish to analyze 400m tweets, you can imagine that my first priority is getting that whole analysis done in a realistic timeframe, like 2-3 days processing time maximum (on a normal notebook). Do you think that is even possible?
Tagged:
0