The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Prediction Column out of Binary Machine Learning Classification Problem

SA_HSA_H Member Posts: 29 Contributor II
edited October 2019 in Help
For example, in case of Logistic Regression, we can get Coefficients that can be multiplied by the predictors to get the final output in the form of an attribute in CSV file or an image. Please let me know if it is scientifically correct to get the weights/rules out of trained SVM. ANN, KNN, NB models and multiply each predictor with each weight/rule and get the sum for all predictors. I mean (predictor 1* its weight + predictor 2* its weight + predictor 3* its weigh + ........)

Answers

  • rfuentealbarfuentealba RapidMiner Certified Analyst, Member, University Professor Posts: 568 Unicorn

    At first glance, operating on weights/rules didn't sound logical to me: Decision Trees try to make examples fit in one or other category by treating all data as categorical rather than numerical. Logistic Regressions, on the other hand, are performed over numerical data, and altering the results might make more sense.

    However, Gradient Boosted Trees work in a similar fashion. That is, giving more weight to classes that are difficult to classify and less weight to the easier ones. It wouldn't hurt to make a quick test and see how predictors behave with your data. The keyword to continue researching is Boosting.

    Hope it helps,

    Rodrigo.
  • SA_HSA_H Member Posts: 29 Contributor II
    Thank you @rfuentealba for your help. Could you please let me which classifier rather than DT allow extract of weight/rules.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi,
    your approach of coefficient*value only works for linear models. The strenght of most machine learning models is ,that they are non-linear. thats the cool part.Breaking down non-linear, multi-variate methods into single factors is 'tricky' to 'impossible'.
    Never the less, have a look at the WEI ports of the operators and at operators like Tree to Rules (or so?). They may help.

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • SA_HSA_H Member Posts: 29 Contributor II
    So linear ML model could work same way sch as Linear SVM, but other non-linear models could not,
  • varunm1varunm1 Member Posts: 1,207 Unicorn
    edited October 2019
    Hello @summer_helmi

    SVM is a one of the linear models , but it can work with non linear functions using kernel trick. Non linear algorithms have their own way of working, for example a decision tree works based on split criterion and a neural network work based on hidden unit activations.

    So basically every class of algorithms have their own way of working 

    For your initial question, yes its scientifically correct to get feature weights from an algorithm, as the weights are calculated based on proven methods. But it is not always correct to multiply the weight with feature, it is only correct for a class of linear models (GLM) that are based on linear equations 



    Regards,
    Varun
    https://www.varunmandalapu.com/

    Be Safe. Follow precautions and Maintain Social Distancing

Sign In or Register to comment.