The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

What seems to be the problem in this case?

cjjc20001cjjc20001 Member Posts: 8 Contributor II
I am trying Lightgbm with a dataset. It is giving the following error. 




Sample data are gender, degree concentration etc. Mostly ready-made options coming from a survey where the participant just selects the most appropriate option.

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi,

    looks like your text field has categories in application which werent present in training.

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • cjjc20001cjjc20001 Member Posts: 8 Contributor II
    I think the problem is that there are data instances that only occur once, and during the sampling, this occurrence is not chosen by the training data; hence during the validation; they are marked as unrecognized. When I removed the split, it worked. However, I need to train and test the model. I utilized cross-validation but it has the same problem. What is the solution for this?
Sign In or Register to comment.