The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Generating Simulated DataFrame

cedric_anovercedric_anover Member Posts: 5 Contributor I
edited December 2018 in Product Feedback - Resolved

This is an Operator Idea where it takes a DataSet as input and then it analyze/estimate the distribution of each attributes/columns, and then outputs another dataframe (which may have different nuber of rows) with same columns/attributes but have different simulated observations/examples.

 

Input:

  • DF(Type: DataFrame)

Parameters:

  • nrow (Type: Int) = nrow of DF (by default)

Output:

  • DF_Out (Type: DataFrame)
0
0 votes

Declined · Last Updated

no comments or votes in over a year - closing this idea for now. Please comment if still relevant.

Comments

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist

    Dear @cedric_anover ,

    what do you mean by "analyze/estimate the distribution"? Simple check for usual distributions like Normal, Poisson or Cauchy?

    I think this is rather academic, because in real life distributions aren't that easy. If you don't use the histogram as a estimate for the pdf you get a problem. In any case, these simulations are not taking account dependecies between two attributes (or more). If you want to do it more correctly you are forced to use techniques like Markov Chain i suppose.

     

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    pending response from user

Sign In or Register to comment.