The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Features effecting Bottom Line (Revenue)
Experts,
Can you please help me on how to perform a feature weights/contributing factors that effecting the revenue. We would like understand why are some instances of revenue low and some high, what is the differentiator. Please see the sample data. I wanted to see what features are affecting a revenue percentages.
Can you please help me how to approach this. I do have lot of nominal attributes, should i convert everything to numerical etc., can you point me to a sample process please.
As Always thanks you for your valuable advice and time
Can you please help me on how to perform a feature weights/contributing factors that effecting the revenue. We would like understand why are some instances of revenue low and some high, what is the differentiator. Please see the sample data. I wanted to see what features are affecting a revenue percentages.
Can you please help me how to approach this. I do have lot of nominal attributes, should i convert everything to numerical etc., can you point me to a sample process please.
As Always thanks you for your valuable advice and time
Tagged:
0
Contributor II
Answers
These will only show you individual relationships. Your question may actually be about what combinations of factors are most associated with Revenue. If that is the case and you are interested in exploring multivariate relationships, then that is basically a supervised machine learning problem. In that case, you probably want to build a simple predictive model to start, using a highly interpretable algorithm. I suggest a simple Decision Tree model so you can get a sense of what combinations of factors are associated with different levels of Revenue.
In both cases, looking at the tutorial processes contained in RapidMiner will be useful for understanding the basic setup and use in RapidMiner.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
<?xml version="1.0" encoding="UTF-8"?><process version="9.1.000-BETA2"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.1.000-BETA2" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="120"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.1.000-BETA2" expanded="true" height="68" name="Retrieve Polynomial" width="90" x="112" y="85"> <parameter key="repository_entry" value="//Samples/data/Polynomial"/> </operator> <operator activated="true" class="concurrency:cross_validation" compatibility="8.2.000" expanded="true" height="145" name="Validation" width="90" x="380" y="34"> <parameter key="split_on_batch_attribute" value="false"/> <parameter key="leave_one_out" value="false"/> <parameter key="number_of_folds" value="10"/> <parameter key="sampling_type" value="shuffled sampling"/> <parameter key="use_local_random_seed" value="false"/> <parameter key="local_random_seed" value="1992"/> <parameter key="enable_parallel_execution" value="true"/> <process expanded="true"> <operator activated="true" class="concurrency:parallel_decision_tree" compatibility="9.1.000-BETA2" expanded="true" height="103" name="Decision Tree" width="90" x="179" y="34"> <parameter key="criterion" value="least_square"/> <parameter key="maximal_depth" value="10"/> <parameter key="apply_pruning" value="true"/> <parameter key="confidence" value="0.1"/> <parameter key="apply_prepruning" value="true"/> <parameter key="minimal_gain" value="0.01"/> <parameter key="minimal_leaf_size" value="2"/> <parameter key="minimal_size_for_split" value="4"/> <parameter key="number_of_prepruning_alternatives" value="3"/> </operator> <connect from_port="training set" to_op="Decision Tree" to_port="training set"/> <connect from_op="Decision Tree" from_port="model" to_port="model"/> <portSpacing port="source_training set" spacing="0"/> <portSpacing port="sink_model" spacing="0"/> <portSpacing port="sink_through 1" spacing="0"/> <description align="left" color="green" colored="true" height="113" resized="true" width="284" x="33" y="148">Builds a model on the current training data set (90 % of the data by default, 10 times).<br><br>Make sure that you only put numerical attributes into a linear regression!</description> </process> <process expanded="true"> <operator activated="true" class="apply_model" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34"> <list key="application_parameters"/> <parameter key="create_view" value="false"/> </operator> <operator activated="true" class="performance" compatibility="9.1.000-BETA2" expanded="true" height="82" name="Performance" width="90" x="179" y="34"> <parameter key="use_example_weights" value="true"/> </operator> <connect from_port="model" to_op="Apply Model" to_port="model"/> <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/> <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/> <connect from_op="Performance" from_port="performance" to_port="performance 1"/> <connect from_op="Performance" from_port="example set" to_port="test set results"/> <portSpacing port="source_model" spacing="0"/> <portSpacing port="source_test set" spacing="0"/> <portSpacing port="source_through 1" spacing="0"/> <portSpacing port="sink_test set results" spacing="0"/> <portSpacing port="sink_performance 1" spacing="0"/> <portSpacing port="sink_performance 2" spacing="0"/> <description align="left" color="blue" colored="true" height="107" resized="true" width="333" x="28" y="139">Applies the model built from the training data set on the current test set (10 % by default).<br/>The Performance operator calculates performance indicators and sends them to the operator result.</description> </process> <description align="center" color="transparent" colored="false" width="126">A cross validation including a linear regression.</description> </operator> <connect from_op="Retrieve Polynomial" from_port="output" to_op="Validation" to_port="example set"/> <connect from_op="Validation" from_port="model" to_port="result 1"/> <connect from_op="Validation" from_port="test result set" to_port="result 2"/> <connect from_op="Validation" from_port="performance 1" to_port="result 3"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> <portSpacing port="sink_result 3" spacing="0"/> <portSpacing port="sink_result 4" spacing="0"/> </process> </operator> </process>
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
RegressionTree
segment = global: 0.018 {count=4} segment = local | Sector = AD: 0.016 {count=3} | Sector = ES: 0.011 {count=2} segment = med: 0.020 {count=10}Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts