The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Using mod output to isolate variables
I am not sure I am using the correct terms here, I will try to be descriptive.
I want to isolate (or control for) selected variables. That is, from a dataset (x,y,z,result) to create a model and then plot (x,result) considering y,z to be fixed. The final plot will be extrapolated to a wider range of x values.
I am at the point of having the model created (I used linear regression for testing purposes). The remaining steps, which I can't find out how to perform, are to create the appropriate dataset and apply the model.
Is there a data generator suitable for that or should I manually create the tables like (1,0,0),(2,0,0),(3,0,0),(4,0,0)... ?
After data is entered, or generated, how do I use the "mod" output to predict the result value?
Finally, am I using a totally wrong approach for the task I am trying to achieve? Is there a better way to visualize that kind of dependancy than this one?
Thank you all in advance.
I want to isolate (or control for) selected variables. That is, from a dataset (x,y,z,result) to create a model and then plot (x,result) considering y,z to be fixed. The final plot will be extrapolated to a wider range of x values.
I am at the point of having the model created (I used linear regression for testing purposes). The remaining steps, which I can't find out how to perform, are to create the appropriate dataset and apply the model.
Is there a data generator suitable for that or should I manually create the tables like (1,0,0),(2,0,0),(3,0,0),(4,0,0)... ?
After data is entered, or generated, how do I use the "mod" output to predict the result value?
Finally, am I using a totally wrong approach for the task I am trying to achieve? Is there a better way to visualize that kind of dependancy than this one?
Thank you all in advance.
0
Answers
Can you post a few rows of example data?
Best regards,
Wessel
I mostly need to visualize mes=f(res), for a given set (t,r).
Physical modeling of the problem says I should expect mes=a*res^2+b*res+c, but given the effect t and the fact that the order of magnitude is so different it is quite difficult. The values of a and b are not independent of t and r.
I thought of getting a number of examples, large enough to have a lot of cases with the same t,r but that seems impossible.
I'm sure I'm missing something.
I generated an attribute res^2 and ran linear regression.
And then after made a scatter plot.
I used 0,1-normalization to make it all fit.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Process">
<process expanded="true" height="409" width="840">
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve" width="90" x="153" y="146">
<parameter key="repository_entry" value="//RS/A"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.008" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
<parameter key="name" value="mes"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.1.008" expanded="true" height="76" name="Generate Attributes" width="90" x="313" y="30">
<list key="function_descriptions">
<parameter key="res^2" value="res^2"/>
</list>
</operator>
<operator activated="true" class="linear_regression" compatibility="5.1.008" expanded="true" height="94" name="Linear Regression" width="90" x="447" y="30">
<parameter key="feature_selection" value="none"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="581" y="120">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.1.008" expanded="true" height="94" name="Normalize" width="90" x="715" y="30">
<parameter key="include_special_attributes" value="true"/>
<parameter key="method" value="range transformation"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Linear Regression" to_port="training set"/>
<connect from_op="Linear Regression" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Linear Regression" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Normalize" to_port="example set input"/>
<connect from_op="Apply Model" from_port="model" to_port="result 1"/>
<connect from_op="Normalize" from_port="example set output" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="162"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
The problem is that with mes=f(res)=a*res^2 + b*res + c the values of a,b,c are not independent of t,r.
Doing that kind of regression would only acounf for c=g(t,r).
What I want to do is find a and b, given the values of t,r.
To put it in a proper form, the function is:
mes(res, t, r) = a(t,r)*res^2 + b(t,r)*res + c(t,r)
The quest is to find a(t,r), b(t,r) and c(t,r).
This does not seem like a problem suitable for Rapid Miner.
You can use a fuzzy neural network, or a genetic algorithm to solve this problem.
But you will have to write your own Java code.
Best regards,
Wessel
A visual representation of mes vs res for 4-5 pairs of (r,t) would be enough.
Still, is it reasonable to ask for enough data values with the same t and r? Gathering 100 examples for each pair will take around 8 months.
And then I could train the model 5 times and get the required 5 values for a,b,c.
I was hoping I could find a workaround and work with randomly collected values but it seems quite difficult.