The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Understanding Linear Regression Model"
Hello,
In Linear Regression operator there are the following columns in the resulting model:
Here is my simple process for investigation:
My problem is in the Wikipedia article the std error for example Height is 3.1539, and for the constant it is 8.63185, while in the RM results I see 0.961 and 1.558 respectively. I was curious whether I set some parameters wrong (I have changed the ridge to 0, so I think the Tikhonov regularization [http://en.wikipedia.org/wiki/Tikhonov_regularization] becomes normal linear regression, also no feature elimination.)
As I see the standard coefficient is computed like this https://github.com/aborg0/RapidMiner-Unuk/blob/master/src/com/rapidminer/operator/learner/functions/linear/LinearRegression.java#L329:
coeff*stddev/mean
What does this mean? When this is useful?
(I have also checked the code for the std. error, but it is much harder to interpret, and it seems it has no obvious connection to the wikipedia definition. In the Tikhonov regularization I could not find the formula for std error.)
Could you help me understanding these results?
Thanks, gabor
In Linear Regression operator there are the following columns in the resulting model:
- Attribute
- Coeffiicient
- Std. Error
- Std. Coefficient
- Tolerance
- t-Stat
- p-Value
- Code
Here is my simple process for investigation:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>The data (from Wikipedia http://en.wikipedia.org/wiki/Simple_linear_regression#Numerical_example):
<process version="5.2.008">
<context>
<input>
<location>//NewLocalRepository/wiki_regression_example_mass_height</location>
</input>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="427" width="675">
<operator activated="true" class="linear_regression" compatibility="5.2.008" expanded="true" height="94" name="Linear Regression" width="90" x="179" y="30">
<parameter key="feature_selection" value="none"/>
<parameter key="eliminate_colinear_features" value="false"/>
<parameter key="ridge" value="0.0"/>
</operator>
<connect from_port="input 1" to_op="Linear Regression" to_port="training set"/>
<connect from_op="Linear Regression" from_port="model" to_port="result 1"/>
<connect from_op="Linear Regression" from_port="weights" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
<object-stream>(You can store it in your repository and use it as an input.)
<com.rapidminer.example.set.SimpleExampleSet id="1" serialization="custom">
<com.rapidminer.operator.AbstractIOObject>
<default>
<source>Linear Regression</source>
</default>
</com.rapidminer.operator.AbstractIOObject>
<com.rapidminer.operator.ResultObjectAdapter>
<default>
<annotations id="2">
<keyValueMap id="3"/>
</annotations>
</default>
</com.rapidminer.operator.ResultObjectAdapter>
<com.rapidminer.example.set.AbstractExampleSet>
<default>
<idMap id="4"/>
<statisticsMap id="5">
<entry>
<string>Mass</string>
<linked-list id="6">
<NumericalStatistics id="7">
<sum>931.1700000000001</sum>
<squaredSum>58498.5439</squaredSum>
<valueCounter>15</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="8">
<sum>931.1700000000001</sum>
<squaredSum>58498.5439</squaredSum>
<totalWeight>15.0</totalWeight>
<count>15.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="9">
<minimum>52.21</minimum>
<maximum>74.46</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="10">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</linked-list>
</entry>
<entry>
<string>Height</string>
<linked-list id="11">
<NumericalStatistics id="12">
<sum>24.759999999999998</sum>
<squaredSum>41.0532</squaredSum>
<valueCounter>15</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="13">
<sum>24.759999999999998</sum>
<squaredSum>41.0532</squaredSum>
<totalWeight>15.0</totalWeight>
<count>15.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="14">
<minimum>1.47</minimum>
<maximum>1.83</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="15">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</linked-list>
</entry>
</statisticsMap>
</default>
</com.rapidminer.example.set.AbstractExampleSet>
<com.rapidminer.example.set.SimpleExampleSet>
<default>
<attributes class="SimpleAttributes" id="16">
<attributes class="linked-list" id="17">
<AttributeRole id="18">
<special>false</special>
<attribute class="NumericalAttribute" id="19" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="20">
<keyValueMap id="21"/>
</annotations>
<attributeDescription id="22">
<name>Height</name>
<valueType>2</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>0</index>
</attributeDescription>
<constructionDescription>Height</constructionDescription>
<statistics class="linked-list" id="23">
<NumericalStatistics id="24">
<sum>24.759999999999998</sum>
<squaredSum>41.0532</squaredSum>
<valueCounter>15</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="25">
<sum>24.759999999999998</sum>
<squaredSum>41.0532</squaredSum>
<totalWeight>15.0</totalWeight>
<count>15.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="26">
<minimum>1.47</minimum>
<maximum>1.83</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="27">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="28"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
</attribute>
</AttributeRole>
<AttributeRole id="29">
<special>true</special>
<specialName>label</specialName>
<attribute class="NumericalAttribute" id="30" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="31">
<keyValueMap id="32"/>
</annotations>
<attributeDescription id="33">
<name>Mass</name>
<valueType>2</valueType>
<blockType>1</blockType>
<defaultValue>0.0</defaultValue>
<index>1</index>
</attributeDescription>
<constructionDescription>Mass</constructionDescription>
<statistics class="linked-list" id="34">
<NumericalStatistics id="35">
<sum>931.1700000000001</sum>
<squaredSum>58498.5439</squaredSum>
<valueCounter>15</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="36">
<sum>931.1700000000001</sum>
<squaredSum>58498.5439</squaredSum>
<totalWeight>15.0</totalWeight>
<count>15.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="37">
<minimum>52.21</minimum>
<maximum>74.46</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="38">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="39"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
</attribute>
</AttributeRole>
</attributes>
</attributes>
<exampleTable class="com.rapidminer.example.table.MemoryExampleTable" id="40">
<attributes id="41">
<NumericalAttribute id="42" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="43">
<keyValueMap id="44"/>
</annotations>
<attributeDescription reference="22"/>
<constructionDescription>Height</constructionDescription>
<statistics class="linked-list" id="45">
<NumericalStatistics id="46">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<valueCounter>0</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="47">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<totalWeight>0.0</totalWeight>
<count>0.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="48">
<minimum>Infinity</minimum>
<maximum>-Infinity</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="49">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="50"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
</NumericalAttribute>
<NumericalAttribute id="51" serialization="custom">
<com.rapidminer.example.table.AbstractAttribute>
<default>
<annotations id="52">
<keyValueMap id="53"/>
</annotations>
<attributeDescription reference="33"/>
<constructionDescription>Mass</constructionDescription>
<statistics class="linked-list" id="54">
<NumericalStatistics id="55">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<valueCounter>0</valueCounter>
</NumericalStatistics>
<WeightedNumericalStatistics id="56">
<sum>0.0</sum>
<squaredSum>0.0</squaredSum>
<totalWeight>0.0</totalWeight>
<count>0.0</count>
</WeightedNumericalStatistics>
<com.rapidminer.example.MinMaxStatistics id="57">
<minimum>Infinity</minimum>
<maximum>-Infinity</maximum>
</com.rapidminer.example.MinMaxStatistics>
<UnknownStatistics id="58">
<unknownCounter>0</unknownCounter>
</UnknownStatistics>
</statistics>
<transformations id="59"/>
</default>
</com.rapidminer.example.table.AbstractAttribute>
</NumericalAttribute>
</attributes>
<unusedColumnList class="linked-list" id="60"/>
<dataList id="61">
<com.rapidminer.example.table.DoubleArrayDataRow id="62">
<data id="63">
<double>1.47</double>
<double>52.21</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="64">
<data id="65">
<double>1.5</double>
<double>53.12</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="66">
<data id="67">
<double>1.52</double>
<double>54.48</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="68">
<data id="69">
<double>1.55</double>
<double>55.84</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="70">
<data id="71">
<double>1.57</double>
<double>57.2</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="72">
<data id="73">
<double>1.6</double>
<double>58.57</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="74">
<data id="75">
<double>1.63</double>
<double>59.93</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="76">
<data id="77">
<double>1.65</double>
<double>61.29</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="78">
<data id="79">
<double>1.68</double>
<double>63.11</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="80">
<data id="81">
<double>1.7</double>
<double>64.47</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="82">
<data id="83">
<double>1.73</double>
<double>66.28</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="84">
<data id="85">
<double>1.75</double>
<double>68.1</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="86">
<data id="87">
<double>1.78</double>
<double>69.92</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="88">
<data id="89">
<double>1.8</double>
<double>72.19</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
<com.rapidminer.example.table.DoubleArrayDataRow id="90">
<data id="91">
<double>1.83</double>
<double>74.46</double>
</data>
</com.rapidminer.example.table.DoubleArrayDataRow>
</dataList>
<columns>2</columns>
</exampleTable>
</default>
</com.rapidminer.example.set.SimpleExampleSet>
</com.rapidminer.example.set.SimpleExampleSet>
</object-stream>
My problem is in the Wikipedia article the std error for example Height is 3.1539, and for the constant it is 8.63185, while in the RM results I see 0.961 and 1.558 respectively. I was curious whether I set some parameters wrong (I have changed the ridge to 0, so I think the Tikhonov regularization [http://en.wikipedia.org/wiki/Tikhonov_regularization] becomes normal linear regression, also no feature elimination.)
As I see the standard coefficient is computed like this https://github.com/aborg0/RapidMiner-Unuk/blob/master/src/com/rapidminer/operator/learner/functions/linear/LinearRegression.java#L329:
coeff*stddev/mean
What does this mean? When this is useful?
(I have also checked the code for the std. error, but it is much harder to interpret, and it seems it has no obvious connection to the wikipedia definition. In the Tikhonov regularization I could not find the formula for std error.)
Could you help me understanding these results?
Thanks, gabor
Tagged:
0