The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How can I see multicollinearity?
Hi, I'm a beginner.
I have a total of 17,379 row data.
I clicked to check the spatter matrix and heatmap because I wanted to check the relationship between variables.
But I couldn't see the scatter matrix and the heatmap.
Because the following text was displayed.
<heatmap>
Plot Heatmap does only support more than 2,000 rows if aggregation is enabled.
<scatter matrix>
Plot Scatter Matrix does not support more than 10,000 rows with the current configuration.
My data is time series data, and because it is time-based data from 2011-2012,
It is also ambiguous to cut the data to about 2,000 pieces.
In this case, what should I do?
Additionally, how can the VIF value be calculated in the Rapidminer?
I ask for an answer.
Thank you.
I have a total of 17,379 row data.
I clicked to check the spatter matrix and heatmap because I wanted to check the relationship between variables.
But I couldn't see the scatter matrix and the heatmap.
Because the following text was displayed.
<heatmap>
Plot Heatmap does only support more than 2,000 rows if aggregation is enabled.
<scatter matrix>
Plot Scatter Matrix does not support more than 10,000 rows with the current configuration.
My data is time series data, and because it is time-based data from 2011-2012,
It is also ambiguous to cut the data to about 2,000 pieces.
In this case, what should I do?
Additionally, how can the VIF value be calculated in the Rapidminer?
I ask for an answer.
Thank you.
0
Best Answer
-
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi!
In the Preferences (Settings => Preferences => User Interface) there's a setting "Visualizations row limit modifier". You can input higher values there if you are confident that your computer should be powerful enough to process and visualize more data. This is a safety limit to avoid overwhelming older computers.
With higher limits you should be able to get the charts you need.
About the VIF factor: RapidMiner is not a classical statistic application. It doesn't do regression analysis like those programs do.
That said, this could be calculated in a process according to the formula in https://www.statisticshowto.com/variance-inflation-factor/ by looping through the attributes, doing the regression with the current attribute being the label, getting the R² values and calculating the VIF.
Regards,
Balázs2