The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to model a large data set with different "components"?
dan_ferraro24
Member Posts: 1 Learner III
in Help
Greetings Community,
I was hesitant to post this because i thought for sure it would be covered. But thus far in the existing videos and posts i haven't seen my specific question covered (even if some similar subject matter has been discussed).
I am modeling sports data. Baseball and football for daily fantasy sports purposes. Overall Direction Is: i want to build a baseline projection model (time series/pattern recognition), and then add some form of regression to that baseline projection in order to account for game specific matchup variables.
My question is less about the types of models or the theory, but more basic: How can i create models for individual players from a dataset including multiple players in the easiest way possible.
I want to build both models (projection and regressions) based on player specific data. I don't want to create models for "all third basemen" or "all running backs". I want to create them specific to individual players. However, i don't want to save individual data files for each player. That process, while likely not to hard with some engineering, seems like a waste of time. There has to be a better way.
I have large data sets with all the variables and historical data tied to individual players for individual games connected and organized. It would read something like (Date-game specific; player ID; Team ID; Points scored, then all the stats and situational variables related to that game). Each player has their own line for a specific game/date.
How would someone with more experience suggest i set up my process, or leverage certain models, which can provide me player specific results from a single run through a larger data set?
From my research i have a hunch that a macro and loop setup could possibly be used to limit my overall data to a player specific set of examples based on the macro list. But is there a better, more streamlined way?
Last note - my question (again) is less about using specific operators. I have used single player data sets with success using the instruction for time series, regression, and SVMs (THANKS THOMAS OTT). Now i need the best way to move from single player datasets to larger data sets. I will need to update these daily or weekly - hence my quest for simplicity if possible.
Thanks (and sorry if i am in the wrong place with a bad question)
-Dan
I was hesitant to post this because i thought for sure it would be covered. But thus far in the existing videos and posts i haven't seen my specific question covered (even if some similar subject matter has been discussed).
I am modeling sports data. Baseball and football for daily fantasy sports purposes. Overall Direction Is: i want to build a baseline projection model (time series/pattern recognition), and then add some form of regression to that baseline projection in order to account for game specific matchup variables.
My question is less about the types of models or the theory, but more basic: How can i create models for individual players from a dataset including multiple players in the easiest way possible.
I want to build both models (projection and regressions) based on player specific data. I don't want to create models for "all third basemen" or "all running backs". I want to create them specific to individual players. However, i don't want to save individual data files for each player. That process, while likely not to hard with some engineering, seems like a waste of time. There has to be a better way.
I have large data sets with all the variables and historical data tied to individual players for individual games connected and organized. It would read something like (Date-game specific; player ID; Team ID; Points scored, then all the stats and situational variables related to that game). Each player has their own line for a specific game/date.
How would someone with more experience suggest i set up my process, or leverage certain models, which can provide me player specific results from a single run through a larger data set?
From my research i have a hunch that a macro and loop setup could possibly be used to limit my overall data to a player specific set of examples based on the macro list. But is there a better, more streamlined way?
Last note - my question (again) is less about using specific operators. I have used single player data sets with success using the instruction for time series, regression, and SVMs (THANKS THOMAS OTT). Now i need the best way to move from single player datasets to larger data sets. I will need to update these daily or weekly - hence my quest for simplicity if possible.
Thanks (and sorry if i am in the wrong place with a bad question)
-Dan
0
Answers
i am not completly sure if i understood you, but i think you misunderstood the concept of predictive analytics. In predictive analytics you usually generate a model, which represents the general underlying rules like "Old players which a injury in the last three month underperform"
To do this you need a dataset like this
PlayerId | Age | TouchDownsLastThreePlayDays | Preferred System | ...
and most important a performance/cost/value you want to predict. Then you take this general rule (e.g. a SVM Model) and apply it on your specific data: A person who is X years old, likes to play system B and had 2 TouchDowns the last three playdays. Then you get the prediction of 1 [a.u] for it.
Can your idea fit into this schema?
Cheers
Martin
Dortmund, Germany
Therefore taking a dataset like:
PlayerId|Age|TouchDownsLastThreePlayDays|Preferred System|...|GameID|GameDate|GameRuns|GamePass|GameWasMoM
You can use Loop Values to loop the individual PlayerID and generate a model for the individual player data.
I would suggest looking into ways of combining the data of the other players to complete attributes of players without a huge amount of data. (for example Player X is currently performing well and has stats similar to Player Z in 1983 season, if the same trend holds then Player X should burn out before end of season so should be sold from Fantasy League within 2 weeks whilst price is at peak).