The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Question about Clustering Data before running a model"
Hi i have a problem with a biologic data and here i explain it with a simple example
I have 10 proteins that every two protein belong to one organism
for example protein 1&2 belongs to human 3 &4 belong to mouse and so on
I have Five organism which consist my label and my goal is making a model to predicts these five organism
but the problem is when i run this data every proteins is analyzed independently and the final result consist of 10 proteins which belongs to 5 organism while i every two proteins are linked together and they should be analyzed together .....what i want is every two proteins with same organism get into one group and then i get 5 groups which are classified by the organism of my protein get analyzed by model
i wanna know is there any way to cluster these proteins and similar data ?
I
I have 10 proteins that every two protein belong to one organism
for example protein 1&2 belongs to human 3 &4 belong to mouse and so on
I have Five organism which consist my label and my goal is making a model to predicts these five organism
but the problem is when i run this data every proteins is analyzed independently and the final result consist of 10 proteins which belongs to 5 organism while i every two proteins are linked together and they should be analyzed together .....what i want is every two proteins with same organism get into one group and then i get 5 groups which are classified by the organism of my protein get analyzed by model
i wanna know is there any way to cluster these proteins and similar data ?
I
Tagged:
0
Answers
can you please post your table structure with one or two rows of example data, and the desired outcome?
andcanyoupleaseusedotsandlinebreaksotherwiseyourpostsareprettyhardtounderstand.
Best,
Marius
And What i need is samples in a same cluster be analyzed together
You can do so by installing the Series extension and using the Windowing operator. Set both window_size and step_size to 2, because you always have 2 lines which belong together.
Maybe you have to add a Select Attributes or some Rename and Set Role operators after the Windowing operator, but that should be pretty straight forward.
Does that operator do what you need?
Best, Marius
and my second question what if my number of rows is not always 2?
cant it be done by using set role operator and selecting batch or cluster role for cluster column ?