The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Row name changes in Turbo Prep and apply calculations to all cells
Hi,
Brief summary of my data set:
I have 6152 columns/attributes and 156 rows/examples. My columns are DNA names, and my rows are patient IDs. Simply put, the values are the number of mRNAs in each DNA for each patient, and these are transformed by log.
I have two separate questions about RapidMiner:
1. Every time I run the process, the first column of the result table is always row names like I wanted. However, after I click on Turbo Prep, the row name column always becomes the last column. I have tried the "Set Role" operator, and tried making the target role as "label" and "id". But it still doesn't work. How can I fix this problem?
2. I want to apply the same formula to all the cells/values (except for the column names and row names) in my table. I want to reverse log base 10. In other words, let all the values become the exponent of 10. How can I achieve that? I was thinking to use the GENERATE function in Turbo Prep, but it seems like that only creates a new column and my dataset is too large for that.
Thank you in advance!
Brief summary of my data set:
I have 6152 columns/attributes and 156 rows/examples. My columns are DNA names, and my rows are patient IDs. Simply put, the values are the number of mRNAs in each DNA for each patient, and these are transformed by log.
I have two separate questions about RapidMiner:
1. Every time I run the process, the first column of the result table is always row names like I wanted. However, after I click on Turbo Prep, the row name column always becomes the last column. I have tried the "Set Role" operator, and tried making the target role as "label" and "id". But it still doesn't work. How can I fix this problem?
2. I want to apply the same formula to all the cells/values (except for the column names and row names) in my table. I want to reverse log base 10. In other words, let all the values become the exponent of 10. How can I achieve that? I was thinking to use the GENERATE function in Turbo Prep, but it seems like that only creates a new column and my dataset is too large for that.
Thank you in advance!
Tagged:
1
Answers
I don't think RapidMiner will use the row names like R. If your mRNA data has meaningful row names, you can convert the rownames to a proper column of the data and set it to special role (role can be called rname or name).
If you want to generate new columns with a formula like 10^(att) for 6000+ columns, a "loop attribute" is suggested here.
Inside the loop attribute, you will use "Generate Attribute" for the transformations.
Cheers,
YY