The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Wrinting an excel file with multiple sheets
Contributor I
Hi
I'm looking for an option to write an excel/csv file with multiple sheets in it. The current "Write Excel" operator in Rapidminer can only write to a single sheet (under sheet name).
Is it possible to merge 2 excel files with different tab names into 1 file containing data from all tabs (across files) in Append mode ?
Thanks
Mandeep
Tagged:
0
Answers
hello @mandeep_kumar - welcome to the community. There are several threads on this topic including one exactly about this two weeks ago. Did you do a search?
https://community.rapidminer.com/t5/forums/searchpage/tab/message?advanced=false&allow_punctuation=false&q=excel%20multiple%20sheets
Scott
@sgenzer
I have gone through these threads and i don't think there is any option except to have Execute R/python script that can perform.
Do you know if there are any operators (in rapidminer) that can write data in multiple sheets in a single excel file.
thanks
Mandeep
Hi @mandeep_kumar,
To perform this task, you will need of Advanced Reporting Extension (to download and install from the MarketPlace).
Here a process, where 2 exampleSets are written in two differents sheets of a same Excel file.
NB : You have to create preliminarily in the Excel file, the sheets in which you want to write.
I hope this help,
Regards,
Lionel
I share you point that having it out of the box available would be a great thing, but till then this may help also.
Find attached a working sample writing different datasets to different tabs
<?xml version="1.0" encoding="UTF-8"?><process version="9.1.000"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="generate_data" compatibility="9.1.000" expanded="true" height="68" name="Generate Data" width="90" x="112" y="136"> <parameter key="target_function" value="random"/> <parameter key="number_examples" value="100"/> <parameter key="number_of_attributes" value="5"/> <parameter key="attributes_lower_bound" value="-10.0"/> <parameter key="attributes_upper_bound" value="10.0"/> <parameter key="gaussian_standard_deviation" value="10.0"/> <parameter key="largest_radius" value="10.0"/> <parameter key="use_local_random_seed" value="false"/> <parameter key="local_random_seed" value="1992"/> <parameter key="datamanagement" value="double_array"/> <parameter key="data_management" value="auto"/> </operator> <operator activated="true" class="generate_data" compatibility="9.1.000" expanded="true" height="68" name="Generate Data (2)" width="90" x="112" y="238"> <parameter key="target_function" value="random"/> <parameter key="number_examples" value="100"/> <parameter key="number_of_attributes" value="5"/> <parameter key="attributes_lower_bound" value="-10.0"/> <parameter key="attributes_upper_bound" value="10.0"/> <parameter key="gaussian_standard_deviation" value="10.0"/> <parameter key="largest_radius" value="10.0"/> <parameter key="use_local_random_seed" value="false"/> <parameter key="local_random_seed" value="1992"/> <parameter key="datamanagement" value="double_array"/> <parameter key="data_management" value="auto"/> </operator> <operator activated="true" class="generate_data" compatibility="9.1.000" expanded="true" height="68" name="Generate Data (3)" width="90" x="112" y="340"> <parameter key="target_function" value="random"/> <parameter key="number_examples" value="100"/> <parameter key="number_of_attributes" value="5"/> <parameter key="attributes_lower_bound" value="-10.0"/> <parameter key="attributes_upper_bound" value="10.0"/> <parameter key="gaussian_standard_deviation" value="10.0"/> <parameter key="largest_radius" value="10.0"/> <parameter key="use_local_random_seed" value="false"/> <parameter key="local_random_seed" value="1992"/> <parameter key="datamanagement" value="double_array"/> <parameter key="data_management" value="auto"/> </operator> <operator activated="true" class="python_scripting:execute_python" compatibility="9.1.000" expanded="true" height="145" name="Execute Python (2)" width="90" x="313" y="136"> <parameter key="script" value="import pandas as pd import xlsxwriter import os user = os.getlogin() tpath = 'C:\\Users\\' + user + '\\data\\tmp' def rm_main(d3,d1,d2): writer = pd.ExcelWriter(tpath + '/multitab.xlsx', engine='xlsxwriter') d1.to_excel(writer, 'set1') d2.to_excel(writer, 'set2') d3.to_excel(writer, 'set3') # Save the result writer.save() return"/> <parameter key="use_default_python" value="true"/> <parameter key="package_manager" value="conda (anaconda)"/> </operator> <connect from_op="Generate Data" from_port="output" to_op="Execute Python (2)" to_port="input 1"/> <connect from_op="Generate Data (2)" from_port="output" to_op="Execute Python (2)" to_port="input 2"/> <connect from_op="Generate Data (3)" from_port="output" to_op="Execute Python (2)" to_port="input 3"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> </process> </operator> </process>