Google Analytics xlsx format import issue
Hello
We export data from Google Analytics / Webmaster Tools / AdWords... Export -> Excel (xlsx format)
We tried "Read Excel" Operator on this file, but it gives an error; the Import Config Wizard stucks, too.
Pls find a file attached.
What are we doing wrong?
Thanks,
Antal
PS
It comes from Google Enterprise Account, but I'm afraid normal account files are the same.
Best Answer
-
Marco_Boeck Administrator, Moderator, Employee-RapidMiner, Member, University Professor Posts: 1,996 RM Engineering
Hi,
thanks for the report!
If you open the .xlsx file with Excel, it will immediately be modified. If you now save it again (without doing anything except having opened it), it will load successfully in Studio. So I guess the format you get from Google does not comply with the ECMA-376, 4th Edition standard
I'm not sure we can circumvent that problem on our side so my advice would be to create a bug report at Google so they actually comply with the standard defintion.
Regards,
Marco
1
Answers
Hello Marco
Thank you for the turnaround, actually this is what we did - on the other hand:
- we have 1000s of analytics reports (weekly), with automatic updates
- the size roughly doubles after open/save
Anyhow thanks for your suggestion!
Cheers,
Antal
Perhaps exporting in a different format would relieve the necessity of a workaround? I believe Google Analytics also allows report exports in other simpler formats, such as csv and tsv, both of which are also readable by RapidMiner.
Regards,
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Hello,
Good idea, thank you for sharing!
However in this situation we have to consider other factors:
- All the data has been generated / saved in xlsx for years
- Google csv has other issues: for example the character coding is changing sometimes "randomly" (utf-16, utf-8, ISO-whatever..., ) that make things little challenging
- paralel reporting / BI / Pred tools uses these xlsx format files
- +++
Originally I wanted to make things easier using xlsx - due to the csv issues we have been encountering for months so far;
tsv testing is coming up next
Thanks,
Cheers,
Antal