I regularly export reports that are 300,000+ lines and so the only workable option is to output to CSV. I got really frustrated when certain characters like diacritics, cyrillic letters, Greek letters when opened in Excel were corrupted showing something like Г„/Г¤, Г–/Г¶
.
I raised a ticket with Docebo help as it seemed to be a reporting issue with CSV file exports. However, thanks to their (and Shaune Peebles) help. I was amazed to discover that Docebo reports do output correctly with CSV UTF-8 format. which, in theory, should remove the character corruption.
I was totally dumbfounded to learn from Microsoft support that it’s a known, but little publicized, issue that Excel misreads the UTF (BOM) marker that says it needs to use 65001: Unicode (UTF-8) and tries to open the file as standard CSV (which causes the corruption).
OK so for those of you who have not worked this out yet :
Don’t open a CSV file and let Excel import it (it will get it wrong)
- Save the exported file as a csv
- Open Excel
- Import the data using Data-->Import External Data --> Import Data
- Select the file type of "csv" and browse to your file
- In the import wizard change the File Origin to "65001 UTF" (or choose correct language character identifier)
- Change the Delimiter to comma
- Select where to import to and Finish
Hope this helps
Gary