Special character formatting issues with CSV reports in Microsoft Excel


Userlevel 5
Badge +1

I regularly export reports that are 300,000+ lines and so the only workable option is to output to CSV. I got really frustrated when certain  characters like diacritics, cyrillic letters, Greek letters when opened in Excel were corrupted  showing something like Г„/Г¤, Г–/Г¶.

I raised a ticket with Docebo help as it seemed to be a reporting  issue with CSV file exports. However, thanks to their (and Shaune Peebles) help. I was amazed to discover that Docebo reports do output correctly with CSV UTF-8  format. which, in theory, should remove the character corruption.

I was totally dumbfounded to learn from Microsoft support that it’s a known, but little publicized, issue that Excel misreads the UTF (BOM) marker that says it needs to use 65001: Unicode (UTF-8) and tries to open the file as standard CSV (which causes the corruption).

OK so for those of you who have not worked this out yet :

Don’t open a CSV file and let Excel import it (it will get it wrong)

  1. Save the exported file as a csv
  2. Open Excel
  3. Import the data using Data-->Import External Data --> Import Data
  4. Select the file type of "csv" and browse to your file
  5. In the import wizard change the File Origin to "65001 UTF" (or choose correct language character identifier)
  6. Change the Delimiter to comma
  7. Select where to import to and Finish

Hope this helps

 

Gary


10 replies

Userlevel 4
Badge

That’s great Gary. Very Helpful to know as we have staff in Eastern Europe with many accented characters. so far Excel export has been fine as we’re not that big but a time will come…

Richard

Userlevel 2

Gary, this has saved me so many hours and issues managing reports and CSV upload updates when reconfiguring platforms. Thank you very much for sharing! Jenny

Userlevel 6
Badge +1

Hey @Gary Jarvis; really glad to hear this, mate! I’ll be sure to share with the product team before our next catch-up.  

We’re here is you have any questions in the meantime :)

Userlevel 7
Badge +4

I’m so glad you found a solution and that it is working for you! This would be very frustrating for me as well! 

Userlevel 7
Badge +3

This is an awesome tip, thank you @Gary Jarvis!

 

I’m curious if you’ve ever experienced issues importing CSVs into Docebo that are saved properly as CSV UTF-8 from Excel or other tools like Google Sheets/Numbers? Throughout my years of working with Docebo, I recall a handful of instances where PUs or Superadmins had issues mapping fields appropriately when trying to import users via CSV and have never been able to solve the reason why. I’m wondering if this could be related?

Userlevel 7
Badge +7

HI @Gary Jarvis do you have any thoughts on IMPORTING data with accents? I did this using a CSV (not CSV-UTF8) with the UTF-8 setting on the import and all accents now are corrupt...they look like this:

 

this is supposed to be, è or é

pretty much all accents are converted to “?” for the most part

What’s strange is if I manually input the character, it sticks even when using the export to XLS option under the User Management menu. 

Exporting to CSV brings in the wonky characters you mention above...

Userlevel 7
Badge +7

Hi all, so after running a few tests I found that for accented characters, saving your input file in CSV UTF-8 format combined with the UTF-8 setting in the import tool DOES preserve the accents for the on-screen.

Userlevel 5
Badge +1

Yep i am afraid Microsoft has a lot to answer for here.:rage:  Docebo exports correctly however Excel  completely ignores the CSV type

When ever you import any CSV to Microsoft Excel you must use the import function 

  1. Open Excel
  2. Import the data using Data-->Import External Data --> Import Data
  3. Select the file type of "csv" and browse to your file
  4. In the import wizard change the File Origin to "65001 UTF" (or choose correct language character identifier)
  5. Change the Delimiter to comma
  6. Select where to import to and Finish

Also when you save any Excel worksheet as a CSV you must also select the “save as” option and then select  CSV UTF8 (Comma delimited)  as Excel will default to standard CSV and then mess up your character sets.

Its strange that Google sheets dos not have this issue :nerd:

Happy reporting and importing

Best regards,

Gary

 

Userlevel 7
Badge +7

Thanks for the details @Gary Jarvis...agree that MS Excel needs a bit of kick in the behind to fix these annoyances...these types of issues have been around for years unfortunately. 

Userlevel 3
Badge +1

Thank you so much for this info.  I had this problem and this helped tremendously.  Just wanted to update the steps since Microsoft made some changes to their menus.

Here are the steps that was provided for the problems we were having with the names:

  1. Download the zipped CSV from the site and unpack it.
  2. Open a new excel sheet.
  3. From the Data ribbon, select From Text/CSV
  4. Find and select the downloaded CSV file.
  5. At the pop up, select 65001 UTF and Comma for the delimeter and Load.
  6. Once loaded confirm the characters look normal and then save the file to a XLXS format by the standard Save As method.

Hope this helps someone else!

Reply