Ex23 - Error while breaking code

Rathish · March 25, 2018, 11:51pm

Hi All,

While playing around with the code / rewriting my own way, I encountered an error similar to the one mentioned here, however, I got a UnicodeDecodeError.
UnicodeDecodeError: ‘utf-16-le’ codec can’t decode bytes in position 812-813: illegal UTF-1UTF-16 surrogate
When I reviewed the code with the original one, I found that I made a mistake while opening the file.
lang = open(“files/languages.txt”, encoding=enc_form)
So instead of explicitly mentioning encoding = “utf-8”, I used it with the one mentioned in command line (‘enc_form’ was unpacked with argv).
In the above scenario, the output was perfect when using utf-8 and error appeared when I used utf-16. I think I am missing the logic. My question is, why are we hard-coding utf-8 there? Is it because the file was saved as utf-8 format or because Python sees everything as utf-8? Also when I googled the error, someone mentioned to use iso-8859-1 (whatever it is!) but that didn’t help at all. Please help.

Thank you.

zedshaw · March 30, 2018, 3:30pm

That’s super weird because the file I give you is actually utf-8, so unless you’re changing it to be utf-16 you shouldn’t get this error. How are you grabbing the file?

Rathish · March 31, 2018, 2:25pm

I did the usual ‘Right click > Save link as’ (from Safari books, if it helps). The only difference, as I mentioned originally, is that I didn’t hard code encoding = “utf-8” in the open(). Instead, passed the encoding through argument.

zedshaw · April 22, 2018, 12:22am

Well, it’s not hard coding if it’s the encoding of the file I give you. If I have utf-8, and then you remove that and it breaks, then you broke it. Time to put it back or come up with another fix.