Problems with utf-8, Exercise 23

WinWin32 · August 9, 2018, 3:13am

So, I can’t use the utf-8 in my script. It keeps saying it’s an “invalid continuation byte”

WinWin32 · August 9, 2018, 3:46am

Fixed my own problem.

Instead of downloading the language.txt file, I had copied the contents into a .txt file
Silly me.

zedshaw · August 9, 2018, 4:31pm

Ah that’s another way it gets messed up.

tradegreek · August 28, 2018, 7:16pm

I get the same error, How did you download the file? I can’t find a download button. I also copied it into a file and named it languages.txt so I am not sure why it doesn’t work.

tradegreek · August 28, 2018, 7:25pm

Was not able to download via my PDF reader used another and could save as target which worked.

Why did this error happen does anyone know? I maybe am just super ignorant but I am not sure why downloading would work vs copying the document and saving the file as the correct name?

zedshaw · August 29, 2018, 5:36pm

Easiest is put your mouse over this:

https://learnpythonthehardway.org/python3/languages.txt

DON’T CLICK IT! Right click, save as. Now it should be perfect.

azaza · September 23, 2018, 1:53pm

Thanks - that fixed my problem…

gtracy · November 15, 2019, 4:19pm

Thanks for posting this. I was doing it wrong. Got it now. Also, thanks for writing the Python3 book. I also have your original one. Been resisting moving to three but now I’ve found “resistence is futile. Prepare to be absorbed.” Cheers.

zedshaw · November 16, 2019, 4:41am

Ah, I just replied to you in help so you can ignore the reply. This also solves it.

RayL · January 4, 2020, 1:23pm

I did not get the identical output that you show as exercise 23 on page 79 of “Learn PYTHON 3 the HARD WAY”. I did as you suggest in this post (it’s “Save link as…” – not “save as”) to download the file of languages. I also was very careful to make sure I wrote the code as you show in the book.
Here is my output:

[code]
PS C:\Users\Ray\lpthw> ex23.py utf-8 strict b’Afrikaans’ <===> Afrikaans
b’\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b’ <===> አማርኛ
b’\xd0\x90\xd2\xa7\xd1\x81\xd1\x88\xd3\x99\xd0\xb0’ <===> Аҧсшәа
b’\xd8\xa7\xd9\x84\xd8\xb9\xd8\xb1\xd8\xa8\xd9\x8a\xd8\xa9’ <===> العربية
b’Aragon\xc3\xa9s’ <===> Aragonés

[code]

Look at line 5, where my output is: “b’Aragon\xc3xa9s’” …
Your line 5 output is: “‘V\xc3\xb5ro’” …

I’m wondering if others got different output as well?
Ray