Ex23 The strings are't correctly decoded

I run the ex23.py script and only the strings written in english are correctly outputted. All other strings are unintelligible.

" b’Afrikaans’ <==> Afrikaans
b’\xc3\xa1\xc5\xa0 \xc3\xa1\xcb\x86\xe2\x80\xba\xc3\xa1\xcb\x86\xc2\xad\xc3\xa1\xc5\xa0\xe2\x80\xba’ <==> አማርኛ
b’\xc3\x90\xc2\x90\xc3\x92\xc2\xa7\xc3\x91\xc2\x81\xc3\x91\xcb\x86\xc3\x93\xe2\x84\xa2\xc3\x90\xc2\xb0’ <==> Аҧсшәа
b’\xc3\x98\xc2\xa7\xc3\x99\xe2\x80\x9e\xc3\x98\xc2\xb9\xc3\x98\xc2\xb1\xc3\x98\xc2\xa8\xc3\x99\xc5\xa0\xc3\x98\xc2\xa9’ <==> العربية
b’Aragon\xc3\x83\xc2\xa9s’ <==> Aragonés
b’Arpetan’ <==> Arpetan
b’Az\xc3\x89\xe2\x84\xa2rbaycanca’ <==> AzÉ™rbaycanca
b’Bamanankan’ <==> Bamanankan
b’\xc3\xa0\xc2\xa6\xc2\xac\xc3\xa0\xc2\xa6\xc2\xbe\xc3\xa0\xc2\xa6\xe2\x80\x9a\xc3\xa0\xc2\xa6\xc2\xb2\xc3\xa0\xc2\xa6\xc2\xbe’ <==> বাংলা
b’B\xc3\x83\xc2\xa2n-l\xc3\x83\xc2\xa2m-g\xc3\x83\xc2\xba’ <==> Bân-lâm-gú
b’\xc3\x90\xe2\x80\x98\xc3\x90\xc2\xb5\xc3\x90\xc2\xbb\xc3\x90\xc2\xb0\xc3\x91\xe2\x82\xac\xc3\x91\xc6\x92\xc3\x91\xc2\x81\xc3\x90\xc2\xba\xc3\x90\xc2\xb0\xc3\x91\xc2\x8f’ <==> Беларуская
b’\xc3\x90\xe2\x80\x98\xc3\x91\xc5\xa0\xc3\x90\xc2\xbb\xc3\x90\xc2\xb3\xc3\x90\xc2\xb0\xc3\x91\xe2\x82\xac\xc3\x91\xc2\x81\xc3\x90\xc2\xba\xc3\x90\xc2\xb8’ <==> Български
b’Boarisch’ <==> Boarisch
b’Bosanski’ <==> "

It seems like a problem with my system, but how do I fix it.

That’s correct. It has to be that way, it is showing how strings in scripts other than Roman are saved in computers.

Hello @thesuzan
Welcome to the forum.

To be able to help you better it is better if we can see your code.
Mostly there is a tiny thing that breaks the code. It is sometimes impossible to see it one self.
Other people, or even yourself see it when it is put up here at the forum
But sometimes one has to copy the code and run it to see where the problem is.

So please do like this:
Copy your code.
put it between:

[code] 
and
[/code]

It will look like this:

def my_code():
   print("something")

It is also a good thing to tell which system you are on.
Some problem are specific to Mac, others to Windows.
Linux also has issues.

1 Like

@Dynamo_Nishant But are the strings on the right side of the arrow supposed to be showing the actual name of the languages?

@ulfen69 Here’s the code

import sys
script, input_encoding, error = sys.argv

def main(language_file, encoding, errors):
    line = language_file.readline()
    
    if line:
        print_line(line, encoding, errors)
        return main(language_file, encoding, errors)

def print_line(line, encoding, errors):
    next_lang = line.strip()
    raw_bytes = next_lang.encode(encoding, errors=errors)
    cooked_string = raw_bytes.decode(encoding, errors=errors)

    print(raw_bytes, "<==>", cooked_string)

languages = open("languages.txt", encoding="utf-8")

main(languages, input_encoding, error)

System Info

“Microsoft Windows 10 Home
Version 1903
OS Build 18362.175”

I’m actually running the scripts on an Ubuntu subsystem on windows. But trying the same code in CMD and in Anaconda Prompt gave the same results.

I’m not sure about linux but on Windows 10 the 2 things you need to do to get the languages to print correctly in the command line are run “chcp 65001” to set utf-8 and you have to have the locale(font) files installed for each language. You get them (free) from the Microsoft Store. So go to Settings -> Time & Language -> Language and under the “Windows Display Language” drop-down is a link to get what they call Local Experience Packs. You would have to get one for each language you want to display. Hope this helps and maybe you could pass this on to those that want to see the exercise work fully.

@KennethB So, I need to download all the language packs for the languages that I want to display? That seems to be really tedious.

Hi, I think we had this problem a while back and changing the command prompt font in Windows to a monospace fixed it. It’s worth a try.
I can’t remember the name of the monospace in windows PowerShell

@thesuzan, for the ones you want to display in the console or powershell, yes. Not all of the packs, just the ones you want to display correctly. Example, a friend of mine grew up in Japan and wanted to display Kanji at the command prompt as they didn’t want people reading over their shoulder. The system still used the English language, it just used the Kanji characters for the English letters.

1 Like

I tried that, but it didn’t work.

1 Like

But for this exercise, it’s trying to display a lot of different languages. There are 97 of them I want to display correctly. So I need to download all 97 of them? and I don’t even know what languages they are.

Try this shell out instead:

https://cmder.net/

I think it supports utf-8 better, and it’s also just better than raw powershell.

1 Like

@zedshaw , sorry, that doesn’t fix it either.

Then welcome to windows. I’d just move on with your life and not worry about it. This is a constant problem with windows shells and the main reason that forcing everyone to use unicode tends to alienate 50% of the computing world. It’s not important enough to get hung up on for 11 days. In fact, I’d say your main goal is to not attempt to obsessively solve everything, but to take notes of things you didn’t solve and move on. Otherwise you’ll get stuck and give up over things that aren’t really that important.

Sorry @zedshaw , you sound offended. I’m on chapter 40 in your book. Its just that I would feel way better if everything worked well. I would feel better.

I am totally not offended by you because none of this is your fault. You’re just an unsuspecting victim or Microsoft’s incompetence. I am offended at Windows PowerShell and Python for totally getting this wrong and making it difficult for people. PowerShell has had broken Unicode for decades and they refuse to fix it, so we get unsuspecting people like you trying to fix it when it’s nearly impossible to fix reliably. No other platform has this problem, so I end up just having to tell people to let it go and move on with their life since fixing the real problem would require convincing Microsoft to actually support unicode in PowerShell without having to download 97 fonts.

My advice to you though is to just accept that sometimes things just don’t work well and move on. In fact, things not working perfectly is the norm in programming so learning to accept it when you can’t change it will help you quite a lot. You can focus your energies on things that you can fix and learn, and ignore things you can’t like unicode in PowerShell.

That’s all. If you’re on Ex40 then you’re doing great. I was under the impression that you were stuck here for 11 days and needed a push to get past it, but if you’re on 40 then you’re doing it right. Carry on.

2 Likes