Read the threads and I have similar problem.
I use the raw bytes as inputs - I take last ten from languages.txt - we converted in first part.
When/ wherever I use .readline() or .strip() - I get a string and no amount of encoding seens to give me the raw bytes.
I thought this would be really easy.
I paste some crud ecode here - but best I can do so far is get a double quoted byte string which doesn’t look great …
OUTPUT
b"b’\xd0\xa2\xd0\xbe\xd2\xb7\xd0\xb8\xd0\xba\xd3\xa3’\n"
b"b’T\xc3\xbcrk\xc3\xa7e’\n"
b"b’\xd0\xa3\xd0\xba\xd1\x80\xd0\xb0\xd1\x97\xd0\xbd\xd1\x81\xd1\x8c\xd0\xba\xd0\xb0’\n"
b"b’\xd8\xa7\xd8\xb1\xd8\xaf\xd9\x88’\n"
b"b’Ti\xe1\xba\xbfng Vi\xe1\xbb\x87t’\n"
b"b’V\xc3\xb5ro’\n"
b"b’\xe6\x96\x87\xe8\xa8\x80’\n"
b"b’\xe5\x90\xb4\xe8\xaf\xad’\n"
b"b’\xd7\x99\xd7\x99\xd6\xb4\xd7\x93\xd7\x99\xd7\xa9’\n"
b"b’\xe4\xb8\xad\xe6\x96\x87’\n"
#############
import sys
script, input_encoding, error = sys.argv
def main(language_file, encoding, errors):
line = language_file.readline()
## line is a string - lets turn to bytes Here
raw_bytes = line.encode()
if line:
print_line(raw_bytes, encoding, errors)
return main(language_file, encoding, errors)
can I do conversion in here ? I wonder …
def print_line(raw_bytes, encoding, errors):
#print(raw_bytes)
raw = raw_bytes.strip()
#print(raw)
#print(type(raw))
have bytes now , let’s turn to string ?##
cooked_string = raw.decode(encoding, errors=errors)
print(cooked_string)
print(type(cooked_string))
languages = open(“input_bytes.txt”, encoding=“utf-8”)
main(languages, input_encoding, error)