My attempt to improve oop_test.py

AdamHe · January 31, 2020, 2:11am

Hello everyone, I’m new to this community, and while I’ve attempted to read everything I can before posting, please accept my apologies in advance if I’m breaking the rules.

My project is to improve upon oop_test.py, as I found it very, very helpful in drilling concepts into my thick skull. The improvement I want to implement is that when the user input is incorrect, show the user the correct answer and the differences between the correct answer what the user input.

To this end, I’ve googled away and dropped down several rabbit holes, but I do have a functioning version. It is not, however, as pretty or efficient as I would like, so I am turning to you fine folks for assistance.

My issues with this code are as follows:

I have two different methods of comparing the correct answer with the user input. I would like to have a single method.
The output of the second method I’m using - symmetric_difference is not as user readable as I would like.

I’ve commented my code as best I can, and I have placed it below (feedback on my commenting skills would also be helpful!):

import random
from urllib.request import urlopen
import sys

WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []

PHRASES = {
    "class %%%(%%%):":
      "Make a class named %%% that is-a %%%.",
    "class %%%(object):\n\tdef __init__(self, ***)" :
      "class %%% has-a __init__ that takes self and *** params.",
    "class %%%(object):\n\tdef ***(self, @@@)":
      "class %%% has-a function *** that takes self and @@@ params.",
    "*** = %%%()":
      "Set *** to an instance of class %%%.",
    "***.***(@@@)":
      "From *** get the *** function, call it with params self, @@@.",
    "***.*** = '***'":
      "From *** get the *** attribute and set it to '***'."
}

# do they want to drill phrases first
if len(sys.argv) == 2 and sys.argv[1] == "english":
    PHRASE_FIRST = True
else:
    PHRASE_FIRST = False

# load up the words from the website
for word in urlopen(WORD_URL).readlines():
    WORDS.append(str(word.strip(), encoding="utf-8"))


def convert(snippet, phrase):
    class_names = [w.capitalize() for w in
                   random.sample(WORDS, snippet.count("%%%"))]
    other_names = random.sample(WORDS, snippet.count("***"))
    results = []
    param_names = []

    for i in range(0, snippet.count("@@@")):
        param_count = random.randint(1,3)
        param_names.append(', '.join(
            random.sample(WORDS, param_count)))

    for sentence in snippet, phrase:
        # this is how you duplicate a list or string
        result = sentence[:]

        # fake class names
        for word in class_names:
            result = result.replace("%%%", word, 1)

        # fake other names
        for word in other_names:
            result = result.replace("***", word, 1)

        # fake parameter lists
        for word in param_names:
            result = result.replace("@@@", word, 1)

        results.append(result)

    return results


# keep going until they hit CTRL-D
try:
    while True:
        snippets = list(PHRASES.keys())
        random.shuffle(snippets)

        for snippet in snippets:
            phrase = PHRASES[snippet]
            question, answer = convert(snippet, phrase)
            if PHRASE_FIRST:
                question, answer = answer, question

            print(f"\n{question}")

            # New code begins here.  Set the user input to a variable.

            user_input = input("\n> ")

            # Create a variable to hold the correct answer without tabs or line feeds.
            # Done because I think the user input won't have either of these.

            stripped_answer = answer.replace('\n', ' ').replace('\t', '')

            # Print both the correct answer and the user input.

            print(f"\nCorrect answer: \"{stripped_answer}\"\n")

            print(f"Your answer: \"{user_input}\"\n")

            # Comparing the variables; the correct answer and what user input.

            if user_input != stripped_answer:

                print("Incorrect answer, please try again.\n")

                # Goal is to show the user where they got the answer wrong.

                # First we're creating lists to hold the correct answer
                # and the user input strings parsed on spaces.

                user_input_list = list()

                for user_inputListItem in user_input.split(' '):

                    user_input_list.append(user_inputListItem)

                stripped_answer_list = list()

                for stripped_answerListItem in stripped_answer.split(' '):

                    stripped_answer_list.append(stripped_answerListItem)

                # If the two lists have identical length,
                # then iteratively compare each item (string) in the list
                # and print out the user input when they are different.

                if len(user_input_list) == len(stripped_answer_list):

                    for i in user_input_list:

                        if i != stripped_answer_list[user_input_list.index(i)]:

                            print(f"This text you wrote is the problem: \"{i}\"\n")

                            print(f"It should read: \"{stripped_answer_list[user_input_list.index(i)]}\"\n")

                # Here's where I am having some issues.
                # If the user input list is longer than the correct answer list,
                # then compare lists using symmetric_difference.
                # Note that symmetric_difference only uses sets, so I had to convert the lists.
                # Problems with this approach:
                    # 1. Every member of a set is unique, so duplicated text strings are removed.
                    # 2. Output is hard to follow - user has to find which text string is part of the
                    #    correct answer or the user input.

                else:

                        print(f"The issue is here: {set(user_input_list).symmetric_difference(set(stripped_answer_list))}")

            else:

                print("You got the correct answer - well done!\n")

except EOFError:
    print("\nBye")

Screenshot of the code in action:

Any help would be graciously appreciated - thank you!

-Adam

AdamHe · January 31, 2020, 2:24am

In case it is not immediately obvious, here are the reasons for my answers in the screenshot:

First answer: Just making sure that the correct answer works. It does.

Second answer: An example of the user making a typo. Here I am making sure that the if len(user_input_list) == len(stripped_answer_list): code block works properly when the condition of the if is TRUE.

Third answer: An example of the user making what I think is a very common mistake, attempting to “get” a function by calling “self”. This inserts an additional element into the user_input_list and exercises the else: section of the if len(user_input_list) == len(stripped_answer_list): code block when the if is FALSE and the user input has more elements than the answer.

Fourth answer: I answered this correctly because it wasn’t of the format I desired.

Fifth answer: I omitted an equality sign, as I felt this would exercise the code in a subtly different way than the third answer, illustrating the situation when the stripped_answer_list has more elements than the user_input_list.

Hopefully that helps!

-Adam

florian · January 31, 2020, 9:18am

Hi Adam,
this is interesting, I like it!

I have a few tips:

Firstly, user_input_list = user_input.split() would suffice to split a string into a list of words because the split method already returns a list, and it splits on whitespace by default.

The for-loop after that is kind of upside down. It’s not just a cosmetic problem because you’ll get bugs when there are identical items in one of the lists. index always returns the first occurence of the searched item. Try it this way:

for i in range(len(list1)):
    if list1[i] != list2[i]:
        # report the difference

Then, using sets to get all elements that are not in both lists is an interesting idea, but it’s also kind of ugly because you have a hard time telling what exactly was wrong. Why don’t you just scan the two lists and report the first difference, regardless of whether the lists are equally long or not?

As for the lists: You could get cleaner tokens if you split the strings on word boundaries instead of whitespace, but you’d have to take a look at the re module. Try this in the shell: (The hieroglyphs in the second line are a regular expression that matches at each word boundary except for the beginning of the string.)

>>> import re
>>> pattern = re.compile('(?<!^)\b')
>>> code = "class Banana(Fruit)"
>>> pattern.split(code)
['class', ' ', 'Banana', '(', 'Fruit', ')']

I hope this helps a bit!

AdamHe · February 1, 2020, 2:15am

This is very useful - thank you. My code is now updated with this change.

Why don’t you just scan the two lists and report the first difference, regardless of whether the lists are equally long or not?

I did think about doing this, but I wanted to get all the errors in the user input, not just the first one.

The for-loop after that is kind of upside down. It’s not just a cosmetic problem because you’ll get bugs when there are identical items in one of the lists. index always returns the first occurence of the searched item.

More good info - thank you! Code updated.

I have updated my code with this, thank you! I have a lot to learn on how lists work, and your assistance has been huge for me.

One of my major issues, and the reason for this check to ensure that the lists have the same number of elements ( if len(user_input_list) == len(stripped_answer_list): ) is that I don’t know how to iterate through the lists if there are different numbers of elements in each. I could append the shorter of the two lists with empty strings entries to make them match in size, but that seems … clumsy? Not sure what the right way to approach this problem is, so I simply IF cased it, and bounced the different length lists to the symmetric_difference code.

As for the lists: You could get cleaner tokens if you split the strings on word boundaries instead of whitespace, but you’d have to take a look at the re module.

The reason I decided not to separate out the characters like parentheses as elements in the list is that I wanted to be able to determine if I’d put the correct spaces into the answer - correct me if I’m wrong, but I was using the regular expression you provided to populate the answer and user input lists, the code would not be able to detect the difference between these two lines:

"class Banana(Fruit)"

"class Banana ( Fruit )"

Not catching this distinction seems like a problem to me. If I’m wrong on that, please let me know; definitely still learning a lot, as you can tell.

Thank you again! Updated code below if anyone cares to read it.

import random
from urllib.request import urlopen
import sys

WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []

PHRASES = {
    "class %%%(%%%):":
      "Make a class named %%% that is-a %%%.",
    "class %%%(object):\n\tdef __init__(self, ***)" :
      "class %%% has-a __init__ that takes self and *** params.",
    "class %%%(object):\n\tdef ***(self, @@@)":
      "class %%% has-a function *** that takes self and @@@ params.",
    "*** = %%%()":
      "Set *** to an instance of class %%%.",
    "***.***(@@@)":
      "From *** get the *** function, call it with params self, @@@.",
    "***.*** = '***'":
      "From *** get the *** attribute and set it to '***'."
}

# do they want to drill phrases first
if len(sys.argv) == 2 and sys.argv[1] == "english":
    PHRASE_FIRST = True
else:
    PHRASE_FIRST = False

# load up the words from the website
for word in urlopen(WORD_URL).readlines():
    WORDS.append(str(word.strip(), encoding="utf-8"))


def convert(snippet, phrase):
    class_names = [w.capitalize() for w in
                   random.sample(WORDS, snippet.count("%%%"))]
    other_names = random.sample(WORDS, snippet.count("***"))
    results = []
    param_names = []

    for i in range(0, snippet.count("@@@")):
        param_count = random.randint(1,3)
        param_names.append(', '.join(
            random.sample(WORDS, param_count)))

    for sentence in snippet, phrase:
        # this is how you duplicate a list or string
        result = sentence[:]

        # fake class names
        for word in class_names:
            result = result.replace("%%%", word, 1)

        # fake other names
        for word in other_names:
            result = result.replace("***", word, 1)

        # fake parameter lists
        for word in param_names:
            result = result.replace("@@@", word, 1)

        results.append(result)

    return results


# keep going until they hit CTRL-D
try:
    while True:
        snippets = list(PHRASES.keys())
        random.shuffle(snippets)

        for snippet in snippets:
            phrase = PHRASES[snippet]
            question, answer = convert(snippet, phrase)
            if PHRASE_FIRST:
                question, answer = answer, question

            print(f"\n{question}")

            # New code begins here.  Set the user input to a variable.

            user_input = input("\n> ")

            # Create a variable to hold the correct answer without tabs or line feeds.
            # Done because I think the user input won't have either of these.

            stripped_answer = answer.replace('\n', ' ').replace('\t', '')

            # Print both the correct answer and the user input.

            print(f"\nCorrect answer: \"{stripped_answer}\"\n")

            print(f"Your answer: \"{user_input}\"\n")

            # Comparing the variables; the correct answer and what user input.

            if user_input != stripped_answer:

                print("Incorrect answer, please try again.\n")

                # Goal is to show the user where they got the answer wrong.

                # First we're creating lists to hold the correct answer
                # and the user input strings parsed on spaces.

                user_input_list = list()

                user_input_list = user_input.split()

                stripped_answer_list = list()

                stripped_answer_list = stripped_answer.split()

                # If the two lists have identical length,
                # then iteratively compare each item (string) in the list
                # and print out the user input when they are different.

                if len(user_input_list) == len(stripped_answer_list):

                    for i in range(len(user_input_list)):

                        if user_input_list[i] != stripped_answer_list[i]:

                            print(f"This text you wrote is the problem: \"{user_input_list[i]}\"\n")

                            print(f"It should read: \"{stripped_answer_list[i]}\"\n")

                # Here's where I am having some issues.
                # If the user input list is longer than the correct answer list,
                # then compare lists using symmetric_difference.
                # Note that symmetric_difference only uses sets, so I had to convert the lists.
                # Problems with this approach:
                    # 1. Every member of a set is unique, so duplicated text strings are removed.
                    # 2. Output is hard to follow - user has to find which text string is part of the
                    #    correct answer or the user input.

                else:

                        print(f"The issue is here: {set(user_input_list).symmetric_difference(set(stripped_answer_list))}")

            else:

                print("You got the correct answer - well done!\n")

except EOFError:
    print("\nBye")

AdamHe · February 1, 2020, 2:21am

Here’s a screenshot demonstrating why I want to have the for loop through every string in the list and not just stop on the first issue:

florian · February 1, 2020, 10:05am

I see. Regular expressions are really powerful, you could try something like this:

>>> s = "class  Banana (Fruit ):"
>>> p = re.compile(r'\w+|\s+|\W')
>>> p.findall(s)
['class', '  ', 'Banana', ' ', '(', 'Fruit', ' ', ')', ':']

This regex matches any sequence of one or more word characters OR any sequence of one or more whitespace characters OR any single non-word character. The findall method simply finds all matches in a given string.

florian · February 1, 2020, 10:09am

The first line is redundant. split returns a list anyway.

AdamHe · February 5, 2020, 9:36pm

Thank you for helping me improve my code! I’ve updated my oop_test_adamedit.py file with this change, removing both statements establishing the lists.

Agreed, but for what I’m after, parsing the answer on spaces seems adequate. I’ve updated my code again, this time I’ve added two new functions - one to equalize the length of the lists and one to print out where the text is wrong.

I think I’ve succeeded in removing the symmetric_difference code path entirely, but I would very much like to hear what you (or anyone who is more knowledgeable than I am) have to say about it.

Thank you @florian!

Code:

import random
from urllib.request import urlopen
import sys

WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []

PHRASES = {
    "class %%%(%%%):":
      "Make a class named %%% that is-a %%%.",
    "class %%%(object):\n\tdef __init__(self, ***)" :
      "class %%% has-a __init__ that takes self and *** params.",
    "class %%%(object):\n\tdef ***(self, @@@)":
      "class %%% has-a function *** that takes self and @@@ params.",
    "*** = %%%()":
      "Set *** to an instance of class %%%.",
    "***.***(@@@)":
      "From *** get the *** function, call it with params self, @@@.",
    "***.*** = '***'":
      "From *** get the *** attribute and set it to '***'."
}

# do they want to drill phrases first
if len(sys.argv) == 2 and sys.argv[1] == "english":
    PHRASE_FIRST = True
else:
    PHRASE_FIRST = False

# load up the words from the website
for word in urlopen(WORD_URL).readlines():
    WORDS.append(str(word.strip(), encoding="utf-8"))


def convert(snippet, phrase):
    class_names = [w.capitalize() for w in
                   random.sample(WORDS, snippet.count("%%%"))]
    other_names = random.sample(WORDS, snippet.count("***"))
    results = []
    param_names = []

    for i in range(0, snippet.count("@@@")):
        param_count = random.randint(1,3)
        param_names.append(', '.join(
            random.sample(WORDS, param_count)))

    for sentence in snippet, phrase:
        # this is how you duplicate a list or string
        result = sentence[:]

        # fake class names
        for word in class_names:
            result = result.replace("%%%", word, 1)

        # fake other names
        for word in other_names:
            result = result.replace("***", word, 1)

        # fake parameter lists
        for word in param_names:
            result = result.replace("@@@", word, 1)

        results.append(result)

    return results


# New code begins here.  Defining functions...


def EqualizeListLengths(List1, List2):

    if len(List1) == len(List2):

        return 1

    elif len(List1) > len(List2):

#        print(f"Length of list 1: {len(List1)}.\n\nLength of list 2: {len(List2)}."

        while len(List1) != len(List2):

            List2.append(' ')

    elif len(List2) > len(List1):

#        print(f"Length of list 2: {len(List2)}.\n\nLength of list 1: {len(List1)}."

        while len(List2) != len(List1):

            List1.append(' ')

    else:

        print("You should never see this message.")


    IncorrectAnswer(List1, List2)



def IncorrectAnswer(List1, List2):

    if len(user_input_list) == len(stripped_answer_list):

        print("\nIncorrect answer, please try again.\n")

        # Goal is to show the user where they got the answer wrong.

        # If the two lists have identical length,
        # then iteratively compare each item (string) in the list
        # and print out the user input when they are different.

        # Print both the correct answer and the user input.

        print(f"Correct answer: \"{stripped_answer}\"")

        print(f"Your answer: \t\"{user_input}\"\n")

        for i in range(len(user_input_list)):

            if user_input_list[i] != stripped_answer_list[i]:

                print(f"This text you wrote is a problem: \t\"{user_input_list[i]}\"")

                print(f"The text should read: \t\t\t\"{stripped_answer_list[i]}\"\n")

    else:

        EqualizeListLengths(List1, List2)



# keep going until they hit CTRL-D
try:
    while True:
        snippets = list(PHRASES.keys())
        random.shuffle(snippets)

        for snippet in snippets:
            phrase = PHRASES[snippet]
            question, answer = convert(snippet, phrase)
            if PHRASE_FIRST:
                question, answer = answer, question

            print(f"\n{question}")

            # Set the user input to a variable.

            user_input = input("\n> ")

            # Create a variable to hold the correct answer without tabs or line feeds.
            # Done because I think the user input won't have either of these.

            stripped_answer = answer.replace('\n', ' ').replace('\t', '')


            # First we're creating lists to hold the correct answer
            # and the user input strings parsed on spaces.

            user_input_list = user_input.split()

            stripped_answer_list = stripped_answer.split()

            # Comparing the variables; the correct answer and what user input.

            if user_input != stripped_answer:

                IncorrectAnswer(user_input_list, stripped_answer_list)

                # Here's where I am having some issues.
                # If the user input list is longer than the correct answer list,
                # then compare lists using symmetric_difference.
                # Note that symmetric_difference only uses sets, so I had to convert the lists.
                # Problems with this approach:
                    # 1. Every member of a set is unique, so duplicated text strings are removed.
                    # 2. Output is hard to follow - user has to find which text string is part of the
                    #    correct answer or the user input.


#                else:

#                    EqualizeListLengths(user_input_list, stripped_answer_list)

#                    for i in range(len(user_input_list)):

#                        if user_input_list[i] != stripped_answer_list[i]:

#                            print(f"This text you wrote is the problem: \"{user_input_list[i]}\"\n")

#                            print(f"It should read: \"{stripped_answer_list[i]}\"\n")

#                   print(f"The issue is here: {set(user_input_list).symmetric_difference(set(stripped_answer_list))}")

            else:

                print("\nYou got the correct answer - well done!\n")

except EOFError:
    print("\nBye")

zedshaw · February 5, 2020, 11:07pm

These comments are all good, but I’ll throw in a few pointers:

You name things WithCamelCase when they should be using underscore_style_naming. IncorrectAnswer should be incorrect_answer. List1 should be list1 (actually it should be a better name than just 1 or 2).
This code is kind of old. You could probably do away with most of convert by just using .format() on a dict.

Try those out.

AdamHe · February 7, 2020, 7:45pm

Fixed - thank you for helping me get better at writing comprehensible code.

I’ve adopted some notation conventions in my naming; if there’s a better method for me to use, please let me know, this is something I want to make sure I have good habits about.

This lead me down quite the rabbit hole. During my journey of discovery I learned all kinds of new things, from how format() works to format_map(), to the fact that dictionaries have a uniqueness requirement upon their keys (which seems completely reasonable and intuitive now that I write it down).

From what I can tell, str.format(**dict_name) seems to me to be functionally equivalent to str.format_map(dict_name) - but as a newb, I am likely missing something important here - please let me know!

I am also interested in any critique anyone has on how I’ve gone about updating the code.

Something I am a bit unsure about: I felt it necessary to put a number after each instance of @@@ / *** / %%% in the PHRASES strings as otherwise it seemed every instance of the string was replaced with the dictionary item, not just the first instance. Perhaps this is a difference between str.format(**dict_name) and str.format_map(dict_name)? I have not yet tested with both methods and I did use format_map() for my initial development.

Regardless, thank you for sending me on this quest of knowledge @zedshaw - it has been a fascinating exploration.

I’ve kept the debugging print statements I used in this version of my code:

import random
from urllib.request import urlopen
import sys

WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []

PHRASES = {
    "class {%%%0}({%%%1}):":
      "Make a class named {%%%0} that is-a {%%%1}.",
    "class {%%%0}(object):\n\tdef __init__(self, {***0})" :
      "class {%%%0} has-a __init__ that takes self and {***0} params.",
    "class {%%%0}(object):\n\tdef {***0}(self, {@@@0})":
      "class {%%%0} has-a function {***0} that takes self and {@@@0} params.",
    "{***0} = {%%%0}()":
      "Set {***0} to an instance of class {%%%0}.",
    "{***0}.{***1}({@@@0})":
      "From {***0} get the {***1} function, call it with params self, {@@@0}.",
    "{***0}.{***1} = '{***2}'":
      "From {***0} get the {***1} attribute and set it to '{***2}'."
}

# do they want to drill phrases first
if len(sys.argv) == 2 and sys.argv[1] == "english":
    PHRASE_FIRST = True
else:
    PHRASE_FIRST = False

# load up the words from the website
for word in urlopen(WORD_URL).readlines():
    WORDS.append(str(word.strip(), encoding="utf-8"))


# The convert function has been deprecated
# Do not use without updating to take into account the numbers added to PHRASES strings.
def convert(snippet, phrase):
    class_names = [w.capitalize() for w in
                   random.sample(WORDS, snippet.count("%%%"))]
    other_names = random.sample(WORDS, snippet.count("***"))
    results = []
    param_names = []

    for i in range(0, snippet.count("@@@")):
        param_count = random.randint(1,3)
        param_names.append(', '.join(
            random.sample(WORDS, param_count)))

    for sentence in snippet, phrase:
        # this is how you duplicate a list or string
        result = sentence[:]

        # fake class names
        for word in class_names:
            result = result.replace("{%%%}", word, 1)

        # fake other names
        for word in other_names:
            result = result.replace("{***}", word, 1)

        # fake parameter lists
        for word in param_names:
            result = result.replace("{@@@}", word, 1)

        results.append(result)

        print(f"The results: {results}")

    return results


# New code begins here.  Defining functions...

def convert_version2(cv2_snippet, cv2_phrase):

    cv2_class_names = [q.capitalize() for q in
                    random.sample(WORDS, cv2_snippet.count("%%%"))]

    cv2_other_names = random.sample(WORDS, cv2_snippet.count("***"))

    cv2_results = []

    cv2_param_names = []

    cv2_the_dict = dict()


    for i in range(0, cv2_snippet.count("@@@")):
        cv2_param_count = random.randint(1,3)
        cv2_param_names.append(', '.join(
            random.sample(WORDS, cv2_param_count)))


    for cv2_sentence in cv2_snippet, cv2_phrase:

        cv2_result = cv2_sentence[:]

        print(f"cv2_snippet: {cv2_snippet}")

        print(f"cv2_phrase: {cv2_phrase}")

        print(f"The cv2_class_names: {cv2_class_names}")

        print(f"The cv2_other_names: {cv2_other_names}")

        print(f"The cv2_param_names: {cv2_param_names}")


        counter = 0

        for word in cv2_class_names:

            print(f"The word: {word}")

            print(f"The counter: {counter}")

            the_key = f"%%%{counter}"

            print(f"The key: {the_key}")

            cv2_the_dict.update({the_key: word})

            print(f"The dict: {cv2_the_dict}")

            counter = counter + 1


        counter = 0

        for word in cv2_other_names:

            print(f"The word: {word}")

            print(f"The counter: {counter}")

            the_key = f"***{counter}"

            print(f"The key: {the_key}")

            cv2_the_dict.update({the_key: word})

            print(f"The dict: {cv2_the_dict}")

            counter = counter + 1


        counter = 0

        for word in cv2_param_names:
            print(f"The word: {word}")

            print(f"The counter: {counter}")

            the_key = f"@@@{counter}"

            print(f"The key: {the_key}")

            cv2_the_dict.update({the_key: word})

            print(f"The dict: {cv2_the_dict}")

            counter = counter + 1


        print(f"The result: {cv2_result}")

        # This works.
        cv2_result = cv2_result.format(**cv2_the_dict)

        # But this also works.
        # cv2_result = cv2_result.format_map(cv2_the_dict)

        print(f"The updated result: {cv2_result}")

        cv2_results.append(cv2_result)

        print(f"The cv2_results: {cv2_results}")

    return cv2_results



def equalize_list_lengths(ell_input_list, ell_stripped_list):

    if len(ell_input_list) == len(ell_stripped_list):

        return 1

    elif len(ell_input_list) > len(ell_stripped_list):

        while len(ell_input_list) != len(ell_stripped_list):

            ell_stripped_list.append(' ')

    elif len(ell_stripped_list) > len(ell_input_list):

        while len(ell_stripped_list) != len(ell_input_list):

            ell_input_list.append(' ')

    else:

        print("You should never see this message.")


    incorrect_answer(ell_input_list, ell_stripped_list)



def incorrect_answer(ia_input_list, ia_stripped_list):

    if len(ia_input_list) == len(ia_stripped_list):

        print("\nIncorrect answer, please try again.\n")

        # Goal is to show the user where they got the answer wrong.

        # If the two lists have identical length,
        # then iteratively compare each item (string) in the list
        # and print out the user input when they are different.

        # Print both the correct answer and the user input.

        print(f"Correct answer: \"{stripped_answer}\"")

        print(f"Your answer: \t\"{user_input}\"\n")

        for i in range(len(ia_input_list)):

            if ia_input_list[i] != ia_stripped_list[i]:

                print(f"This text you wrote is a problem: \t\"{ia_input_list[i]}\"")

                print(f"The text should read: \t\t\t\"{ia_stripped_list[i]}\"\n")

    else:

        equalize_list_lengths(ia_input_list, ia_stripped_list)



# keep going until they hit CTRL-D
try:
    while True:
        snippets = list(PHRASES.keys())
        random.shuffle(snippets)

        for snippet in snippets:
            phrase = PHRASES[snippet]
            question, answer = convert_version2(snippet, phrase)
            if PHRASE_FIRST:
                question, answer = answer, question

            print(f"\n{question}")

            # Set the user input to a variable.

            user_input = input("\n> ")

            # Create a variable to hold the correct answer without tabs or line feeds.
            # Done because I think the user input won't have either of these.

            stripped_answer = answer.replace('\n', ' ').replace('\t', '')


            # First we're creating lists to hold the correct answer
            # and the user input strings parsed on spaces.

            user_input_list = user_input.split()

            stripped_answer_list = stripped_answer.split()

            # Comparing the variables; the correct answer and what user input.

            if user_input != stripped_answer:

                incorrect_answer(user_input_list, stripped_answer_list)

            else:

                print("\nYou got the correct answer - well done!\n")

except EOFError:
    print("\nBye")

And I’ve removed the debug print statements from this one:

import random
from urllib.request import urlopen
import sys

WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []

PHRASES = {
    "class {%%%0}({%%%1}):":
      "Make a class named {%%%0} that is-a {%%%1}.",
    "class {%%%0}(object):\n\tdef __init__(self, {***0})" :
      "class {%%%0} has-a __init__ that takes self and {***0} params.",
    "class {%%%0}(object):\n\tdef {***0}(self, {@@@0})":
      "class {%%%0} has-a function {***0} that takes self and {@@@0} params.",
    "{***0} = {%%%0}()":
      "Set {***0} to an instance of class {%%%0}.",
    "{***0}.{***1}({@@@0})":
      "From {***0} get the {***1} function, call it with params self, {@@@0}.",
    "{***0}.{***1} = '{***2}'":
      "From {***0} get the {***1} attribute and set it to '{***2}'."
}

# do they want to drill phrases first
if len(sys.argv) == 2 and sys.argv[1] == "english":
    PHRASE_FIRST = True
else:
    PHRASE_FIRST = False

# load up the words from the website
for word in urlopen(WORD_URL).readlines():
    WORDS.append(str(word.strip(), encoding="utf-8"))


# The convert function has been deprecated
# Do not use without updating to take into account the numbers added to PHRASES strings.
def convert(snippet, phrase):
    class_names = [w.capitalize() for w in
                   random.sample(WORDS, snippet.count("%%%"))]
    other_names = random.sample(WORDS, snippet.count("***"))
    results = []
    param_names = []

    for i in range(0, snippet.count("@@@")):
        param_count = random.randint(1,3)
        param_names.append(', '.join(
            random.sample(WORDS, param_count)))

    for sentence in snippet, phrase:
        # this is how you duplicate a list or string
        result = sentence[:]

        # fake class names
        for word in class_names:
            result = result.replace("{%%%}", word, 1)

        # fake other names
        for word in other_names:
            result = result.replace("{***}", word, 1)

        # fake parameter lists
        for word in param_names:
            result = result.replace("{@@@}", word, 1)

        results.append(result)

        print(f"The results: {results}")

    return results


# New code begins here.  Defining functions...

def convert_version2(cv2_snippet, cv2_phrase):

    cv2_class_names = [q.capitalize() for q in
                    random.sample(WORDS, cv2_snippet.count("%%%"))]

    cv2_other_names = random.sample(WORDS, cv2_snippet.count("***"))

    cv2_results = []

    cv2_param_names = []

    cv2_the_dict = dict()


    for i in range(0, cv2_snippet.count("@@@")):

        cv2_param_count = random.randint(1,3)

        cv2_param_names.append(', '.join(
            random.sample(WORDS, cv2_param_count)))


    for cv2_sentence in cv2_snippet, cv2_phrase:

        cv2_result = cv2_sentence[:]

        counter = 0

        for word in cv2_class_names:

            the_key = f"%%%{counter}"

            cv2_the_dict.update({the_key: word})

            counter = counter + 1

        counter = 0


        for word in cv2_other_names:

            the_key = f"***{counter}"

            cv2_the_dict.update({the_key: word})

            counter = counter + 1


        counter = 0

        for word in cv2_param_names:

            the_key = f"@@@{counter}"

            cv2_the_dict.update({the_key: word})

            counter = counter + 1


        # This works.
        cv2_result = cv2_result.format(**cv2_the_dict)

        # But this also works.
        # cv2_result = cv2_result.format_map(cv2_the_dict)

        cv2_results.append(cv2_result)

    return cv2_results



def equalize_list_lengths(ell_input_list, ell_stripped_list):

    if len(ell_input_list) == len(ell_stripped_list):

        return 1

    elif len(ell_input_list) > len(ell_stripped_list):

        while len(ell_input_list) != len(ell_stripped_list):

            ell_stripped_list.append(' ')

    elif len(ell_stripped_list) > len(ell_input_list):

        while len(ell_stripped_list) != len(ell_input_list):

            ell_input_list.append(' ')

    else:

        print("You should never see this message.")


    incorrect_answer(ell_input_list, ell_stripped_list)



def incorrect_answer(ia_input_list, ia_stripped_list):

    if len(ia_input_list) == len(ia_stripped_list):

        print("\nIncorrect answer, please try again.\n")

        # Goal is to show the user where they got the answer wrong.

        # If the two lists have identical length,
        # then iteratively compare each item (string) in the list
        # and print out the user input when they are different.

        # Print both the correct answer and the user input.

        print(f"Correct answer: \"{stripped_answer}\"")

        print(f"Your answer: \t\"{user_input}\"\n")

        for i in range(len(ia_input_list)):

            if ia_input_list[i] != ia_stripped_list[i]:

                print(f"This text you wrote is a problem: \t\"{ia_input_list[i]}\"")

                print(f"The text should read: \t\t\t\"{ia_stripped_list[i]}\"\n")

    else:

        equalize_list_lengths(ia_input_list, ia_stripped_list)



# keep going until they hit CTRL-D
try:
    while True:
        snippets = list(PHRASES.keys())
        random.shuffle(snippets)

        for snippet in snippets:
            phrase = PHRASES[snippet]
            question, answer = convert_version2(snippet, phrase)
            if PHRASE_FIRST:
                question, answer = answer, question

            print(f"\n{question}")

            # Set the user input to a variable.

            user_input = input("\n> ")

            # Create a variable to hold the correct answer without tabs or line feeds.
            # Done because I think the user input won't have either of these.

            stripped_answer = answer.replace('\n', ' ').replace('\t', '')


            # First we're creating lists to hold the correct answer
            # and the user input strings parsed on spaces.

            user_input_list = user_input.split()

            stripped_answer_list = stripped_answer.split()

            # Comparing the variables; the correct answer and what user input.

            if user_input != stripped_answer:

                incorrect_answer(user_input_list, stripped_answer_list)

            else:

                print("\nYou got the correct answer - well done!\n")

except EOFError:
    print("\nBye")

zedshaw · February 13, 2020, 12:43pm

Hey that’s great. I’d say you probably beat this to death, but if you wanted to push more, to really use format you’d probably also have to change the input data. This is just off the top of my head, but you’d look at the %%%, ***, and @@@ strings that are being replaced and see that they’re being used for “class name”, “function name”, “variable name”.

Next, you’d take the input words and make lists of dicts for each of these things. So, something like:

random_names = []
for each word:
   sample = {'class': random.sample(words), 'function': random.sample(words), etc};
   random_names.append(sample)

That way you end up with a list of a bunch of already randomized dicts that just have a name for each already set.

Then your convert function becomes nothing more than:

snip_naming = random.sample(random_names)
format_snip = snippet.format(snip_naming)
phrase_naming = random.sample(random_names)
format_phrase = phrase.format(phrase_naming)

That’s all NOT python code that works, but the idea is if you just change the data then this becomes easier. Turn it into a list of the dictionaries with the settings already made and you’re done for the most part.