Ex48 the fifth test

GK77 · May 16, 2019, 1:15pm

I’m having problems with the 5th test, the numbers one. I know I have to convert the string from the user into integers: int(‘string’), but I am not sure where in the function I should do this, I have tried this so far:

def scan(sentence):

results = []
words = sentence.split()
for word in words:
    word_type = lexicon.get(word)
    if word_type == ['number']:
        star = int(word)
    else:
        results.append((word_type, word))

return results

I have also put strings in the lexicon dictionary

zedshaw · May 16, 2019, 3:15pm

That’s the best spot, but you can just append the int right there. The other option is to realize that any time you do int() on something that’s not an integer it throws an exception. So you could do this:

try:
   results.append((word_type, int(word))
except ValueError:
   results.append((word_type, word))

You an also use the isdigit() function on word to ask if it’s all digits (aka a number).

GK77 · May 20, 2019, 4:58pm

I can’t pass the ‘number’ test, I am trying to use the isdigit() method as below:

def scan(sentence):

results = []
words = sentence.split()
for word in words:
    if word.isdigit() == True:
        word = int(word)
    else:
        word_type = lexicon.get(word)
        return results

zedshaw · May 20, 2019, 6:09pm

Ok, first you don’t need to do this:

if word.isdigit() == True:

You can just do:

if word.isdigit():

Ok, next, you’ve gotta debug this. You have to do a print of your results to see what is being returned. I’ll tell you right now that you are doing a return in the else but not in the if words.isdigit. That’s going to give you an empty result because you default to results = [] at the top.

Take some time to debug this yourself. Remember to do printing of variables and run it repeatedly changing code to see what’s wrong. Don’t just stare at it.

GK77 · May 28, 2019, 4:33pm

I’m not sure how to get the program to print variables, if I run it nothing happens.

zedshaw · May 29, 2019, 3:14am

Try this out:

https://docs.python.org/3/library/pdb.html

At the top of your code put:

import pdb

Then, at the spot that you know it’s running, put this:

pdb.set_trace()

When that line of code is hit, you’ll drop into the python debugger. “drop into” means your program stops running and pauses, and then you get something like the Python shell. Then you can actually print out variables, run python code, and step through your code.

Now, the other thing is, you have to also learn to use print as it’s very useful. So you say “it’s not running at all and I don’t know why”. That means you have to find out why. So, you put a line like this:

print(">>>>>>>>>>>>>>>>>>>>>>>>>> I ran")

Put that at the top. Does it print? Move it down to the bottom, and then see if it runs. You can think of this a way to figure out where it’s running and then where it finally doesn’t run. Basically you need to do a print to see where it “magically” stops running like you expect. Now when you identify where the code doesn’t work how you think then you can do more prints to find out what’s going on. So, you can do things like this:

def scan(sentence):
    results = []
    words = sentence.split()
    print(">>>> before for, words=", words)
    for word in words:
        if word.isdigit() == True:
            word = int(word)
            print(">>> word=", word)
        else:
            word_type = lexicon.get(word)
            print(">>>> return, word=", word, "results=", results)
            return results

See how I put a print with information so I can trace each step. This is the key, and also the same thing you do with the pdb debugger. You print information. Also, do you see how your return results is probably not at the right indentation? It’s under the else but you probably mean it to be under the function.

Finally, you can’t fix code without trying changes. You can’t fix code by staring at it and reading it. You have to change it. You have to add prints. You have to try ideas. Change this. Change that. See what happens.

Try those things.

GK77 · July 1, 2019, 12:34pm

I still can’t get this to work, I keep getting the same error:

File “/Users/geetakotecha/Documents/lpthw/projects/scanner/tests/lexicon_tests.py”, line 38, in test_numbers

('number', 91234)])

AssertionError: Lists differ: [('number', 3)] != [('number', 3), ('number', 91234)]

Second list contains 1 additional elements.

First extra element 1:

(‘number’, 91234)

[(‘number’, 3)]

[(‘number’, 3), (‘number’, 91234)]

Here is my code:

lexicon = {
           "north": 'direction',
           "south": 'direction',
           "east": 'direction',
           "west": 'direction',
           "go": 'verb',
           "eat": 'verb',
           "kill": 'verb',
           "the": 'stop',
           "in": 'stop',
           "of": 'stop',
           "bear": 'noun',
           "princess": 'noun',
            1234: 'number',
            3: 'number',
            91234: 'number'
}


def scan(sentence):

    results = []

    for word in sentence.split():
        try:
            results.append(('number', int(word)))
            print(word)
        except ValueError:
            for item in sentence:
                if word.lower() in item:
                    found_category = category
                    break
                else:
                    found_category = 'error'
                    results.append((found_category, word))

        return results

zedshaw · July 3, 2019, 9:03pm

I think maybe this code is in-between on an earlier version where you hard-coded (that means just wrote it directly rather than computing) the numbers as parts of your dictionary for the words. Then it looks like you added code to do the int() conversion but left the words in.

First, take those words out of the dictionary and see what that does for you.

Next, you have a kind of strange logic that I’ll write out like this:

for each word in the sentence
- try to make a number
- value error not a number
  - for each word in the sentence AGAIN

See how that inner (second) for-loop is going over every word in the sentence extra times? In fact, I think if you have 10 words, this would go through them potentially 10*10 times, or 100 times. In fact, you can test it:

$ python
>>> words = [1,2,3,4,5,6,7,8,9,10]
>>> count = 0
>>> for i in words:
...     for j in words:
...             count += 1
...             print(i, j)
... 
(1, 1)
(1, 2)
(1, 3)
...
(10, 9)
(10, 10)
>>> print count
100
>>>

So the next thing is for you to get rid of that inner for-loop and then I think it’ll start to make more sense and work better. I do think there’s potentially another issue, but this should stop giving you extra results you don’t want.

I’m also a big fan of taking code you wrote down a wrong path and deleting it, then writing it again. Think of first versions of code like a sketch drawing for a painting. It could end up wrong, but it’s just a quick sketch so you can toss it and do another one as a way to study the problem. I frequently toss out code and erase what I did and try again. Give that a shot. You may like it.

Final advice, after you delete this version of scan and start over, write out comments explaining what it should do, like this:

def scan(sentence):
    # go through each word in the sentence
    # try to convert it to a number
    # if that fails, then lower case it and see if it's in the dictionary
    # if it is then add it to the result
    # if not then add an error

You can even copy that, but try doing it on your own. This helps you think through using a language you may think in all day long, so it’s easier. Once it seems right, then put your code under each comment to make the comment work. Once your tests are passing, delete the comments and clean up.

Try that.

GK77 · July 29, 2019, 10:54am

I have tried the following code:

lexicon = {
           "north": 'direction',
           "south": 'direction',
           "east": 'direction',
           "west": 'direction',
           "go": 'verb',
           "eat": 'verb',
           "kill": 'verb',
           "the": 'stop',
           "in": 'stop',
           "of": 'stop',
           "bear": 'noun',
           "princess": 'noun',
            1234: 'number',
            3: 'number',
            91234: 'number'
}


def scan(sentence):

    # go through each word in the sentence
    words = sentence.split()
    for word in words:
    # try to convert it to a number
        try:
            int(word)
    # if that fails, then lower case it and see if it's in the dictionary
        except:
            if word.lower() in lexicon:
                # if it is then add it to the result
                return results
                # if not then add an error
            else:
                print("error")

but I am still getting the same error

zedshaw · July 30, 2019, 10:26pm

Ok you have to work on this more. It looks like you did about 20% of what you need to solve it. Let’s take a look:

def scan(sentence):

    # go through each word in the sentence
    words = sentence.split()
    for word in words:
    # try to convert it to a number
        try:
            int(word)
    # if that fails, then lower case it and see if it's in the dictionary
        except:
            if word.lower() in lexicon:
                # if it is then add it to the result
                return results
                # if not then add an error
            else:
                print("error")

Where are you making the results variable? It’s not in there at all. Why do you have int(word) and it’s not assigned to anything? How are you storing the results? Why are you doing return inside your for-loop, which exits the whole function immediately? Where’s your print() debugging statements showing what you tried to log to see how this is working?

If you work on this more and do more gradual debugging you’d figure this out. It seems though that you have a couple misconceptions, and you tend to stop at the first problem and go ask for help instead of trying to solve it.

In your previous code you also had this:

def scan():
    results = []
    for word in words:
        # lots of code
        return results

Do you see the error there? Here, what if I did this:

def scan():
    results = []
    for word in words:
        # lots of code
    return results

Your code runs the for-loop, processes the first word, and since return results is under the for-loop (meaning, it’s indented 4 chars further inside the for-loop), then that gets run as part of the for-loop, not after. That causes your for-loop to run once, and return one thing.

My code has the return on the same level as the for-loop and so the whole for-loop runs, then it returns the results one time.

Let’s look at your new code for another thing:

Why do you have this:

def scan(sentence):
    for word in words:
        #### code cut
            if word.lower() in lexicon:
                return results  # <-------- why?
            else:
                print("error")

Again, you have a return inside your for-loop, so return exits the whole function which means it jumps out of the for-loop and out of the function and returns that value. It seems you think return does something different, so I’m curious what you think it does. It could be you just have a simple misconception of return, and because of this you keep putting return inside a for-loop and wondering why it doesn’t run. Figure out what your error is regarding return and your for-loop gets fixed.

Also, you’re probably staring at code trying to fix it. That’s not going to fix it though because you have log what your code is doing and watch it run, or run it in a debugger. Try this:

def scan(sentence):

    # go through each word in the sentence
    words = sentence.split()
    for word in words:
        print(">>>> top of loop", word)
        try:
            int(word)
        except:
            print(">>>> Not an integer.")
            if word.lower() in lexicon:
                print(">>>> In the lexicon")
                # if it is then add it to the result
                return results
                # if not then add an error
            else:
                print("error")

If you did that, you’d see the error pretty quick. You’d run it a bunch, see what’s going in, wonder why it keeps stopping, then realize it’s the return (hopefully).