Exercise 20, Functions and Files (open, readline), 'embedding' part of the code, unexpected result

Berend · January 9, 2018, 7:38pm

I’m enjoying learning Python 3 the hard way. I’m now playing around with exercise 20 and have run into something I don’t understand.

I’ve condensed exercice 20:

from sys import argv
script, input_file = argv

def print_a_line(f):
_ print(f.readline())_

current_file = open(input_file)

print(“Let’s print three lines:”)

print_a_line(current_file)
print_a_line(current_file)
print_a_line(current_file)

What I see:

$ python3.6 ex20a.py test.txt
Let’s print three lines:
This is line 1

This is line 2

This is line 3

I tried to shorten the code by ‘embedding’ the open function and the readline function in the print command. Here’s my new code:

from sys import argv
script, input_file = argv

print(“Let’s print three lines:”)

print(open(input_file).readline())
print(open(input_file).readline())
print(open(input_file).readline())

What I expected was that this would give exactly the same result as the original code, however what I see:

$ python3.6 ex20b.py test.txt
Let’s print three lines:
This is line 1

This is line 1

Why does the first code print the first three lines of the input_file whereas the second code prints the first line of the input_file three times?

I hope you can help me grasp what’s happening and why the two codes give different output. I appreciate your help very much!

Many thanks and best wishes from Amsterdam, the Netherlands.

Berend

GrahamH · January 10, 2018, 12:44am

I think the issue might be that there is nothing telling Python what the current_line is and additionally, to take the current line and increment it by 1. In the book there is:

current_line = 1
print_a_line(current_line, current_file)

current_line = current_line + 1
print_a_line(current_line, current_file)

current_line = current_line + 1
print_a_line(current_line, current_file)

Which clearly tells Python that the current line is line 1 originally, and then current_line increments as the interpreter cycles through the iterations and the script executes. From what I can tell from your example, Python only knows to print out line 1 of the file.

–Graham

Berend · January 10, 2018, 7:38am

Thanks for your help @GrahamH!

If I’m not mistaken current_line is ‘just’ a counter which is being printed as a reference, but doesn’t have any relation to the readline method which is used to print a line. If (in the original excerise 20 in the book) I set the starting value of_current_line_ to 2, it still prints out line 1, 2 and 3.

It look as if in the original exercise it opens the input_file only once, whereas in my code it re-opens input_file on every print command, starting again at the beginning of the file. But why?

zedshaw · January 11, 2018, 2:48am

Yes, that’s right. It’s just a counter you’re manually incrementing each time.

GrahamH · January 11, 2018, 6:31am

Which also means I need to go back through that exercise and make sure I fully understand what’s happening . Just for fun I removed the + 1 to see if that would replicate this behaviour and all that happened is the extra blank line between each wasn’t printed. Berend, care to share the entire .py file? Figuring it out will be a great learning experience and I’m actually quite curious.

Berend · January 11, 2018, 6:20pm

Thanks so much for helping me! Below is the code that outlines the question I have.

from sys import argv

script, input_file = argv

def print_a_line(f):
    print(f.readline())

current_file = open(input_file)

#In the lines below, I followed the code in the book but removed the counter.
#It prints the first three lines of the input_file.

print("Let's print three lines:")

print_a_line(current_file)
print_a_line(current_file)
print_a_line(current_file)

# in the code below, replaced current_file by open(input_file)
# I thought I could do this because current_file = open(input_file)
# however, this doesn't work. Instead of printing the first three lines of input_file,
# this prints the first line of input_file three times. But why?

print("Let's print three lines:")

print_a_line(open(input_file))
print_a_line(open(input_file))
print_a_line(open(input_file))

GrahamH · January 12, 2018, 4:40am

I know what’s going on here, but as to how to fix it I’m still a little fuzzy. Because calling the print_a_line function is being done on the file it’s opening the file each time but still only prints the line (in this case the first line because that’s what the print_a_line function does). In the original exercise there is a rewind function so I took a bit of a cue (very loosely I’ll add, as it gave me an idea for what’s going on) from that and came up with this:

from sys import argv

script, input_file = argv

def print_a_line(f):
    print(f.readline())

def print_all(f):
    print(f.read())

current_file = open(input_file)

#In the lines below, I followed the code in the book but removed the counter.
#It prints the first three lines of the input_file.

print("Let's print three lines:")

print_a_line(current_file)
print_a_line(current_file)
print_a_line(current_file)

# in the code below, replaced current_file by open(input_file)
# I thought I could do this because current_file = open(input_file)
# however, this doesn't work. Instead of printing the first three lines of input_file,
# this prints the first line of input_file three times. But why?

print("Let's print three lines:")

print_all(open(input_file))

This will accomplish what you want it to do. I’m 100% positive that there’s a way to get Python to do what you want without defining the print_all function but the mental gymnastics at my level to get it to work is so much more than the 3 lines of code to just get the job done, plus one of the purposes of a function is to make your life easier.

Does that make sense? I’d love to see what you come up with to solve this another way.

–Graham

PaulC · January 24, 2018, 7:47pm

I have further questions, perhaps more basic, on ex20. Is this a good place to pose them?

(Very new here, and very much wanting to play nicely. Please feel free to correct ANYthing, and be kind.)

pdc

zedshaw · January 25, 2018, 9:26pm

So, I think you should watch the video for this. I believe this is the one where I explain that the concept of files in programming comes from old tape drives for mainframes. Just to recap:

Files used to be huge tapes that you had to fast forward, rewind, open, and close. If you wanted the 300th byte of a file you had to seek(300) to get it, and the drive would actually spin up to there like a VHS.
This legacy continues, so when you call f.readline() you’re saying “read one line, advancing the tape to right after the newline, and then wait for me to call readline again.”
When you get to the end of a file you are at the end of a tape, so you need to then rewind it if you want to read the file again.
And if you want to write onto a file, you need to specify if you are going to erase it from the front (‘w’ mode) or append to the end (‘a’ mode).
Finally, when you open a file the first time it automatically starts at the beginning unless you say otherwise or call seek().

So, your first code does this:

Python, open this file.
Python read a line and print it. Python reads one byte at a time until it hits \n and then waits.
Python read the next line and print it. Continues where it left off at #2 reading chars until \n.
Python read the next line and print it. Does it one more time.

But your second code does this:

Python, open this file.
Python read a line and print it. Remember it starts at 0, reads until \n, then waits.
Since you passed the file as a parameter to this function, it only exists as long as that function runs, so python go “Ok you’re done with that I’ll close it now.”
Python, open the file again, CREATING A WHOLE NEW FILE HANDLE STARTING AT 0.
Python, read a line and print it. This new file is at 0, so reads the first line.
This file also only exists during the function call, so THIS ONE GETS CLOSED TOO.
Python, open the file again, CREATES A WHOLE NEW FILE STARTING AT 0.
Python, read the first line again.
Done with that file after the function exits so close it.

If you don’t want this to happen, then just do it the first way.

PaulC · January 26, 2018, 4:35pm

Zed,

Thanks for your answer. It’s pretty neat that you take the time to answer these, and I will do my best not to abuse your availability.

My first computer project involved punch cards (and PAL, I think?) a few years ago. So, the whole idea of tapes, seeking etc, is something almost nostalgic for me. But your explanation is clear and kind of heartwarming.

My question was more basic, unfortunately. The “f” referred to in the program - what is it? I am guessing that the first use of it is as a file, the file used as part of argv in the lines above it?

And then the “seek” and “f:read” stuff? … Maybe we don’t need to know what they are yet, but I think I got lost on what “f” is actually for in Python, except for understanding much earlier that it is used to format strings (for example, in the first few exercises).

To me it looks as though “f” is a parameter for the function “print_a_line” that you are defining? (Is that the way to say it?) But it’s also some sort of modifier of “readline” further down for example? What do you call these?

Thanks again for your time and attention to feels like an embarassingly basic question. If there was some other place I could have researched rather than asking, please let me know.

Thanks,
Paul

Berend · January 26, 2018, 7:07pm

Thanks so much for your time and trouble to answer my question. I really enjoy learning Python the hard way and appreciate your efforts in making this forum a great extension of the textbook. Thanks!

zedshaw · February 3, 2018, 2:27am

I think @PaulC maybe our wires got crossed as I was actually replying to @Berend above, not really your question. This forum has kind of weird threading so you have to look at the top right of the post where it shows that I’m replying to Berend (find my name on my comment, look straight right).

So, to answer your question I need to know a bit more. Can you do this:

[code]
Paul puts his code here
[/code]

Put a couple lines with what you’re asking about, since I can’t really tell what you mean by what lines you talk about, but if you drop two lines it makes it way easier.

PaulC · February 7, 2018, 6:44pm

Zed,

(Sorry for taking so long to respond … )

I think I’ve figured it out. For me there seem to have been 2 things that were confusing:

First, the use of the character “f” in the book (and Python?) varied from context to context, without enough explanation (for me at least). I understood where it was being used in print() strings, eventually. But then in Ex20 it showed up again as a parameter in some of the file handling functions, at the same time as you were introducing (or maybe I was noticing for the first time?) the first “.” functions. And it does seem that “f” is frequently used in file reading, seeking, etc.

I’m still struggling with that format a bit, but it feels a little bit easier working with these now that I’ve made it thru about 20 more exercises, and realized that sometimes, “f” is just a parameter in a function.

This is a great program. Thanks for what you’ve done!

If I had one concern now, at this point, it would be retention. I’m struggling a little bit at keeping what I’ve learned in my own longer term memory. Because I’m working through all of this alone, there’s no one to bounce things off. I can read and watch and do exercises without talking to another human. Being someone who learns easily through discussion, that’s a little challenging for me. So … Ex41 helps with that kind of thing a lot. I’m just starting in on that now, and can see how it’s going to help a lot.

Zed, is there someplace I can supplement the exercises you’ve presented with more of the same? I want to reinforce what I’ve learned further, and don’t want to get too much further into the book before I solidify the gains I’ve made so far. Does that make sense?

Again, thanks for what you’ve done. It’s the most successful program I’ve used so far in trying to learn how to program.

Cheers,

pdc

zedshaw · February 9, 2018, 2:44am

So, how a symbol is used matters, and it’s important to get used to the same symbol meaning multiple things. That’s just a factor of programming languages having access to a limited number of characters. If you see something like this:

f = open("stuff.txt")

You say, “f is a variable because I used = to assign a value to it (create it).” Whenever you see equal, you make a variable.

But, if you see it like this:

print(f"I'm printing {somevar}")

Then the f is not a variable but rather a modifier on strings that says, “I wish to FORMAT this string by putting variables right inside (in this case somevar).”

Which means, if you have this:

f = open("stuff.txt")
print(f"The file contents: {f.read()}")

You are combining the two. Think of that like a puzzle to solve so you can figure what f means what at what place.

As for the follow on material to get better, I have this book:

https://learncodethehardway.org/more-python/