For loop comprehension

Good day,

I have been picking up a course in machinelearning, and there are a lot of times where I have to write out for loops to make a plot like so:

fig, axes = plt.subplots(2, 4, figsize=(20,8))
for axx, n_hidden_nodes in zip(axes, [10, 100]):
    for ax, alpha in zip(axx, [0.0001, 0.01, 0.1, 1]):
        mlp = MLPClassifier(solver='lbfgs', random_state=0, hidden_layer_sizes=[n_hidden_nodes, n_hidden_nodes],
        alpha=alpha)

        mlp.fit(X_train, y_train)
        mglearn.plots.plot_2d_separator(mlp, X_train, fill=True, alpha=.3, ax=ax)
        mglearn.discrete_scatter(X_train[:, 0], X_train[:, 1], y_train, ax=ax)
        ax.set_title(f"n_hidden=[{n_hidden_nodes, n_hidden_nodes, alpha}, {n_hidden_nodes, n_hidden_nodes, alpha}")

Because I have also been following the learnMOREpythonthehardway course, which talks a lot about making for loops smaller into a single line/list, I have been wondering if I would be able to shorten the first three lines into a single line.

I guess you could compress at least the first two lines, but why bother? This is totally fine. Utmost brevity is not an end in itself. Rather than trying to be super smart now, ask yourself what sort of code you would like to decipher in two weaks when you’ve forgotten everything.

I don’t know how much the AI people care about computational resources, but leaving these kinds of nested loops explicit like this can also make it easier to estimate the time complexity of a piece of code.

Perhaps the argument for a comprehension would be for performance, but as this loop is only nested one layer I think your ok.

Some guy at work had a loop that was nested to three layers and the performance impact was noticeable. (Annoyingly he was ‘too busy’ to actually benchmark the loops and improve them…)

I struggle to think of an example where compressing explicit for loops could actually improve performance. In most cases it’s a purely cosmetic change, right? The program logic doesn’t change. Both these snippets do the same thing and I’d expect them take the same time. But the second one is a pain to visually parse.

for row in matrix:
    for cell in row:
        do_something(cell)
for cell in (cell for row in matrix for cell in row):
    do_something(cell)

A difference in performance might occur if the comprehension uses generators while the explicit form uses lists.


I’ve heard that deeply nested loops, regardless of how you implement them, are considered a sign of bad design. But of course that means depths like 4 or 5 layers, not 2.

I hear you. I guess I am asking the question because in all the exercises I keep using the good’ol normal for loop, instead of getting a feeling for when I might use a compressed version. But maybe there just hasn’t been that much cases for it to be useful and therefore hasn’t been intrigated in my method to intuitively use it.

And you should! That’s what it’s there for, and every programmer knows it and can read your code.

Python’s list comprehensions are most useful in situations where you want to construct sequences or dicts for later use.

squares = [n**2 for n in range(10)]

is a lot more concise – and clearer – than

squares = []
for n in range(10):
    sqares.append(n**2)

It’s a nice bit of a functional/declarative mixin in a language that otherwise doesn’t really inherit from that paradigm. But that also means that you shouldn’t overuse it. Python’s control statements are there for a good reason, there’s no need to shun them.

I think the best way to tackle these is to assign each one to variables, kind of going inside out. This blog post does an alright job:

https://towardsdatascience.com/11-examples-to-master-python-list-comprehensions-33c681b56212

But it’s tough to think about multiple nested ones. I think the best approach is this:

  1. Take one of your for loops from the inside and convert it to a single list comprehension calling a function. Store it in a variable and use that for the outer for-loop.
  2. Do the same thing with the next outer for-loop, but using this list comprehension variable as its source.
  3. Continue until the for-loops are all convert to list comprehensions assigned to variables that use each other.

The trick is to go inside-out.

Then, I’d say use some of that datascience to see if is faster or not.

I wasn’t explicit enough with the use case I was describing, and to be fair it’s not relevant to the OP. The guy I refer to was using three nested loops to traverse an excel-imported list, review it against a second imported list, and then create a third list output for any differences between the lists. It was woeful and using set.difference() would have been the obvious choice if he was aware of it, especially as they list indexing was not important.