CSCI 1360E: Foundations for Informatics and Analytics
We've covered list, tuples, sets, and dictionaries. These are the foundational data structures in Python. In this lecture, we'll go over some more advanced topics that are related to these datasets. By the end of this lecture, you should be able to
range()
, zip()
, and enumerate()
The unifying theme with all these collections we've been discussing (lists, tuples, sets) in the context of looping, is that they're all examples of iterators.
Just think of Iterators as "things I can loop over".
Apart from directly iterating over these collections as in the last lecture, the most common iterator you'll use is the range
function.
range()
¶range
is a super-handy function for building a list of sequential numbers to loop over. Here's an example:
for i in range(10):
print(i, end = " ")
0 1 2 3 4 5 6 7 8 9
Using just this example, stop and think about how range
might work.
In short, range()
generates a list of sequential numbers for you to loop over.
range
one number, like in our example (range(10)
), it will give you back a list of numbers starting at 0 (inclusive) and going up to the number you provided (exclusive). So in our example, it gives us all the integers starting at and including 0, and up to (but excluding) 10.range
two numbers: a starting number (inclusive), and an ending number (e.g., 10 like before).for i in range(5, 10):
print(i, end = " ")
5 6 7 8 9
Here's some good news: if we get right down to it, having done loops and lists already, there's nothing new about list comprehensions.
Here's the bad news: it's a different, and possibly less-easy-to-understand, but much more concise way of creating lists. We'll go over it bit by bit.
Let's look at an example: creating a list of squared values (numbers raised to the second power).
squares = []
for element in range(10):
squares.append(element ** 2)
print(squares)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Stop for a moment and convince yourself why this works.
Now, let's break it down, line by line.
for element in range(10):
range(10)
, or an iterator that contains numbers [0, 10) by 1s (only one number is given, so it starts at 0 inclusive and goes to 10 exclusive).range(10)
is stored in element
.So far, so good? Good, next line!
squares.append(element ** 2)
element ** 2
squares
, using the append
method we've seen beforeStop for a minute and consider how that first step could be split into two (think: create a variable, assign it the square of the current element, and then append that variable to the list).
Hopefully that's all fairly clear! In a list comprehension, we'll see these same pieces show up again, just in a slightly different order.
squares = [element ** 2 for element in range(10)] # List comprehension!
print(squares)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
There it is: a list comprehension. Let's break it down.
[ ]
of a list. This is for the exact reason you'd think: we're building a list!element
Here's another example: adding 10 to every element in the squares
list from before.
new_counts = [item + 10 for item in squares]
print(new_counts)
[10, 11, 14, 19, 26, 35, 46, 59, 74, 91]
for item in squares
) goes through the list squares
one element at a time, storing each one in the variable item
item
) and adds 10 to itStop here and convince yourself that this is working as expected. For fun, try to rewrite this loop in the more verbose way: by defining an empty list, writing a "regular" loop header, and then appending the elements (plus 10 each) to the list.
Generators are cool twists on lists. They've been around since Python 2 but took on a whole new life in Python 3.
That said, if you ever get confused about generators, just think of them as lists. This can potentially get you in trouble with weird errors, but 90% of the time it'll work every time.
Let's start with an example using range()
:
x = range(10)
As we know, this will create an iterator with the numbers 0 through 9, inclusive, and assign it to the variable x
.
print(x)
range(0, 10)
In the same way that lists are iterators, so are generators; which is what x
is here.
So range()
gives us a generator! Great! ...what does that mean, exactly?
For most practical purposes, generators and lists are indistinguishable. However, there are some key differences to be aware of:
range(10)
, not all 10 numbers are immediately computed; in fact, none of them are. They're computed on-the-fly in the loop itself! This really comes in handy if, say, you wanted to loop through 1 trillion numbers, or call range(1000000000000)
. With vanilla lists, this would immediately create 1 trillion numbers in memory and store them, taking up a whole lot of space. With generators, only 1 number is ever computed at a given loop iteration. Huge memory savings!This "laziness" means you cannot directly index a generator, as you would a list, since the numbers are generated on-the-fly during the loop.
The other point of interest with generators:
How do we build generators? Aside from range()
, that is.
Remember list comprehensions? Just replace the brackets of a list comprehension [ ]
with parentheses ( )
.
x = [i for i in range(10)] # Brackets -> list
print(x)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x = (i for i in range(10)) # Parentheses -> generator
print(x)
<generator object <genexpr> at 0x10976b410>
In sum, use lists if:
some_list[431]
On the other hand, use generators if:
There are a few other advanced looping mechanisms in Python that are a little complex, but can make your life a lot easier when used correctly (especially if you're a convert from something like C++ or Java).
This isn't a looping mechanism per se, but it is incredibly useful and is often used in the context of looping.
Imagine you have a tuple of a handful of items; for the sake of example, we'll say this tuple stores three elements: first name, last name, and favorite programming language.
The tuple might look like this:
t = ("shannon", "quinn", "python")
Now I want to pull the elements out of the tuple and work with them independently, one at a time. You already know how to do this:
first_name = t[0]
last_name = t[1]
language = t[2]
print(language)
python
...but using three lines of code, one per variable, to extract all the elements from the tuple into their own variables, is kind of clunky.
Luckily, there's a method called "variable unpacking" that allows us to compress those three lines down to one:
fname, lname, language = t # THREE variables undergoing assignment!
print(fname)
shannon
This does exactly the same thing as before. By presenting three variables on the left hand side, we're telling Python to pull out elements of the tuple at positions 0, 1, and 2.
(variable unpacking is always assumed to start at position 0 of the structure on the right hand side)
We'll see more examples of this in practice using the looping tools later in the lecture.
zip()
¶zip()
is a small method that packs a big punch. It "zips" multiple lists together into something of one big mega-list for the sole purpose of being able to iterate through them all simultaneously.
Directly as a result, variable unpacking can play a huge role in using this function! So if you didn't quite follow variable unpacking, please ask!
Here's an example: first names, last names, and favorite programming languages.
# Three lists, each with four elements.
# First element of "first_names" corresponds to
# the first element of "last_names" and the first
# element of "fave_langs".
first_names = ['Shannon', 'N.K.', 'Natasha', 'Benjamin']
last_names = ['Quinn', 'Jemisin', 'Romanov', 'Button']
fave_langs = ['Python', 'Java', 'Assembly', 'Go']
I want to loop through these three lists simultaneously, so I can print out the person's first name, last name, and their favorite language on the same line. Since I know they're the same length, I can zip them together and, combined with a neat use of variable unpacking, do all of this in two lines:
zipped_lists = zip(first_names, last_names, fave_langs)
for fname, lname, lang in zipped_lists:
print(fname, lname, lang)
Shannon Quinn Python N.K. Jemisin Java Natasha Romanov Assembly Benjamin Button Go
zipped_lists = zip(first_names, last_names, fave_langs)
for fname, lname, lang in zipped_lists:
print(fname, lname, lang)
Shannon Quinn Python N.K. Jemisin Java Natasha Romanov Assembly Benjamin Button Go
There's a lot happening here, so take it in chunks:
zip(first_names, last_names, fave_langs)
: This zips together the three lists, so that the elements at position 0 all line up, then the elements at position 1, then position 2, and so on.This is tricky! It's totally ok if you don't fully absorb it just yet. Open up a blank notebook on JupyterHub or fire up this lecture in Binder and experiment with these examples!
enumerate()
¶Of course, there are always those situations where it's really, really nice to have an index variable in the loop. Let's take a look at that previous example:
zipped_lists = zip(first_names, last_names, fave_langs)
for fname, lname, lang in zipped_lists:
print(fname, lname, lang)
Shannon Quinn Python N.K. Jemisin Java Natasha Romanov Assembly Benjamin Button Go
This is great if all I want to do is loop through the lists simultaneously. But what if the ordering of the elements matters? For example, I want to prefix each sentence with the line number. How can I track what index I'm on in a loop if I don't use range()
?
enumerate()
handles this. By wrapping the object we loop over inside enumerate()
, on each loop iteration we not only get the next object of interest, but also the index of that object. To wit:
x = ['a', 'list', 'of', 'strings']
for index, element in enumerate(x):
print(index, element) # prints the number first, list element second
0 a 1 list 2 of 3 strings
This comes in handy anytime you need to loop through a list or generator, but also need to know what index you're on.
And note again: we're using variable unpacking in the loop header. enumerate
essentially performs an "invisible" zip()
on the iterator you supply, and zips it up with numbers, one per element of the iterator.
break
and continue
¶These are two commands that give you much greater control over loop behavior, beyond just what you specify in the header.
for
loops, you specify how many times to run the loop.while
loops, you iterate until some condition is met.For the vast majority of cases, this works well. But sometimes you need just a little more control for extenuating circumstances.
There are some instances where, barring some external intervention, you really do want to just loop forever:
while True:
# "True" can't ever be "False", so this is quite literally an infinite loop!
How do you get out of this infinite loop? With a break
statement.
while True:
print("Help! I'm stuck in an infinite loop!")
break
print("Whew, now I'm outside the loop!")
Help! I'm stuck in an infinite loop! Whew, now I'm outside the loop!
Just break
. That will snap whatever loop you're currently in and immediately dump you out just after it.
Same thing with for
loops:
for i in range(100000): # Loop 100,000 times!
print(i, end = " ")
if i == 5:
break
print(i)
0 1 2 3 4 5 5
Don't worry about that weird if
there--we'll cover that soon enough--but briefly, it tests if our loop variable i
has the value 5. If so, then that break
underneath is executed.
That's why, even though range
clearly says it's supposed to loop until 100,000, it breaks after only 5.
Similar to break
is continue
, though you use this when you essentially want to "skip" certain iterations.
continue
will also halt the current iteration, but instead of ending the loop entirely, it basically skips you on to the next iteration of the loop without executing any code that may be below it.
for i in range(100):
continue
print("This will never be printed, because 'continue' skips it.")
print(i)
99
Notice how the print
statement inside the loop is never executed, but our loop counter i
is still incremented through the very end.
Some questions to discuss and consider:
1: I want a list of all possible combinations of (x, y)
values for the range [0, 9]. Show how this can be done with a list comprehension using two for-loops.
2: Without consulting Google University, consider how generators might work under the hood. How do you think they're implemented?
3: Go back to the example with three lists (first names, last names, and programming languages). Show how you could use enumerate
to prepend a line number (the current index of the lists) to the sentence printed for each person, e.g.: "17: Joe Schmo's favorite language is C++.
"
4: Show how you would use variable unpacking to extract all the values from a list with 5 elements.
5: We've seen how range
works by giving it 1 or 2 numbers, but it can also take a 3rd number: an increment amount. This specifies how much the value should change between each number; the default is 1 but by specifying a third number you can change that. Show how to generate a list of all the even numbers between 4 and 40 using range
.
# YOUR CODE HERE
or YOUR ANSWER HERE
. Also, be sure to delete or comment out the line that says raise NotImplementedError()
.