17

List Comprehension and Beyond — Understand 4 Key Related Techniques in Python

 4 years ago
source link: https://towardsdatascience.com/list-comprehension-and-beyond-understand-4-key-related-techniques-in-python-3bff0f0a3ccb?gi=46cc0350b7a7
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Intermediate Python Knowledge

List Comprehension and Beyond — Understand 4 Key Related Techniques in Python

They’re easier than you may think

faeEVrN.jpg!web

Photo by Filbert Mangundap on Unsplash

When we’re learning Python, list comprehension is a tricky technique that can take us some time to fully understand it. After we learned it, we like to use it because it’s a neat way to showcase our coding expertise in Python. In particular, when we have the chance to let beginners read our code, they’ll be amazed to know the existence of such a concise way of creating lists in Python. Actually, what they probably don’t know is that understanding the syntax of list comprehension can be useful for them to understand a few other critical Python techniques. In this article, let’s explore them altogether.

1. List Comprehension

Let’s first review what the list comprehension is. It has the following basic syntax: [expression for item in iterable] . In essence, it goes over an iterable, executes particular operations that create an item and returns a list consisting of these items. Consider the example below.

>>> # create a list of words
>>> words = ["quixotic", "loquacious", "epistemology", "liminal"]
>>> # create a list of numbers counting the letters
>>> letter_counts = [len(x) for x in words]
>>> letter_counts
[8, 10, 12, 7]

In the above code, we create a list of numbers called letter_counts with each being the letter count for each word in the words list. Pretty straightforward right?

Let’s make something more interesting. In the code below, we create a list of uppercased words by filtering the words list using an if statement.

>>> # create a list of uppercased words with letter count > 8
>>> uppercased = [x.upper() for x in words if len(x) > 8]
>>> uppercased
['LOQUACIOUS', 'EPISTEMOLOGY']

2. Dictionary Comprehension

Besides the list comprehension, Python has a similar technique for creating dictionaries termed dictionary comprehension. It has the following basic syntax: {exp_key: exp_value for item in iterable} . As you can see, the expression is similar to list comprehension in that both have the iteration part (i.e., for item in iterable ).

There are two differences. First, we use curly brackets for dictionary comprehension instead of square brackets for list comprehension. Second, dictionary comprehension has two expressions with one for the key and the other for the value, as opposed to one expression in the list comprehension.

Let’s see the example below. We have a list of tuples with each holding the student’s name and score. Next, we use the dictionary comprehension technique to create a dictionary with the names being the keys and the scores being the values.

>>> # create a list of tuples having student names and scores
>>> scores = [("John", 88), ("David", 95), ("Aaron", 94)]
>>> # create a dictionary using name as key as score as value
>>> dict_scores = {x[0]: x[1] for x in scores}
>>> dict_scores
{'John': 88, 'David': 95, 'Aaron': 94}

To make our example more interesting (thus more can be learned), we can incorporate a conditional assignment with the dictionary comprehension (actually list comprehension too). Consider the example below which still uses the scores list.

>>> # create the dictionary using name as key as grade as value
>>> dict_grades = {x[0]: 'Pass' if x[1] >= 90 else "Fail" for x in scores}
>>> dict_grades
{'John': 'Fail', 'David': 'Pass', 'Aaron': 'Pass'}

3. Set Comprehension

We all know that there are three major built-in collection data structures in Python: lists, dictionaries, and sets. Since there are the list and dictionary comprehension, it’s surprising to know that there is also set comprehension.

The set comprehension has the following syntax: {expression for item in iterable} . The syntax is almost identical to the list comprehension except that it uses curly brackets instead of square brackets. Let’s see how it works with the following example.

>>> # create a list of words of random letters
>>> nonsenses = ["adcss", "cehhe", "DesLs", "dddd"]
>>> # create a set of words of unique letters for each word
>>> unique_letters = {"".join(set(x)) for x in nonsenses}
>>> unique_letters
{'d', 'cdas', 'eLsD', 'ceh'}

In the above code, we create a list of nonsense words with random letters called nonsenses . We then create a set of words called unique_letters with each consisting of unique letters for the words.

One thing to note is that in Python, sets can’t have items with duplicate values, and thus set comprehension will remove duplicate items automatically for us. Please see the code below for this feature.

>>> # create a list of numbers
>>> numbers = [(12, 20, 15), (11, 9, 15), (11, 13, 22)]
>>> # create a set of odd numbers
>>> unique_numbers = {x for triple in numbers for x in triple}
>>> unique_numbers
{9, 11, 12, 13, 15, 20, 22}

In the above code, we create a set called unique_numbers from the list numbers with items of tuples. As you can see, the duplicate numbers (e.g., 11) in the list have only one copy in the set.

One new thing here is that we use a nested comprehension, which has the following syntax: expression for items in iterable for item in items . This technique is useful in cases the iterable contains other collections (e.g., list in list or tuple in list ). Notably, we can use this nested comprehension for the list, dictionary, and set comprehensions.

4. Generator Expression

We learned that we use curly brackets for the set comprehension and square brackets for the list comprehension. What if we use the parentheses, like (expression for item in iterable) ? Good question, which leads to the discussion of generator expression, and some people also call it generator comprehension.

In other words, when we use the parentheses, we’re actually declaring a generator expression, which creates a generator. Generators are “lazy” iterators in Python. It means that generators can be used where an iterator is needed, but it provides the needed item until it’s requested (this is why it’s called “lazy”, a programming jargon ). Let’s see the example below.

>>> # create the generator and get the item
>>> squares_gen = (x*x for x in range(3))
>>> squares_gen.__next__()
0
>>> squares_gen.__next__()
1
>>> squares_gen.__next__()
4
>>> squares_gen.__next__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

In the above code, we create a generator called squares_gen using the generator expression. Using the built-in __next__ method, we’re able to retrieve the next item from the generator. However, when the generator runs out items, it will raise a StopIteration exception, indicating that all items have been used up.

Because of its lazy evaluation feature, generators are a memory-efficient technique to iterate over an enormous list of items without the need to create the iterable in the first place. For example, we work with an enormously large file, and reading the entire file to the memory may exhaust the computer’s RAM, causing it to be non-responsive. Instead, we can use the generator expression technique, like this (row for row in open(filename)) , which allows us to read the file row by row to minimize the memory usage.

To illustrate how the generator expression works, let’s consider a simplified example. In the code below, we create a list and a generator of 100 million numbers with each being a square. Clearly, the generator uses much less memory than the list when we check their sizes.

>>> # a list of 100 million numbers
>>> numbers_list = [x*x for x in range(100000000)]
>>> numbers_list.__sizeof__()
859724448
>>> # a generator of 100 million numbers
>>> numbers_gen = (x*x for x in range(100000000))
>>> numbers_gen.__sizeof__()
96

If our goal is to calculate the sum of these numbers, both options will result in the same result. But importantly, after calculating the sum, the generator will not yield any additional items, as mentioned below. If you do need to use an iterable multiple times, you can either create a list or create a generator every time you need it, with the latter being a more memory-efficient way.

>>> # calculate the sum
>>> sum(numbers_list)
333333328333333350000000
>>> sum(numbers_gen)
333333328333333350000000

Conclusions

In this article, we studied the four important techniques in Python, all of which include an identical component in terms of their syntax. Here’s a quick recap of these techniques and highlights of their use cases.

  • List comprehension : [expression for item in iterable] — a concise way to create lists
  • Dictionary comprehension : {exp_key: exp_value for item in iterable} — a concise way to create dictionaries
  • Set comprehension : {expression for item in iterable} — a concise way to create sets (no duplicate items)
  • Generator expression : (expression for item in iterable) — a concise way to create generators (memory efficient)

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK