5

Python's "in" and "not in" Operators: Check for Membership

 2 years ago
source link: https://realpython.com/python-in-operator/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Python's "in" and "not in" Operators

Getting Started With Membership Tests in Python

Sometimes you need to find out whether a value is present in a collection of values or not. In other words, you need to check if a given value is or is not a member of a collection of values. This kind of check is commonly known as a membership test.

Arguably, the natural way to perform this kind of check is to iterate over the values and compare them with the target value. You can do this with the help of a for loop and a conditional statement.

Consider the following is_member() function:

>>> def is_member(value, iterable):
...     for item in iterable:
...         if value is item or value == item:
...             return True
...     return False
...

This function takes two arguments, the target value and a collection of values, which is generically called iterable. The loop iterates over iterable while the conditional statement checks if the target value is equal to the current value. Note that the condition checks for object identity with is or for value equality with the equality operator (==). These are slightly different but complementary tests.

If the condition is true, then the function returns True, breaking out of the loop. This early return short-circuits the loop operation. If the loop finishes without any match, then the function returns False:

>>> is_member(5, [2, 3, 5, 9, 7])
True

>>> is_member(8, [2, 3, 5, 9, 7])
False

The first call to is_member() returns True because the target value, 5, is a member of the list at hand, [2, 3, 5, 9, 7]. The second call to the function returns False because 8 isn’t present in the input list of values.

Membership tests like the ones above are so common and useful in programming that Python has dedicated operators to perform these types of checks. You can get to know the membership operators in the following table:

Operator Description Syntax
in Returns True if the target value is present in a collection of values. Otherwise, it returns False. value in collection
not in Returns True if the target value is not present in a given collection of values. Otherwise, it returns False. value not in collection

As with Boolean operators, Python favors readability by using common English words instead of potentially confusing symbols as operators.

Note: Don’t confuse the in keyword when it works as the membership operator with the in keyword in the for loop syntax. They have entirely different meanings. The in operator checks if a value is in a collection of values, while the in keyword in a for loop indicates the iterable that you want to draw from.

Like many other operators, in and not in are binary operators. That means you can create expressions by connecting two operands. In this case, those are:

The syntax of a membership test looks something like this:

value in collection

value not in collection

In these expressions, value can be any Python object. Meanwhile, collection can be any data type that can hold collections of values, including lists, tuples, strings, sets, and dictionaries. It can also be a class that implements the .__contains__() method or a user-defined class that explicitly supports membership tests or iteration.

If you use the in and not in operators correctly, then the expressions that you build with them will always evaluate to a Boolean value. In other words, those expressions will always return either True or False. On the other hand, if you try and find a value in something that doesn’t support membership tests, then you’ll get a TypeError. Later, you’ll learn more about the Python data types that support membership tests.

Because membership operators always evaluate to a Boolean value, Python considers them Boolean operators just like the and, or, and not operators.

Now that you know what membership operators are, it’s time to learn the basics of how they work.

Python’s in Operator

To better understand the in operator, you’ll start by writing some small demonstrative examples that determine if a given value is in a list:

>>> 5 in [2, 3, 5, 9, 7]
True

>>> 8 in [2, 3, 5, 9, 7]
False

The first expression returns True because 5 appears inside your list of numbers. The second expression returns False because 8 isn’t present in the list.

According to the in operator documentation, an expression like value in collection is equivalent to the following code:

any(value is item or value == item for item in collection)

The generator expression wrapped in the call to any() builds a list of the Boolean values that result from checking if the target value has the same identity or is equal to the current item in collection. The call to any() checks if any one of the resulting Boolean values is True, in which case the function returns True. If all the values are False, then any() returns False.

Python’s not in Operator

The not in membership operator does exactly the opposite. With this operator, you can check if a given value is not in a collection of values:

>>> 5 not in [2, 3, 5, 9, 7]
False

>>> 8 not in [2, 3, 5, 9, 7]
True

In the first example, you get False because 5 is in [2, 3, 5, 9, 7]. In the second example, you get True because 8 isn’t in the list of values. This negative logic may seem like a tongue twister. To avoid confusion, remember that you’re trying to determine if the value is not part of a given collection of values.

Note: The not value in collection construct works the same as the value not in collection one. However, the former construct is more difficult to read. Therefore, you should use not in as a single operator instead of using not to negate the result of in.

With this quick overview of how membership operators work, you’re ready to go to the next level and learn how in and not in work with different built-in data types.

Using in and not in With Different Python Types

All built-in sequences—such as lists, tuples, range objects, and strings—support membership tests with the in and not in operators. Collections like sets and dictionaries also support these tests. By default, membership operations on dictionaries check whether the dictionary has a given key or not. However, dictionaries also have explicit methods that allow you to use the membership operators with keys, values, and key-value pairs.

In the following sections, you’ll learn about a few particularities of using in and not in with different built-in data types. You’ll start with lists, tuples, and range objects to kick things off.

Lists, Tuples, and Ranges

So far, you’ve coded a few examples of using the in and not in operators to determine if a given value is present in an existing list of values. For these examples, you’ve explicitly used list objects. So, you’re already familiar with how membership tests work with lists.

With tuples, the membership operators work the same as they would with lists:

>>> 5 in (2, 3, 5, 9, 7)
True

>>> 5 not in (2, 3, 5, 9, 7)
False

There are no surprises here. Both examples work the same as the list-focused examples. In the first example, the in operator returns True because the target value, 5, is in the tuple. In the second example, not in returns the opposite result.

For lists and tuples, the membership operators use a search algorithm that iterates over the items in the underlying collection. Therefore, as your iterable gets longer, the search time increases in direct proportion. Using Big O notation, you’d say that membership operations on these data types have a time complexity of O(n).

If you use the in and not in operators with range objects, then you get a similar result:

>>> 5 in range(10)
True

>>> 5 not in range(10)
False

>>> 5 in range(0, 10, 2)
False

>>> 5 not in range(0, 10, 2)
True

When it comes to range objects, using membership tests may seem unnecessary at first glance. Most of the time, you’ll know the values in the resulting range beforehand. But what if you’re using range() with offsets that are determined at runtime?

Note: When creating range objects, you can pass up to three arguments to range(). These arguments are start, stop, and step. They define the number that starts the range, the number at which the range must stop generating values, and the step between the generated values. These three arguments are commonly known as offsets.

Consider the following examples, which use random numbers to determine offsets at runtime:

>>> from random import randint

>>> 50 in range(0, 100, randint(1, 10))
False

>>> 50 in range(0, 100, randint(1, 10))
False

>>> 50 in range(0, 100, randint(1, 10))
True

>>> 50 in range(0, 100, randint(1, 10))
True

On your machine, you might get different results because you’re working with random range offsets. In these specific examples, step is the only offset that varies. In real code, you could have varying values for the start and stop offsets as well.

For range objects, the algorithm behind the membership tests computes the presence of a given value using the expression (value - start) % step) == 0, which depends on the offsets used to create the range at hand. This makes membership tests very efficient when they operate on range objects. In this case, you’d say that their time complexity is O(1).

Note: Lists, tuples, and range objects have an .index() method that returns the index of the first occurrence of a given value in the underlying sequence. This method is useful for locating a value in a sequence.

Some may think that they can use the method to determine if a value is in a sequence. However, if the value isn’t in the sequence, then .index() raises a ValueError:

>>> (2, 3, 5, 9, 7).index(8)
Traceback (most recent call last):
    ...
ValueError: tuple.index(x): x not in tuple

You probably don’t want to figure out whether a value is in a sequence or not by raising exceptions, so you should use a membership operator instead of .index() for this purpose.

Remember that the target value in a membership test can be of any type. The test will check if that value is or isn’t in the target collection. For example, say that you have a hypothetical app where the users authenticate with a username and a password. You can have something like this:

# users.py

username = input("Username: ")
password = input("Password: ")

users = [("john", "secret"), ("jane", "secret"), ("linda", "secret")]

if (username, password) in users:
    print(f"Hi {username}, you're logged in!")
else:
    print("Wrong username or password")

This is a naive example. It’s unlikely that anyone would handle their users and passwords like this. But the example shows that the target value can be of any data type. In this case, you use a tuple of strings representing the username and the password of a given user.

Here’s how the code works in practice:

$ python users.py
Username: john
Password: secret
Hi john, you're logged in!

$ python users.py
Username: tina
Password: secret
Wrong username or password

In the first example, the username and password are correct because they’re in the users list. In the second example, the username doesn’t belong to any registered user, so the authentication fails.

In these examples, it’s important to note that the order in which the data is stored in the login tuple is critical because something like ("john", "secret") isn’t equal to ("secret", "john") in tuple comparison even if they have the same items.

In this section, you’ve explored examples that showcase the core behavior of membership operators with common Python built-in sequences. However, there’s a built-in sequence left. Yes, strings! In the next section, you’ll learn how membership operators work with this data type in Python.

Strings

Python strings are a fundamental tool in every Python developer’s tool kit. Like tuples, lists, and ranges, strings are also sequences because their items or characters are sequentially stored in memory.

You can use the in and not in operators with strings when you need to figure out if a given character is present in the target string. For example, say that you’re using strings to set and manage user permissions for a given resource:

>>> class User:
...     def __init__(self, username, permissions):
...         self.username = username
...         self.permissions = permissions
...

>>> admin = User("admin", "wrx")
>>> john = User("john", "rx")

>>> def has_permission(user, permission):
...     return permission in user.permissions
...

>>> has_permission(admin, "w")
True
>>> has_permission(john, "w")
False

The User class takes two arguments, a username and a set of permissions. To provide the permissions, you use a string in which w means that the user has write permission, r means that the user has read permission, and x implies execution permissions. Note that these letters are the same ones that you’d find in the Unix-style file-system permissions.

The membership test inside has_permission() checks whether the current user has a given permission or not, returning True or False accordingly. To do this, the in operator searches the permissions string to find a single character. In this example, you want to know if the users have write permission.

However, your permission system has a hidden issue. What would happen if you called the function with an empty string? Here’s your answer:

>>> has_permission(john, "")
True

Because an empty string is always considered a substring of any other string, an expression like "" in user.permissions will return True. Depending on who has access to your users’ permissions, this behavior of membership tests may imply a security breach in your system.

You can also use the membership operators to determine if a string contains a substring:

>>> greeting = "Hi, welcome to Real Python!"

>>> "Hi" in greeting
True
>>> "Hi" not in greeting
False

>>> "Hello" in greeting
False
>>> "Hello" not in greeting
True

For the string data type, an expression like substring in string is True if substring is part of string. Otherwise, the expression is False.

Note: Unlike other sequences like lists, tuples, and range objects, strings provide a .find() method that you can use when searching for a given substring in an existing string.

For example, you can do something like this:

>>> greeting.find("Python")
20

>>> greeting.find("Hello")
-1

If the substring is present in the underlying string, then .find() returns the index at which the substring starts in the string. If the target string doesn’t contain the substring, then you get -1 as a result. So, an expression like string.find(substring) >= 0 would be equivalent to a substring in string test.

However, the membership test is way more readable and explicit, which makes it preferable in this situation.

An important point to remember when using membership tests on strings is that string comparisons are case-sensitive:

>>> "PYTHON" in greeting
False

This membership test returns False because strings comparisons are case-sensitive, and "PYTHON" in uppercase isn’t present in greeting. To work around this case sensitivity, you can normalize all your strings using either the .upper() or .lower() method:

>>> "PYTHON".lower() in greeting.lower()
True

In this example, you use .lower() to convert the target substring and the original string into lowercase letters. This conversion tricks the case sensitivity in the implicit string comparison.

Generators

Generator functions and generator expressions create memory-efficient iterators known as generator iterators. To be memory efficient, these iterators yield items on demand without keeping a complete series of values in memory.

In practice, a generator function is a function that uses the yield statement in its body. For example, say that you need a generator function that takes a list of numbers and returns an iterator that yields square values from the original data. In this case, you can do something like this:

>>> def squares_of(values):
...     for value in values:
...         yield value ** 2
...

>>> squares = squares_of([1, 2, 3, 4])

>>> next(squares)
1
>>> next(squares)
4
>>> next(squares)
9
>>> next(squares)
16
>>> next(squares)
Traceback (most recent call last):
    ...
StopIteration

This function returns a generator iterator that yields square numbers on demand. You can use the built-in next() function to retrieve consecutive values from the iterator. When the generator iterator is completely consumed, it raises a StopIteration exception to communicate that no more values are left.

You can use the membership operators on a generator function like squares_of():

>>> 4 in squares_of([1, 2, 3, 4])
True
>>> 9 in squares_of([1, 2, 3, 4])
True
>>> 5 in squares_of([1, 2, 3, 4])
False

The in operator works as expected when you use it with generator iterators, returning True if the value is present in the iterator and False otherwise.

However, there’s something you need to be aware of when checking for membership on generators. A generator iterator will yield each item only once. If you consume all the items, then the iterator will be exhausted, and you won’t be able to iterate over it again. If you consume only some items from a generator iterator, then you can iterate over the remaining items only.

When you use in or not in on a generator iterator, the operator will consume it while searching for the target value. If the value is present, then the operator will consume all the values up to the target value. The rest of the values will still be available in the generator iterator:

>>> squares = squares_of([1, 2, 3, 4])

>>> 4 in squares
True

>>> next(squares)
9
>>> next(squares)
16
>>> next(squares)
Traceback (most recent call last):
    ...
StopIteration

In this example, 4 is in the generator iterator because it’s the square of 2. Therefore, in returns True. When you use next() to retrieve a value from square, you get 9, which is the square of 3. This result confirms that you no longer have access to the first two values. You can continue calling next() until you get a StopIteration exception when the generator iterator is exhausted.

Likewise, if the value isn’t present in the generator iterator, then the operator will consume the iterator completely, and you won’t have access to any of its values:

>>> squares = squares_of([1, 2, 3, 4])

>>> 5 in squares
False

>>> next(squares)
Traceback (most recent call last):
    ...
StopIteration

In this example, the in operator consumes squares completely, returning False because the target value isn’t in the input data. Because the generator iterator is now exhausted, a call to next() with squares as an argument raises StopIteration.

You can also create generator iterators using generator expressions. These expressions use the same syntax as list comprehensions but replace the square brackets ([]) with round brackets (()). You can use the in and not in operators with the result of a generator expression:

>>> squares = (value ** 2 for value in [1, 2, 3, 4])
>>> squares
<generator object <genexpr> at 0x1056f20a0>

>>> 4 in squares
True

>>> next(squares)
9
>>> next(squares)
16
>>> next(squares)
Traceback (most recent call last):
    ...
StopIteration

The squares variable now holds the iterator that results from the generator expression. This iterator yields square values from the input list of numbers. Generator iterators from generator expressions work the same as generator iterators from generator functions. So, the same rules apply when you use them in membership tests.

Another critical issue can arise when you use the in and not in operators with generator iterators. This issue can appear when you’re working with infinite iterators. The function below returns an iterator that yields infinite integers:

>>> def infinite_integers():
...     number = 0
...     while True:
...         yield number
...         number += 1
...

>>> integers = infinite_integers()
>>> integers
<generator object infinite_integers at 0x1057e8c80>

>>> next(integers)
0
>>> next(integers)
1
>>> next(integers)
2
>>> next(integers)
3
>>> next(integers)

The infinite_integers() function returns a generator iterator, which is stored in integers. This iterator yields values on demand, but remember, there will be infinite values. Because of this, it won’t be a good idea to use the membership operators with this iterator. Why? Well, if the target value isn’t in the generator iterator, then you’ll run into an infinite loop that’ll make your execution hang.

Dictionaries and Sets

Python’s membership operators also work with dictionaries and sets. If you use the in or not in operators directly on a dictionary, then it’ll check whether the dictionary has a given key or not. You can also do this check using the .keys() method, which is more explicit about your intentions.

You can also check if a given value or key-value pair is in a dictionary. To do these checks, you can use the .values() and .items() methods, respectively:

>>> likes = {"color": "blue", "fruit": "apple", "pet": "dog"}

>>> "fruit" in likes
True
>>> "hobby" in likes
False
>>> "blue" in likes
False

>>> "fruit" in likes.keys()
True
>>> "hobby" in likes.keys()
False
>>> "blue" in likes.keys()
False

>>> "dog" in likes.values()
True
>>> "drawing" in likes.values()
False

>>> ("color", "blue") in likes.items()
True
>>> ("hobby", "drawing") in likes.items()
False

In these examples, you use the in operator directly on your likes dictionary to check whether the "fruit", "hobby", and "blue" keys are in the dictionary or not. Note that even though "blue" is a value in likes, the test returns False because it only considers the keys.

Next up, you use the .keys() method to get the same results. In this case, the explicit method name makes your intentions much clearer to other programmers reading your code.

To check if a value like "dog" or "drawing" is present in likes, you use the .values() method, which returns a view object with the values in the underlying dictionary. Similarly, to check if a key-value pair is contained in likes, you use .items(). Note that the target key-value pairs must be two-item tuples with the key and value in that order.

If you’re using sets, then the membership operators work as they would with lists or tuples:

>>> fruits = {"apple", "banana", "cherry", "orange"}

>>> "banana" in fruits
True
>>> "banana" not in fruits
False

>>> "grape" in fruits
False
>>> "grape" not in fruits
True

These examples show that you can also check whether a given value is contained in a set by using the membership operators in and not in.

Now that you know how the in and not in operators work with different built-in data types, it’s time to put these operators into action with a couple of examples.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK