1

How to Check if a Pandas Column contains a value?

 1 year ago
source link: https://thispointer.com/how-to-check-if-a-pandas-column-contains-a-value/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

A Pandas DataFrame is two-dimensional data structure with labeled axes (rows and columns). In this article, we will discuss different ways to check if a pandas column contains a particular value (string/int) or some specified values.

Table Of Contents

There are five different ways to check for a value in a DataFrame Column. Let’s discuss them one by one.

Check if DataFrame Column contains a value using in or not-in operator

When can select a DataFrame column as a Series object. Therefore, we can use the Series member functions on a DataFrame Column.

The in operator on a Series can be used to check if a DataFrame Column contains a given a string value. The not in operator checks for the non-existence of given value in Series.

Advertisements

Syntax of in or not-in operator

Search_value in dataFrame[column_name].values:
print('Element exists in Dataframe')
Search_value in dataFrame[column_name].values:
    print('Element exists in Dataframe')

Example of in operator

A pandas script to check for a string ‘Reema’ in a DataFrame column ‘Name’, using in operator.

import pandas as pd
# create a dataframe
df = pd.DataFrame({
'Rollno':[1,2,3,4,5],
'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
# show the dataframe
print(df)
# Find 'Reema' in name column
if 'Reema' in df['Name'].values:
print("\n Reema name is exists in DataFrame")
import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Rollno':[1,2,3,4,5],
    'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
    'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
    })

# show the dataframe
print(df)

# Find 'Reema' in name column
if 'Reema' in df['Name'].values:
    print("\n Reema name is exists in DataFrame")

In the above script, we have use the in operator to check for the existence of string ‘Reema’ in the column ‘Name’ of the DataFrame.

Output

Rollno Name Date_of_Birth
0 1 Reema 1980-04-01
1 2 Lissa 1988-06-24
2 3 Jaya 1992-10-07
3 4 susma 1990-12-25
4 5 Rekha 1989-02-28
Reema name is exists in DataFrame
   Rollno   Name Date_of_Birth
0       1  Reema    1980-04-01
1       2  Lissa    1988-06-24
2       3   Jaya    1992-10-07
3       4  susma    1990-12-25
4       5  Rekha    1989-02-28

 Reema name is exists in DataFrame

Example of not-in operator

A pandas script to check for if a string ‘Reema’ exists in a DataFrame column ‘Name’ using not-in operator.

import pandas as pd
# create a dataframe
df = pd.DataFrame({
'Rollno':[1,2,3,4,5],
'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
# show the dataframe
print(df)
# find 'Leena' in name column
if 'Leena' not in df['Name'].values:
print("\n Leena name does not exists in DataFrame")
import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Rollno':[1,2,3,4,5],
    'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
    'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
})
# show the dataframe
print(df)
# find 'Leena' in name column
if 'Leena' not in df['Name'].values:
    print("\n Leena name does not exists in DataFrame")

In the above script, we have use not in operator to check for the non-existence of string ‘Leena’ in the DataFrame column ‘Name’.

Output

Rollno Name Date_of_Birth
0 1 Reema 1980-04-01
1 2 Lissa 1988-06-24
2 3 Jaya 1992-10-07
3 4 susma 1990-12-25
4 5 Rekha 1989-02-28
Leena name does not exists in DataFrame
   Rollno   Name Date_of_Birth
0       1  Reema    1980-04-01
1       2  Lissa    1988-06-24
2       3   Jaya    1992-10-07
3       4  susma    1990-12-25
4       5  Rekha    1989-02-28

 Leena name does not exists in DataFrame

Check if DataFrame Column contains a value using unique()

The pandas.Series.unique() function is used to find unique values of the Series object. Unique values are returned in the order of appearance. When we use the unique() function with the in operator, then it will return True if search value is found in a Series object.

When can select a DataFrame column as a Series object. Therefore, we can use the Series member functions on a DataFrame Column.

Syntax of unique() Function with in operator

search_value in dataFrame[column_name].unique()
search_value in dataFrame[column_name].unique()

Example of unique() Function with in operator

A pandas script to check for if a string ‘Reema’ exists in a dataFrame column ‘Name’.

import pandas as pd
# create a dataframe
df = pd.DataFrame({
'Rollno':[1,2,3,4,5],
'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
# show the dataframe
print(df)
# find 'Reema' in name column
if 'Reema' in df['Name'].unique():
print('Value exist in column')
else:
print('Value does not exist in column')
import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Rollno':[1,2,3,4,5],
    'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
    'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
    })

# show the dataframe
print(df)

# find 'Reema' in name column
if 'Reema' in df['Name'].unique():
    print('Value exist in column')
else:
    print('Value does not exist in column')

Output

Rollno Name Date_of_Birth
0 1 Reema 1980-04-01
1 2 Lissa 1988-06-24
2 3 Jaya 1992-10-07
3 4 susma 1990-12-25
4 5 Rekha 1989-02-28
Value exist in column
   Rollno   Name Date_of_Birth
0       1  Reema    1980-04-01
1       2  Lissa    1988-06-24
2       3   Jaya    1992-10-07
3       4  susma    1990-12-25
4       5  Rekha    1989-02-28

Value exist in column

In the above script, the df.[‘Name’] returns a Series object, with all values from column ‘Name’. The pandas.Series.unique() function will return unique values of the Series object. The output of above if-statement is True as ‘Reema’ name is in the ‘Name’ column of DataFrame.

Check if DataFrame Column contains a value using Pandas.Series.isin()

The Pandas.Series.isin() function is used to check the existence of list of values in a DataFrame column. It returns a boolean Series. Each True value in this bool series represents the elements if column, that matches an the given elemen.

Syntax of Pandas.Series.isin() Function

df['column_name'].isin([search_value1,search_value2,..])
df['column_name'].isin([search_value1,search_value2,..])

Example of Pandas.Series.isin() Function

A pandas script to check if strings ‘Reema’ or ‘Jaya’ exist in the dataFrame column ‘Name’ using the Pandas.Series.isin() Function.

import pandas as pd
# create a dataframe
df = pd.DataFrame({
'Rollno':[1,2,3,4,5],
'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
# show the dataframe
print(df)
# Check if Column 'Name' contains multiple values
boolValues = df['Name'].isin(['Reema','Jaya'])
print(boolValues)
# Check if any of 'Reema' or 'Jaya' exists
# in the 'Name' column of DataFrame
if boolValues.any():
print('A value exist in column')
else:
print('None of the Values exists in column')
import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Rollno':[1,2,3,4,5],
    'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
    'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
    })

# show the dataframe
print(df)

# Check if Column 'Name' contains multiple values
boolValues = df['Name'].isin(['Reema','Jaya'])

print(boolValues)

# Check if any of 'Reema' or 'Jaya' exists
# in the 'Name' column of DataFrame
if boolValues.any():
    print('A value exist in column')
else:
    print('None of the Values exists in column')

Output

Rollno Name Date_of_Birth
0 1 Reema 1980-04-01
1 2 Lissa 1988-06-24
2 3 Jaya 1992-10-07
3 4 susma 1990-12-25
4 5 Rekha 1989-02-28
0 True
1 False
2 True
3 False
4 False
Name: Name, dtype: bool
A value exist in column
   Rollno   Name Date_of_Birth
0       1  Reema    1980-04-01
1       2  Lissa    1988-06-24
2       3   Jaya    1992-10-07
3       4  susma    1990-12-25
4       5  Rekha    1989-02-28
0     True
1    False
2     True
3    False
4    False
Name: Name, dtype: bool
A value exist in column

Check if DataFrame Column contains a value using contains() Function

Pandas contains() function can be used to search for a regex pattern. We can use it to check if a string exists in a column or not.

Syntax of contains() Function

DataFrame[DataFrame[column_name].str.contains(search_value)]
DataFrame[DataFrame[column_name].str.contains(search_value)]

Example of contains() Function

A pandas script to check for if a string ‘Reema’ exits in a dataFrame column ‘Name’ using contains() Function.

import pandas as pd
# create a dataframe
df = pd.DataFrame({
'Rollno':[1,2,3,4,5],
'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
# show the dataframe
print(df)
# Check if string 'Reema' exists in the
# 'Name' column of DataFrame
if df['Name'].str.contains('Reema').any():
print('Value exist in column')
else:
print('Value does not exist in column')
import pandas as pd

# create a dataframe
df = pd.DataFrame({
    'Rollno':[1,2,3,4,5],
    'Name': ['Reema', 'Lissa', 'Jaya', 'susma', 'Rekha'],
    'Date_of_Birth': ['1980-04-01', '1988-06-24', '1992-10-07', '1990-12-25', '1989-02-28']
    })

# show the dataframe
print(df)


# Check if string 'Reema' exists in the
# 'Name' column of DataFrame
if df['Name'].str.contains('Reema').any():
    print('Value exist in column')
else:
    print('Value does not exist in column')

Output

Rollno Name Date_of_Birth
0 1 Reema 1980-04-01
1 2 Lissa 1988-06-24
2 3 Jaya 1992-10-07
3 4 susma 1990-12-25
4 5 Rekha 1989-02-28
Value exist in column
   Rollno   Name Date_of_Birth
0       1  Reema    1980-04-01
1       2  Lissa    1988-06-24
2       3   Jaya    1992-10-07
3       4  susma    1990-12-25
4       5  Rekha    1989-02-28

Value exist in column

In the above script, we have used contains() function to search for ‘Reema’ in the ‘Name’ column of DataFrame. The contains() function will return a Series containing True if Series / column contains the specified value.

Summary

We learned how to check if a Pandas Column contains a particular value. We have discussed what is pandas dataFrame, and how to find value in a particular column. Happy Learning.

Pandas Tutorials -Learn Data Analysis with Python

 

 

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK