

Pandas Dataframe.loc[] – thisPointer
source link: https://thispointer.com/pandas-dataframe-loc/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Pandas Dataframe.loc[] – thisPointer Skip to content
In this article, we will discuss how to use the loc property of the Dataframe with examples.
In Pandas, the Dataframe provides a property loc[], to select the subset of Dataframe based on row and column names/labels. We can choose single or multiple rows & columns using it. Let’s learn more about it,
Syntax:
Dataframe.loc[row_segment , column_segment] Dataframe.loc[row_segment]
The column_segment argument is optional. Therefore, if column_segment is not provided, loc [] will select the subset of Dataframe based on row_segment argument only.
Arguments:
- row_segement:
- It contains information about the rows to be selected. Its value can be,
- A single label like ‘A’ or 7 etc.
- In this case, it selects the single row with given label name.
- For example, if ‘B’ only is given, then only the row with label ‘B’ is selected from Dataframe.
- A list/array of label names like, [‘B’, ‘E’, ‘H’]
- In this case, multiple rows will be selected based on row labels given in the list.
- For example, if [‘B’, ‘E’, ‘H’] is given as argument in row segment, then the rows with label name ‘B’, ‘E’ and ‘H’ will be selected.
- A slice object with ints like -> a:e .
- This case will select multiple rows i.e. from row with label a to one before the row with label e.
- For example, if ‘B’:’E’ is provided in the row segment of loc[], it will select a range of rows from label ‘B’ to one before label ‘E’
- For selecting all rows, provide the value ( : )
- A boolean sequence of same size as number of rows.
- In this case, it will select only those rows for which the corresponding value in boolean array/list is True.
- A callable function :
- It can be a lambda function or general function, which accepts the calling dataframe as an argument and returns valid label names in any one of the formats mentioned above.
- A single label like ‘A’ or 7 etc.
- It contains information about the rows to be selected. Its value can be,
- column_segement:
- It is optional.
- It contains information about the columns to be selected. Its value can be,
- A single label like ‘A’ or 7 etc.
- In this case, it selects the single column with given label name.
- For example, if ‘Age’ only is given, then only the column with label ‘Age’ is selected from Dataframe.
- A list/array of label names like, [‘Name’, ‘Age’, ‘City’]
- In this case, multiple columns will be selected based on column labels given in the list.
- For example, if [‘Name’, ‘Age’, ‘City’] is given as argument in column segment, then the columns with label names ‘Name’, ‘Age’, and ‘City’ will be selected.
- A slice object with ints like -> a:e .
- This case will select multiple columns i.e. from column with label a to one before the column with label e.
- For example, if ‘Name’:’City’ is provided in the column segment of loc[], it will select a range of columns from label ‘Name’ to one before label ‘City’
- For selecting all columns, provide the value ( : )
- A boolean sequence of same size as number of columns.
- In this case, it will select only those columns for which the corresponding value in boolean array/list is True.
- A callable function :
- It can be a lambda function or general function that accepts the calling dataframe as an argument and returns valid label names in any one of the formats mentioned above.
- A single label like ‘A’ or 7 etc.
Returns :
It returns a reference to the selected subset of the dataframe based on the provided row and column names.
Also, if column_segment is not provided, it returns the subset of the Dataframe containing only selected rows based on the row_segment argument.
Error scenarios:
Dataframe.loc[row_sgement, column_segement] will give KeyError, if any label name provided is invalid.
Let’s understand more about it with some examples,
Pandas Dataframe.loc[] – Examples
We have divided examples in three parts i.e.
Let’s look at these examples one by one. But before that we will create a Dataframe from list of tuples,
import pandas as pd # List of Tuples students = [('jack', 34, 'Sydeny', 'Australia'), ('Riti', 30, 'Delhi', 'India'), ('Vikas', 31, 'Mumbai', 'India'), ('Neelu', 32, 'Bangalore', 'India'), ('John', 16, 'New York', 'US'), ('Mike', 17, 'las vegas', 'US')] # Create a DataFrame from list of tuples df = pd.DataFrame( students, columns=['Name', 'Age', 'City', 'Country'], index=['a', 'b', 'c', 'd', 'e', 'f']) print(df)
Output:
Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US
Select a few rows from Dataframe
Here we will provide only row segment argument to the Dataframe.loc[]. Therefore it will select rows based on given names and all columns.
Select a single row of Dataframe
To select a row from the dataframe, pass the row name to the loc[]. For example,
# Select row at with label name 'c' row = df.loc['c'] print(row)
Output:
Name Vikas Age 31 City Mumbai Country India Name: c, dtype: object
It returned the row with label name ‘c’ from the Dataframe, as a Series object.
Select multiple rows from Dataframe based on list of names
Pass a list of row label names to the row_segment of loc[]. It will return a subset of the Dataframe containing only mentioned rows. For example,
# Select multiple rows from Dataframe by label names subsetDf = df.loc[ ['c', 'f', 'a'] ] print(subsetDf)
Output:
Name Age City Country c Vikas 31 Mumbai India f Mike 17 las vegas US a jack 34 Sydeny Australia
It returned a subset of the Dataframe containing only three rows with labels ‘c’, ‘f’ and ‘a’.
Select multiple rows from Dataframe based on name range
Pass an name range -> start:end in row segment of loc. It will return a subset of the Dataframe containing only the rows from name start to end from the original dataframe. For example,
# Select rows of Dataframe based on row label range subsetDf = df.loc[ 'b' : 'f' ] print(subsetDf)
Output:
Name Age City Country b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US
It returned a subset of the Dataframe containing only five rows from the original dataframe i.e. rows from label ‘b’ to label ‘f’.
Select rows of Dataframe based on bool array
Pass a boolean array/list in the row segment of loc[]. It will return a subset of the Dataframe containing only the rows for which the corresponding value in the boolean array/list is True. For example,
# Select rows of Dataframe based on bool array subsetDf = df.loc[ [True, False, True, False, True, False] ] print(subsetDf)
Output:
Name Age City Country a jack 34 Sydeny Australia c Vikas 31 Mumbai India e John 16 New York US
Select rows of Dataframe based on Callable function
Create a lambda function that accepts a dataframe as an argument, applies a condition on a column, and returns a bool list. This bool list will contain True only for those rows where the condition is True. Pass that lambda function to loc[] and returns only those rows will be selected for which condition returns True in the list.
For example, select only those rows where column ‘Age’ has a value of more than 25,
# Select rows of Dataframe based on callable function subsetDf = df.loc[ lambda x : (x['Age'] > 25).tolist() ] print(subsetDf)
Output:
Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India
Select a few Columns from Dataframe
Here we will provide the (:) in the row segment argument of the Dataframe.loc[]. Therefore it will select all rows, but only a few columns based on the names provided in column_segement.
Select a single column of Dataframe
To select a column from the dataframe, pass the column name to the loc[]. For example,
# Select single column from Dataframe by column name column = df.loc[:, 'Age'] print(column)
Output:
a 34 b 30 c 31 d 32 e 16 f 17 Name: Age, dtype: int64
It returned the column ‘Age’ from Dataframe, as a Series object.
Select multiple columns from Dataframe based on list of names
Pass a list of column names to the column_segment of loc[]. It will return a subset of the Dataframe containing only mentioned columns. For example,
# Select multiple columns from Dataframe based on list of names subsetDf = df.loc[:, ['Age', 'City', 'Name']] print(subsetDf)
Output:
Age City Name a 34 Sydeny jack b 30 Delhi Riti c 31 Mumbai Vikas d 32 Bangalore Neelu e 16 New York John f 17 las vegas Mike
It returned a subset of the Dataframe containing only three columns.
Select multiple columns from Dataframe based on name range
Pass an name range -> start:end in column segment of loc. It will return a subset of the Dataframe containing only the columns from name start to end, from the original dataframe. For example,
# Select multiple columns from Dataframe by name range subsetDf = df.loc[:, 'Name' : 'City'] print(subsetDf)
Output:
Name Age City a jack 34 Sydeny b Riti 30 Delhi c Vikas 31 Mumbai d Neelu 32 Bangalore e John 16 New York f Mike 17 las vegas
It returned a subset of the Dataframe containing only three columns, i.e., ‘Name’ to ‘City’.
Select columns of Dataframe based on bool array
Pass a boolean array/list in the column segment of loc[]. It will return a subset of the Dataframe containing only the columns for which the corresponding value in the boolean array/list is True. For example,
# Select columns of Dataframe based on bool array subsetDf = df.iloc[:, [True, True, False, False]] print(subsetDf)
Output:
Name Age a jack 34 b Riti 30 c Vikas 31 d Neelu 32 e John 16 f Mike 17
Select a subset of Dataframe
Here we will provide the row and column segment arguments of the Dataframe.loc[]. It will return a subset of Dataframe based on the row and column names provided in row and column segments of loc[].
Select a Cell value from Dataframe
To select a single cell value from the dataframe, just pass the row and column name in the row and column segment of loc[]. For example,
# Select a Cell value from Dataframe by row and column name cellValue = df.loc['c','Name'] print(cellValue)
Output:
Vikas
It returned the cell value at (‘c’,’Name’).
Select subset of Dataframe based on row/column names in list
Select a subset of the dataframe. This subset should include the following rows and columns,
- Rows with names ‘b’, ‘d’ and ‘f’
- Columns with name ‘Name’ and ‘City’
# Select sub set of Dataframe based on row/column indices in list subsetDf = df.loc[['b', 'd', 'f'],['Name', 'City']] print(subsetDf)
Output:
Name City b Riti Delhi d Neelu Bangalore f Mike las vegas
It returned a subset from the calling dataframe object.
Select subset of Dataframe based on row/column name range
Select a subset of the dataframe. This subset should include the following rows and columns,
- Rows from name ‘b’ to ‘e’
- Columns from name ‘Name’ to ‘City’
# Select subset of Dataframe based on row and column label name range. subsetDf = df.loc['b':'e', 'Name':'City'] print(subsetDf)
Output:
Name Age City b Riti 30 Delhi c Vikas 31 Mumbai d Neelu 32 Bangalore e John 16 New York
It returned a subset from the calling dataframe object.
Pro Tip: Changing the values of Dataframe using loc[]
loc[] returns a view object, so any changes made in the returned subset will be reflected in the original Dataframe object. For example, let’s select the row with label ‘c’ from the dataframe using loc[] and change its content,
print(df) # Change the contents of row 'C' to 0 df.loc['c'] = 0 print(df)
Output:
Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c Vikas 31 Mumbai India d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US Name Age City Country a jack 34 Sydeny Australia b Riti 30 Delhi India c 0 0 0 0 d Neelu 32 Bangalore India e John 16 New York US f Mike 17 las vegas US
Changes made to view object returned by loc[], will also change the content of the original dataframe.
Summary:
We learned about how to use the Dataframe.loc[] with several examples.
Advertisements
Recommend
-
5
Pandas is a robust data manipulation library available in Python. If your data wrangling needs are any, then available pandas functions are many :P. Today, we will be f...
-
10
This article explains the usage details of Pandas.Series.nunique() in Python with a few examples. In Pandas, the Series class provides a member function nunique(), which returns a count of unique elements. pandas.Series.nunique...
-
5
This article explains the usage details of Pandas.Series.unique() in Python with few examples. In Pandas, the Series class provides a member function unique(), which returns a numpy array of unique elements in the Series.
-
4
Pandas Dataframe.iloc[] – thisPointer Skip to content In this article, we will discuss how to use the iloc property of Dataframe with examples. In Pandas, th...
-
13
This article will discuss how to convert Pandas Dataframe to Numpy Array. Table of Contents A Dataframe is a data structure that stores the data in rows and columns. We can create a DataFrame using pandas.Data...
-
9
This article will discuss how to convert Numpy arrays to a Pandas DataFrame. Table of Contents A DataFrame is a data structure that will store the data in rows and columns. We can create a DataFrame using panda...
-
5
Pretty Print a Pandas Dataframe – thisPointer Skip to content In this article we will discuss how to print the a Dataframe in pretty formats.
-
10
This article will discuss how to convert JSON to pandas Dataframe. JSON stands for JavaScript Object Notation that stores the data in key-value pair format, inside the list/dictionary data structure. A DataFrame is a data structure tha...
-
9
Export Pandas Dataframe to JSON – thisPointer Skip to content In this article, we will discuss how to export a Pandas Dataframe to JSON file in Python.
-
11
Drop Duplicate Rows from Pandas Dataframe – thisPointer In this article, we will discuss different ways to delete duplicate rows in a pandas DataFrame. Table of Contents: A DataFrame is a data structure...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK