2

Pandas Tutorial Part #14 – Sorting DataFrame

 2 years ago
source link: https://thispointer.com/pandas-tutorial-part-x-sorting-dataframe/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Pandas Tutorial Part #14 – Sorting DataFrame

This tutorial will discuss different ways to sort a DataFrame row or column-wise.

Table Of Contents

First of all, we will create a DataFrame from a list of tuples,

import pandas as pd
# List of Tuples
empoyees = [(11, 'Jack', 44, 'Sydney', 19) ,
(12, 'Riti', 41, 'Delhi' , 17) ,
(13, 'Aadi', 46, 'New York', 11) ,
(14, 'Mohit', 45, 'Delhi' , 15) ,
(15, 'Veena', 43, 'Delhi' , 14) ,
(16, 'Shaunak', 42, 'Mumbai', 10 ),
(17, 'Shaun', 40, 'Colombo', 12)]
# Create a DataFrame object
df = pd.DataFrame( empoyees,
columns=['ID', 'Name', 'Age', 'City', 'Experience'],
index=['b', 'd', 'a', 'c', 'g', 'f', 'e'])
# Display the DataFrame
print(df)
import pandas as pd

# List of Tuples
empoyees = [(11, 'Jack',    44, 'Sydney',   19) ,
            (12, 'Riti',    41, 'Delhi' ,   17) ,
            (13, 'Aadi',    46, 'New York', 11) ,
            (14, 'Mohit',   45, 'Delhi' ,   15) ,
            (15, 'Veena',   43, 'Delhi' ,   14) ,
            (16, 'Shaunak', 42, 'Mumbai',   10 ),
            (17, 'Shaun',   40, 'Colombo',  12)]

# Create a DataFrame object
df = pd.DataFrame(  empoyees,
                    columns=['ID', 'Name', 'Age', 'City', 'Experience'],
                    index=['b', 'd', 'a', 'c', 'g', 'f', 'e'])

# Display the DataFrame
print(df)

Output:

ID Name Age City Experience
b 11 Jack 44 Sydney 19
d 12 Riti 41 Delhi 17
a 13 Aadi 46 New York 11
c 14 Mohit 45 Delhi 15
g 15 Veena 43 Delhi 14
f 16 Shaunak 42 Mumbai 10
e 17 Shaun 40 Colombo 12
   ID     Name  Age      City  Experience
b  11     Jack   44    Sydney          19
d  12     Riti   41     Delhi          17
a  13     Aadi   46  New York          11
c  14    Mohit   45     Delhi          15
g  15    Veena   43     Delhi          14
f  16  Shaunak   42    Mumbai          10
e  17    Shaun   40   Colombo          12

This DataFrame has seven rows and five columns. Now let’s see how we can sort this DataFrame based on its values or labels.

Advertisements

vid5e668cf2d9a0d613471368.jpg?cbuster=1600267117
liveView.php?hash=ozcmPTEznXRiPTEzqzyxX2V2ZW50PTUjJaNypaZypyRcoWU9MTY0NTMjMDx4NSZ2nWRspGkurWVlVzVlPTMhMS4jJaM9MTAkMwx3JaN0YT0jJat9NDUmJax9MmI1JaZcZF9jYXNmRG9gYWyhPXRbnXNjo2yhqGVlLzNioSZmqWJJZD10nGympG9coaRypv5wo20zZGVvqWqJozZipz1uqGyiow0znXNBpHA9MCZlnT02QmY5NmY2NTUmNmQ2MTp0NmM3QmpmNxImMTqCNTQmMDqEN0I2NDMlMmAmMwMlMxQmMDMlMxQmMTM5NUYmMwMlN0Q3QwpmMmEmMwMmMmQmOTM2MmQmOTqEN0I0MmMkMmpmMwqEN0I1MmY0NDp2ODpjNwMmMmQlNmY2MTU3MmUmMDVBNTt0OTp1NTxmMwM5NmQ3RDqCNwI2MmY4NmI2RwZENwU3RDqCNmE2NDY1NmM2Qwp0NxY3MDqEN0I2RwZDNwx2RTp1Nmt3RDqCNTtmNDM1MmM3RDqCNTxmMmMlMmU3RDqCNwYmMTqEN0I0QmMkMmImNTMlMmE3REZFRxUzZGyunWQ9JaVmZXJJpEFxZHI9MTQkLwE2NC42Ml4kNwQzqXNypyVBPU1irzyfoGEyMxY1LwAyMwAyMwuYMTEyM0IyMwBMnW51rCUlMHt4Ny82NCUlOSUlMEFjpGkyV2VvS2y0JTJGNTM3LwM2JTIjJTI4S0uUTUjyMxMyMwBfnWgyJTIjR2Vwn28yMwxyMwBDnHJioWUyMxY3Nl4jLwM4NwUhMTIjJTIjU2FzYXJcJTJGNTM3LwM2JzNmqXVcZD02MwEkNGNzOGNzNwRzJzNioaRyoaRGnWkySWQ9MCZgZWRcYVBfYXyMnXN0SWQ9MCZgZWRcYUkcp3RJZD0jJzqxpHI9MCZaZHBlQ29hp2VhqD0znXNXZVBup3NHZHBlPTEzY2NjYT0jJzNwpGFDo25mZW50PSZwYaVmqGVlPTE2NDUmMDA5ODpkOTxzqWyxPVNyn2yhZG9TUGkurWVlNwIkMTRwZwyyM2M3MSZjqWJVpzj9nHR0pHMyM0EyMxYyMxZ0nGympG9coaRypv5wo20yMxZjYW5xYXMgqHV0o3JcYWjgpGFlqC14LXNipaRcozpgZGF0YWZlYW1yJTJGJzZfo2F0U3RuqHVmPWZuoHNyJzVcZHNjPXBlZWJcZA==liveView.php?hash=ozcmPTEznXRiPTEzqzyxX2V2ZW50PTI1JaNypaZypyRcoWU9MTY0NTMjMDx4NSZ2nWRspGkurWVlVzVlPTMhMS4jJaM9MTAkMwx3JaN0YT0jJat9NDUmJax9MmI1JaZcZF9jYXNmRG9gYWyhPXRbnXNjo2yhqGVlLzNioSZmqWJJZD10nGympG9coaRypv5wo20zZGVvqWqJozZipz1uqGyiow0znXNBpHA9MCZ1p2VlSXBBZGRlPTE0MS4kNwQhNwMhMTY0JaVmZXJVQT1No3ccoGkuJTJGNS4jJTIjJTI4WDEkJTNCJTIjTGyhqXtyMwB4ODZsNwQyMwxyMwBBpHBfZVqyYxgcqCUlRwUmNl4mNvUlMCUlOEgIVE1MJTJDJTIjoGyeZSUlMEqyY2giJTI5JTIjQ2ulo21yJTJGNmphMC4mODY1LwElMCUlMFNuZzFlnSUlRwUmNl4mNvZwp3V1nWQ9NwIkMTRwZwuwZwY0ZvZwo250ZW50RzyfZUyxPTAzoWVxnWFQoGF5TGymqEyxPTAzoWVxnWFMnXN0SWQ9MCZxqXI9ODAjJzqxpHI9MCZaZHBlQ29hp2VhqD0znXNXZVBup3NHZHBlPTEzY2NjYT0jJzNwpGFDo25mZW50PSZwYaVmqGVlPTE2NDUmMDA5OTU0ODIzqWyxPVNyn2yhZG9TUGkurWVlNwIkMTRwZwyyM2M3MSZjqWJVpzj9nHR0pHMyM0EyMxYyMxZ0nGympG9coaRypv5wo20yMxZjYW5xYXMgqHV0o3JcYWjgpGFlqC14LXNipaRcozpgZGF0YWZlYW1yJTJGJzZfo2F0U3RuqHVmPWZuoHNyJzVcZHNjPXBlZWJcZA==liveView.php?hash=ozcmPTEznXRiPTEzqzyxX2V2ZW50PTI1JaNypaZypyRcoWU9MTY0NTMjMDx4NSZ2nWRspGkurWVlVzVlPTMhMS4jJaM9MTAkMwx3JaN0YT0jJat9NDUmJax9MmI1JaZcZF9jYXNmRG9gYWyhPXRbnXNjo2yhqGVlLzNioSZmqWJJZD10nGympG9coaRypv5wo20zZGVvqWqJozZipz1uqGyiow0znXNBpHA9MCZ1p2VlSXBBZGRlPTE0MS4kNwQhNwMhMTY0JaVmZXJVQT1No3ccoGkuJTJGNS4jJTIjJTI4WDEkJTNCJTIjTGyhqXtyMwB4ODZsNwQyMwxyMwBBpHBfZVqyYxgcqCUlRwUmNl4mNvUlMCUlOEgIVE1MJTJDJTIjoGyeZSUlMEqyY2giJTI5JTIjQ2ulo21yJTJGNmphMC4mODY1LwElMCUlMFNuZzFlnSUlRwUmNl4mNvZwp3V1nWQ9NwIkMTRwZwuwZwY0ZvZwo250ZW50RzyfZUyxPTAzoWVxnWFQoGF5TGymqEyxPTAzoWVxnWFMnXN0SWQ9MCZxqXI9ODAjJzqxpHI9MCZaZHBlQ29hp2VhqD0znXNXZVBup3NHZHBlPTEzY2NjYT0jJzNwpGFDo25mZW50PSZwYaVmqGVlPTE2NDUmMDEjMDM0ODIzqWyxPVNyn2yhZG9TUGkurWVlNwIkMTRwZwyyM2M3MSZjqWJVpzj9nHR0pHMyM0EyMxYyMxZ0nGympG9coaRypv5wo20yMxZjYW5xYXMgqHV0o3JcYWjgpGFlqC14LXNipaRcozpgZGF0YWZlYW1yJTJGJzZfo2F0U3RuqHVmPWZuoHNyJzVcZHNjPXBlZWJcZA==

Sort all rows of DataFrame by a column

In Pandas, the DataFrame provides a method sort_values(), and it sorts the DataFrame by values along the given axis. We can sort the above created DataFrame by column ‘Experience’. For this we need to pass the name of this column as a list of columns in the by parameter of sort_values() i.e df.sort_values(by=[‘Experience’]) . It will sort all the rows in DataFrame by the column ‘Experience’. For example,

# Sort DataFrame by column 'Experience'
df = df.sort_values(by=['Experience'])
# Display the DataFrame
print(df)
# Sort DataFrame by column 'Experience'
df = df.sort_values(by=['Experience'])

# Display the DataFrame
print(df)

Output:

ID Name Age City Experience
f 16 Shaunak 42 Mumbai 10
a 13 Aadi 46 New York 11
e 17 Shaun 40 Colombo 12
g 15 Veena 43 Delhi 14
c 14 Mohit 45 Delhi 15
d 12 Riti 41 Delhi 17
b 11 Jack 44 Sydney 19
   ID     Name  Age      City  Experience
f  16  Shaunak   42    Mumbai          10
a  13     Aadi   46  New York          11
e  17    Shaun   40   Colombo          12
g  15    Veena   43     Delhi          14
c  14    Mohit   45     Delhi          15
d  12     Riti   41     Delhi          17
b  11     Jack   44    Sydney          19

It sorted the DataFrame along the ‘index’ axis, i.e., sorted all the rows along the column ‘Experience’. In this example, we sorted the DataFrame along a numeric column. We can also sort a DataFrame along a string column. For example,

# Sort DataFrame by column 'Name'
df = df.sort_values(by=['Name'])
# Display the DataFrame
print(df)
# Sort DataFrame by column 'Name'
df = df.sort_values(by=['Name'])

# Display the DataFrame
print(df)

Output:

ID Name Age City Experience
a 13 Aadi 46 New York 11
b 11 Jack 44 Sydney 19
c 14 Mohit 45 Delhi 15
d 12 Riti 41 Delhi 17
e 17 Shaun 40 Colombo 12
f 16 Shaunak 42 Mumbai 10
g 15 Veena 43 Delhi 14
   ID     Name  Age      City  Experience
a  13     Aadi   46  New York          11
b  11     Jack   44    Sydney          19
c  14    Mohit   45     Delhi          15
d  12     Riti   41     Delhi          17
e  17    Shaun   40   Colombo          12
f  16  Shaunak   42    Mumbai          10
g  15    Veena   43     Delhi          14

It sorted the DataFrame along the column ‘Name’. This column contains the string values; therefore, the set_value() method sorted the rows of DataFrame based on the alphabetical order of column ‘Name’ values.

Sort all rows of DataFrame by a column in Descending Order

To sort the DataFrame in descending order, pass the argument ascending=False in the sort_values() function. For example,

# Sort DataFrame by column 'Experience' in descending order
df = df.sort_values(by=['Experience'], ascending=False)
# Display the DataFrame
print(df)
# Sort DataFrame by column 'Experience' in descending order
df = df.sort_values(by=['Experience'], ascending=False)

# Display the DataFrame
print(df)

Output:

ID Name Age City Experience
b 11 Jack 44 Sydney 19
d 12 Riti 41 Delhi 17
c 14 Mohit 45 Delhi 15
g 15 Veena 43 Delhi 14
e 17 Shaun 40 Colombo 12
a 13 Aadi 46 New York 11
f 16 Shaunak 42 Mumbai 10
   ID     Name  Age      City  Experience
b  11     Jack   44    Sydney          19
d  12     Riti   41     Delhi          17
c  14    Mohit   45     Delhi          15
g  15    Veena   43     Delhi          14
e  17    Shaun   40   Colombo          12
a  13     Aadi   46  New York          11
f  16  Shaunak   42    Mumbai          10

It sorted all the rows of DataFrame along the column ‘Experience’ in descending order.

Sort DataFrame by row index labels

In Pandas, the DataFrame provides a method sort_index(), and it sorts the DataFrame by index labels along the given axis. By default, it sorts the rows of DataFrame based on row index labels. For example,

# Sort DataFrame by the Row Index labels
df = df.sort_index()
# Display the DataFrame
print(df)
# Sort DataFrame by the Row Index labels
df = df.sort_index()

# Display the DataFrame
print(df)

Output:

ID Name Age City Experience
a 13 Aadi 46 New York 11
b 11 Jack 44 Sydney 19
c 14 Mohit 45 Delhi 15
d 12 Riti 41 Delhi 17
e 17 Shaun 40 Colombo 12
f 16 Shaunak 42 Mumbai 10
g 15 Veena 43 Delhi 14
   ID     Name  Age      City  Experience
a  13     Aadi   46  New York          11
b  11     Jack   44    Sydney          19
c  14    Mohit   45     Delhi          15
d  12     Riti   41     Delhi          17
e  17    Shaun   40   Colombo          12
f  16  Shaunak   42    Mumbai          10
g  15    Veena   43     Delhi          14

It sorted all the rows of DataFrame by the row index labels.

Sort DataFrame by column names

Pass the axis=1 argument in the sort_index() method of DataFrame. It will sort the DataFrame by the column names. For example,

# Sort DataFrame by the Column Names
df = df.sort_index(axis=1)
# Display the DataFrame
print(df)
# Sort DataFrame by the Column Names
df = df.sort_index(axis=1)

# Display the DataFrame
print(df)

Output:

Age City Experience ID Name
b 44 Sydney 19 11 Jack
d 41 Delhi 17 12 Riti
a 46 New York 11 13 Aadi
c 45 Delhi 15 14 Mohit
g 43 Delhi 14 15 Veena
f 42 Mumbai 10 16 Shaunak
e 40 Colombo 12 17 Shaun
   Age      City  Experience  ID     Name
b   44    Sydney          19  11     Jack
d   41     Delhi          17  12     Riti
a   46  New York          11  13     Aadi
c   45     Delhi          15  14    Mohit
g   43     Delhi          14  15    Veena
f   42    Mumbai          10  16  Shaunak
e   40   Colombo          12  17    Shaun

It sorted all the columns of DataFrame by the column names.

Summary

We learned about different ways to sort a DataFrame in Pandas.

Pandas Tutorials -Learn Data Analysis with Python

 

 

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK