0

Pandas Tutorial Part #10 – Add/Remove DataFrame Rows & Columns

 2 years ago
source link: https://thispointer.com/pandas-tutorial-part-10-add-remove-modify-dataframe/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Pandas Tutorial Part #10 – Add/Remove DataFrame Rows & Columns

In this tutorial, we will learn how to add a new row or column to a DataFrame and change the values of existing rows and columns.

Table of Contents

First of all, we will create a DataFrame, and then we will discuss how to add or remove elements from it i.e.

import pandas as pd
# List of Tuples
students = [('jack', 34, 'Sydney', 'Australia'),
('Riti', 30, 'Delhi', 'India'),
('Vikas', 31, 'Mumbai', 'India'),
('Neelu', 32, 'Bangalore','India'),
('John', 16, 'New York', 'US'),
('Mike', 17, 'Las Vegas', 'US')]
# Create a DataFrame object
df = pd.DataFrame( students,
columns=['Name', 'Age', 'City', 'Country'],
index= ['a', 'b', 'c', 'd', 'e', 'f'])
# Display the DataFrame
print(df)
import pandas as pd

# List of Tuples
students = [('jack',    34, 'Sydney',   'Australia'),
            ('Riti',    30, 'Delhi',    'India'),
            ('Vikas',   31, 'Mumbai',   'India'),
            ('Neelu',   32, 'Bangalore','India'),
            ('John',    16, 'New York',  'US'),
            ('Mike',    17, 'Las Vegas', 'US')]

# Create a DataFrame object
df = pd.DataFrame( students,
                   columns=['Name', 'Age', 'City', 'Country'],
                   index=  ['a', 'b', 'c', 'd', 'e', 'f'])

# Display the DataFrame
print(df)

Output

Name Age City Country
a jack 34 Sydney Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 Las Vegas US
    Name  Age       City    Country
a   jack   34     Sydney  Australia
b   Riti   30      Delhi      India
c  Vikas   31     Mumbai      India
d  Neelu   32  Bangalore      India
e   John   16   New York         US
f   Mike   17  Las Vegas         US

This DataFrame contains four columns and six rows.

Advertisements

vid5e668cf2d9a0d613471368.jpg?cbuster=1600267117
liveView.php?hash=ozcmPTEznXRiPTEzqzyxX2V2ZW50PTUjJaNypaZypyRcoWU9MTY0NTMjMTAkNCZ2nWRspGkurWVlVzVlPTMhMS4jJaM9MTAkMwx3JaN0YT0jJat9NDUmJax9MmI1JaZcZF9jYXNmRG9gYWyhPXRbnXNjo2yhqGVlLzNioSZmqWJJZD10nGympG9coaRypv5wo20zZGVvqWqJozZipz1uqGyiow0znXNBpHA9MCZlnT02QmY5NmY2NTUmNmQ2MTp0NmM3QmpmNxImMTqCNTQmMDqEN0I2NDMlMmAmMwMlMxQmMDMlMxQmMTM5NUYmMwMlN0Q3QwpmMmEmMwMmMmQmOTM2MmQmOTqEN0I0MmMkMmpmMwqEN0I1MmY0NDp2ODpjNwMmMmQlNmY2MTU3MmUmMDVBNTt0OTp1NTxmMwM5NmQ3RDqCNwI2MmY4NmI2RwZENwU3RDqCNmE2NDY1NmM2Qwp0NxY3MDqEN0I2RwZDNwx2RTp1Nmt3RDqCNTtmNDM1MmM3RDqCNTxmMmMlMmU3RDqCNwYmMTqEN0I0QmMkMmImNTMlMmE3REZFRxUzZGyunWQ9JaVmZXJJpEFxZHI9MTQkLwE2NC42Ml4kNwQzqXNypyVBPU1irzyfoGEyMxY1LwAyMwAyMwuYMTEyM0IyMwBMnW51rCUlMHt4Ny82NCUlOSUlMEFjpGkyV2VvS2y0JTJGNTM3LwM2JTIjJTI4S0uUTUjyMxMyMwBfnWgyJTIjR2Vwn28yMwxyMwBDnHJioWUyMxY3Nl4jLwM4NwUhMTIjJTIjU2FzYXJcJTJGNTM3LwM2JzNmqXVcZD02MwEkNGQkNTpkNDZzJzNioaRyoaRGnWkySWQ9MCZgZWRcYVBfYXyMnXN0SWQ9MCZgZWRcYUkcp3RJZD0jJzqxpHI9MCZaZHBlQ29hp2VhqD0znXNXZVBup3NHZHBlPTEzY2NjYT0jJzNwpGFDo25mZW50PSZwYaVmqGVlPTE2NDUmMDEjMTU2MTpzqWyxPVNyn2yhZG9TUGkurWVlNwIkMTRxMTY2ODU3MCZjqWJVpzj9nHR0pHMyM0EyMxYyMxZ0nGympG9coaRypv5wo20yMxZjYW5xYXMgqHV0o3JcYWjgpGFlqC0kMC1uZGQgpzVgo3ZyLW1iZGyzrS1xYXRuZaJuoWUyMxYzZzkiYXRTqGF0qXM9ZzFfp2UzZWyxp3A9pHJyYzyxliveView.php?hash=ozcmPTEznXRiPTEzqzyxX2V2ZW50PTI1JaNypaZypyRcoWU9MTY0NTMjMTAkNCZ2nWRspGkurWVlVzVlPTMhMS4jJaM9MTAkMwx3JaN0YT0jJat9NDUmJax9MmI1JaZcZF9jYXNmRG9gYWyhPXRbnXNjo2yhqGVlLzNioSZmqWJJZD10nGympG9coaRypv5wo20zZGVvqWqJozZipz1uqGyiow0znXNBpHA9MCZ1p2VlSXBBZGRlPTE0MS4kNwQhNwMhMTY0JaVmZXJVQT1No3ccoGkuJTJGNS4jJTIjJTI4WDEkJTNCJTIjTGyhqXtyMwB4ODZsNwQyMwxyMwBBpHBfZVqyYxgcqCUlRwUmNl4mNvUlMCUlOEgIVE1MJTJDJTIjoGyeZSUlMEqyY2giJTI5JTIjQ2ulo21yJTJGNmphMC4mODY1LwElMCUlMFNuZzFlnSUlRwUmNl4mNvZwp3V1nWQ9NwIkMTRxMTU3MTQ2ZvZwo250ZW50RzyfZUyxPTAzoWVxnWFQoGF5TGymqEyxPTAzoWVxnWFMnXN0SWQ9MCZxqXI9ODAkJzqxpHI9MCZaZHBlQ29hp2VhqD0znXNXZVBup3NHZHBlPTEzY2NjYT0jJzNwpGFDo25mZW50PSZwYaVmqGVlPTE2NDUmMDEjMwQjMwEzqWyxPVNyn2yhZG9TUGkurWVlNwIkMTRxMTY2ODU3MCZjqWJVpzj9nHR0pHMyM0EyMxYyMxZ0nGympG9coaRypv5wo20yMxZjYW5xYXMgqHV0o3JcYWjgpGFlqC0kMC1uZGQgpzVgo3ZyLW1iZGyzrS1xYXRuZaJuoWUyMxYzZzkiYXRTqGF0qXM9ZzFfp2UzZWyxp3A9pHJyYzyx

Add a column to the DataFrame

To add a new column in the DataFrame, pass the column name in the subscript operator ([]) of the DataFrame and assign new values to it. Let’s see an example, where we will add a new column ‘Budget’ to the above created DataFrame,

# Add a new column to the DataFrame
df['Budget'] = [2000, 3000, 4000, 3500, 4500, 2900]
# Display the DataFrame
print(df)
# Add a new column to the DataFrame
df['Budget'] = [2000, 3000, 4000, 3500, 4500, 2900]

# Display the DataFrame
print(df)

Output

Name Age City Country Budget
a jack 34 Sydney Australia 2000
b Riti 30 Delhi India 3000
c Vikas 31 Mumbai India 4000
d Neelu 32 Bangalore India 3500
e John 16 New York US 4500
f Mike 17 Las Vegas US 2900
    Name  Age       City    Country  Budget
a   jack   34     Sydney  Australia    2000
b   Riti   30      Delhi      India    3000
c  Vikas   31     Mumbai      India    4000
d  Neelu   32  Bangalore      India    3500
e   John   16   New York         US    4500
f   Mike   17  Las Vegas         US    2900

All the list values were added as different rows values for the new column in the DataFrame. What if we want to add a new column with the same values?

Add a new column with the same values

To add a new column in the DataFrame with a similar value in each row, pass the column name in the subscript operator ([]) of the DataFrame and assign a scalar value. For example,

# Add a new column to the DataFrame
df['Marks'] = 0
# Display the DataFrame
print(df)
# Add a new column to the DataFrame
df['Marks'] = 0

# Display the DataFrame
print(df)

Output

Name Age City Country Budget Marks
a jack 34 Sydney Australia 2000 0
b Riti 30 Delhi India 3000 0
c Vikas 31 Mumbai India 4000 0
d Neelu 32 Bangalore India 3500 0
e John 16 New York US 4500 0
f Mike 17 Las Vegas US 2900 0
    Name  Age       City    Country  Budget  Marks
a   jack   34     Sydney  Australia    2000      0
b   Riti   30      Delhi      India    3000      0
c  Vikas   31     Mumbai      India    4000      0
d  Neelu   32  Bangalore      India    3500      0
e   John   16   New York         US    4500      0
f   Mike   17  Las Vegas         US    2900      0

It added a new column, ‘Marks’ in the DataFrame with a similar value in each row, i.e. a zero.

Changing values of an existing column

While using the subscript operator([]) of DataFrame, if you use a column that already exists, it will change the values of that column. For example, let’s change the values of column ‘Age’,

# Change the values of a column
df['Age'] = [31, 35, 36, 34, 31, 37]
# Display the DataFrame
print(df)
# Change the values of a column
df['Age'] = [31, 35, 36, 34, 31, 37]

# Display the DataFrame
print(df)

Output

Name Age City Country Budget Marks
a jack 31 Sydney Australia 2000 0
b Riti 35 Delhi India 3000 0
c Vikas 36 Mumbai India 4000 0
d Neelu 34 Bangalore India 3500 0
e John 31 New York US 4500 0
f Mike 37 Las Vegas US 2900 0
    Name  Age       City    Country  Budget  Marks
a   jack   31     Sydney  Australia    2000      0
b   Riti   35      Delhi      India    3000      0
c  Vikas   36     Mumbai      India    4000      0
d  Neelu   34  Bangalore      India    3500      0
e   John   31   New York         US    4500      0
f   Mike   37  Las Vegas         US    2900      0

As the column ‘Age’ already exists in the DataFrame, all the values in column ‘Age’ got changed.

Add a new Row to the DataFrame

To add a new row to the DataFrame, pass the row index label in the loc[] property of the DataFrame and assign new row values. For example,

# Add a new Row to the DataFrame
df.loc['g'] = ['Aadi', 35, 'Delhi', 'India']
# Add a new Row to the DataFrame
df.loc['g'] = ['Aadi', 35, 'Delhi', 'India']

The Complete example of creating a new DataFrame and then adding a new Row to it,

import pandas as pd
# List of Tuples
students = [('jack', 34, 'Sydney', 'Australia'),
('Riti', 30, 'Delhi', 'India'),
('Vikas', 31, 'Mumbai', 'India'),
('Neelu', 32, 'Bangalore','India'),
('John', 16, 'New York', 'US'),
('Mike', 17, 'Las Vegas', 'US')]
# Create a DataFrame object
df = pd.DataFrame( students,
columns=['Name', 'Age', 'City', 'Country'],
index= ['a', 'b', 'c', 'd', 'e', 'f'])
# Display the DataFrame
print(df)
# Add a new Row to the DataFrame
df.loc['g'] = ['Aadi', 35, 'Delhi', 'India']
# Display the DataFrame
print(df)
import pandas as pd

# List of Tuples
students = [('jack',    34, 'Sydney',   'Australia'),
            ('Riti',    30, 'Delhi',    'India'),
            ('Vikas',   31, 'Mumbai',   'India'),
            ('Neelu',   32, 'Bangalore','India'),
            ('John',    16, 'New York',  'US'),
            ('Mike',    17, 'Las Vegas', 'US')]

# Create a DataFrame object
df = pd.DataFrame( students,
                   columns=['Name', 'Age', 'City', 'Country'],
                   index=  ['a', 'b', 'c', 'd', 'e', 'f'])

# Display the DataFrame
print(df)

# Add a new Row to the DataFrame
df.loc['g'] = ['Aadi', 35, 'Delhi', 'India']

# Display the DataFrame
print(df)

Output

Name Age City Country
a jack 34 Sydney Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 Las Vegas US
Name Age City Country
a jack 34 Sydney Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 Las Vegas US
g Aadi 35 Delhi India
    Name  Age       City    Country
a   jack   34     Sydney  Australia
b   Riti   30      Delhi      India
c  Vikas   31     Mumbai      India
d  Neelu   32  Bangalore      India
e   John   16   New York         US
f   Mike   17  Las Vegas         US



    Name  Age       City    Country
a   jack   34     Sydney  Australia
b   Riti   30      Delhi      India
c  Vikas   31     Mumbai      India
d  Neelu   32  Bangalore      India
e   John   16   New York         US
f   Mike   17  Las Vegas         US
g   Aadi   35      Delhi      India

It added a new row with the index label ‘g’. All the list values got added as the new row values in the DataFrame. Please make sure that number of items provided in the list must be equal to the number of columns in the DataFrame, otherwise it will give ValueError like,

raise ValueError("cannot set a row with mismatched columns")
ValueError: cannot set a row with mismatched columns
raise ValueError("cannot set a row with mismatched columns")
ValueError: cannot set a row with mismatched columns

Add a new Row with the same values

Instead of passing a sequence, we can also assign a scalar value to the df.loc[row_name]. It will add a new row with similar values for all the columns. For example,

# Add a new Row to the DataFrame
df.loc['h'] = 0
# Display the DataFrame
print(df)
# Add a new Row to the DataFrame
df.loc['h'] = 0

# Display the DataFrame
print(df)

Output

Name Age City Country
a jack 34 Sydney Australia
b Riti 30 Delhi India
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 Las Vegas US
g Aadi 35 Delhi India
h 0 0 0 0
    Name  Age       City    Country
a   jack   34     Sydney  Australia
b   Riti   30      Delhi      India
c  Vikas   31     Mumbai      India
d  Neelu   32  Bangalore      India
e   John   16   New York         US
f   Mike   17  Las Vegas         US
g   Aadi   35      Delhi      India
h      0    0          0          0

It added a new row with the index label ‘h,’ and all the values in the new row are 0.

Changing the existing row values

While using the loc[] operator of DataFrame, if you use a row index label that already exists, it will change the values of that row contents. For example, let’s change the values of row ‘b’

# Change the values of existing row
df.loc['b'] = ['Justin', 45, 'Tokyo', 'Japan']
# Display the DataFrame
print(df)
# Change the values of existing row
df.loc['b'] = ['Justin', 45, 'Tokyo', 'Japan']

# Display the DataFrame
print(df)

Output

Name Age City Country
a jack 34 Sydney Australia
b Justin 45 Tokyo Japan
c Vikas 31 Mumbai India
d Neelu 32 Bangalore India
e John 16 New York US
f Mike 17 Las Vegas US
g Aadi 35 Delhi India
h 0 0 0 0
     Name  Age       City    Country
a    jack   34     Sydney  Australia
b  Justin   45      Tokyo      Japan
c   Vikas   31     Mumbai      India
d   Neelu   32  Bangalore      India
e    John   16   New York         US
f    Mike   17  Las Vegas         US
g    Aadi   35      Delhi      India
h       0    0          0          0

As row ‘b’ already exists in the DataFrame, all the values in row ‘b’ got changed.

Summary:

We learned how to change the add or remove new rows and columns in the Pandas DataFrame, also discussed how to change the values of existing rows and columns.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK