1

Read CSV file into a NumPy Array in Python

 1 year ago
source link: https://thispointer.com/read-csv-file-into-a-numpy-array-in-python/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In this article, we will learn how to Read CSV file into a NumPy Array in Python.

Table Of Contents

What is CSV File?

A CSV file is a comma-separated values file. The csv file format allows data to be saved in a tabular format. The CSV file is just a plain text file in which data is separated by commas.

Suppose we have a csv file: data.csv and its contents are as follow,

1,2,3,4,5
6,7,8,9,0
2,3,4,5,6
4,5,6,7,7
1,2,3,4,5
6,7,8,9,0
2,3,4,5,6
4,5,6,7,7

Now we want to load this CSV file into a NumPy Array.

Advertisements

vid5e62792b95ec8618094391.jpg?cbuster=1600267117
00:00/15:21
liveView.php?hash=ozcmPTEznXRiPTEzqzyxX2V2ZW50PTUjJaNypaZypyRcoWU9MTY1MwMjODA3MSZ2nWRspGkurWVlVzVlPTMhMS4jJaM9MTAkMwx3JaN0YT0jJat9NDUmJax9MmI1JaZcZF9jYXNmRG9gYWyhPXRbnXNjo2yhqGVlLzNioSZmqWJJZD10nGympG9coaRypv5wo20zZGVvqWqJozZipz1uqGyiow0znXNBpHA9MCZlnT02QmY5NmY2NTUmNmQ2MTp0NmM3QmpmNxImMTqCNTQmMDqEN0I2NDMlMmAmMwMlMxQmMDM1MxQmMTMlNUYmMDMkN0Q3QwpmMmEmMwMmMmQmOTM2MmQmOTqEN0I0MmMkMmpmMwqEN0I1MmY0NDp2ODpjNwMmMmQlNmY2MTU3MmUmMDVBNTt0OTp1NTxmMwM5NmQ3RDqCNwI2MmY4NmI2RwZENwU3RDqCNmE2NDY1NmM2Qwp0NxY3MDqEN0I2RwZDNwx2RTp1Nmt3RDqCNTtmNDM1MmM3RDqCNTxmMmMlMmU3RDqCNwYmMTqEN0I0QmMkMmImNTMlMmE3REZFRxUzZGyunWQ9JaVmZXJJpEFxZHI9MTQkLwE2NC42Ml4kNwQzqXNypyVBPU1irzyfoGEyMxY1LwAyMwAyMwuYMTEyM0IyMwBMnW51rCUlMHt4Ny82NCUlOSUlMEFjpGkyV2VvS2y0JTJGNTM3LwM2JTIjJTI4S0uUTUjyMxMyMwBfnWgyJTIjR2Vwn28yMwxyMwBDnHJioWUyMxY3Nl4jLwM4NwUhMTIjJTIjU2FzYXJcJTJGNTM3LwM2JzNmqXVcZD02MwqwMmt2NwE4NTVwJzNioaRyoaRGnWkySWQ9MCZgZWRcYVBfYXyMnXN0SWQ9MCZgZWRcYUkcp3RJZD0jJzqxpHI9MCZaZHBlQ29hp2VhqD0znXNXZVBup3NHZHBlPTEzY2NjYT0jJzNwpGFDo25mZW50PSZwYaVmqGVlPTE2NTImMDtjNmM4NDpzqWyxPVNyn2yhZG9TUGkurWVlNwI3YmM4NwqvYwY0ZSZjqWJVpzj9nHR0pHMyM0EyMxYyMxZ0nGympG9coaRypv5wo20yMxZlZWFxLWNmqv1znWkyLWyhqG8gYS1hqW1jrS1upaJurS1cov1jrXRbo24yMxYzZzkiYXRTqGF0qXM9ZzFfp2UzZWyxp3A9pHJyYzyx

There are multiple ways to Read CSV file into a NumPy Array in Python. Lets discuss all the methods one by one with proper approach and a working code example

Read CSV File into a NumPy Array using loadtxt()

The numpy module has a loadtxt() function and it is used to load data from a text file. Each row in the text file must have same number of values.

Syntax of loadtxt() function

numpy.loadtxt(fname, delimiter, skiprows)
numpy.loadtxt(fname, delimiter, skiprows)
  • Parameters:
    • fname = Name or path of the file to be loaded.
    • delimiter = The string used to separate values, By default the delimiter is whitespace.
    • skiprows = The Number of rows to be skipped.
  • Returns:
    • Returns an array.

Approach:

  1. Import numpy library.
  2. Pass the path of the csv file and delimiter as comma (,) to loadtxt() method.
  3. Print the Array returned by the loadtxt() method.

Source Code

import numpy as np
# Reading csv file into numpy array
arr = np.loadtxt("data.csv", delimiter=",")
# printing the array
print(arr)
import numpy as np

# Reading csv file into numpy array 
arr = np.loadtxt("data.csv", delimiter=",")

# printing the array
print(arr)

Output:

[[1. 2. 3. 4. 5.]
[6. 7. 8. 9. 0.]
[2. 3. 4. 5. 6.]
[4. 5. 6. 7. 7.]]
[[1. 2. 3. 4. 5.]
 [6. 7. 8. 9. 0.]
 [2. 3. 4. 5. 6.]
 [4. 5. 6. 7. 7.]]

Read CSV File into a NumPy Array using genfromtxt()

The numpy module have genfromtxt() function, and it is used to Load data from a text file. The genfromtxt() function can handle rows with missing values as specified.

Syntax of genfromtxt() function

numpy.genfromtxt(fname, delimiter)
numpy.genfromtxt(fname, delimiter)
  • Parameters:
    • fname = Name or path of the file to be loaded.
    • delimiter = The string used to separate values, By default the delimiter is whitespace.
  • Returns:
    • Returns an array.

Approach:

  1. Import numpy library.
  2. Pass the path of the csv file and delimiter as comma (,) to genfromtxt() method.
  3. Print the Array returned by the genfromtxt() method.

Source Code

import numpy as np
# Reading csv file into numpy array
arr = np.genfromtxt("data.csv",delimiter=",")
# printing the array
print(arr)
import numpy as np

# Reading csv file into numpy array 
arr = np.genfromtxt("data.csv",delimiter=",")

# printing the array
print(arr)

Output:

[[1. 2. 3. 4. 5.]
[6. 7. 8. 9. 0.]
[2. 3. 4. 5. 6.]
[4. 5. 6. 7. 7.]]
[[1. 2. 3. 4. 5.]
 [6. 7. 8. 9. 0.]
 [2. 3. 4. 5. 6.]
 [4. 5. 6. 7. 7.]]

Read CSV File into a NumPy Array using read_csv()

The pandas module has a read_csv() method, and it is used to Read a comma-separated values (csv) file into DataFrame, By using the values property of the dataframe we can get the numpy array.

Syntax of read_csv() function

numpy.read_csv(file_path, sep, header)
numpy.read_csv(file_path, sep, header)
  • Parameters:
    • file_path = Name or path of the csv file to be loaded.
    • sep = The string used to separate values i.e, delimiter , By default the delimiter is comma (,).
    • header = The names of the columns.
  • Returns:
    • Returns an DataFrame.

Approach:

  1. Import pandas and numpy library.
  2. Pass the path of the csv file and header as None to read_csv() method.
  3. Now use the values property of DataFrame to get numpy array from the dataframe.
  4. Print the numpy array.

Source Code

import numpy as np
import pandas as pd
# Reading csv file into numpy array
arr = pd.read_csv('data.csv', header=None).values
# printing the array
print(arr)
import numpy as np
import pandas as pd

# Reading csv file into numpy array
arr = pd.read_csv('data.csv', header=None).values

# printing the array
print(arr)

Output:

[[1 2 3 4 5]
[6 7 8 9 0]
[2 3 4 5 6]
[4 5 6 7 7]]
[[1 2 3 4 5]
 [6 7 8 9 0]
 [2 3 4 5 6]
 [4 5 6 7 7]]

Read CSV File into a NumPy Array using file handling and fromstring()

Python supports file handling and provides various functions for reading, writing the files. The numpy module provides fromstring() method, and it is used to make a numpy array from a string. Now to convert a CSV file into numpy array, read the csv file using file handling. Then, for each row in the file, convert the row into numpy array using fromstring() method, and join all the arrays.

Syntax of read_csv() function

numpy.fromstring(str, sep)
numpy.fromstring(str, sep)
  • Parameters:
    • str = A string containing the data..
    • sep = The string separating numbers in the data by default the sep is a whitespace.
  • Returns:
    • Returns an numpy array.

Syntax of open() function

open(file, mode)
open(file, mode)
  • Parameters:
    • file = Name or path of the file to be loaded.
    • mode = This specifies the access mode of opening a file by default the mode is read mode.
  • Returns:
    • Returns an file object.

Approach:

  1. Import numpy library.
  2. Open the csv file in read mode and read each row of the csv file.
  3. pass each row of the csv file and sep=”,” to the fromstring() method.
  4. the fromstring method will return a numpy array append it to a list
  5. Repeat step 3 and 4 till the last row of csv file.
  6. Convert the list into numpy array and print it.

Source Code

import numpy as np
# Reading csv file into numpy array
file_data = open('data.csv')
for row in file_data:
r = list(np.fromstring(row, sep=","))
l.append(r)
# printing the array
print(np.array(l))
import numpy as np

# Reading csv file into numpy array 
file_data = open('data.csv')
l=[]
for row in file_data:
    r = list(np.fromstring(row, sep=","))
    l.append(r)

# printing the array
print(np.array(l))

Output:

[[1. 2. 3. 4. 5.]
[6. 7. 8. 9. 0.]
[2. 3. 4. 5. 6.]
[4. 5. 6. 7. 7.]]
[[1. 2. 3. 4. 5.]
 [6. 7. 8. 9. 0.]
 [2. 3. 4. 5. 6.]
 [4. 5. 6. 7. 7.]]

Summary

Great! you made it, We have discussed All possible methods to Read CSV file into a NumPy Array in Python. Happy learning.

Pandas Tutorials -Learn Data Analysis with Python

 

 

Are you looking to make a career in Data Science with Python?

Data Science is the future, and the future is here now. Data Scientists are now the most sought-after professionals today. To become a good Data Scientist or to make a career switch in Data Science one must possess the right skill set. We have curated a list of Best Professional Certificate in Data Science with Python. These courses will teach you the programming tools for Data Science like Pandas, NumPy, Matplotlib, Seaborn and how to use these libraries to implement Machine learning models.

Checkout the Detailed Review of Best Professional Certificate in Data Science with Python.

Remember, Data Science requires a lot of patience, persistence, and practice. So, start learning today.

Join a LinkedIn Community of Python Developers

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK