

Get the number of rows for a parquet file
source link: http://www.donghao.org/2021/12/17/get-the-number-of-rows-for-a-parquet-file/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Get the number of rows for a parquet file
We were using Pandas to get the number of rows for a parquet file:
import pandas as pd df = pd.read_parquet("my.parquet") print(df.shape[0])
xxxxxxxxxx
import pandas as pd
df = pd.read_parquet("my.parquet")
print(df.shape[0])
This is easy but will cost a lot of time and memory when the parquet file is very large. For example, it may cost more than 100GB of memory to just read a 10GB parquet file.
If we only need to get the number of rows, not the whole data, Pyarrow will be a better solution:
import pyarrow.parquet as pq table = pq.read_table("my.parquet", columns=[]) print(table.num_rows)
xxxxxxxxxx
import pyarrow.parquet as pq
table = pq.read_table("my.parquet", columns=[])
print(table.num_rows)
This method only spend a couple seconds and cost about 2GB of memory for the same parquet file.
Like this:
Recommend
-
90
Parquet性能测试调优及其优化建议 一、我们为什么...
-
8
Get the schema of a parquet file Previously I just use this snippet to get all the column names of a parquet file: import pandas as pd df = pd.read_parquet("hello.parquet") print(list(df.columns))...
-
10
SQL Authority with Pinal DaveSQL SERVER – Number of Rows Read – Execution PlanRecently one of the clients sent me the following two images from the execution plan and his question was about the Number of Rows Read...
-
5
-
8
Bash - Delete rows that do not end with a number advertisements I need to remove all lines (in text file) not ending with a number. Bef...
-
12
jQuery datatables - display the number of rows on the array advertisements I have a jQuery datatable on my view...
-
4
How to Insert Thousands of Rows from an Excel / CSV File into a database with a single database call ...
-
9
How to Count Number of Rows in a Table Using jQuery 1142 views 2 years ago jQuery Use the length
-
10
Adobe Livecycle Designer (number of rows in the table) Skip to Content...
-
1
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK