An Overview of Python’s Datatable package
source link: https://www.tuicool.com/articles/ai2equZ
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
“There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days”:Eric Schmidt
If you are an R user, chances are that you have already been using the data.table
package. Data.table
is an extension of the data.frame
package in R. It’s also the go-to package for R users when it comes to the fast aggregation of large data (including 100GB in RAM).
The R’s data.table
package is a very versatile and a high-performance package due to its ease of use, convenience and programming speed. It is a fairly famous package in the R community with over 400k downloads per month and almost 650 CRAN and Bioconductor packages using it( source
).
So, what is in it for the Python users? Well, the good news is that there also exists a Python counterpart to the data.table
package called datatable
which has a clear focus on big data support, high performance, both in-memory and out-of-memory datasets, and multi-threaded algorithms. In a way, it can be called as
data.table
’s
younger sibling.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK