21

An Overview of Python’s Datatable package

 4 years ago
source link: https://www.tuicool.com/articles/ai2equZ
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

“There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days”:Eric Schmidt

If you are an R user, chances are that you have already been using the data.table package. Data.table is an extension of the data.frame package in R. It’s also the go-to package for R users when it comes to the fast aggregation of large data (including 100GB in RAM).

The R’s data.table package is a very versatile and a high-performance package due to its ease of use, convenience and programming speed. It is a fairly famous package in the R community with over 400k downloads per month and almost 650 CRAN and Bioconductor packages using it( source ).

So, what is in it for the Python users? Well, the good news is that there also exists a Python counterpart to the data.table package called datatable which has a clear focus on big data support, high performance, both in-memory and out-of-memory datasets, and multi-threaded algorithms. In a way, it can be called as data.table ’s younger sibling.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK