6

处理比较大的数据集的时候,单机 spark 会比 pandas 更快吗

 2 years ago
source link: https://www.v2ex.com/t/844667
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

V2EX  ›  Python

处理比较大的数据集的时候,单机 spark 会比 pandas 更快吗

  MTMT · 10 分钟前 via Android · 16 次点击

最近因为需求要处理几十到两百 G 的 CSV 文件,pandas 感觉跑得比较慢,而且貌似是单核?求问有什么好用一点的包吗,单机的 spark 可以吗胜任吗?内存大概近 500g


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK