4

超强图解 Pandas,建议收藏

 1 year ago
source link: https://www.51cto.com/article/716991.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

超强图解 Pandas,建议收藏

2022-08-24 11:54:10
本文将借助可视化的过程,讲解Pandas的各种操作。一起来看一下吧。
84cf82e330200a26d009699e082da541c539d3.jpg

Pandas是数据挖掘常见的工具,掌握使用过程中的函数是非常重要的。

sort_values

(dogs[dogs['size'] == 'medium']
.sort_values('type')
.groupby('type').median()
)

执行步骤:

  •  size列筛选出部分行
  •  然后将行的类型进行转换
  •  按照type列进行分组,计算中位数
d631daa0379945f449e478127e2fd41abc5f43.jpg
46a0aaa052424dd5f48055a5da64063efe823a.jpg
d44256308c5a3c70d97227b8e8665b1a4a36ac.jpg
7494a3c8730013cfa248786f27cb6a6381cfb9.jpg

selecting a column

dogs['longevity']
834d19c67ee747c57c8380b6d216d4b7ff6067.jpg

groupby + mean

dogs.groupby('size').mean()

执行步骤:

  •  将数据按照size进行分组
  •  在分组内进行聚合操作
c4b2e5119168a5cc429356efcdce1d11664ebb.jpg
4290850267669343492710a23b83b3e50198d8.jpg

grouping multiple columns

dogs.groupby(['type', 'size'])
553a2933629c0c968db07306db2b18fd5bed34.jpg

groupby + multi aggregation

(dogs
 .sort_values('size')
 .groupby('size')['height']
 .agg(['sum', 'mean', 'std'])
)
  •  按照size列对数据进行排序
  •  按照size进行分组
  •  对分组内的height进行计算
2660c0f4264c1fb7c977623687173c3ea6b797.jpg
c7afa3f574f312f138e987a4e2ab0d1f69e7d1.jpg
219e5f298042a53c7b82100f6f34dd257aeeae.jpg
844d2ff99814ff657f73979e6ff94356185605.jpg

filtering for columns

df.loc[:, df.loc['two'] <= 20]
b14ed666100430595a4379057903992c309061.jpg

filtering for rows

dogs.loc[(dogs['size'] == 'medium') & (dogs['longevity'] > 12), 'breed']
34f967d97734ab306be58161fddef8a377e8c4.jpg

dropping columns

dogs.drop(columns=['type'])
12c780044247e42ee7443684cdcf809d701e25.jpg

joining

ppl.join(dogs)
529cf70615553ba52975105b578e8031918fb1.jpg

merging

ppl.merge(dogs, left_on='likes', right_on='breed', how='left')
982490a43b127b85ef6412a6fbed74bae6b24f.jpg

pivot table

dogs.pivot_table(index='size', columns='kids', values='price')
850d21b98216c45c345433e232e6dcef29e9e7.jpg

melting

dogs.melt()
59cfa9f5245f824d759598cc350aec6a790ee6.jpg

pivoting

dogs.pivot(index='size', columns='kids')
23a47af56a9886d29c06931e940091a8257740.jpg

stacking column index

dogs.stack()
095855997653b0db3209612e9fc10693b25852.jpg

unstacking row index

dogs.unstack()
b4947c6443e470c5b16493029f16ebc2be63f2.jpg

resetting index

dogs.reset_index()
c7c19029543b5437b451112f684015301632ce.jpg

setting index

dogs.set_index('breed')
750d1e0688d2f9e22c07227054dc48919b43ed.jpg
责任编辑:庞桂玉 来源: Python开发者

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK