可视化二seaborn_聊伟的博客-CSDN博客-免疫在线蚂蚁淘旗下平台-

当前位置：首页 > 新闻动态 >

热卖商品

dfdist/Cobalt/7440-48-4

dfdist/Calcium hydride (CaH)/14452-75-6

新闻详情

可视化二seaborn_聊伟的博客-CSDN博客

来自 : CSDN技术社区发布时间：2021-03-25

三、强大的seaborn

Seaborn是一个在Python中制作有吸引力和信息丰富的统计图形的库。它建立在matplotlib之上并与PyData堆栈紧密集成包括支持来自scipy和statsmodels的numpy和pandas数据结构和统计例程。 Seaborn旨在将可视化作为探索和理解数据的核心部分。绘图函数对包含整个数据集的数据框和数组进行操作并在内部执行必要的聚合和统计模型拟合以生成信息图。如果matplotlib“试图让事情变得简单容易和难以实现” seaborn会试图使一套明确的方案让事情变得容易。 Seaborn可以认为是对matplotlib的补充而不是它的替代品。在数据可视化方面能够很好的表现。

所以我首先从matplotlib说起

matplotlib 绘图可视化知识点整理

http://python.jobbole.com/85106/

legend and legend_handler

https://matplotlib.org/api/legend_api.html

matplotlib图标正常显示中文

为了在图表中能够显示中文和负号等需要下面一段设置

import matplotlib.pyplot as plt

plt.rcParams[ font.sans-serif ] [ SimHei ] #用来正常显示中文标签

plt.rcParams[ axes.unicode_minus ] False #用来正常显示负号

1、创建画板

import matplotlib.pyplot as plt

%matplotlib inline

fig plt.figure()

2、添加子图

ax1 fig.add_subplot(221)

ax2 fig.add_subplot(222)

ax3 fig.add_subplot(223)

或者 #创建画板和两行两列的子图 ,axes[0,1]#子图的索引

fig,axes plt.subplots(2,2)

或者

# matplotlib 提供了 subplot(geo) 和 subplots(n_row, n_col) 的方法来绘制子图

# geo 是行列值的组合具体可见文档

# subplot

ax1 plt.subplot(121)

ax2 plt.subplot(122)

3、画图

#简单的话了一个line图在一张画板中得到两个line图 ‘ko--’这类的是color,marker,linestyle的简略版本

from numpy.random import randn

plt.plot(randn(50).cumsum(), ko-- )

plt.plot(randn(50).cumsum(), go-- )

$\"\"$

其他版本

data1 randn(30).cumsum()

data randn(30).cumsum()

ax plt.subplot(111)

ax.plot(data1, k-- ,label Default1 )

ax.plot(data,linestyle dashed ,c green ,marker o ,label Default2 )#label是标签

ax.legend(loc best )#loc是位置有ax.Legend才显示label

plt.show()

$\"\"$

子图中画法与整图相同只是我们前缀变了

fig2 plt.figure(2)

ax1 fig2.add_subplot(221)

ax1.hist(randn(100),bins 20,color k ,alpha .5)

ax2 fig2.add_subplot(222)

ax2.scatter(np.arange(30),np.arange(30) 3*randn(30))

ax3 fig2.add_subplot(223)

ax3.plot(randn(50).cumsum(), k-- )

$\"\"$

3、子图间距的调整

fig,axes plt.subplots(2,2,sharex True,sharey True)

for i in range(2):

for j in range(2):

axes[i,j].hist(randn(500),bins 50,color k ,alpha .6)

plt.subplots_adjust(wspace 0,hspace 0)#子图之间的间距宽度和高度的百分比

$\"\"$

4、风格

%pylab inline

%matplotlib inline

import pandas as pd

import numpy as np

# 随手生成个拉普拉斯分布的数据

data1 np.random.laplace(size 10000)

# 再来份正态分布的设均值为 100 标准差为 24 也就是我们常见的 IQ 数据了

data2 np.random.normal(100, 24, 10000)

4.1#line图的drawstyle()

plt.plot(data, k- ,drawstyle steps-post ,label steps-post )

plt.plot(data,linestyle dashed ,c green ,marker o ,label Default2 )

plt.legend(loc best ) $\"\"$

#柱状图风格

# hist 绘制分布布 bins 20 指定分为 20 个分布区域

# 横轴是分布区域纵轴是数据集分布在该区域的数量

# 也可使用参数 density True 绘制概率分布图将纵轴换为概率密度值

# subplot

subplot(121) # 创建一行二列的子图当前是第一幅

title( data1 ) # 标题

r hist(data1, bins 20, label 14 ,histtype stepfilled )

# 绘制图例可以更改 bbox_to_anchor 的取值来调节图例的位置

legend(loc lower left , bbox_to_anchor (0.3, -0.2))

subplot(122) # 一行二列中的第二幅

title( data2 )

r hist(data2, bins 20, label 24 )

legend(loc center left , bbox_to_anchor (1, 0.5))

$\"\"$

5、pandas里面的matplotlib,就是第一章的内容

# pandas 的 DataFrame 也内置了 matplotlib

figure()

df1, df2 map(pd.DataFrame, (data1, data2)) # 将 data1, data2 转成 pandas DataFrame 格式

# 创建子图

ax subplot(121) # 也可以使用 fig, axes subplots(1, 2)

df1[0].hist(ax ax) # 若之前使用的是 subplots 这里就用 ax ax[0]

ax.set_title( data1 )

# subplot 2

ax subplot(122)

ax.set_title( data2 )

df1[0].hist(ax ax) # ax axex[1]

$\"\"$

6、添加文本

6、1简单版

fig plt.figure()

ax fig.add_subplot(111)

ax.plot(randn(1000).cumsum(), r-- ,label two )

ax.plot(randn(1000).cumsum(), k. ,label three )

ax.text(400,20, hi ,fontsize 15)#坐标文本内容大小

ax.legend(loc best )#自动寻找一个图例放置的位置

plt.show()

$\"\"$

6、2 ax.annotate

fig plt.figure()

ax fig.add_subplot(111)

ax.plot(randn(1000).cumsum(), r-- ,label two )

ax.plot(randn(1000).cumsum(), k. ,label _three_ )#双下划綫表示标签不传入 ax.annotate( Hello world! ,xy (400, 20),fontsize 12,#文本代表数据位置大小

xytext (400,50))#文本放置坐标

ax.legend(loc best )#自动寻找一个图例放置的位置

plt.show()

$\"\"$

#添加箭头或其他符号

import pandas as pd

from datetime import datetime

fig plt.figure()

ax fig.add_subplot(111)

data pd.read_csv(r C:\\Users\\11488\\Desktop\\book\\ch09\\stock_px.csv ,index_col 0, parse_dates True)

spx data[ SPX ]

spx.plot(ax ax, style k- )

crisis_data [

(datetime(2007, 10, 11), Peak of bull market ),

(datetime(2008, 3, 12), Bear Stearns Fails ),

(datetime(2008, 9, 15), Lehman Bankruptcy )

]

for date, label in crisis_data:

ax.annotate(label, xy (date, spx.asof(date) 75),

xytext (date, spx.asof(date) 225),

arrowprops dict(facecolor black , headwidth 4, width 2,

headlength 4),

horizontalalignment left , verticalalignment top )

# Zoom in on 2007-2010

ax.set_xlim([ 1/1/2007 , 1/1/2011 ])

ax.set_ylim([600, 1800])

ax.set_title( Important dates in the 2008-2009 financial crisis )

接下里就到强大的seaborn 了

1 set_style( ) set( )

set_style( )是用来设置主题的 Seaborn有五个预设好的主题 darkgrid , whitegrid , dark , white ,和 ticks 默认 darkgrid

import matplotlib.pyplot as plt

import seaborn as sns

sns.set_style( whitegrid )

plt.plot(np.arange(10))

plt.show()

set( )通过设置参数可以用来设置背景调色板等更加常用。

import seaborn as sns

import matplotlib.pyplot as plt

sns.set(style white , palette muted , color_codes True) #set( )设置主题调色板更常用

plt.plot(np.arange(10))

plt.show()

$\"\"$

with sns.color_palette( husl , 8):

sns.set(style dark ,color_codes True) #set( )设置主题调色板更常用

plt.plot(np.arange(10))

plt.show()

详情见

http://seaborn.pydata.org/generated/seaborn.color_palette.html#seaborn.color_palette

$\"\"$

2 distplot( ) kdeplot( )

distplot( )为hist加强版 kdeplot( )为密度曲线图

import matplotlib.pyplot as plt

import seaborn as sns

df_iris pd.read_csv(r C:\\Users\\11488\\Desktop\\Data set\\seaborn-data-master\\iris.csv )

fig, axes plt.subplots(1,2)

sns.distplot(df_iris[ petal_length ], ax axes[0], kde True, rug True) # kde 密度曲线 rug 边际毛毯

sns.kdeplot(df_iris[ petal_length ], ax axes[1], shade True) # shade 阴影

plt.show()

$\"\"$

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

sns.set(style darkgrid ,color_codes True)

sns.color_palette( husl , 8)

rs np.random.RandomState(10)

d rs.normal(size 100)

f, axes plt.subplots(2, 2, figsize (7, 7), sharex True)

sns.distplot(d, kde False, color b , ax axes[0, 0])

sns.distplot(d, hist False, rug True, color r , ax axes[0, 1])

sns.distplot(d, hist False, color g , kde_kws { shade : True}, ax axes[1, 0])

sns.distplot(d, color m , ax axes[1, 1])

plt.show()

$\"\"$

3 箱型图 boxplot( )

sns.boxplot(x df_iris[ species ],y df_iris[ sepal_width ])

plt.show()

$\"\"$

import matplotlib.pyplot as plt

import os

os.chdir(r C:\\Users\\11488\\Desktop\\Data set\\seaborn-data-master )

import seaborn as sns

tips pd.read_csv( tips.csv )

sns.set(style ticks ) #设置主题

sns.boxplot(x day , y total_bill , hue sex , data tips, palette husl ) #palette 调色板

plt.show()

$\"\"$

4 联合分布jointplot( )

tips pd.read_csv( tips.csv ) #右上角显示相关系数

sns.jointplot( total_bill , tip , tips)

plt.show()

$\"\"$

tips pd.read_csv( tips.csv ) #右上角显示相关系数

with sns.axes_style( white ):

sns.jointplot( total_bill , tip , tips,kind reg )# kind : { scatter | reg | resid | kde | hex }, optional #Kind of plot to draw.

plt.show()

$\"\"$

5 热点图heatmap( )#我们常用热力图显示变量关系

data pd.read_csv( car_crashes.csv )

data data.corr()

sns.heatmap(data,robust True,cmap YlGnBu ,vmin 0, vmax 1,cbar True)

plt.show()

$\"\"$

6 pairplot( )

data pd.read_csv( iris.csv )

sns.set() #使用默认配色

sns.pairplot(data,hue class ) #hue 选择分类列

plt.show()

$\"\"$

sns.pairplot(data, vars [ sepal_width , sepal_length ],hue species ,palette Set1 )

plt.show()

$\"\"$

7 FacetGrid( )

import seaborn as sns

import matplotlib.pyplot as plt

tips pd.read_csv( tips.csv )

g sns.FacetGrid(tips, col time , row smoker )

g g.map(plt.hist, total_bill , color b )

plt.show()

$\"\"$

#hue分组 colp_order人为设置顺序

g sns.FacetGrid(tips, col smoker , hue time ,col_order [ Yes , No ])

g g.map(plt.hist, total_bill , bins bins,stacked True).add_legend()

$\"\"$

#设置参数大小线宽边缘色

kws dict(s 50, linewidth .5, edgecolor w )

g sns.FacetGrid(tips, col sex , hue time , palette Set1 ,hue_order [ Dinner , Lunch ])

g (g.map(plt.scatter, total_bill , tip ,**kws).add_legend().set_titles( hi ,fontsize 12,loc right )#添加legend,设置标题这里不能识别更为一般的位置参数

) $\"\"$

pal dict(Lunch seagreen , Dinner gray )

g sns.FacetGrid(tips, col sex , hue time , palette pal,

hue_order [ Dinner , Lunch ])

g (g.map(plt.scatter, total_bill , tip , **kws).add_legend().set_axis_labels( Total bill (US Dollars) , Tip ))#增加标签

$\"\"$

本文链接： http://dfdist.immuno-online.com/view-762438.html

发布于： 2021-03-25 阅读（0）

没有了