【1.1】Pandas简介

一、pandas库的简介

  • Pandas是Python第三方库,提供高性能易用数据类型和分析工具
  • Pandas基于NumPy实现,常与NumPy和Matplotlib一同使用
  • 官网: http://pandas.pydata.org

1.1 安装

pip install pandas

报错1:

Collecting pandas
  Could not fetch URL https://pypi.python.org/simple/pandas/: There was a problem confirming the ssl certificate: [SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:590) - skipping
  Could not find a version that satisfies the requirement pandas (from versions: )
No matching distribution found for pandas

解决办法:更新pip

curl https://bootstrap.pypa.io/get-pip.py >> get-pip.py
sudo python get-pip.py

接着安装:

pip install pandas --user

报错2:

Uninstalling python-dateutil-1.5:
Could not install packages due to an EnvironmentError: [('/Sys

解决办法:

sudo pip uninstall python-dateutil-1.5

#提示
The directory '/Users/tanqianshan/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Skipping python-dateutil-1.5 as it is not installed.
The directory '/Users/tanqianshan/Library/Caches/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.

1.2 简单的使用

导入库:

import pandas as pd

例子:

import pandas as pd
d =pd.Series(range(10))

print d

0      0
1      1
2      2
3      3
4      4
5      5
6      6
7      7
8      8
9      9
dtype: int64

print d.cumsum()

0       0
1       1
2       3
3       6
4      10
5      15
6      21
7      28
8      36
9      45
dtype: int64

二、pandas库的两种数据类型

两个数据类型:

  • Series
  • DataFrame

基于上述数据类型的各类操作

基本操作、运算操作、特征类操作、关联类操作

NumPy Pandas
基础数据类型 扩展数据类型
关注数据的结构表达 关注数据的应用表达
维度:数据间关系 数据与索引间关系

参考资料

北京理工大学 嵩山 www.python123.org

药企,独角兽,苏州。团队长期招人,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn