【3.3.5】Pandas--DataFrame的dropna

去掉空值

DataFrame.dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False)

一、参数

Name Description Type/Default Value Required / Optional
axis Determine if rows or columns which contain missing values are removed. 0, or ‘index’ : Drop rows which contain missing values. 1, or ‘columns’ : Drop columns which contain missing value. {0 or ‘index’, 1 or ‘columns’} ; Default Value: 0 Required
how Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.‘any’ : If any NA values are present, drop that row or column; ‘all’ : If all values are NA, drop that row or column. {‘any’, ‘all’} Default Value: ‘any’ Required
thresh Require that many non-NA values. int Optional
subset Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include. array-like Optional
inplace If True, do operation inplace and return None. boolDefault Value: False Required

二、示例

import numpy as np
import pandas as pd


df = pd.DataFrame({"name": ['Superman', 'Batman', 'Spiderman'],
                   "toy": [np.nan, 'Batmobile', 'Spiderman toy'],
                   "born": [pd.NaT, pd.Timestamp("1956-06-26"),
                            pd.NaT]})
df

输出:

name	toy	born
0	Superman	NaN	NaT
1	Batman	Batmobile	1956-06-26
2	Spiderman	Spiderman toy	NaT

2.1 Drop the rows where at least one element is missing:

df.dropna()

	name	toy	born
	1	Batman	Batmobile	1956-06-26

2.2 Drop the columns where at least one element is missing:

df.dropna(axis='columns')
df

name
0	Superman
1	Batman
2	Spiderman

2.3 Drop the rows where all elements are missing.

df.dropna(how='all')

name	toy	born
0	Superman	NaN	NaT
1	Batman	Batmobile	1956-06-26
2	Spiderman	Spiderman toy	NaT

2.4 Keep only the rows with at least 2 non-NA values: 保留至少2个非空值的行

df.dropna(thresh=2)

name	toy	born
1	Batman	Batmobile	1956-06-26
2	Spiderman	Spiderman toy	NaT

2.5 Define in which columns to look for missing values:

df.dropna(subset=['name', 'born'])
name	toy	born
1	Batman	Batmobile	1956-06-26

2.6 Keep the DataFrame with valid entries in the same variable:

df.dropna(inplace=True)
df

name	toy	born
1	Batman	Batmobile	1956-06-26

参考资料

药企,独角兽,苏州。团队长期招人,感兴趣的都可以发邮件聊聊:tiehan@sina.cn
个人公众号,比较懒,很少更新,可以在上面提问题,如果回复不及时,可发邮件给我: tiehan@sina.cn