階層型インデックス
code: Python
data = pd.Series(np.random.randn(9),
data
--------------------------------------------------------------------------
a 1 -0.204708
2 0.478943
3 -0.519439
b 1 -0.555730
3 1.965781
c 1 1.393406
2 0.092908
d 2 0.281746
3 0.769023
dtype: float64
--------------------------------------------------------------------------
code: Python
data.unstack()
--------------------------------------------------------------------------
1 2 3
a -0.204708 0.478943 -0.519439
b -0.555730 NaN 1.965781
c 1.393406 0.092908 NaN
d NaN 0.281746 0.769023
--------------------------------------------------------------------------
code: Python
data.unstack().stack()
--------------------------------------------------------------------------
a 1 -0.204708
2 0.478943
3 -0.519439
b 1 -0.555730
3 1.965781
c 1 1.393406
2 0.092908
d 2 0.281746
3 0.769023
dtype: float64
--------------------------------------------------------------------------
code: Python
frame = pd.DataFrame(np.arange(12).reshape((4, 3)),
index='a', 'a', 'b', 'b'], [1, 2, 1, 2,
frame
--------------------------------------------------------------------------
Ohio Colorado
Green Red Green
a 1 0 1 2
2 3 4 5
b 1 6 7 8
2 9 10 11
--------------------------------------------------------------------------
code: Python
frame
--------------------------------------------------------------------------
state Ohio Colorado
color Green Red Green
key1 key2
a 1 0 1 2
2 3 4 5
b 1 6 7 8
2 9 10 11
--------------------------------------------------------------------------
code: Python
--------------------------------------------------------------------------
color Green Red
key1 key2
a 1 0 1
2 3 4
b 1 6 7
2 9 10
--------------------------------------------------------------------------
階層の順序変更やソート
code: Python
frame.swaplevel('key1', 'key2')
--------------------------------------------------------------------------
state Ohio Colorado
color Green Red Green
key2 key1
1 a 0 1 2
2 a 3 4 5
1 b 6 7 8
2 b 9 10 11
--------------------------------------------------------------------------
code: Python
frame.sort_index(level=1)
--------------------------------------------------------------------------
state Ohio Colorado
color Green Red Green
key1 key2
a 1 0 1 2
b 1 6 7 8
a 2 3 4 5
b 2 9 10 11
--------------------------------------------------------------------------
階層ごとの要約統計量
code: Python
frame.sum(level='key2')
--------------------------------------------------------------------------
state Ohio Colorado
color Green Red Green
key2
1 6 8 10
2 12 14 16
--------------------------------------------------------------------------
code: Python
frame.sum(level='color', axis=1)
--------------------------------------------------------------------------
color Green Red
key1 key2
a 1 2 1
2 8 4
b 1 14 7
2 20 10
--------------------------------------------------------------------------
データフレームの列をインデックスに使う
code: Python
frame = pd.DataFrame({'a': range(7), 'b': range(7, 0, -1),
'c': ['one', 'one', 'one', 'two', 'two',
'two', 'two'],
frame
--------------------------------------------------------------------------
a b c d
0 0 7 one 0
1 1 6 one 1
2 2 5 one 2
3 3 4 two 0
4 4 3 two 1
5 5 2 two 2
6 6 1 two 3
--------------------------------------------------------------------------
code: Python
frame2
--------------------------------------------------------------------------
a b
c d
one 0 0 7
1 1 6
2 2 5
two 0 3 4
1 4 3
2 5 2
3 6 1
--------------------------------------------------------------------------
code: Python
# 使用した列を削除しない
--------------------------------------------------------------------------
a b c d
c d
one 0 0 7 one 0
1 1 6 one 1
2 2 5 one 2
two 0 3 4 two 0
1 4 3 two 1
2 5 2 two 2
3 6 1 two 3
--------------------------------------------------------------------------
code: Python
# set_indexの逆
# インデックスを列に追加する
frame2.reset_index()
--------------------------------------------------------------------------
c d a b
0 one 0 0 7
1 one 1 1 6
2 one 2 2 5
3 two 0 3 4
4 two 1 4 3
5 two 2 5 2
6 two 3 6 1
--------------------------------------------------------------------------