読者です 読者をやめる 読者になる 読者になる

<Python, pandas> DataFrame.query()で変数を使う場合

Python pandas

.query()grepみたいなもの)で変数を使う場合。

@を使う。

例。

In [100]: df = pd.DataFrame({'a':[1,2,'X'],'b':[4,'X',5]})

In [101]: df
Out[101]: 
   a  b
0  1  4
1  2  X
2  X  5

In [102]: df.query('a == "X"')
Out[102]: 
   a  b
2  X  5

In [103]: s = 'X'

In [104]: df.query('a == @s')
Out[104]: 
   a  b
2  X  5

いけるね。
じゃ、カラム名columnは???   

In [105]: s = 'a'

In [106]: df.query('@s == "X"')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
C:\Anaconda3\Lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
   1875             try:
-> 1876                 return self._engine.get_loc(key)
   1877             except KeyError:

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4027)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3852)()

pandas\index.pyx in pandas.index.Int64Engine._check_type (pandas\index.c:7570)()

KeyError: False

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-106-12d4482fac32> in <module>()
----> 1 df.query('@s == "X"')

C:\Anaconda3\Lib\site-packages\pandas\core\frame.py in query(self, expr, inplace, **kwargs)
   2141 
   2142         try:
-> 2143             new_data = self.loc[res]
   2144         except ValueError:
   2145             # when res is multi-dimensional loc raises, but this is sometimes a

C:\Anaconda3\Lib\site-packages\pandas\core\indexing.py in __getitem__(self, key)
   1284             return self._getitem_tuple(key)
   1285         else:
-> 1286             return self._getitem_axis(key, axis=0)
   1287 
   1288     def _getitem_axis(self, key, axis=0):

C:\Anaconda3\Lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
   1428         # fall thru to straight lookup
   1429         self._has_valid_type(key, axis)
-> 1430         return self._get_label(key, axis=axis)
   1431 
   1432 

C:\Anaconda3\Lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis)
     91             raise IndexingError('no slices here, handle elsewhere')
     92 
---> 93         return self.obj._xs(label, axis=axis)
     94 
     95     def _get_loc(self, key, axis=0):

C:\Anaconda3\Lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, copy, drop_level)
   1742                                                       drop_level=drop_level)
   1743         else:
-> 1744             loc = self.index.get_loc(key)
   1745 
   1746             if isinstance(loc, np.ndarray):

C:\Anaconda3\Lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)
   1876                 return self._engine.get_loc(key)
   1877             except KeyError:
-> 1878                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   1879 
   1880         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4027)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:3852)()

pandas\index.pyx in pandas.index.Int64Engine._check_type (pandas\index.c:7570)()

KeyError: False

ゲロー。
使えねー。 いまいち感が、、、 format()を使った方がいいな。。。

In [107]: df.query('{} == "X"'.format(s))
Out[107]: 
   a  b
2  X  5