<Python, selenium> 空白のあるクラス名を選択するには、、

Seleniumで、空白のあるクラス名を選択するには、、、 how to select the class name having a blank in it?

まずは、xpathを使う。

In [133]: from selenium import webdriver

In [144]: d = webdriver.Chrome()

In [145]: d.get('http://nekoyukimmm.hatenablog.com/entry/2017/04/27/164336')

In [146]: d.find_element_by_xpath('//*[@id="box2-inner"]/div[3]')
Out[146]: <selenium.webdriver.remote.webelement.WebElement (session="a489f57900d9be9af6a6356176a89bd9", element="0.23540204403336795-1")>

In [148]: d.find_element_by_xpath('//*[@id="box2-inner"]/div[3]').text
Out[148]: 'カテゴリー\nselenium (2) Python (305) sentdex (2) pandas (86) Office2016 (1) numpy (7) pyplot (4) Javascript (15) jQuery (11) JSFiddle (3) flask (26) Vim (62) gspread (2) requests (2) Json (7) Google Cloud Platform (4) dos (5) conda (7) Github (11) msys2 (21) Anaconda (5) handsontable (1) Bootstrap (1) Google App Engine (5) Windows (24) matplotlib (27) seaborn (12) ATOM (5) datetime (2) SQLite (7) peewee (2) Apache (1) Beautiful Soup (7) Visual Studio (1) Linux (29) ftp (1) Werkzeug (1) html (11) tqdm (1) Unix Command (2) ssh (2) zsh (10) iPython (22) pacman (4) Cheatsheet (5) css (5) regexp (8) jinja (2) highcharts (8) socket (2) mermaid (2) Markdown (7) CDN (1) Bash (20) Network (2) Outlook (3) Pygments (1) Web (1) tmux (2) pip (5) Solarized (4) KDE (1) mintty (2) Putty (1) Go (2) peco (2) dotfiles (1) Jupyter (5) VBA (5) Excel (7) Surfingkeys (2) NeoBundle (2) SystemVerilog (1) x240 (4) MinGW+mintty (8) Sphinx (2) X (2) japandas (1) urllib (2) readability-lxml (2) html2text (1) Chrome (2) plotly (4) statistics (1) Office2013 (4) cron (1) bokeh (1) pptx (4) PIL (2) pdb (1) docopt (1) cgi (1) Jedi (4) XML (1) mpld3 (2) tput (1) iconv (1) mutt (1) Vundle (1) sed (1) Gow (2) ggplot (1) Perl (3)'

で、find_element_by_class_nameをしてみる。
ゲロ、、エラー。

In [155]: d.find_element_by_class_name('hatena-module hatena-module-category')
---------------------------------------------------------------------------
InvalidSelectorException                  Traceback (most recent call last)

その場合は、find_element_by_xpathで、div[@class='hage hage']とするらしい。

In [157]: d.find_element_by_xpath('//div[@class="hatena-module hatena-module-category"]')
Out[157]: <selenium.webdriver.remote.webelement.WebElement (session="a489f57900d9be9af6a6356176a89bd9", element="0.23540204403336795-1")>

In [158]: d.find_element_by_xpath('//div[@class="hatena-module hatena-module-category"]').text
Out[158]: 'カテゴリー\nselenium (2) Python (305) sentdex (2) pandas (86) Office2016 (1) numpy (7) pyplot (4) Javascript (15) jQuery (11) JSFiddle (3) flask (26) Vim (62) gspread (2) requests (2) Json (7) Google Cloud Platform (4) dos (5) conda (7) Github (11) msys2 (21) Anaconda (5) handsontable (1) Bootstrap (1) Google App Engine (5) Windows (24) matplotlib (27) seaborn (12) ATOM (5) datetime (2) SQLite (7) peewee (2) Apache (1) Beautiful Soup (7) Visual Studio (1) Linux (29) ftp (1) Werkzeug (1) html (11) tqdm (1) Unix Command (2) ssh (2) zsh (10) iPython (22) pacman (4) Cheatsheet (5) css (5) regexp (8) jinja (2) highcharts (8) socket (2) mermaid (2) Markdown (7) CDN (1) Bash (20) Network (2) Outlook (3) Pygments (1) Web (1) tmux (2) pip (5) Solarized (4) KDE (1) mintty (2) Putty (1) Go (2) peco (2) dotfiles (1) Jupyter (5) VBA (5) Excel (7) Surfingkeys (2) NeoBundle (2) SystemVerilog (1) x240 (4) MinGW+mintty (8) Sphinx (2) X (2) japandas (1) urllib (2) readability-lxml (2) html2text (1) Chrome (2) plotly (4) statistics (1) Office2013 (4) cron (1) bokeh (1) pptx (4) PIL (2) pdb (1) docopt (1) cgi (1) Jedi (4) XML (1) mpld3 (2) tput (1) iconv (1) mutt (1) Vundle (1) sed (1) Gow (2) ggplot (1) Perl (3)'

できた。

さんくー、スタックオーバーフロー。

stackoverflow.com

<Python, selenium> Chromeを動かしてみた。

ちょと、seleniumを試す。

>pip install selenium

で、下記から、chromedriver.exeをゲットする。

sites.google.com

そいつをパスpathが通っている、/usr/local/binに放り込む。

で、

In [1]: from selenium import webdriver

In [6]: d = webdriver.Chrome()

In [7]: d.get('http://www.yahoo.co.jp')

In [9]: d.quit()

お、動く。

Selenium - Web Browser Automation

<Python, pandas, sentdex> resample

目的とするDataFrameから、値を抜き取りする、リサンプリング resample をしてみた。

In [20]: import datetime as dt

In [21]: import pandas as pd

In [22]: import pandas_datareader.data as web

In [23]: s = dt.datetime(2000,1,1)

In [24]: e = dt.datetime(2016,12,31)


In [30]: df = web.DataReader('TXN', 'yahoo', s, e)

In [31]: df.resample('M').mean().head()
Out[31]: 
                  Open        High         Low       Close    Volume  \
Date                                                                   
2000-01-31  105.481250  107.803125  102.500000  105.242965   9585750   
2000-02-29  138.125000  143.434375  135.156250  140.340625  10091150   
2000-03-31  172.709239  177.774457  165.434783  171.730978  11085226   
2000-04-30  150.115132  156.256579  144.059211  150.125000  12271800   
2000-05-31  126.215909  128.931818  121.397727  124.840909   8540727   

            Adj Close  
Date                   
2000-01-31  41.146986  
2000-02-29  54.886878  
2000-03-31  67.163569  
2000-04-30  58.715439  
2000-05-31  56.198717  

In [32]: df['Adj Close'].resample('M').mean().head()
Out[32]: 
Date
2000-01-31    41.146986
2000-02-29    54.886878
2000-03-31    67.163569
2000-04-30    58.715439
2000-05-31    56.198717
Freq: M, Name: Adj Close, dtype: float64

In [33]: df['Adj Close'].resample('M').ohlc().head()
Out[33]: 
                 open       high        low      close
Date                                                  
2000-01-31  40.218784  44.519213  36.553646  42.140764
2000-02-29  44.242913  64.971085  44.242913  64.971085
2000-03-31  64.286664  73.379683  57.491344  62.575612
2000-04-30  58.957959  65.117746  54.362563  63.718405
2000-05-31  62.593675  62.593675  50.074940  56.529912

いつものお世話になったところ。

sinhrks.hatenablog.com

マニュアル。

pandas.DataFrame.resample — pandas 0.19.2 documentation

アンカー Anchor の表。
http://pandas.pydata.org/pandas-docs/stable/timeseries.html#anchored-offsets

<pandas, Python, sentdex> Python Programming for Finance

www.youtube.com

やってみた。

In [1]: import datetime as dt

In [2]: import matplotlib.pyplot as plt

In [3]: from matplotlib import style

In [4]: import pandas as pd

In [5]: import pandas_datareader.data as web

In [6]: style.use('ggplot')

In [7]: s = dt.datetime(2000,1,1)

In [8]: e = dt.datetime(2016,12,31)

In [9]: df = web.DataReader('TSLA', 'yahoo', s, e)

In [10]: df.head()
Out[10]: 
                 Open   High        Low      Close    Volume  Adj Close
Date                                                                   
2010-06-29  19.000000  25.00  17.540001  23.889999  18766300  23.889999
2010-06-30  25.790001  30.42  23.299999  23.830000  17187100  23.830000
2010-07-01  25.000000  25.92  20.270000  21.959999   8218800  21.959999
2010-07-02  23.000000  23.10  18.709999  19.200001   5139800  19.200001
2010-07-06  20.000000  20.00  15.830000  16.110001   6866900  16.110001

ふーん、なるほど。

www.youtube.com

続いてその3もやった。

In [11]: df['100ma'] = df['Adj Close'].rolling(window=100, min_periods=0).mean()

In [12]: df.head()
Out[12]: 
                 Open   High        Low      Close    Volume  Adj Close  \
Date                                                                      
2010-06-29  19.000000  25.00  17.540001  23.889999  18766300  23.889999   
2010-06-30  25.790001  30.42  23.299999  23.830000  17187100  23.830000   
2010-07-01  25.000000  25.92  20.270000  21.959999   8218800  21.959999   
2010-07-02  23.000000  23.10  18.709999  19.200001   5139800  19.200001   
2010-07-06  20.000000  20.00  15.830000  16.110001   6866900  16.110001   

                100ma  
Date                   
2010-06-29  23.889999  
2010-06-30  23.860000  
2010-07-01  23.226666  
2010-07-02  22.220000  
2010-07-06  20.998000  

In [13]: ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1)

In [14]: ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1, sharex=ax1)

In [15]: ax1.plot(df.index, df['Adj Close'])
Out[15]: [<matplotlib.lines.Line2D at 0xb9c5c88>]

In [16]: ax1.plot(df.index, df['100ma'])
Out[16]: [<matplotlib.lines.Line2D at 0xb463f98>]

In [17]: ax2.bar(df.index, df['Volume'])
Out[17]: <Container object of 1640 artists>

In [18]: plt.show()

f:id:nekoyukimmm:20170228145110p:plain

<Python, pandas> 縦にずらす。

縦にずらす。

In [22]: df = pd.DataFrame({'a':[1,2,3,4,5,6]})

In [23]: df
Out[23]: 
   a
0  1
1  2
2  3
3  4
4  5
5  6

In [24]: df.shift(-1)
Out[24]: 
     a
0  2.0
1  3.0
2  4.0
3  5.0
4  6.0
5  NaN

In [25]: df.shift(1)
Out[25]: 
     a
0  NaN
1  1.0
2  2.0
3  3.0
4  4.0
5  5.0

ふーん。

横にもずらせる。

In [26]: df = pd.DataFrame([[1,2,3],[4,5,6]])

In [27]: df
Out[27]: 
   0  1  2
0  1  2  3
1  4  5  6

In [28]: df.shift(-1,axis=1)
Out[28]: 
     0    1   2
0  2.0  3.0 NaN
1  5.0  6.0 NaN

In [29]: df.shift(-1,axis=0)
Out[29]: 
     0    1    2
0  4.0  5.0  6.0
1  NaN  NaN  NaN

なるへそ。

シフトshiftのマニュアル。

pandas.DataFrame.shift — pandas 0.19.2 documentation

<Python, numpy> 無限大

知ってましたか? pythonで無限大は、np.infか、float('inf')で表現するらしいっす。

In [1]: float('inf')
Out[1]: inf

In [2]: float('inf') == 0
Out[2]: False

In [3]: float('inf') < 1
Out[3]: False

In [4]: float('inf') > 1
Out[4]: True

In [5]: import numpy as np

In [6]: np.inf
Out[6]: inf

In [7]: float('inf') == np.inf
Out[7]: True

In [8]: np.inf < 50000000 * 500000000
Out[8]: False

In [9]: -np.inf
Out[9]: -inf

In [10]: -np.inf < 0
Out[10]: True

In [11]: type(np.inf)
Out[11]: float

In [12]: np.isinf(np.inf)
Out[12]: True

In [13]: np.isinf(float('inf'))
Out[13]: True

numpy.isinfのマニュアル。

numpy.isinf — NumPy v1.12 Manual

追加。

In [14]: 0 / np.inf
Out[14]: 0.0

In [15]: np.inf / np.inf
Out[15]: nan

In [16]: 1 / np.inf
Out[16]: 0.0

In [17]: np.inf - np.inf
Out[17]: nan

In [18]: 1 * np.inf
Out[18]: inf

In [19]: 0 * np.inf
Out[19]: nan

In [20]: np.inf * np.inf
Out[20]: inf