output_notebook()ohlcv
Plotting OHLC data
Functions to plot times series in OHLC format (Open, High, Low, Close) and OHLCV format (same + volume).
candlestick_plot
def candlestick_plot(
df:DataFrame, # df with datetime index, and at least following columns 'Open', 'High', 'Low', 'Close', 'Volume'
width:int=950, # height of the plot figure
height:int=600, # height of the plot figure
chart_title:str='-', # title of the chart
fig:bokeh.plotting._figure.figure | None=None, # figure to allow superposition of other lines on candlestick plot
)->None:
Create a candlestick chart (Bokeh) using a dataframe with ‘Open’, ‘High’, ‘Low’, ‘Close’, ‘Volume’.
Before using the function in a notebook, you must load BokehJS, with:
Let’s load a test DataFrame and plot it.
df = load_test_df()
display(df.head(10))
candlestick_plot(df.head(10), width=800, height=400, chart_title='Candlestick plot')| Open | High | Low | Close | Volume | |
|---|---|---|---|---|---|
| dt | |||||
| 2018-10-22 | 2759.02 | 2779.27 | 2747.27 | 2754.48 | 26562 |
| 2018-10-23 | 2753.11 | 2755.36 | 2690.69 | 2743.45 | 38777 |
| 2018-10-24 | 2744.83 | 2748.58 | 2651.23 | 2672.80 | 41777 |
| 2018-10-25 | 2670.80 | 2722.90 | 2657.93 | 2680.71 | 39034 |
| 2018-10-26 | 2675.59 | 2692.34 | 2627.59 | 2663.57 | 61436 |
| 2018-10-29 | 2667.70 | 2707.00 | 2603.33 | 2639.17 | 44960 |
| 2018-10-30 | 2639.55 | 2689.50 | 2633.05 | 2688.50 | 52786 |
| 2018-10-31 | 2688.88 | 2736.76 | 2681.25 | 2704.75 | 32374 |
| 2018-11-01 | 2707.13 | 2741.58 | 2706.88 | 2731.90 | 29565 |
| 2018-11-02 | 2725.28 | 2766.28 | 2699.96 | 2723.76 | 41892 |
Note that the plot uses a full DateTimeIndex x axis, and the missing bars will just be empty. This allows to compare plots for time series with different missing bars
Handling OLHCV data
Function performing transformation and analysis on OLHCV data.
resample_ohlcv
def resample_ohlcv(
df:DataFrame, # `df` with datetime index, and at least following columns 'Open', 'High', 'Low', 'Close'. Optional 'Volume'
rule_str:str='W-FRI', # `DateOffset` alias for resampling. Default: 'W-FRI'. Other commons: 'D', 'B', 'W', 'M'
)->DataFrame: # resampled `df` with columns 'Open', 'High', 'Low', 'Close'. Optional 'Volume'
Resample a DataFrame with OHLCV format according to given rule string.
The re-sampling is applied to each of the OHLC and optional V column. Re-sampling aggregate applies first(), max(), min(), last() and sum() to OHLCV respectively.
The list of all DateOffset string reference can be found in Pandas’ documentation here.
The test df has one bar (row) for each day. We can resample to aggregate data per week, where one week ends on Friday.
df_wk = resample_ohlcv(df, rule_str='W-FRI')
df_wk.head(5)| Open | High | Low | Close | Volume | |
|---|---|---|---|---|---|
| W-FRI | |||||
| 2018-10-26 | 2759.02 | 2779.27 | 2627.59 | 2663.57 | 207586 |
| 2018-11-02 | 2667.70 | 2766.28 | 2603.33 | 2723.76 | 201577 |
| 2018-11-09 | 2721.51 | 2817.01 | 2713.14 | 2778.60 | 118857 |
| 2018-11-16 | 2777.10 | 2794.23 | 2669.14 | 2740.15 | 170290 |
| 2018-11-23 | 2732.15 | 2746.53 | 2625.66 | 2630.36 | 132017 |
print('Days of the week for initial df:',list(df.index.day_of_week[:5]))
print('Days of the week for sampled df:',list(df_wk.index.day_of_week[:5]))Days of the week for initial df: [0, 1, 2, 3, 4]
Days of the week for sampled df: [4, 4, 4, 4, 4]
autocorrelation_ohlcv
def autocorrelation_ohlcv(
df:DataFrame, # `df` with `DateTimeIndex`, with Open, High, Low, Close
max_lag:int=10, # Maximum lag to consider for the autocorrelation
ohlc_col:str='Close', # Columns to use for the autocorrelation. Default: 'Close'. Options: 'Open', 'High', 'Low', 'Close'
)->Series:
Return autocorrelation for a range of lags and for the selected ohlc_col_col defined.
autocorrelation_ohlcv(df, max_lag=5, ohlc_col='Open')1 0.948029
2 0.884840
3 0.826346
4 0.764883
5 0.713426
Name: Autocorrelation, dtype: float64