ohlcv

Set of functions and classes used to handle data with Open, High, Low, Close and Volume

Plotting OHLC data

Functions to plot times series in OHLC format (Open, High, Low, Close) and OHLCV format (same + volume).


source

candlestick_plot


def candlestick_plot(
    df:DataFrame, # df with datetime index, and at least following columns 'Open', 'High', 'Low', 'Close', 'Volume'
    width:int=950, # height of the plot figure
    height:int=600, # height of the plot figure
    chart_title:str='-', # title of the chart
    fig:bokeh.plotting._figure.figure | None=None, # figure to allow superposition of other lines on candlestick plot
)->None:

Create a candlestick chart (Bokeh) using a dataframe with ‘Open’, ‘High’, ‘Low’, ‘Close’, ‘Volume’.

Before using the function in a notebook, you must load BokehJS, with:

output_notebook()
Loading BokehJS ...

Let’s load a test DataFrame and plot it.

df = load_test_df()
display(df.head(10))
candlestick_plot(df.head(10), width=800, height=400, chart_title='Candlestick plot')
Open High Low Close Volume
dt
2018-10-22 2759.02 2779.27 2747.27 2754.48 26562
2018-10-23 2753.11 2755.36 2690.69 2743.45 38777
2018-10-24 2744.83 2748.58 2651.23 2672.80 41777
2018-10-25 2670.80 2722.90 2657.93 2680.71 39034
2018-10-26 2675.59 2692.34 2627.59 2663.57 61436
2018-10-29 2667.70 2707.00 2603.33 2639.17 44960
2018-10-30 2639.55 2689.50 2633.05 2688.50 52786
2018-10-31 2688.88 2736.76 2681.25 2704.75 32374
2018-11-01 2707.13 2741.58 2706.88 2731.90 29565
2018-11-02 2725.28 2766.28 2699.96 2723.76 41892

Note that the plot uses a full DateTimeIndex x axis, and the missing bars will just be empty. This allows to compare plots for time series with different missing bars

Handling OLHCV data

Function performing transformation and analysis on OLHCV data.


source

resample_ohlcv


def resample_ohlcv(
    df:DataFrame, # `df` with datetime index, and at least following columns 'Open', 'High', 'Low', 'Close'. Optional 'Volume'
    rule_str:str='W-FRI', # `DateOffset` alias for resampling. Default: 'W-FRI'. Other commons: 'D', 'B', 'W', 'M'
)->DataFrame: # resampled `df` with columns 'Open', 'High', 'Low', 'Close'. Optional 'Volume'

Resample a DataFrame with OHLCV format according to given rule string.

The re-sampling is applied to each of the OHLC and optional V column. Re-sampling aggregate applies first(), max(), min(), last() and sum() to OHLCV respectively.

The list of all DateOffset string reference can be found in Pandas’ documentation here.

The test df has one bar (row) for each day. We can resample to aggregate data per week, where one week ends on Friday.

df_wk = resample_ohlcv(df, rule_str='W-FRI')
df_wk.head(5)
Open High Low Close Volume
W-FRI
2018-10-26 2759.02 2779.27 2627.59 2663.57 207586
2018-11-02 2667.70 2766.28 2603.33 2723.76 201577
2018-11-09 2721.51 2817.01 2713.14 2778.60 118857
2018-11-16 2777.10 2794.23 2669.14 2740.15 170290
2018-11-23 2732.15 2746.53 2625.66 2630.36 132017
print('Days of the week for initial df:',list(df.index.day_of_week[:5]))
print('Days of the week for sampled df:',list(df_wk.index.day_of_week[:5]))
Days of the week for initial df: [0, 1, 2, 3, 4]
Days of the week for sampled df: [4, 4, 4, 4, 4]

source

autocorrelation_ohlcv


def autocorrelation_ohlcv(
    df:DataFrame, # `df` with `DateTimeIndex`, with Open, High, Low, Close
    max_lag:int=10, # Maximum lag to consider for the autocorrelation
    ohlc_col:str='Close', # Columns to use for the autocorrelation. Default: 'Close'. Options: 'Open', 'High', 'Low', 'Close'
)->Series:

Return autocorrelation for a range of lags and for the selected ohlc_col_col defined.

autocorrelation_ohlcv(df, max_lag=5, ohlc_col='Open')
1    0.948029
2    0.884840
3    0.826346
4    0.764883
5    0.713426
Name: Autocorrelation, dtype: float64