CSV Database¶
This is an implementation for saving/loading candle data (timestamp, open, high, low, close, volume) in .csv files.
- class src.boatwright.Data.CSVdatabase.CSVdatabase(source, dir=None, debug=False)¶
Read and write candles (OHLCV) data to and from .csv files.
- Parameters:
source (str) – e.g. COINBASE, ALPACA, YAHOO, etc
dir (str) – directory path, defaults to location specified in config.json “DATA_DIR”
debug (bool) – boolean toggle for printing debugging information
Note
Candle Database Structure:
- SOURCE
- SYMBOL
- YEAR
- MONTH
data.csv
- DAY
data.csv
- HOUR
data.csv
- calc_prerequisite_start(start, prerequisite_data_length, granularity, granularity_unit)¶
calculate the date such that the prequisite number of bars is loaded before the start date
- Parameters:
start (datetime) – start date
prerequisite_data_length (int) – number of bars to be loaded before start date
granularity (int) – data granularity
granularity_unit (str) – “MINUTE”, “HOUR” or “DAY”
- Returns:
datetime
- get_filepath(symbol, date, level='hour')¶
for the given product/date, returns the filepath at the specified level
- Parameters:
symbol (str) – e.g. “AAPL” or “BTC”
date (datetime) – date to generate the filepath for
level (str) – “year”, “month”, “day”, or “hour”
- load(symbol, start, end, prerequisite_data_length=0, granularity=1, granularity_unit='MINUTE', verbose=False)¶
load data from the ‘database’
- Parameters:
product_id – e.g. BTC-USD
start (datetime) – start date of data to collect
end (datetime) – end date of data to collect
prequisite_data_length – amount of extra data to load before start
interval – time between each bar of data (in interval_units)
granularity (int) – data granularity
granularity_unit (str) – “DAY”, “HOUR” or “MINUTE”
verbose (bool) – boolean toggle for printing progress to terminal
symbol (str)
prerequisite_data_length (int)
- Returns:
pd.DataFrame
with columns [timestamp, datetime, open, high, low, close, volume]
Note
granularity=5, granularity_unit=”MINUTE” yields 5 minute bars
- make_date_chunks(data, granularity_unit)¶
takes a dataframe, and returns a list of dataframes, where each is a chunk of data with respect to the granularity_unit
- Parameters:
data (DataFrame) – data to be chunked
granularity_unit (str) – “MINUTE”, “HOUR”, or “DAY”
- save(symbol, data, granularity_unit, verbose=False)¶
save data to .csv
- Parameters:
symbol (str) – i.e BTC or AAPL
data (DataFrame) – timestamp, datetime, open, high, low, close, volume data
granularity – either “MINUTE”, “HOUR”, “DAY”
verbose (bool) – boolean toggle to print progress to terminal
granularity_unit (str)
- write(symbol, data, level)¶
writes data to the apporiate file. Assumes data is all part of the same chunk
- Parameters:
symbol (str) – e.g. “AAPL” or “BTC”
data (DataFrame) – data, assumed to be one chunk of data to be appended to a single file
level (str) – file path level, either “month”, “day”, “hour”