Udgivet 2017-12-22 - Skrevet af Philip Sørensen
df = pd.read_csv('data/BTC USD Historical Data.csv')There are some missing values that we are not interested in, so they are removed. Furthermore, this dataset is reverse chronological order (latest prices at the top), so it's flipped.
df = df.dropna()df = df.iloc[::-1]Let's convert the date string into a datetime.
def ConvertDate(row): return datetime.datetime.strptime(row['Date'], '%b %d, %Y')df['date'] = df.apply (lambda row: ConvertDate (row), axis = 1)Let's also convert the price into a float.
def ConvertPrice(row): return float(str(row['Price']).replace(",", ""))df['price'] = df.apply (lambda row: ConvertPrice (row), axis = 1)Now let's find the peak price and only look at data around it. I have found 3 years to be a decent value for the time leading up to and after the peak.
DAYS = 3 * 365datePeak = df.at[df['price'].argmax(), 'date']dateStart = datePeak - datetime.timedelta(days = DAYS)dateEnd = datePeak + datetime.timedelta(days = DAYS)df = df[df['date'].between(dateStart, dateEnd, inclusive = True)]As we'll be seeing multiple bubbles on the same graph there needs to be a common Y-axis. To do this an initial price is found, where the rest of the prices are compared against.
priceInitial = df['price'].iloc[0]def AdjustPriceInitial(row, priceInitial): return row['price'] / priceInitialdf['price'] = df.apply (lambda row: AdjustPriceInitial (row, priceInitial), axis = 1)In order to also have a common X-axis I convert the dates according to the peak.
def AdjustDateToPeak(row, datePeak): return row['date'] - datePeakdf['date'] = df.apply (lambda row: AdjustDateToPeak (row, datePeak), axis = 1)All of the above code can be put into a function, and then adjusted according to the dataset being handled.
def GetBitcoinDf(DAYS = 3 * 365): df = pd.read_csv('data/BTC USD Historical Data.csv') df = df.dropna() df = df.iloc[::-1] def ConvertDate(row): return datetime.datetime.strptime(row['Date'], '%b %d, %Y') df['date'] = df.apply (lambda row: ConvertDate (row), axis = 1) def ConvertPrice(row): return float(str(row['Price']).replace(",", "")) df['price'] = df.apply (lambda row: ConvertPrice (row), axis = 1) datePeak = df.at[df['price'].argmax(), 'date'] dateStart = datePeak - datetime.timedelta(days = DAYS) dateEnd = datePeak + datetime.timedelta(days = DAYS) df = df[df['date'].between(dateStart, dateEnd, inclusive = True)] priceInitial = df['price'].iloc[0] def AdjustPriceInitial(row, priceInitial): return row['price'] / priceInitial df['price'] = df.apply (lambda row: AdjustPriceInitial (row, priceInitial), axis = 1) def AdjustDateToPeak(row, datePeak): return row['date'] - datePeak df['date'] = df.apply (lambda row: AdjustDateToPeak (row, datePeak), axis = 1) return df[['date'] + ['price']]
plt.figure(figsize = (20, 20))plt.plot(bitcoinDf['date'], bitcoinDf['price'], label = 'Bitcoin')plt.legend()plt.show()For the data I have gathered (Bitcoin updated 2017-12-21) I get the following graph.