Saturday, July 10, 2021

Are we really at the dawn of new era : Predicting Bitcoin using time series.

I would begin with an exciting example of how data can be both insightful and misleading at the same time. During World War II, an analysis was done on the plane for bullet marks to identify the vulnerable area. After collecting data from sufficient number of planes and plotting them, a clear pattern was identified. Maximum number of bullet marks appeared on the wings and body of the plane and it was decided to increase armor over the wings and body. But, the data was sent for a final review by a Hungarian-Jewish statistician Abraham Wald who pointed out a critical flaw. What if planes which are hit other than body or wings never made to the base camp. Following which armors where increased at cockpit, engine and tail which had very few bullet marks but could have been vulnerable areas.

Read full story at https://www.trevorbragdon.com/blog/when-data-gives-the-wrong-solution.

Well! Let’s get back to Bit Coin.

With the sudden spurt in the rise and fall of the Bit coin and a whole lot of new crypto currencies joining the league all of a sudden, it seems that we are standing at the dawn of a new era. Given the fact that the next big revolution of the 21st century is nothing but block chain technology investors have all the good reason to rest assure that like ever expanding universe, as the physicist says, their investments in crypto are also likely to expand. However the markets are not that straight forward and hold all the surprises which are not easily perceivable by investors, who have recently flooded the crypto currency market. So let’s analyze what we can perceive out of the historical data of bitcoin. Are we standing at the dawn or at the dusk of these so called crypto currencies era.

To understand relationship between crypto currencies and block chain read this full story by PWC: https://www.pwc.com/us/en/industries/financial-services/fintech/bitcoin-blockchain-cryptocurrency.html.

I will start with a small Wikipedia introduction of time series, a statistical approach on which our analysis is based and then we will predict using moving average and exponential smoothening model in time series.

Time Series as quoted by Wikipedia is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.”

In our analysis we are taking weekly closing of Bitcoin from 2017 onwards. (Data courtesy Investing.com)

Figure 1: Weekly Bitcoin closing prices (Courtesy Investing.com)

Above is the weekly chart of Bitcoin closing prices from January 2017 to 30th June 2021. We can see the sudden rise in prices from the end quarter of 2020 and then an abrupt fall in 2021. Apart from this we can also see many rise and fall in prices which seems to be absolutely random and unperceivable. Let us understand what we can make out of this random walk of Bitcoin. To do this we will further decompose the time series. To say in lay man language we will break the data statistically into trend, seasonality and random error (simply said that part of the randomness in data which cannot be explained).

Figure 2: Additive Decomposition of Time Series

First chart from the top is original line chart od weekly bitcoin prices, second is trend captured in the weekly prices, third is the seasonality in data and last are those data points which neither got captured by trend nor seasonality. Out of the four charts above, we are primarily concerned with seasonality in weekly prices which can be clearly seen from the third chart. By seasonality we mean a clear pattern of rise and fall in prices at fixed interval.

Now as we know that the prices of Bitcoin follows some sort of seasonal pattern let us try to further check this using a simple boxplot from 2017 to mid-2020 i.e. exactly before the prices started to surge northward.

Figure 3: Monthly Boxplot of Weekly closing prices of Bitcoin

And what we get here above is nothing but few colorful boxes with whiskers on top and bottom….right, but they are insightful. Out of 12 boxes i.e. one for each month…… by the way these boxes represents middle 50% data point of each month starting from 2017, so longer the box and it’s whisker (that T shape thing attached to box) wider the range of data………..So, out of 12 boxes, we see that December, January and February has the widest range of data. You know price of Bitcoin was $ 19,698.10 at the close of November which rose to $ 28,949.4 by December end and $ 45,164 by February end. Bizarre! Why didn’t I see that before?

The story doesn’t ends here if one sees clearly May had some outliers below the lower tail of the box i.e. to say data points when Bitcoin traded at the value far lesser than middle values. And needless to say, price of Bitcoin on 1st May was $ 57, 807.10 and on 31st May it was $ 37,298.60.

But these are all history now and it would be meaningful to brag around these historical points only if we could further make any meaningful insights from these data. So, the first and the simplest technique which can be used is moving average analysis. I would not take much time to explain and would simply put a chart which is self-explanatory. To understand more on moving average check my blog https://manishbansal3003.blogspot.com/2021/06/wow-it-is-almost-so-easy-to-predict.html.

Figure 4: 2 & 9 Week Moving Average against Spot Prices

We can clearly see how closely 9 days moving average is following spot prices. But moving average are often blamed as late bloomers and we need something which can give us an early signal. And for that we will use Holt – Winter’s approach of Triple Exponential Smoothening. I will not go inside the mathematics behind the model as it would be clearly out of scope for this article where we are simply trying to foresee near term future of Bitcoin prices. But, just for the sake of understanding I had used Holt model from statsmodels.tsa.api in python and ran multiple iteration for the values of α, β and γ and created a model which was most close to the test data. Below is the chart which shows how the model performed on test data (Bit coin prices from 2021 onwards) based on the inputs of train data (Bitcoin prices before 2021).

 

 

Figure 5: Predicting Test Data using Train Data

Above chart clearly shows that based on the data from Jan 2017 to Dec 2020 (plotted in blue), model has very accurately predicted an unsustainable initial increase in price of Bitcoin during 2021 (plotted in green). The original data follows a similar pattern (plotted in orange) as per the prediction done by model. We will now use this same model to predict half-year further from 30th June i.e. from July 2021 to Dec 2021.

Figure 6: Predicted Bitcoin Prices in the 2nd half of 2021(TES model)

From the above chart we can predict that in the short term the prices of Bitcoin may further come down to the $ 26,000 but then it will rise again. With this we can see a fairly good future of the prices of Bitcoin. However, we must also acknowledge at the same time that there are lot of complexities involved and nobody can be 100% sure as to what will happen with these complex currencies especially when the price movement is just a matter of tweets. But, given the advancement in science and technology when the world is not far from planning vacation in outer space who knows one day our digital wallets will be ruled by these digital currencies.

 

Predicting Stock Prices: The Surprising Accuracy and Hidden Power of Linear Regression

Introduction to Linear Regression Linear regression is one of the most fundamental and widely used statistical techniques in data an...