I would
begin with an exciting example of how data can be both insightful and
misleading at the same time. During
World War II, an analysis was done on the plane for bullet marks to identify
the vulnerable area. After collecting data from sufficient number of planes and
plotting them, a clear pattern was identified. Maximum number of bullet marks
appeared on the wings and body of the plane and it was decided to increase
armor over the wings and body. But, the data was sent for a final review by a
Hungarian-Jewish statistician Abraham Wald who pointed out a critical flaw.
What if planes which are hit other than body or wings never made to the base
camp. Following which armors where increased at cockpit, engine and tail which
had very few bullet marks but could have been vulnerable areas.
Read full
story at https://www.trevorbragdon.com/blog/when-data-gives-the-wrong-solution.
Well! Let’s get back to Bit Coin.
With the
sudden spurt in the rise and fall of the Bit coin and a whole lot of new crypto
currencies joining the league all of a sudden, it seems that we are standing at
the dawn of a new era. Given the fact that the next big revolution of the 21st
century is nothing but block chain technology investors have all the good
reason to rest assure that like ever expanding universe, as the physicist says,
their investments in crypto are also likely to expand. However the markets are
not that straight forward and hold all the surprises which are not easily
perceivable by investors, who have recently flooded the crypto currency market.
So let’s analyze what we can perceive out of the historical data of bitcoin. Are we standing at the dawn or at the dusk of
these so called crypto currencies era.
To
understand relationship between crypto currencies and block chain read this
full story by PWC: https://www.pwc.com/us/en/industries/financial-services/fintech/bitcoin-blockchain-cryptocurrency.html.
I will
start with a small Wikipedia introduction of time series, a statistical
approach on which our analysis is based and then we will predict using moving
average and exponential smoothening model in time series.
Time Series
as quoted by Wikipedia is “a series of data
points indexed
(or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points
in time. Thus it is a sequence of discrete-time data. Examples of time series are heights
of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.”
In our
analysis we are taking weekly closing of Bitcoin from
2017 onwards. (Data courtesy
Investing.com)
Figure 1: Weekly Bitcoin
closing prices (Courtesy Investing.com)
Above is
the weekly chart of Bitcoin closing prices from
January 2017 to 30th June 2021. We can see the sudden rise in prices
from the end quarter of 2020 and then an abrupt fall in 2021. Apart from this
we can also see many rise and fall in prices which seems to be absolutely
random and unperceivable. Let us understand what we can make out of this random
walk of Bitcoin. To do this we will further decompose
the time series. To say in lay man language we will break the data statistically
into trend, seasonality and random error (simply
said that part of the randomness in data which cannot be explained).
Figure 2: Additive Decomposition of Time Series
First chart
from the top is original line chart od weekly bitcoin
prices, second is trend captured in the weekly prices, third is the seasonality
in data and last are those data points which neither got captured by trend nor
seasonality. Out of the four charts above, we are primarily concerned with
seasonality in weekly prices which can be clearly seen from the third chart. By
seasonality we mean a clear pattern of rise and fall in prices at fixed
interval.
Now as we
know that the prices of Bitcoin follows some sort of
seasonal pattern let us try to further check this using a simple boxplot from
2017 to mid-2020 i.e. exactly before the prices started to surge northward.
Figure 3: Monthly Boxplot of Weekly closing prices of Bitcoin
And what we
get here above is nothing but few colorful boxes with whiskers on top and
bottom….right, but they are insightful. Out of 12 boxes i.e. one for each
month…… by the way these boxes represents middle 50% data point of each month
starting from 2017, so longer the box and it’s whisker (that T shape thing attached to box) wider the range of
data………..So, out of 12 boxes, we see that December, January and February has
the widest range of data. You know price of Bitcoin
was $ 19,698.10 at the close of November which rose to $ 28,949.4 by December
end and $ 45,164 by February end. Bizarre!
Why didn’t I see that before?
The story
doesn’t ends here if one sees clearly May had some outliers below the lower
tail of the box i.e. to say data points when Bitcoin
traded at the value far lesser than middle values. And needless to say, price
of Bitcoin on 1st May was $ 57, 807.10 and
on 31st May it was $ 37,298.60.
But these
are all history now and it would be meaningful to brag around these historical
points only if we could further make any meaningful insights from these data.
So, the first and the simplest technique which can be used is moving average
analysis. I would not take much time to explain and would simply put a chart
which is self-explanatory. To understand more on moving average check my blog https://manishbansal3003.blogspot.com/2021/06/wow-it-is-almost-so-easy-to-predict.html.
Figure 4: 2 & 9 Week Moving Average against
Spot Prices
We can
clearly see how closely 9 days moving average is following spot prices. But
moving average are often blamed as late bloomers and we need something which
can give us an early signal. And for that we will use Holt – Winter’s
approach of Triple Exponential Smoothening. I will not go inside the
mathematics behind the model as it would be clearly out of scope for this
article where we are simply trying to foresee near term future of Bitcoin prices. But, just for the sake of understanding I
had used Holt model from statsmodels.tsa.api in python and ran multiple iteration for
the values of α, β and γ and created a model which was most close to the test data. Below is the
chart which shows how the model performed on test data (Bit coin prices from
2021 onwards) based on the inputs of train data (Bitcoin
prices before 2021).
Figure 5: Predicting Test Data using Train Data
Above chart clearly shows
that based on the data from Jan 2017 to Dec 2020 (plotted in blue), model
has very accurately predicted an unsustainable initial increase in price of Bitcoin during 2021 (plotted in green). The original
data follows a similar pattern (plotted in orange) as per the
prediction done by model. We will now use this same model to predict half-year
further from 30th June i.e. from July 2021 to Dec 2021.
Figure 6: Predicted Bitcoin
Prices in the 2nd half of 2021(TES model)
From the
above chart we can predict that in the short term the prices of Bitcoin may further come down to the $ 26,000 but then it
will rise again. With this we can see a fairly good future of the prices of Bitcoin. However, we must also acknowledge at the same time
that there are lot of complexities involved and nobody can be 100% sure as to
what will happen with these complex currencies especially when the price
movement is just a matter of tweets. But, given the advancement in science and
technology when the world is not far from planning vacation in outer space who
knows one day our digital wallets will be ruled by these digital currencies.