ARIMA ( AutoRegressive Integrated Moving Average ), part 1

Wow. I’m excited!

My first attemp to truly predict price changes on bitcoin. Let’s see how it goes.

Following an article I produced very first chart:

Untitled-1

Second example from an article resulted in:

Untitled-2

It’s time to try ARIMA package. Third code example produced another plot:

Untitled-3

Finally, at least for shampoo dataset, let’s do some predictions:

Untitled-4

It’s time to fetch in some real bitcoin data! First I need to expot into CSV last 100 transaction and try to plot some forecasts. Should be easy. An SQL query comes handy:

select time, price from gdax_matches where product_id = 'BTC-USD' order by time desc limit 100 into outfile 'btc-usd-matches-100.csv' fields terminated by ',' enclosed by '' lines terminated by '\n';

Manually added header row, and executed the script reasulted in this beautifull screenshot:

Untitled-1

red – predicted,
blue – original values

Source code:

from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
from statsmodels.tsa.arima_model import ARIMA
from sklearn.metrics import mean_squared_error

series = read_csv('btc-usd-matches-100.csv', header=0, parse_dates=[0], index_col=0, squeeze=True)
X = series.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for t in range(len(test)):
    model = ARIMA(history, order=(5,1,0))
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()
    yhat = output[0]
    predictions.append(yhat)
    obs = test[t]
    history.append(obs)
    print('predicted=%f, expected=%f' % (yhat, obs))
error = mean_squared_error(test, predictions)
print('Test MSE: %.3f' % error)
# plot
pyplot.plot(test)
pyplot.plot(predictions, color='red')
pyplot.show()

So it was a test on last 100 transactions. Let’s try last 1000 transactions. Here are the results:

Untitled-2

with zoom-in:

Untitled-3

But the shampoo example from the article first calculated autocorrelations. Let’s try that for 1000 last bitcoin transactions:

Untitled-4

Ok – looks like the prediction script needs an alteration with a parameter 100 instead of 5 into ARIMA model. New test run into memory error.. For using 50 instead of 100 it is still calculating. Will post results tomorrow.

Oh, and by the way – over 2mln records in the database:

MariaDB [solocryptoprenuer]> select count(1) from gdax_matches;
+----------+
| count(1) |
+----------+
| 2084602  |
+----------+
1 row in set (6.05 sec)

Great! Sounds like a lot of training data.

Let me share some real historical trading data for a pair BTC-USD:

Have fun.

Thanks for reading,
Łukasz.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s