ARIMA ( AutoRegressive Integrated Moving Average ), part 1

Wow. I’m excited!

My first attemp to truly predict price changes on bitcoin. Let’s see how it goes.

Following an article I produced very first chart:


Second example from an article resulted in:


It’s time to try ARIMA package. Third code example produced another plot:


Finally, at least for shampoo dataset, let’s do some predictions:


It’s time to fetch in some real bitcoin data! First I need to expot into CSV last 100 transaction and try to plot some forecasts. Should be easy. An SQL query comes handy:

select time, price from gdax_matches where product_id = 'BTC-USD' order by time desc limit 100 into outfile 'btc-usd-matches-100.csv' fields terminated by ',' enclosed by '' lines terminated by '\n';

Manually added header row, and executed the script reasulted in this beautifull screenshot:


red – predicted,
blue – original values

Source code:

from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
from statsmodels.tsa.arima_model import ARIMA
from sklearn.metrics import mean_squared_error

series = read_csv('btc-usd-matches-100.csv', header=0, parse_dates=[0], index_col=0, squeeze=True)
X = series.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for t in range(len(test)):
    model = ARIMA(history, order=(5,1,0))
    model_fit =
    output = model_fit.forecast()
    yhat = output[0]
    obs = test[t]
    print('predicted=%f, expected=%f' % (yhat, obs))
error = mean_squared_error(test, predictions)
print('Test MSE: %.3f' % error)
# plot
pyplot.plot(predictions, color='red')

So it was a test on last 100 transactions. Let’s try last 1000 transactions. Here are the results:


with zoom-in:


But the shampoo example from the article first calculated autocorrelations. Let’s try that for 1000 last bitcoin transactions:


Ok – looks like the prediction script needs an alteration with a parameter 100 instead of 5 into ARIMA model. New test run into memory error.. For using 50 instead of 100 it is still calculating. Will post results tomorrow.

Oh, and by the way – over 2mln records in the database:

MariaDB [solocryptoprenuer]> select count(1) from gdax_matches;
| count(1) |
| 2084602  |
1 row in set (6.05 sec)

Great! Sounds like a lot of training data.

Let me share some real historical trading data for a pair BTC-USD:

Have fun.

Thanks for reading,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s