In the next two weeks, IBâs Collegiate Olympiad starts. The following describes the system Iâm entering. I previously mentioned that it uses TWSâs ActiveX API to connect to IB through Matlab and listed some other info not covered below. Links to all the Matlab files are at the bottom of this post.
System Process Flow
1. Load Data â modules for Yahoo and IB
2. Preprocess
3. Prediction Engine
4. Position Sizing
5. Execution
+ Backtest
Process Flow Description
1. Historical data, including the most recent period is downloaded from Interactive Brokers.
2. OHLC numbers are converted into periodic returns, and the put in the proper ordering, newest to oldest.
3. Support vector regression with a Gaussian kernel, using parameters (C, Îł) chosen by sliding window validation, is used to predict the next periodâs return for each security/contract. These predictions are normalized and weighted by a confidence value. (code outline below)
4. (manual for now)
5. Send the basket of orders to IB through the TWS ActiveX API
+ Backtesting is relatively efficient because most validation folds are redundant
Prediction Engine Code Outline
Initialize parameters
Pre-allocate array mem...
In the next two weeks, IBâs Collegiate Olympiad starts. The following describes the system Iâm entering. I previously mentioned that it uses TWSâs ActiveX API to connect to IB through Matlab and listed some other info not covered below. Links to all the Matlab files are at the bottom of this post.
System Process Flow
1. Load Data â modules for Yahoo and IB
2. Preprocess
3. Prediction Engine
4. Position Sizing
5. Execution
+ Backtest
Process Flow Description
1. Historical data, including the most recent period is downloaded from Interactive Brokers.
2. OHLC numbers are converted into periodic returns, and the put in the proper ordering, newest to oldest.
3. Support vector regression with a Gaussian kernel, using parameters (C, Îł) chosen by sliding window validation, is used to predict the next periodâs return for each security/contract. These predictions are normalized and weighted by a confidence value. (code outline below)
4. (manual for now)
5. Send the basket of orders to IB through the TWS ActiveX API
+ Backtesting is relatively efficient because most validation folds are redundant
Prediction Engine Code Outline
Initialize parameters
Pre-allocate array memory
Data error checking and preprocessing
For each contract (i.e. security)
For each parameter permutation (C, Îł)
For each validation fold
Train the SVM on the training sample
Make test prediction and compare to known test sample
Save all processing-time and prediction data
End
Calculate validation performance of (C, Îł)
End
Chart results for human inspection
Predict the next periodâs returns
End
Confidence based on validation accuracy: correlation
Choose out- & under-performers based on prediction z-scores*confidence
Final Words
The schematic sketch turned out to be too wordy so I used this format instead. All of the information above is mirrored in the systemâs code, and is intended to be used as a readerâs guide. Without this, I doubt the codeâs comprehensibility. Some parts are simply very complex and may be hard to conceptualize not having been the original inventor, ex. the 6-D array âstorTestPredâ. If you are especially interested in a certain part, such as the sliding window validation or confidence values, please leave a comment or send me an email and I explain in more detail, maybe posting a video if it would be clearer. If you want to do a test run with yahoo data, download all the files to Matlabâs current directory and then execute sys2.m and predictionengine.m. Make sure you have no variables lying around by restarting Matlab or typing >> clear. Both are scripts so you can just do it like this at the Matlab prompt: >> sys2 [press enter, then wait a few seconds], >> predictionengine [press enter, then wait a minute or two]. Some numbers and charts will pop up at the end- you need to understand the code in order to understand these results.
Iâm not worried about the systemâs strategy being âarbed awayâ by sharing because of its generality and flexibility. Also itâs probably challenging to understand the code if you didnât spend many hours writing it. And generally, I donât subscribe to the secretiveness of the trading subculture. I hope bits are useful. Much can be pared away to create a general system framework. The validation and data pulling components should be especially useful for that. Actually, I may gut some of the internals and make a template thatâs a little more fun and easy to play with later. Feel free to comment on anything.
Files: loaddata.m makeportfolio.m predictionengine.m preprocess.m svmpredict.mexw32 svmtrain.mexw32 sys2.m libsvm-mat-2.87-1.zip, or LIBSVM from the authorsâ websiteâ using the first one on the list of âInterfaces to LIBSVMâ. My version of LIBSVM was slightly modified by me to suppress some useless output that was slowing down validation so I donât know if the authorâs version will work exactly the same- but it should. Hereâs everything in one zip file if you are ok with potentially virus-infested zip files (personally I donât trust them)