Application of Machine Learning to Financial Trading

Similar documents
Analytics for Banks. September 19, 2017

Social Media Analytics

MBA Core Curriculum Course Descriptions

Salford Predictive Modeler. Powerful machine learning software for developing predictive, descriptive, and analytical models.

Churn Prediction with Support Vector Machine

Copyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. ENTERPRISE MINER: ANALYTICAL MODEL DEVELOPMENT

Machine Learning Models for Sales Time Series Forecasting

Predictive Analytics Using Support Vector Machine

2015 The MathWorks, Inc. 1

E-Commerce Sales Prediction Using Listing Keywords

Using AI to Make Predictions on Stock Market

Enabling Foresight. Skills for Predictive Analytics

Preface to the third edition Preface to the first edition Acknowledgments

GOVERNMENT ANALYTICS LEADERSHIP FORUM SAS Canada & The Institute of Public Administration of Canada. April 26 The Shaw Centre

Machine Learning 101

Data Mining Applications with R

Case Study for Vehicle OBDII Data Analytics

Supervised Learning Using Artificial Prediction Markets

PREDICTING EMPLOYEE ATTRITION THROUGH DATA MINING

Applying Regression Techniques For Predictive Analytics Paviya George Chemparathy

Transforming the Client Experience at the Private Bank using Machine Intelligence

Using People Analytics to Help Prevent Absences Due to Mental Health Issues

Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine

Accurate Campaign Targeting Using Classification Algorithms

AUTOMATED INTERPRETABLE COMPUTATIONAL BIOLOGY IN THE CLINIC: A FRAMEWORK TO PREDICT DISEASE SEVERITY AND STRATIFY PATIENTS FROM CLINICAL DATA

BUSINESS DATA MINING (IDS 572) Please include the names of all team-members in your write up and in the name of the file.

Machine Learning Based Prescriptive Analytics for Data Center Networks Hariharan Krishnaswamy DELL

Business Analytics & Data Mining Modeling Using R Dr. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

From Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques. Full book available for purchase here.

RUSRR048 COURSE CATALOG DETAIL REPORT Page 1 of 15 11/14/ :08:10. MKT 100 Course ID

Business Analytics Live Online Webinar with Multisoft Virtual Academy!!!

Building the In-Demand Skills for Analytics and Data Science Course Outline

CHAPTER 8 APPLICATION OF CLUSTERING TO CUSTOMER RELATIONSHIP MANAGEMENT

Integrated Predictive Maintenance Platform Reduces Unscheduled Downtime and Improves Asset Utilization

Predictive Modelling for Customer Targeting A Banking Example

PREDICTION OF PIPE PERFORMANCE WITH MACHINE LEARNING USING R

Azure ML Studio. Overview for Data Engineers & Data Scientists

National Occupational Standard

MODELING THE EXPERT. An Introduction to Logistic Regression The Analytics Edge

CONNECTING CORPORATE GOVERNANCE TO COMPANIES PERFORMANCE BY ARTIFICIAL NEURAL NETWORKS

Statistics 712: Applied Statistical Decision Theory Spring 1999 Syllabus

Strength in numbers? Modelling the impact of businesses on each other

Convex and Non-Convex Classification of S&P 500 Stocks

Predictive Analytics

Building an investment framework that shows what s really driving the markets

Machine Learning Logistic Regression Hamid R. Rabiee Spring 2015

After completion of this unit you will be able to: Define data analytic and explain why it is important Outline the data analytic tools and

in Fintech, Healthcare, and ecommerce

Operations Research Analysts

Master of Business Administration (General)

Is Machine Learning the future of the Business Intelligence?

INSYLABS. Recruitment Process Outsourcing

Axioma Risk Model Machine

Effective CRM Using. Predictive Analytics. Antonios Chorianopoulos

Predicting Corporate Influence Cascades In Health Care Communities

ROCKING ANALYTICS IN A DATA FLOODED WORLD: CHALLENGES AND OPPORTUNITIES BY PROF. BART BAESENS

Determining NDMA Formation During Disinfection Using Treatment Parameters Introduction Water disinfection was one of the biggest turning points for

ABOUT BSE INSTITUTE LIMITED ABOUT IIT MADRAS

Taiwan Salary Benchmark 2018

SCHOOL OF DISTANCE EDUCATION :: ANDHRA UNIVERSITY 3-YEAR MBA III YEAR ASSIGNMENTS FOR THE ACADEMIC YEAR

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and

Implementing Instant-Book and Improving Customer Service Satisfaction. Arturo Heyner Cano Bejar, Nick Danks Kellan Nguyen, Tonny Kuo

Machine Learning Techniques For Particle Identification

Profit Optimization ABSTRACT PROBLEM INTRODUCTION

Progress Report: Predicting Which Recommended Content Users Click Stanley Jacob, Lingjie Kong

Business Analytics using R

Predicting Reddit Post Popularity Via Initial Commentary by Andrei Terentiev and Alanna Tempest

Video Traffic Classification

Chapter 8 Analytical Procedures

Machine Learning in Reservoir Production Simulation and Forecast

The Analytical Revolution

Machine Learning 2.0 for the Uninitiated A PRACTICAL GUIDE TO IMPLEMENTING ML 2.0 SYSTEMS FOR THE ENTERPRISE.

Support Vector Machines (SVMs) for the classification of microarray data. Basel Computational Biology Conference, March 2004 Guido Steiner

Master of Business Administration Course Descriptions

Intel s Machine Learning Strategy. Gary Paek, HPC Marketing Manager, Intel Americas HPC User Forum, Tucson, AZ April 12, 2016

Methods and Applications of Statistics in Business, Finance, and Management Science

Analytics in India a few trends field notes from a layman. Hindol Basu

Index Terms: Customer Loyalty, SVM, Data mining, Classification, Gaussian kernel

Getting Started with Predictive Analytics

GET SOCIAL WITH US. #vision2016. Tweet, follow, share throughout the session.

Making Price Make Sense Building Business Cases to Enhance the Bottom Line

FINAL PROJECT REPORT IME672. Group Number 6

Alpha Factor Library Advanced Stock Selection Signal Library

Machine Learning - Classification

Identifying Splice Sites Of Messenger RNA Using Support Vector Machines

TYBMS V QUESTION BANK for Nov, 2018

POST GRADUATE PROGRAM IN DATA SCIENCE & MACHINE LEARNING (PGPDM)

Cryptocurrency Price Prediction Using News and Social Media Sentiment

Airbnb Price Estimation. Hoormazd Rezaei SUNet ID: hoormazd. Project Category: General Machine Learning gitlab.com/hoorir/cs229-project.

Predicting Volatility in Equity Markets Using Macroeconomic News

USING SAS HIGH PERFORMANCE STATISTICS FOR PREDICTIVE MODELLING

Putting Big Data & Analytics to Work!

Introduction to Machine Learning for Longitudinal Medical Data

DATA ANALYTICS WITH R, EXCEL & TABLEAU

DESIGN OF COPD BIGDATA HEALTHCARE SYSTEM THROUGH MEDICAL IMAGE ANALYSIS USING IMAGE PROCESSING TECHNIQUES

Big Data Beyond The Hype. How Data Science Can Help Finance Professionals Make Better Decisions

Advanced forecasting in FP&A: Automation is here and expanding The Dbriefs Driving Enterprise Value series Miles Ewing, Principal, Deloitte

3DCNN for Lung Nodule Detection And False Positive Reduction

Economic Analysis for Business and Strategic Decisions. The Fundamentals of Managerial Economics

Applications of Statistics and Its Usefulness in Managerial Decision-Making

Transcription:

Application of Machine Learning to Financial Trading January 2, 2015 Some slides borrowed from: Andrew Moore s lectures, Yaser Abu Mustafa s lectures

About Us Our Goal : To use advanced mathematical and statistical concepts to create situational trading algorithms generating uncorrelated alpha. Our Background: A Mathematician with some market experience started AlgoAnalytics in October 2009. Global Equivalent: Systematic (non-discretionary) Managed Futures Advisors CEO: Aniruddha Pant, PhD (Berkeley, USA) Financial Engineering, Quantitative Trading, Derivative Trading, Hedging, Analytics/Machine learning, Control Theory CFO: Girish Patil, BE, PGDBA Fundamental equity research covering Indian, US and Middle East markets. Experience in technical trading of markets. Aniruddha Pant +91-9822873624 @ apant@algoanalytics.com www.algoanalytics.com +6 Quantitative Analysts Page 2

Outline What is machine learning - Binary classification What are we trying to classify - Why is this problem unique Machine Learning Techniques - Different Techniques - Support Vector Machines (SVM) - Ensemble Learning - Unsupervised Learning - Overfitting Approach Newer techniques - MKL - Deep Learning Money Management What we do? AlgoAnalytics Portfolio Page 3

DEFINING & UNDERSTANDING THE PROBLEM Page 4

What is machine learning? A computer programis said to learn from experience E with respect to some class of tasks T and performance P, if its performance at tasks in T, as measured by P, improves with experience E Tom M. Mitchell Page 5

Daily Returns of NIFTY Index since January 2003 Daily Return Average: 0.081%, Standard Deviation: 0.016 Ratio of Std/Mean = 19.27 Kurtosis: 13, 2% of the moves bigger than 3-sigma 3 moves bigger than 6 sigma in@2800 days 6-sigma moves @350times more likely than Gaussian Non stationary distribution Page 6

Autocorrelation of Daily returns Mean absolute daily move 1.1% 52% Accuracy leads to losses/break-even 56 % Accuracy leads to phenomenal profit 4% improvement over break-even accuracy leads to 8.8% profit every 100 days, which is huge! Working very close to randomness Page 7

Random Trading Systems: Pitfalls of working with close to random systems Daily signals generated randomly 100 times Only constraint: Number of positive moves same as original dataset Best random system accuracy: 53.1% Worst random system accuracy: 47.3% Page 8

MACHINE LEARNING TECHNIQUES Page 9

Supervised vs. Unsupervised Learning Supervised Learning Unsupervised Learning Goal: to learn a classification/regression model TASK: well defined (the target function) EXPERIENCE: training data with teacher provided PERFORMANCE: error/accuracy on the task Primarily, supervised learning used in the case of financial data Goal: to find structure in the data TASK: vaguely defined No TEACHER No PERFORMANCE (but there are some evaluation metrics) Page 10

Supervised Learning Techniques Decision Trees Flow-chart like structure Valuable with small width datasets Maps observations of an item to conclusion about the items target value Random Forests Extension of single classification trees Many classification trees grown into a FOREST High accuracy and efficient on large databases Artificial Neural Networks Analogous to biological neural networks Used to find complex data patterns Interconnected artificial neurons used for computation Logistic Regression Probabilistic statistical classification model Binary Predictor Page 11

Supervised Learning Techniques SVM Used for classification and regression analysis Constructs hyperplane in high dimensional space with maximum margin Most widely used and popular method Multiple Kernel Learning Extension of kernel trick used to handle non-linear classification Combines information from multiple sources Deep Learning Attempts to model high-level abstractions in data Model architecture composed of multiple non-linear transformations Uses many layers of non-linear processing units for feature extraction and transformation Bayesian Networks Probabilistic model Based on the Bayesian rule Assumption that input attributes are indepedant Page 12

WHAT WE DO FINANCIAL MARKETS Page 13

Try to predict many things which look like this Daily Return Average: 0.081%, Standard Deviation: 0.016 Ratio of Std/Mean = 19.27 Kurtosis: 13, 2% of the moves bigger than 3-sigma 3 moves bigger than 6 sigma in@2800 days 6-sigma moves @350times more likely than Gaussian Non stationary distribution Page 14

AA Portfolio Intra-Day Low Frequency Daily predictions using machine learning techniques Predictions based on economic factors affecting the underlying security Market Neutral Multi-Day Directional Strategy Options Pair Trading Long-short pairs of Nifty stocks and indices Market neutrality achieved by making the pair beta neutral. Based on the idea of statistical arbitrage Momentum Strategy Indentifying momentum in stocks/indices Mean-Reversion Strategy Assumption that each security returns to its historical mean Alpha comes from underlying direction Butterfly spread long ITM strike, short 2 ATM strike, long OTM strike No naked short options Page 15

Portfolio Performance 2.5 Equity Curve 2 AA Portfolio Niftybees 1.5 1 0.5 Jan-10 Jan-11 Jan-12 Jan-13 Jan-14 AA Equity Backtesting Performance : Backtesting Period: 4 th Jan 2010 31 st Oct 2014 Portfolio Annualized Returns Drawdown Max DD Period (Months) Leverage Factor Max Loss in Rs. L Sharpe Calmar ratio AA Equity 15.86% 4.80% 4 1 48 3.75 3.30 NiftyBees 10.63% 27.50% 38 0.62 0.39 *Nifty BeES, an ETF tracking the S&P CNX Nifty index, is used as the benchmark. Page 16

WHAT WE DO OTHER DOMAINS Page 17

Some of our previous work and future possibilities BFSI Trading Strategy and Analysis Bank Credit Classification Portfolio Analysis Financial Market Forecasting Predict customer interest in Caravan Insurance Policy Predictive Customer Relationship Analytics(CRA) Risk management and prediction Future Work Detect money laundering Customer segmentation and Branding Healthcare Recognizing potential Pulmonary Embolism candidates from CAT scan data Hepatitis B and Hepatitis C patients using nonbiopsy test data Cancer cell classification Future Work Patient care aid Predict premature birth based on peptide biomarkers Risk of death in surgery Hospital admission predict readmission for same illness Page 18

Some of our previous work and future possibilities (Contd ) Human Resources Management Manpower Asset Allocation Recruitment Model Talent Forecasting Worker s Compensation Policy Future Work Turnover modeling for businesses Targeted retention Telecom Accurately predict as many current 3G customers Identify 2G customers likely to convert to 3G customers Future Work Forecast traffic patterns and peak period routing Identify at-risk customers; convert them to loyal customers Other Electricity Load Forecasting Airline Passenger Forecasting Sentiment Analysis using twitter data Cross-selling predicting potential customers Future Work Predicting player performance in sports Efficient building design Power grid management Page 19

Work in progress MRI Analytics Efficient evidence based healthcare system Image Processing + Machine learning + Radiologist = decision support systems Recommender Systems Recommend items sold online to potential customers Machine learning - predicting that an item is worth recommending Automated detection of diabetic retinopathy and macular edima Efficient evidence based healthcare system Image Processing + Machine learning + eye specialist Predictive Maintenance in Refrigeration Systems Fault detection in refrigeration systems Energy optimization Page 20