1- Trading Strategy Backtesting Background
Time Line : June 2017 to June 2019.
Stocks Considered : S&P100 Firms
Holding Period : 1 Day
Alternative Data Used: News Data used in from a popular newswire in XML format.
Objective : Buy the most comparatively salient stock (as per paper 1) at the close of day t and sell at the close of day t+1.
2- Use Python to Clean & Read News Text Files
Each database is unique and will require meticulous cleaning and pre-processing. For example, as per the from-to images below, each line in the news article saved as independent XML block. I used Python to compile the text into one block.
FROM
TO
3- Create suitable Sentiment Measures
Here I use a NLTK to create a simple sentiment measure for each news article as per my first paper on comparative sentiments. NLTK is an open-source library that classifies text as Positive, Negative or Neutral.
4- Backtest Trading Strategy
For my trading strategy, I buy those stocks with highest CPS at the close and liquidate the position at the close of the next day.
The graph on the right shows progression of $100 invested at the start of the back testing period.
Based on the graph, total returns over 2 years are about 89.95% with a CAGR of 34.53%.
The table below shows the trading strategy statistics.