VOL 40 .... No. 32

WEDNESDAY, JULY 19, 2023

Automated News Analysis Series: Introduction

Categories: Programming

newspaper

As many of you are probably aware, I’ve been involved in varying degrees with a project on computational finance.  Originally called the DarwInvest project, I had intended to develop a Genetic Programming algorithm for making intelligent trades on price history data.  The projected ended up being encompassed by the development of my extensive GP library, Darwin, my proudest achievement in elegant code design.  Well, I’m back at it, and this time I’m trying to tackle the news!

I intend to take TKD in a slightly different direction from here-on-out (not that it had any direction to begin with), and focus mainly on my computer programming projects.  For that reason, I am changing the category WEB STUFF to COMPUTERS to make it more general.

This post serves primarily as an introduction for my current endeavor in algorithmic trading: automated news analysis.  But first, a primer on my opinions of automated algorithmic trading.

It has long been the holy grail of A.I. developers everywhere to develop some magic computer system that predicted the stock market and could turn infinite profits with minimal effort.  Obviously, that type of endeavor seems incredibly frivolous.  That being said, my interest in A.I. has naturally led me in the same direction.

According to my research, there is a lot of industry interest in algorithmic trading.  I’ve recently been interviewing with several financial firms for a summer internship, and they all seem to have some type of foot in the algorithmic trading game.  My prediction for the future is that more and more trading decisions are going to be made by computers, as more sophisticated and successful algorithms are developed.

The advantages to using computers to make trades are obvious, in that computers can be utilized to consistently analyze vastly more data than any single human, or teams of humans, could ever hope to analyze.  With direct data pipelines for data sources, they can also be made to act immediately to incoming information, in situations where humans would have to hesitate.

The drawbacks, of course, are that humans have a large corpus of implicit knowledge on the interactions between systems, and so are generally able to draw larger-scale conclusions about the implications of a certain piece of information.

Still, I feel the growing sophistication of A.I. and the resources being poured into its development will eventually lead to algorithms that can compete, and even beat humans at trading.

We are entering an era of algorithmic arms races between huge financial firms.  Trading decisions by individuals will become less and less successful as the computers of large firms find clever ways of exploiting them.  Eventually it will come down to technical specifications, who has more computing power, better bandwidth, etc, and especially algorithmic ingenuity.

Some have speculated that the widespread advent of algorithmic trading would put the market in essential stasis, where there would be little profit potential for firms, encouraging a withdrawal of resources from investment firms.  I wager that this will not be the case, in much the same way a restaurant will never become unpopular by being too crowded.  If firms pull out, by that very action, the profitability returns encouraging new firms to enter.

With that being said, there will always be the people refusing to trust computers to make trades.  The idea of trusting huge amounts of money to computers is certainly a scary notion, and not one that should be taken lightly.  Also, advances by individual firms to improve their technical infrastructure and algorithmic sophistication will inevitably become the primary source of competition, and I feel it will allow the markets to stay as dynamic as they are now.

My current project is focused on analyzing relevant news articles for their impact on stock market historical price data.  It is more generally a project on qualitative classification of full-text documents, on a particularly noisy source of data which very well may actually have no underlying predictability.

I will be updating on my progress as I make it.  Right now, I am running a script to crawl the web for relevant news articles from the past 10 years.  I will be running this for several days in an attempt to build a sufficiently large corpus of information, and I will report on the results in a few days.

Stay tuned, because I’m excited!


related post

Tags: , ,
  1. No comments yet.
  1. No trackbacks yet.