Twitter – Data Mining the Stock Market

On a daily basis over 140 million users of microblogging service Twitter are generating a collective 340 million small text messages. This is a lot of data with a lot of patterns hidden in it. But can this data predict whether the stocks go up or down?

I think it is a fact that the media, both online and offline, are able to influence the financial market good and bad. It is also said that investors are not the only subject to the sentiment of related news articles but also the public opinions.

The challenge is how to quantify such sentimental information to predict the movement of the stock market, so say the researchers who mined the twitter data.

In other words, the mist important problem in modern finance is finding efficient ways to summarize and visualize the stock market data to give individuals (e.g. the traders) useful information about the stock market behaviour for investment decisions.

Data mining the stock marketTwitter - Data Mining the Stock Market

Data mining is the science and technology of exploring data in order to discover certain unknown patterns. This is part of the so-called Knowledge Discovery in Databases (KDD). In today’s computer driven world these databases contain massive quantities of information regarding whatever topic you can think of case of our subject Twitter. The hard thing is to extract the required information from this data punch.

The research conducts the following data mining systems:

  1. Application of decision tree in stock market
  2. Application of Neural network in stock market
  3. Application of clustering in stock market
  4. Application of associating rules in stock market
  5. Application of factor analysis in stock market
  6. Application of time series in stock market

Decision trees are powerful and popular for dicision making. In data mining a dicision tree is a predictive model which can be used to represent both classifiers and regression models. When a lot of information needs to be taken in to account these tools are excellent for making financial or number based decisions.
Neural network is a computational technique that benefits from techniques to ones employed in the human brain, hence neural. Their main advantage is that they approximate any non linear function to a arbitrary degree of accurancy with a suitable number of hidden units. The results for this method show that the trading strategies guided by the models generate a higher risk-adjusted profit than the buy and hold strategy.
Clustering is a tool that solves classification problems. In particular, this method is useful to find, inside a given stock market index, groups of companies sharing a similar temporal behaviour. To this purpose a clustering approach to the problem represent a good strategy.
Association rules is a popular research method for discovering interesting relations between variables in large databases. Optimized association rules are permitted to contain uninstantiated attrubutes and the problem is to determine instantiations such that either the support or confidence if the rule is maximized.
Factor analysis  is mostly useful in situations where a large number of variables are believed to be determined by a relatively few common causes of variation.
Time series are obviously forcasting activities and play an important role in our daily life. Traditional time series require more historical data along with some assumptions like normality postulates. Time series data is characterized as large in data size, high dimensionally and update continuously. For that reason time series data is always considered as “a whole” instead of individual numeric fields.

Not only do we want to predict the stock market using this information, also a timeframe should be applied to that information. A trader should be able to benefit from very subtle patterns with a short life time, and incorporate the impact of market players on market regularities.


There is a critical need for automated approaches to effective and efficient use of financial data to support companies and individual traders in strategic planning and making investment decisions.

One question remains:

If we could predict the financial market. Will there still be a financial market? When all traders will know what the market will do, what benefit would there still be? And, maybe even more important, will trading still be fun and exciting? Share your opinion, or maybe experience, in the comments below. I am very curious!