Semantic News Analysis and Prediction

dc.contributor.advisorNgu, Anne Hee Hiong
dc.contributor.authorOrell, Seth R.
dc.contributor.committeeMemberGao, Byron
dc.contributor.committeeMemberPodorozhny, Rodion
dc.date.accessioned2011-09-14T19:41:19Z
dc.date.available2011-09-14T19:41:19Z
dc.date.issued2011-08
dc.description.abstractActive stock trading firms have a need for quick analysis of financial news items. News affects markets. Predicting how a news article may move a stock’s price can give a trader an edge over competitors and this involves the automatic understanding of a news item’s semantics. Years of research on semantic Web Services has yielded a variety of techniques to discern or provide meaning beyond the basic WSDL syntax. I believe that this research into Web Service semantics has relevance in other fields, specifically the content analysis of news as it applies to markets. The purpose of the present study is to determine if specific academic models of Web-based semantic analysis can be utilized to provide market price predictions. The study’s design allows for an objective measure of accuracy by comparing predictions against actual market changes. In the study, I explore the application of current “Top-Down” Web service semantic analyzers to distill the various approaches into abstract concepts. I take a common approach of textual content matching and apply it with and without synonym-analysis (a form of spread activation) with promising results. Using the securities in the Russell 1000 Index (chosen for market liquidity and activity), I collected corresponding news articles from Reuters for 8 months. For each article, I pulled one-minute snapshots of market data for the article’s publishing date and corresponding security. I then divided the news items into two groups: an in-sample learning set and an out-of-sample input set. The in-sample set of news provided “predictions” for price movement and I could contrast this against what the input item actually did in the market. Simple semantic analysis produced encouraging results with a rate of return (profit) better than random for shorter hold durations (one to five minutes). A synonym-based strategy showed a stronger return for longer hold periods (thirty to forty-five minutes). Both strategies performed better than a random matching approach, which lost money for every hold duration. These results show potential for similar and broader market analysis using established academic models of semantic Web analysis.
dc.description.departmentComputer Science
dc.formatText
dc.format.extent63 pages
dc.format.medium1 file (.pdf)
dc.identifier.citationOrell, S. R. (2011). Semantic news analysis and prediction (Unpublished thesis). Texas State University-San Marcos, San Marcos, Texas.
dc.identifier.urihttps://hdl.handle.net/10877/2533
dc.language.isoen
dc.subjectsemantic web
dc.subjectstock market
dc.subjectlucene
dc.subjectweb service composition
dc.titleSemantic News Analysis and Prediction
dc.typeThesis
thesis.degree.departmentComputer Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorTexas State University-San Marcos
thesis.degree.levelMasters
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ORELL-THESIS.pdf
Size:
1.38 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
license.txt
Size:
2.12 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
1_license.txt
Size:
1.71 KB
Format:
Plain Text
Description: