Examples of Software and Demos for
CS 175, Winter 2017
The links below are
just a small example of the many Web pages you can find discussing
research demos, applications, sample code, for text analysis using AI
and machine learning techniques. These are intended to be a starting
point to help you in terms of thinking about ideas for projects.
: a very
comprehensive list of resources related to sentiment analysis, particularly for
Twitter. Also includes
labeled Twitter data
NLTK Sentiment Analysis Package
including the vader package for sentence-level analysis (e.g., with negation detection).
Stanford deep learning demo
for sentiment analysis
(Web page with code, data, papers)
Sentiment analysis of short informal texts
using Python to build a Naive Bayes text
classifier for sentiment analysis
Blog tutorial on
building a classifier in NLTK to classify tweets as positive or
GATE demos and software
for information extraction from the Web,
from the University of Washington
The Aristo Project
the Allen Institute for AI, for acquiring knowledge and question-answering.
Includes links to several research papers.
The Quepy System
for converting natural language questions into queries in a database query language.
Semantic parsing for
(from Stanford NLP group)
Embedding Text in Vector Spaces
networks for word embeddings
(with Python code)
BookNLP, extracting information from books: Github repository
(Java code) by David Bamman.
Making predictions from search query data
, slide presentation
from Choi and Varian
mining for analyzing the biomedical literature
, introduction by
Cohen and Hunter, 2008
Implementing a spelling corrector. Blog post and Python
by Peter Norvig, Director of Research at Google.
Software Packages for NLP and Text Analysis
(general-purpose toolkits with multiple components and algorithms -
see also NLTK 3.0 and scikit-learn as discussed in class)
: Python library
for topic modeling, document indexing, and similarity retrieval
: General Python library for analyzing text data (built on top
of NLTK and pattern).
: A Web mining
module in Python (doesn't support Python 3)
: useful toolkit
(in Java) from the University of Massachusetts, Amherst
(particularly for classification and topic modeling)
(also: a tutorial article on how to use MALLET for topic modeling
Stanford NLP Software
packages for NLP
from the University of Illinois (UIUC)
List of text
from the Digital Research Tools Website