Software for CS 175, Fall 2022

The links to software packages (all in Python) below will likely be useful to you both for the initial assignments and for class projects. For the class projects you are welcome (if you wish) to make use of other software packages in addition to those below, although the packages below contain a very large range of different library functions and utilities for text analysis and machine learning and should be enough to support most if not all aspects of your project.

Anaconda Python Distribution
We recommend that you download and install the free Anaconda Python distribution with Python 3.6 or above. Anaconda includes Python, the Natural Language Toolkit (NLTK) and scikit-learn, in addition to a wide range of other packages that are useful for data analysis (such as matplotlib, numpy, scipy, and more). If you download Anaconda you should have many of the packages you will need for both the assignments and for your class project. Anaconda is available for Mac, Linux, and Windows OS. Anaconda includes (among many other libraries):

Python (3.6 or above)
You should have Python 3.6 or above installed on your computer for this course (if you installed Anaconda (see above) with the Python 3 option then you should already have it). The online Python Tutorial materials are very useful reference in general. If you are not familiar with Python you will need to spend time learning it, e.g., via an online tutorial such as the Beginner's Guide to Python or an introductory text on Python such as Python Programming: An Introduction to Computer Science.

Pytorch and Related NLP Tools
PyTorch is a powerful machine learning framework in Python that you should also download and install for this course. There are also a number of additional (optional) NLP packages that are built on top of PyTorch and that may be useful for your projects:

Python Virtual Environments
When installing Python packages you may find it useful to use conda to create virtual environments that are specific to this course and/or your project, e.g.,
> conda create --name cs175
> source activate cs175
> conda install --name cs175 pytorch

Collaboration with Project Team Members
If you have not used GitHub before to develop code as part of a collaboration, this project class would be an ideal opportunity to learn to use it. There is lots of online tutorial material on how to get started. Its probably helpful if at least one person on the team has used it before.