Intro to Machine Learning
ISTA 421 / INFO 521
Introduction to Machine Learning
[Fall 2017]

Setting up Python

Programming project assignments will be developed to run in python 3.6.x. The latest version of Python 3 is 3.6.2. For those who have primarily used Python 2 in the past, this page discusses transitioning from Python 2 to 3.

At the level we are using Python in this course, the differences between Python 2.7 and 3 are small. If you wish to use Python 2.7, you may, but you are solely responsible for making any needed changes to the released project code, and you must clearly indicate that you are using Python 2.7; we will otherwise assume it is Python 3, and if the code does not run we will not grade it.

In the following, $ is used to indicate a unix-flavor command-line.

Brief Python and Unix tutorial

Required modules:

Installation options:

There are two general approaches for setting up a python environment.

  1. Use a pre-packaged, sandboxed distribution:
    The following are cross-platform (Mac, Linux, Windows). Installing these puts the entire python installation, with a large set of pre-packaged modules (includes numpy, scipy, matplotlib), into a stand-alone sandbox, completely independent of any other python environment on your machine. These also include a package manager and IDE):
    • Enthought Canopy The Canopy Express release is free. Once installed, us the Package Manager to update to the latest versions of the Enthought release of included python modules.
    • Anaconda I have heard good things about Anaconda, and appears to work similarly to Canopy as a sandboxed distribution and all of the modules you will need, but I don't have any personal experience using it.
    • List of alternative python distributions (many of these include IDE's)

  2. Install Python, modules, and IDE separately:

    1. Install Python (or use the existing python install in your OS; ensure it is Python 3.6).
      If you want to keep your install of Python and any modules you work with for the class separate from your OS's native Python, I recommend either using a sandbox distribution release like Enthought Canopy (above), or use Python virtualenv (Here's a good place to start to learn about virtualenv).

      Platform-specific notes:
      • Linux: Setting up Python 3 on Ubuntu
      • Mac: A version of Python 3 comes installed with mac osx, but I recommend doing a separate installation in order to not interfere with the system installation.
        • Ensure XCode is installed. To check if XCode is installed properly, you can type gcc in the command-line. If you get gcc: command not found, then XCode is not properly installed. (depending on your version of Mac OSX, you may be able to use the following command to install directly from the command line:
          $ xcode-select --install)
        I then recommend using either one of the following package managers:
        • (A) I now use Homebrew for Mac package management
          • (i) Install Homebrew
          • (ii) $ brew install python3 # grabs latest python 3
          • (iii) $ pip3 install --upgrade pip # ensures pip for Python 3 is up to date
        • (B) In the distant past I used MacPorts, with the following recipe for setting up Python 2.7; likely just need to adjust a little for python 3 -- try starting here:
          • (i) Install MacPorts (After downloading and installing, run
            $ sudo port selfupdate)
          • (ii) $ sudo port install python2.7 pip
          • (iii) $ sudo port select --set python python27
      • Windows:
        • Using Python 3 on Windows
        • Note on installing numpy, scipy and matplotlib: After installing Python 3, some Windows users have reported (fall 2017) trouble with installing scipy. The following was found to work, as long as you install all three, starting with the following version of numpy first:
          • First, download one version of the .whl files (depending on whether you have win32 or win_amd64) for the following three packages from here: Unofficial Windows Binaries for Python Extension Packages
            • from numpy choose either numpy-1.13.1+mkl-cp36-cp36m-win32.whl or numpy-1.13.1+mkl-cp36-cp36m-win_amd64.whl
            • from scipy choose either scipy-0.19.1-cp36-cp36m-win32.whl or scipy-0.19.1-cp36-cp36m-win_amd64.whl
            • from matplotlib choose either matplotlib-2.0.2-cp36-cp36m-win32.whl or matplotlib-2.0.2-cp36-cp36m-win_amd64.whl
          • Second, from a command-line shell (e.g., PowerShell, Windows bash, etc.), change directory (cd) to the place where you downloaded the .whl files (e.g., Downloads)
          • Third, from the command-line, you will now pip install first numpy, then scipy, then matplotlib. Pip install by using this command from the command-line:
            $ pip install <.whl-filename>
            ... where <.whl-filename> is the name of the .whl file you downloaded.

    2. Install NumPy, SciPy, matplotlib. Highly recommend using PIP. PIP is now typically installed as part of standard python installations.
      • PIP is a unix command-line tool like a package manager for python site-package module management.
      • Once pip is installed, you can use
        $ pip install numpy
        and similarly for scipy and matplotlib
      Or you can install NumPy, SciPy, and matplotlib independently (preferably using a package manager like, such as Homebrew on the Mac).

    3. Python IDE
      • These IDEs require independent installation of python and modules (as opposed to using the packaged IDE provided in a sandboxed Python distribution, like Enthought Canopy, above)
      • There are many options for editors/IDEs. IDEs, of course, give more Python-aware editing (ranging from syntax to semantic highlighting, tool-tips, multi-file and module structure awareness, etc...). NOTE: For many of these IDEs you will need to go through the step of making the IDE aware of where your Python installation is located -- this is typically done by using the IDE's preferences facility to identify the system path to python executable and site-packages directory (all usually accomplished by a single path); see your IDE's docs for how to do this. All of the following are free and multiplatform.
        • Any basic text editor will work! You'll just edit, save edits then run from the command-line.
        • You could use Emacs (the simple course Python Tutorial includes some notes about Emacs; Aquamacs is the best Mac flavor of Emacs) or vi. Both options provide varying levels of Python awareness and plugins.
        • PyCharm Community Edition is excellent
        • Sublime Text is also excellent (but will bug you periodically to buy it).
        • Eclipse with the PyDev plugin works will for many people. There is the LiClipse bundle which I have not tried but looks intriguing
      • Another list of python IDEs