{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CSE 152 : Introduction to Computer Vision, Spring 2018 – Assignment 5\n", "### Instructor: Ben Ochoa\n", "### Assignment Published On: Wednesday, May 23, 2018\n", "### Due On: Saturday, June 9, 2018, 11:59 PM" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instructions\n", "* Review the academic integrity and collaboration policies on the course website.\n", "* This assignment must be completed individually.\n", "* This assignment contains only programming problems.\n", "* All solutions must be written in this notebook\n", "* Programming aspects of this assignment must be completed using Python in this notebook.\n", "* If you want to modify the skeleton code, you can do so. This has been provided just to provide you with a framework for the solution.\n", "* You may use python packages for basic linear algebra (you can use numpy or scipy for basic operations), but you may not use packages that directly solve the problem.\n", "* If you are unsure about using a specific package or function, then ask the instructor and teaching assistants for clarification.\n", "* You must submit this notebook exported as a pdf. All answers and results must be present in the .pdf file. Points will not be given for answers and results only present in the .ipynb file.\n", "* You must submit both files (.pdf and .ipynb) on Gradescope. You must mark each problem on Gradescope in the pdf.\n", "* It is highly recommended that you begin working on this assignment early." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "In this assignment, we will have a look at some simple techniques for object recognition, in particular,\n", "we will try to recognize faces. The face data that we will use is derived from the Yale Face\n", "Database. The database consists of 5760 images of\n", "10 individuals, each under 9 poses and 64 different lighting conditions. The availability of such standardized\n", "databases is important for scientific research as they are useful for benchmarking different\n", "algorithms.\n", " \n", "
Figure1: The Yale face database B.
\n", "
• (5 points) Write a function eigenTrain(trainset,k) that takes as input a N × d matrix\n", "trainset of vectorized images from subset 0, where N = 70 is the number of training images\n", "and d = 2500 is the number of pixels in each training image. Perform PCA on the data and\n", "compute the top k = 20 eigenvectors. Return the k × d matrix of eigenvectors W, and a d\n", "dimensional vector mu encoding the mean of the training images.
• \n", "\n", "
• (2 points) Rearrange each of the top 20 eigenvectors you obtained in the previous step into a\n", "2D image of size 50 × 50. Display these images by appending them together into a 500 × 100\n", "image (a 10 × 2 grid of images).
• \n", "\n", "
• (2 points) Explain the objective of performing PCA on the training images. What does this\n", "achieve?
• \n", "
• (2 points) Select one image per person from subset 0 (e.g., the 5 images person01 01.png,\n", "person02 01.png, ... , person10 01.png). Show what each of these images would look like\n", "when using only the top k eigenvectors to reconstruct them, for k = 1, 2, 3, 4, 5, ...10. This\n", "reconstruction procedure should project each image into a k dimensional space, project that k\n", "dimensional space back into a 2500 dimensional space, and finally resize that 2500 vector into\n", "a 50 × 50 image.
• \n", "
\n", "
• (10 points) Write a function called eigenTest(trainset,trainlabels,testset,W,mu,k)\n", "that takes as input :\n", "\n", "
• The same N × d matrix trainset of vectorized images from subset 0
• \n", "
• An N dimensional vector trainlabels that encodes the class label of each training image\n", "(e.g., 1 for person01, 2 for person02, etc.)
• \n", "
• An M × d matrix testset of M vectorized images from one of the test subsets (1-4)
• \n", "
• The output of PCA i.e. W and mu, and the number of eigenvectors to use k
• \n", "
\n", "Project each image from trainset and testset onto the space spanned by the first k eigenvectors.\n", "For each test image, find the nearest neighbor (1-NN) in the training set using an L2\n", "distance in this lower dimensional space and predict the class label as the class of the nearest\n", "training image. Your function should return an M dimensional vector testlabels encoding\n", "the predicted class label for each test example. Evaluate eigenTest on each test subset 1-4\n", "separately for values k = 1...20 (so it should be evaluated 4 × 20 times). Plot the error rate\n", "(fraction of incorrect predicted class labels) of each subset as a function of k in the same plot,\n", "and use the Python legend function add a _legend_ to your plot.\n", "
• (2 points) Repeat the experiment from the previous step, but throw out the first 4 eigenvectors.\n", "That is, use k eigenvectors starting with the 5th eigenvector. Produce a plot similar to the one\n", "in the previous step. How do you explain the difference in recognition performance from the\n", "previous part?
• \n", "
• (2 points) Explain any trends you observe in the variation of error rates as you move from\n", "subsets 1 to 4 and as you increase the number of eigenvectors. Use images from each subset\n", "to reinforce your claims.
• \n", "
• (10 points) Write a function called fisherTrain(trainset,trainlabels,c) that\n", "takes as input the same N × d matrix trainset of vectorized images from subset 0, the corresponding\n", "class labels trainlabels, and the number of classes c = 10. Your function should\n", "do the following :\n", "
• Compute the mean \$mu\$ of the training data, and use PCA to compute the first N − c\n", "principal components. Let this be \$W_{PCA}\$.
• \n", "
• Use \$W_{PCA}\$ to project the training data into a space of dimension (N − c).
• \n", "
• Compute the between-class scatter matrix \$S_B\$ and the within class scatter matrix \$S_W\$ on\n", "the (N − c) dimensional space from the previous space.
• \n", "
• Compute \$W_{FLD}\$, by solving for the generalized eigenvectors of the (c−1) largest generalized\n", "eigenvalues for the problem \$S_Bw_i = λ_iS_Ww_i\$. You can use inbuilt functions to solve\n", "for the generalized eigenvalues of \$S_B\$ and \$S_W\$.
\n", "
• The fisher bases will be a \$W = W_{FLD}W_{PCA}\$, where W is (c − 1) × d dimensional, \$W_{FLD}\$\n", "is (c − 1) × (N − c) dimensional, and \$W_{PCA}\$ is (N − c) × d dimensional.
• \n", "
• (5 points) As in the Eigenfaces exercise, rearrange the top 9 Fisher bases you obtained in the\n", "previous part into images of size 50 × 50 and stack them into one big 450 × 50 image.
• \n", "
• (5 points) As in the eigenfaces exercise, perform recognition on the testset with Fisherfaces.\n", "As before, use a nearest neighbor classifier (1-NN), and evaluate results separately for each\n", "test subset 1-4 for values k = 1...9. Plot the error rate of each subset as a function of k in\n", "the same plot, and use the legend function in Python to add a _legend_ to your\n", "plot. Explain any trends you observe in the variation of error rates with different subsets and\n", "different values of k, and compare performance to the Eigenface method. paper-link
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def fisherTrain(trainset, trainlabel, c):\n", " #Your Implementation Here" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false } }, "nbformat": 4, "nbformat_minor": 2 }