DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
UNIVERSITY OF CALIFORNIA, SAN DIEGO
CSE 190a: Projects in Vision & Learning
List of Suggested Projects
- Fun with Google Satellite Images
- Build detector for structures not appearing on map, e.g., swimming pool, race track, baseball diamond, etc.
- Use data from http://maps.google.com or http://earth.google.com/.
- training data available from wikimapia.org
- birds-eye photos available at live.com
- locate solar panels on houses
- tie in with Zillow and census data, assist in urban planning
- Recognizing Cars
- Build off of Louka Dlagnekov's MS thesis Video-based Car Surveillance: License Plate, Make, and Model Recognition
- improve OCR algorithm
- detection of car manufacturer logos/insignias
- bumper sticker detection
- improve LPR performance using super-resolution
- classify car colors into familiar color categories (silver, red, etc.)
- estimation of 3D structure from 2 different poses
- Analysis of Underwater Images and Video of Coral Reefs
- Touch-based interactive image manipulation
- Specify image transformations (zoom, rotate, scale, perspective distortion) with multi-finger contact and gestures.
- See demo from Jeff Han at NYU.
- Collaborate with HCI group in CogSci dept.
- 3D Object Scanner (on a Budget)
- Milkscanner V1.5
- requires 1 webcam, 1 tupperware bowl, 3 cups of milk, and 1 custom LEGO rig
- Assistive Technologies for the Visually Impaired
- see the research projects at Smith-Kettlewell Eye Research Institute
- read street signs, bus stop signs, detect crosswalks
- if unknown object detected, upload to internet, make software for volunteers on web to identify (maybe like LabelMe interface)
- develop Remote Sighted Guide (RSG) infrastructure for assisting blind people in grocery shopping, leveraging ideas from peekaboom and espgame.
- Automatic Mosaic Construction
- Build photomosaics without hand-clicking points.
- Learn about Block (and Bundle) Adjustment.
- Implement algorithms by Brown & Lowe and Shum & Szeliski.
- Diagnosis of Skin Diseases
- Handwritten Digit Recognition
- Virtual Simultaneous Replay
- SimulCam is used with events such as downhill skiing to depict two competitors performing a race at the same time.
- As a project, implement a basic version of this, e.g., using 2-3 Pan-Tilt-Zoom (PTZ) cameras on campus somewhere capturing two people running a race.
- Animal Monitoring
- Mobile Vision Platforms
- Develop mobile computer vision applications on an ARM9 core [www]
- Implement vision algorithms on an ER-1 robot [www]
- Super-Resolution
- How to combine multiple low-resolution images of the same scene to make a composite high-resolution image.
- References: Capel, Zisserman, Irani, Peleg, Milanfar.
- Handwritten Equation Recognition
- Sea of Cell Phone Cameras
- In a typical classroom, dozens of students have cell phones with cameras. How can images from all of them be used together to solve some interesting problem?
- Object Recognition
- Human Surveillance
- In static photographs: help a news agency count the number of people at a demonstration.
- In video: count number of people walking through a corridor.
- CAVIAR database for human activity recognition.
- Online Photo Sites
- See for example this project from Berkeley that uses face images and attractiveness ratings from the ``Hot or Not'' website.
- Riya.com does face and text recognition in photo albums. It could be interesting to tie this in with Google or Yahoo image search to find celebrities, political figures, et al.
- Book/CD/DVD Cover Recognition
- Delicious Library manages media collections via barcode scanning with an iSight camera, and grabs cover art from Amazon.
What about recognizing the cover art directly, instead of using the bar code?
- Parking Lot Monitor
- Point one or more webcams at a parking lot.
- Detect vacant parking spaces.
- 3D Photography on Campus
- Use stereo/multiview/video to reconstruct 3D models of sculptures around campus.
- Examples: Bear, Sun God.
- Relevant courses at udub and Caltech. More links here.
- Use data from Mars rover
- Fish Recognition
- Facial Image Registration
- Pedestrian Detection
- TA grading assistant
- perform OCR on students' handwritten names and ID numbers
- use constraint that names and numbers must be drawn from class roster known ahead of time
- help TAs of lower division CSE classes with a real-world paperwork management problem involving scanned quizzes!
- Back-of-Head Recognition
- How well can you recognize someone who is facing away from you?
- Consider a photograph of a conference room taken from the back: can you recognize individuals from their hair pattern/color, ears, neck, etc.? Can the person's slouch/posture also play a role?
- You can think of this as the extreme case of face recognition with `out of plane' rotation invariance.
- As part of the study, you'd build a database and do a study of human ability to perform this task.
- USPS Mailbox Detector
- Detect blue mailboxes in photos like these: http://www.payphone-project.com/mailboxes/.
- Challenging due to variability in scale and illumination.
- Use object detector such as AdaBoost cascade with color and brightness features.
- 3D Laser Scanning
- Vision-Based Key Duplication
- input: a reasonably high-resolution photograph of a house key
- output: instructions for a milling machine to create a duplicate
- Robot Programming Contests
- Sudoku Solver
- grab photo, correct for perspective transformation, detect puzzle grid, perform OCR, solve puzzle (!?)
- iPhone Sudoku Grab
- Interactive Laser Art
- A laser system that follows lines drawn on paper
- YouTube video link
- scoreLight project website
- 4D Ping-Pong Game Acquisition
- multi-view reconstruction of a table tennis game
- set up a single camera to record a table tennis game, try out different algorithms to detect/track the ball in a single view
- several other views could be set up and the video synchronized to try to estimate 3D position
- re-render the game using the positions of the ball and the table
- Fun with Nintendo Wiimotes
- Several cool projects involving 2D/3D tracking using the Nintendo Wii Remote Control are described here.
- The Wiimote has a built-in infra-red camera and firmware for tracking multiple IR LEDs and/or reflective tape.
- Fingerprint singular point detection
- Locate cores and deltas in fingerprint images
- Dataset: SPD 2010
- Automating the Hitchcock Zoom effect
- also known as the Dolly zoom
- implement face/object tracker, maintain scale while camera translates forward or backward
- Hands-free, gaze/head-pose based UI for touchscreen user interfaces
- for iPad 2, iPhone, Android phone/tablet with front facing camera
- track user's face and head pose, determine intersection of normal vector from face with plane of tablet surface
- user facial gesture (blinking, nose-wrinkle) for virtual "tap"
- Relevant project: WATSON: Real-time Head Tracking and Gesture Recognition
- Fun with Whiteboard Image Capture
- Related work from Z. Zhang at MSR: [www]
- Capture image of whiteboard with cell phone camera
- Auto-detect whiteboard edges
- Correct for perspective distortion
- Vision Based Guitar Tuner
- Leverage the rolling shutter effect for oscillating strings.
- Apply spectral analysis to waveform images.
- Traffic Sign Recognition
- An increasingly important problem as more cars get front facing cameras.
- Excellent dataset available here.
Most recently updated on Mar 3, 2012 by Serge Belongie.