Piazza forum


Schedule

Staff

Prerequisites

Project

Additional Reading List

Database Systems: Advanced Topics and Implementation
CSE232B

Description: Description: Description:
                      Description: Description: Description:
                      Description: Description:
                      Z:\cse232b\ArrowTopBlue.gif

Welcome!

For announcements regarding the course, please sign up to our Piazza forum.

Milestone 1 of the project will be due in week 4, Milestone 2 in week 7, milestone 3 in week 10. Stay tuned for exact dates.


Schedule Description: Description: Description:
                      Description: Description: Description:
                      Description: Description:
                      Z:\cse232b\ArrowTopBlue.gif

.

Mon

Tue

Wed

Thu

Fri

Lecture

1:00-1:50 pm
Peterson 103


1:00-1:50 pm
Peterson 103


1:00-1:50 pm
Peterson 103

Instructor Office Hours

9:00 - 10:00 am
 CSE 3238
TA Office Hours





Staff Description: Description: Description:
                      Description: Description: Description:
                      Description: Description:
                      Z:\cse232b\ArrowTopBlue.gif

  • Instructor: Alin Deutsch, deutsch at cs dot ucsd dot edu, office CSE 3238
  • TAs: Nikos Koulouris, nkoulour   at cs dot ucsd dot edu

Prerequisites Description: Description: Description:
                      Description: Description: Description:
                      Description: Description:
                      Z:\cse232b\ArrowTopBlue.gif

  • cse232A or instructor's permission, granted only if the following prerequisites are fulfilled
  • Java (8B or 11 or equivalent)
  • SQL (132A or equivalent)

Class Project

You may team up with (at most) one partner for the class projects. Use the Piazza forum to advertise if you are looking for a partner.

Our class project is the construction of an XQuery processor. We consider a subset/modification of XMLs data model, XQuery, and XQuerys type system as described in this note. The processor receives an XQuery, parses it into an abstract tree representation, optimizes it and finally executes the optimized plan.

  • Milestones 1 and 2 (Nave Evaluation) [due in weeks 4 and 7, respectively]: A straightforward query execution engine receives the simplified XQuery and an input XML file and evaluates the query using a recursive evaluation routine which, given an XQuery expression (path, concatenation, element creation, etc) and a list of input nodes, produces a list of output nodes. For the XQuery parser, we recommend the ANTLR4. Provided with a grammar, ANTLR generates a compiler which automatically constructs abstract syntax trees of  its input expressions.
    Milestone 1 delivers a naive evaluator for XPath, while Milestone 2 extends it to XQuery.

  • Milestone 3 (Efficient Evaluation): Implement a join operator as defined in this note. Implement an algorithm which detects the fact that the FOR and WHERE clause computation can be implemented using the join operator. You may assume that the input XQueries to be optimized are in the simplified syntax given in the note. No need to first normalize your queries to this form.


To access XML files you can use the standard DOM interface. There are a number of XML DOM parser implementations.
The Java distribution includes one (see documentation here). As an alternative, the Xerces-J project from Apache is quite mature and stable.

The W3C specification of DOM is here.


Test cases for project Phase I
. For the data, download Shakespeare's play, Julius Caesar, in XML form (the associated DTD is here). Queries can be found here.


Additional Reading




Grading

The project constitutes 70% of the final grade; final constitutes 25%; class/Piazza participation constitute 5%.

Reading List

Formal XPath Semantics note

A brief informal XQuery tutorial (much briefer and more readable than the W3C standard below)


Textbook material (from the warmly recommended textbook "Web Data Management"):

  • XQuery Advanced Topics slides.


The complete XQuery and XML Schema documentation (the WWW Consortium standards):