Project 3 is due today, and Project 4 is being distributed.
The CAPE and TA evaluations will be on Wednesday next week, November 21. This is the day before Thanksgiving, but do urge all your friends to come to class so they can provide feedback.
See the article Building a Large-scale E-commerce Site with Apache and mod_perl by Perrin Harkins, October 2001. This is a case study of the etoys.com site. The company no longer exists, but the site was the third busiest e-commerce site before Christmas 1999 and Christmas 2000, after Ebay and Amazon. The article shows how to design a multi-tier architecture, similar to the Windows DNA architecture, and how to do component-based programming using a scripting language similar to PHP.
XML can be used as a notation for a programming language, e.g. VoiceXML.
Recent browsers (IE 5.5 and Mozilla, i.e. Netscape 6.0) handle XHTML,
though not all do so perfectly. Unfortunately application developers
have to still cater to older browsers.
An XML document is a tree with exactly one root element, and no overlapping elements. XML is case-sensitive, and in fact can use non-Western characters.
Start tags are written <elementname ...> and end tags are written </elementname>. Start tags can have attributes, which have the syntax name="value". There is no XML-defined syntax inside attribute values, so nested elements are preferable. Also, attributes must be unique for each tag instance.
Tags are nested, and can appear inside free text. <name/> is an empty tag, unlike in HTML.
In free text, special characters must be written as < and & Any XML parser translates these before passing the text to any application using the parser.
<?xml version="1.0" encoding="ISO_8859-1" standalone="no"?> optional processing instructionA section written <![CDATA[ text ]]> doesn't need escaped characters. VoiceXML uses CDATA to include grammars for voice recognition.
<!DOCTYPE person SYSTEM "http://www.ucsd.edu/person.dtd">
<person born="1912" died="1954" id="p342">
<name>
<first_name>Alan</first_name>
<last_name>Turing</last_name>
</name>
<!-- Did the word computer scientist exist in Turing's day? --> this is a comment
<profession>computer scientist</profession>
<profession>mathematician</profession>
<profession>cryptographer</profession>
</person>
A tag beginning <? and ending ?> is a processing
instruction. These are considered to be markup, but not elements,
so they can appear outside the root element. Script code, e.g. PHP
code, is a special case.
A DTD specifies application-specific syntax. It cannot specify constraints like "this piece of data is a year after 2000" or even "this piece of data is a number." XML schemas can specify data types, but they are more complex and less widely used.
In an XML document, the DTD to use is given by something like a special tag, for example
<!DOCTYPE person SYSTEM "http://www.ucsd.edu/person.dtd">In general DTDs can be thousands of lines long, but the basics are simple. For example:
<!ELEMENT person (name, job*)>#PCDATA means parsed character data. In this type of free text, special characters must be written as < and & Any XML parser translates these before passing the text to any application using the parser. If #PCDATA is one choice among others, the content of the element is said to be mixed.
<!ELEMENT name (first, middle?, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT paragraph (#PCDATA | name | footnote | date)*>
<!ELEMENT image EMPTY>
The number of appearances allowed for a nested element is indicated
by * or ? or +. Parentheses indicate grouping.