Hypertext Webster Gateway FAQ

this is an old web page

bsy has left UCSD; please update your bookmarks if you really want to access this site

This is the FAQ for questions related the the Hypertext Webster Interface Gateway.
What's the privacy policy for the gateway?
I look at the logs periodically only for abuses, e.g., the use of robots that doesn't conform to the /robots.txt convention, attempts to break in (e.g., using the old phf abuse), that sort of thing. Server load statistics are also gathered, but only in the aggregate: only number of hits per day, not even per-IP-address / per-domain analysis. The logs are deleted / truncated occasionally to recover disk space, though not on a particularly regular basis.
What's the load on the gateway server?
I use the server for more than just the Hypertext Webster Gateway, so just hit counts are slightly little misleading. But since they were easy to produce, that's what I have. You can see the weekly hit rate graph -- the data consists of hits placed into 10-minute buckets, shoved through a low pass filter, then folded to average multiple weeks worth of data together (not that many yet). The graph starts sometime during Saturday.

I've also produced pie-chart plots of hits (gathered from about a week's worth of accesses) categorized into hits by domain and hits by country. The country code chart may be a little misleading, since the .com, .net, etc addresses are not counted. A chart without those accesses is also available.

What is the full definition of word X?
If the definition returned by the Hypertext Webster Gateway isn't sufficient, you can consult other on-line dictionaries. The returned page should have links to other dictionaries at the bottom of the page. Other, more general resources are various on-line encyclopedias as well as web-wide search engines.

Unfortunately, I cannot give you any personal help. I am merely an assistant professor in computer science, and my job is to do research in computer security (as well as teach). Your local librarian at your local (corporate/public) library is likely to be a more appropriate resource for simple word/vocabulary questions.

What is the third English word that ends in "gry" (aside from "angry" and "hungry"?
I've heard that it's "puggry" or "puggaree", which is a Hindi word that's been transliterated into English. Given that I am -not- your local librarian nor an English professor, I have no idea whether this is correct or not; it certainly is not what I'd call a common word. Ask me a computer security question instead (reasonable consulting rates :).

A better answer comes from George Fields :

As I understand it, this whole brouhaha started with a bad puzzle -

"Angry and hungry are common english words that end in 'gry'. What is the third word that ends in 'gry'? It is a common word that you use every day and if you were listening closely, you've already heard it".

The answer is hungry - the third word of the puzzle that ends in gry.

See also the GRY FAQ at the Internet Public Library

What is the Webster's Hypertext Interface Gateway? Where do they run? Who keeps them going?
There is one webster gateway that is presently in operation: http://smac.ucsd.edu/cgi-bin/http_webster. I wrote an earlier version of this interface gateway while a graduate student at CMU using an NCSA httpd that I modified before CGI existed; the current gateway is an updated version that I run under Apache. I run the UCSD server; the CMU server currently only forwards requests to the UCSD server.
Could you change the appearance of the interface? Change the color of the links? Add snazzy graphics?
The appearance of the interface pages is determined by your browser. I do not wish to dictate how things should look for everybody because a few people decided that a particular color scheme or font combination look good. You get to decide that for yourself by changing your browser's defaults. For most graphical browsers, you can try out the "Options" or "Preferences" menu. You should check out the documentation for your browser.
Your site is being recognized by our group with award PLOUGH -- would you please check out the graphics at URL XYZZY and include one of them on your page, so that the graphic is a link to our site?
Thank you for your (form) letter, but I would like to keep the interface simple -- so that it loads faster for low-bandwidth (e.g., dialup) users -- and free from random advertising. I am glad that you find the interface interesting, and am happy that you find it deserving of the PLOUGH award.
May I set up a link to your dictionary interface? How about including a local query box in a locally generated form?
Please do; many people have already done so (most without asking!) I'd appreciate it if you included a link to the opening screen (empty query: http://smac.ucsd.edu/cgi-bin/http_webster) for the usage instructions, to minimize the number of `how to' emails that I might otherwise receive. Note that the gateway's interface is subject to change.

If the primary server at http://smac.ucsd.edu/cgi-bin/http_webster is down, you can try the backup server at http://philby.ucsd.edu/cgi-bin/http_webster. The philby server is more experimental and the Interface there is more likely to be down, broken, or out of date.

Can I get the sources?
The source code is available as a tar file. Note that it does not contain the database(s), and that I request that you use this only non-commercially; this includes using it to attract web traffic to sell advertisements.
Can I get the database? Can you fix a misspelling / error in the dictionary database?
My gateway does not have any other access to the dictionary database than what the dictionary protocol (which the client program uses) provides. Thus, the hyperlinks to other definitions are provided using heuristics -- which may fail -- to make it easier for the user to find related words. The gateway does not know whether a definition actually exist.

Furthermore, because the dictionary servers are located elsewhere, I have no control over their content. If there is an error or words missing, you should use contact the back-end server's administrator. As the interface now says, I only wrote the interface and am not responsible for the contents. Plus, I do not wish to exercise editorial control.

Do NOT write a software web robot client that ignores the /robots.txt convention and mechanically try to dump the dictionary contents through this web interface. This is unreasonable behavior, since (1) you should ask the people running the dictionary server for permission, (2) you can do it much easier by talking directly to the back-end dictionary servers and not waste the hypertext interfaces gateway's cycles, and (3) you will be disproportionately loading the hypertext interface and the back-end server(s) so that other users may have difficulty accessing the dictionary. New: you can get the back-end software and databases from http://www.dict.org/. A direct download is much simpler than trying to dump a database through the web.

Which edition of which dictionary is being used? Is it copyrighted data?
See the server credits page. I do not check the data to verify that it is non-copyrighted; the backend server can add a new database tomorrow, and the gateway will automatically pick up definitions from it -- I will not constantly monitor the server's data, nor accept liability for what goes on at the backend.
Where can I find a copy for local use? Do you have a word list that I can use for my own project Y?
The sources are available (see above). Other sources of on-line dictionaries are: Microsoft Bookshelf, NeXT's Webster's dictionary (bundled with NeXTStep, last I heard), Britannica on-line. These often include sound support and pronunciation guides, but do not provide the same kind of hypertext access. For the first two, you should be able to find them at any good software store. There're also a couple of FTPable word lists / thesauri around; some of these may be found by following the word-list links from my security Web page at http://www.cs.ucsd.edu/users/bsy/sec.html.
Have you thought about accepting advertising? Do you want donated resources / money?
I'd prefer to keep it free from advertising; since it's been going pretty well being supported by various individuals on their spare time, running on spare cycles, there's really no need. Furthermore, the data is not being provided by me -- the only `value added' that I provide is the gatewaying to the Web to allow easier use -- since none of the web browsers understand the dictionary protocol directly (like phones allowing you to call your local library). And that's not so much. I don't feel I ought to profit from what started as -- and still isn't much more than -- an afternoon hack. (It's now a few days worth of hacking.)

If you still want to donate money/resources, please give it to your local educational institution (well, go ahead, send it to UCSD and/or CMU :). I'd appreciate an email message letting me know if you do so. And support research -- without the opportunity to think broadly in a research environment and play with ideas (even though they are not in my own area of research), I would not have had the chance to come up with the simple ideas leading to this webster's interface.

Yes, I have bent my own rules a little by displaying the GIFs for the cryptography and on-line privacy controversy.

I am offended by a definition. I want names and numbers.
I have received some letters similar to this one:
To Whom It May concern, I am very disturbed with the definitions of the word BLACK in the Webster's Dictionary. I understand that you may not be the person who wrote the definitions, but regardless you had something to do with it being put into this system. I would like some names, numbers, and address' of all of those who are responsible for these definitions. All the way up to the person who came up with the definition.
(Authorship intentionally omitted. Not all email about offensive words are about the word black; other words on other topics have been found offensive by various people as well.)

The definition to which this particular email is probably referring is from a 1912 edition of Webster's dictionary. The person who came up with the definition is quite likely to be long dead. This database is in the public domain (the copyright has expired), and is, as the Hypertext Webster Gateway FAQ (this document) says, not served by the gateway but by a server machine elsewhere on the Internet. The gateway can't do anything about it; the server FAQ server credits / FAQ page has some contact info.

I think most people would (prudently) be hesitant to update such a database, since by actually taking editorial control of it, that would be accepting responsibility and make the editor a possible lawsuit target. (This is why you can't sue the phone company for an obscene phone call -- they are just carriers of the phone call.) While I honestly believe that some of the definitions are a bit outdated and perhaps even possibly offensive to some people, I certain am not willing to touch the database and exercise editorial control. Plus, I don't think anybody involved at the server back-end has the time to do this -- if you want to download a copy of the database and update it yourself and offer its use to the public, it's available, and I encourage you to do so. The beauty of the Internet is that setting up a web site / information source is cheap, and grassroots cooperation is very easy.

Note that the 1912 edition of Webster's contains the word "Chinaman", the definition of which, unlike the entry from the WordNet database, does not indicate that it's considered offensive either.

[ search CSE | CSE | bsy's home page | links | webster | MRQE | google | yahoo | citeseer | pgp certserver | openpgp certserver | geourl's meatspace ]
picture of bsy

bsy+www@cs.ucsd.edu, last updated Fri Aug 29 00:28:03 PDT 2003. Copyright 2003 Bennet Yee.
email bsy.

Don't make me hand over my privacy keys!