NCSA httpd tutorials

In an effort to make this documentation a bit more usable, we have written these
tutorials on various aspects of server setup. 

   NCSA httpd directory indexing

   NCSA httpd provides a directory indexing format which is similar to that
   which will be offered by the WWW Common Library. To set up this
   indexing, follow these steps.

   For an example of what these indexes look like, take a look at the demo.

   Activating Fancy indexing

   You first need to tell httpd to use the advanced indexing instead of the simple
   version. The simple version should be used if you prefer its simplicity, or if
   you are serving files off of a remote file server, for which the stat() call
   would be costly. You tell the server which you want to use with either the
   IndexOptions directive, or the older FancyIndexing directive. We recommend:

   IndexOptions FancyIndexing

   Icons

   NCSA httpd comes with a number of icons in the /icons subdirectory which
   are used for directory indexing. The first thing you should do is make sure
   your Server Resource Map h as the following line in it:

   Alias /icons/ /usr/local/etc/httpd/icons/

   You should replace /usr/local/etc/httpd/ with whatever you set
   ServerRoot to be. 

   Next, you need to tell the server what icons to provide for different types of
   files. You do this with the AddIcon and AddIconByType directives. We
   recommend something like the following setup:

   AddIconByType (IMG,/icons/image.xbm) image/*
   AddIconByType (SND,/icons/sound.xbm) audio/*
   AddIconByType (TXT,/icons/text.xbm) text/*

   This covers the three main types of files. If you want to add your own icons,
   simply create the appropriately sized xbm, place it in /icons, and choose a
   3-letter ALT identifier for the type.

   httpd also requires three special icons, one for directories, one which is a
   blank icon the same size as the other icons, and one which specifies the parent
   directory of this index. To use the icons in the distribution, use the following
   lines in srm.conf:

   AddIcon /icons/menu.xbm ^^DIRECTORY^^
   AddIcon /icons/blank.xbm ^^BLANKICON^^
   AddIcon /icons/back.xbm ..

   However, not all files fit one of these types. To provide a general icon for any
   unknown files, use the DefaultIcon directive:

   DefaultIcon /icons/unknown.xbm

   Descriptions

   If you want to add descriptions to your files, use the AddDescription
   directive. For instance, to add the description "My pictures" to 
   /usr6/rob/public_html/images, use the following line:

   AddDescription "My pictures" /usr6/rob/public_html/images/* 

   If you want to have the titles of your HTML documents displayed for their
   descriptions, use the IndexOptions directive to activate ScanHTMLTitles:

   IndexOptions FancyIndexing ScanHTMLTitles

   WARNING: You should only use this option if your server has time to spare!!!
   This is a costly operation!

   Ignoring certain items

   Generally, you don't want httpd sending references to certain files when it's
   creating indexes. Such files are emacs autosave and backup files, httpd's
   .htaccess files, and perhaps any file beginning with . (if you have a gopher or
   FTP server running in that directory as well). We recommend you ignore the
   following patterns:

   IndexIgnore */.??* */README* */HEADER*

   This tells httpd to ignore any file beginning with ., and any file starting with
   README or HEADER.

   Creating READMEs and HEADERs

   When httpd is indexing a directory, it will look for two things and insert them
   into the index: A header, and a README. Generally, the header contains an
   HTML <H1> tag with a title for this index, and a brief description of what's
   in this directory. The README contains things you may want people to read
   about the items being served. 

   httpd will look for both plaintext and HTML versions of HEADERs or
   READMEs. If we add the following lines to srm.conf:

   ReadmeName README
   HeaderName HEADER

   When httpd is indexing a directory, it will first look for HEADER.html. If it
   doesn't find that file, it will look for HEADER. If it finds neither, it will
   generate its own. If it finds one, it will insert it at the beginning of the index.
   Similarly, the server will look for README.html, then README to insert a
   trailer for the document. 


   Setting up CGI in NCSA httpd

   CGI scripts are a way for documents to be generated on the fly. You should
   first read the brief introduction to CGI to learn what it is and why you would
   want to use it.

   There are two main mechanisms to tell NCSA httpd where your scripts are.
   Each has its pluses and its minuses.

   ScriptAlias

   The first approach is based on the Server Resource Map directive ScriptAlias.
   With this directive, you specify to the server that you want to designate a
   directory (or directories) as script-only, that is, any time the server tries to
   retrieve a file from these directories it will execute the file instead of reading
   it.

   The usual setup is to have the following line in srm.conf:

   ScriptAlias /cgi-bin/ cgi-bin/

   This will make any request to the server which begins with /cgi-bin/ be
   fulfilled by executing the corresponding program in ServerRoot/cgi-bin/. 

   You may have more than one ScriptAlias directive in srm.conf to desingnate
   different directories as CGI.

   The advantage of this setup is ease of administration, and centralization.
   Many system managers don't want things as dangerous as scripts anywhere in
   the filesystem. The disadvantage is that anyone wishing to create scripts must
   either have their own entry in srm.conf or must have write access to a
   ScriptAliased directory. 

   CGI as files

   NCSA httpd 1.2 allows you to create CGI scripts anywhere, by specifying a
   "magic" MIME type for files which tells the server to execute them instead of
   sending them. To accomplish this, use the AddType directive in either the
   Server Resource Map or in a per-directory access control file.

   For instance, to make all files ending in .cgi scripts, use the following
   directive:

   AddType application/x-httpd-cgi .cgi

   Alternatively, you could add .sh and .pl after .cgi to allow automatic
   execution of shell scripts and PERL scripts. Note that you have to have
   Options ExecCGI activated in the directory you create scripts. (you might
   want to read more about directives like Option in the docs for installation).

   The advantage of this setup is that scripts may be absolutely anywhere. The
   disadvantage is that scripts may be absolutely anywhere (especially places you
   don't want them to be like users' home directories).


   NCSA httpd server side includes

   NCSA httpd allows users to create documents which provide simple
   information to clients on the fly. Such information can include the current
   date, the file's last modification date, and the size or last modification of
   other files. In its more advanced usage, it can provide a powerful interface to
   CGI and /bin/sh programs. 

   Issues

   Having the server parse documents is a double edged sword. It can be costly
   for heavily loaded servers to perform parsing of files while sending them.
   Further, it can be considered a security risk to have average users executing
   commands as the server's User. If you disable the exec option, this danger is
   mitigated, but the performance issue remains. You should consider these
   items carefully before activating server-side includes on your server.

   Setting Up Includes

   First, you should decide which directories you want to allow Includes in.
   Most likely this will not include users' home directories or directories you do
   not trust. You should then decide, of the directories you are allowing includes
   in, which directories are safe enough to use exec in. 

   For the directories in which you want to fully enable includes, you need to use
   the Options directive to turn on the option Includes. Similarly for the
   directories you want crippled (no exec) includes, you should use the option 
   IncludesNOEXEC. In any directory you want to disable includes, use the
   Options directive without either option.

   Next, you need to tell the server what filename extension you are using for the
   parsed files. These files, while very similar to HTML, are not HTML and are
   thus not treated the same. Internally, the server uses the magic MIME type 
   text/x-server-parsed-html to identify parsed documents. It will
   then perform a format conversion to change these files into HTML for the
   client. To tell the server which extension you want to use for parsed files, use
   the AddType directive. For instance:

   AddType text/x-server-parsed-html .shtml

   This makes any file ending with .shtml a parsed file. Alternatively, if you
   don't care about the performance hit of having all .html files parsed, you
   could use:

   AddType text/x-server-parsed-html .html

   This would make the server parse all .html files. 

   Converting your old INC SRV documents to the new format

   You should use the program inc2shtml in the support subdirectory of
   the httpd distribution to translate your documents from httpd 1.1 and earlier
   to the new format. Usage is simple: inc2shtml file.html >
   file.shtml. 

   The Include format

   All directives to the server are formatted as SGML comments within the
   document. This is in case the document should ever find itself in the client's
   hands unparsed. Each directive has the following format:

   <!--#command tag1="value1" tag2="value2" -->

   Each command takes different arguments, most only accept one tag at a time.
   Here is a breakdown of the commands and their associated tags:

      config

      The config directive controls various aspects of the file parsing. There
      are two valid tags:

         errmsg controls what message is sent back to the client if an
         error includes while parsing the document. When an error
         occurs, it is logged in the server's error log. 

         timefmt gives the server a new format to use when providing
         dates. This is a string compatible with the strftime library
         call under most versions of UNIX. 

         sizefmt determines the formatting to be used when
         displaying the size of a file. Valid choices are bytes, for a
         formatted byte count (formatted as 1,234,567), or abbrev for
         an abbreviated version displaying the number of kilobytes or
         megabytes the file occupies.

      include

      include will insert the text of a document into the parsed document.
      Any included file is subject to the usual access control. This command
      accepts two tags:

         virtual gives a virtual path to a document on the server.
         You must access a normal file this way, you cannot access a
         CGI script in this fashion. You can, however, access another
         parsed document. 

         file gives a pathname relative to the current directory. ../
         cannot be used in this pathname, nor can absolute paths be used.
         As above, you can send other parsed documents, but you cannot
         send CGI scripts. 

      echo prints the value of one of the include variables (defined below).
      Any dates are printed subject to the currently configured timefmt.
      The only valid tag to this command is var, whose value is the name
      of the variable you wish to echo.

      fsize prints the size of the specified file. Valid tags are the same as
      with the include command. The resulting format of this command
      is subject to the sizefmt parameter to the config command.

      flastmod prints the last modification date of the specified file,
      subject to the formatting preference given by the timefmt parameter
      to config. Valid tags are the same as with the include command.

      exec executes a given shell command or CGI script. It must be
      activated to be used. Valid tags are:

         cmd will execute the given string using /bin/sh. All of the
         variables defined below are defined, and can be used in the
         command.

         cgi will execute the given virtual path to a CGI script and
         include its output. The server does not perform error checking
         to make sure your script didn't output horrible things like a
         GIF, so be careful. It will, however, interpret any URL
         Location: header and translate it into an HTML anchor.

   Variables defined for Parsed Documents

   A number of variables are made available to parsed documents. In addition to
   the CGI variable set, the following variables are made available:

      DOCUMENT_NAME: The current filename.

      DOCUMENT_URI: The virtual path to this document (such as
      /~robm/foo.shtml).

      QUERY_STRING_UNESCAPED: The unescaped version of any
      search query the client sent, with all shell-special characters escaped
      with \.

      DATE_LOCAL: The current date, local time zone. Subject to the 
      timefmt parameter to the config command.

      DATE_GMT: Same as DATE_LOCAL but in Greenwich mean time.

      LAST_MODIFIED: The last modification date of the current
      document. Subject to timefmt like the others.


   Making your setup more secure

   When configuring the access control for your server, you will want to make
   sure you do not give any unauthorized access to anyone. Please follow these
   guidelines to ensure that your server is not compromised. 

      A word of caution on DNS based access control and user
      authentication

      The access control by hostname and Basic user authentication facilities
      provided by httpd are relatively safe, but not bulletproof. The user
      authentication sends passwords across the network in plaintext,
      making them easily readable. The DNS based access control is only as
      safe as DNS, so you should keep that in mind when using it. Bottom
      line: If it absolutely positively cannot be seen by outside people, you
      probably should not use httpd to protect it.

      Disable Server-side includes wherever possible 

      Whenever you can, use the Options directive to disable server-side
      includes. At the very least, you should disable the exec feature. Note
      that because the default value of Options is All, you should include an
      Options directive in every Directory cl ause in your global ACF and in
      every .htaccess file you write. 

      Use AllowOverride None whe rever possible 

      Use this directive to prevent any "untrusted" directories (such as users'
      home directories) from overriding your settings (and thus allowing
      their friends to execute xterms as nobody with a server-side include or
      other such horrors). You also gain a bonus in performance. 

      Protect your users' home directories

      Protect your users' home directories with Directory directives. If
      your users all have their home directories in one physical location
      (such as /home), then this is easy:

      <Directory /home>
      AllowOverride None
      Options Indexes
      </Directory>

      If they are not all in one location such as /home, then you should use
      this wildcard pattern to secure them (assuming your UserDir is se t to 
      public_html):

      <Directory /*/public_html*>
      AllowOverride None
      Options Indexes
      </Directory>
       

      In addition, if you wish to give your users the ability to create
      symbolic links to things only they own, use the Option 
      SymLinksIfOwnerMatch.


   Mosaic User Authentication Tutorial

   Introduction

   This tutorial surveys the current methods in NCSA Mosaic for X version 2.0
   and NCSA httpd for restricting access to documents. The tutorial also walks
   through setup and use of these methods. 

   Mosaic 2.0 and NCSA httpd allow access restriction based on several criteria: 

      Username/password-level access authorization. 
      Rejection or acceptance of connections based on Internet address of
      client. 
      A combination of the above two methods. 
   This tutorial is based heavily on work done by Ari Luotonen at CERN and
   Rob McCool at NCSA. In particular, Ari wrote the client-side code currently
   in Mosaic 2.0, and Rob wrote NCSA httpd. 

   Getting Started

   Before you can explore access authorization, you need to install NCSA httpd
   1.0a5 or later on a Unix machine under your control, or get write access to
   one or more directories in a filespace already being served by NCSA httpd. 

   You also need to be running Mosaic for X version 2.0 or later, or another
   browser known to support HTTP/1.0-based authentication. 

   Prepared Examples

   Following are several examples of the range of access authorization
   capabilities available through Mosaic and NCSA httpd. The examples are
   served from a system at NCSA. 

   Simple protection by password. 

      This document is accessible only to user fido with password bones.

      Important Note: There is no correspondence between usernames and
      passwords on specific Unix systems (e.g. in an /etc/passwd file)
      and usernames and passwords in the authentication schemes we're
      discussing for use in the Web. As illustrated in the examples,
      Web-based authentication uses similar but wholly distinct password
      files; a user need never have an actual account on a given Unix system
      in order to be validated for access to files being served from that
      system and protected with HTTP-based authentication. 

   Protection by password; multiple users allowed. 

      This document is accessible to user rover with password bacon and
      user jumpy with password kibbles. 

   Protection by network domain. 

      This document is only accessible to clients running on machines inside
      domain ncsa.uiuc.edu. 

      Note for non-NCSA readers: The .htaccess file used in this case is
      as follows: 

      AuthUserFile /dev/null
      AuthGroupFile /dev/null
      AuthName ExampleAllowFromNCSA
      AuthType Basic

      <Limit GET>
      order deny,allow
      deny from all
      allow from .ncsa.uiuc.edu
      </Limit>

   Protection by network domain -- exclusion. 

      This document is accessible to clients running on machines anywhere 
      but inside domain ncsa.uiuc.edu. 

      Note for NCSA readers: The .htaccess file used in this case is as
      follows: 

      AuthUserFile /dev/null
      AuthGroupFile /dev/null
      AuthName ExampleDenyFromNCSA
      AuthType Basic

      <Limit GET>
      order allow,deny
      allow from all
      deny from .ncsa.uiuc.edu
      </Limit>

   General Information

   There are two levels at which authentication can work: per-server and
   per-directory. This tutorial primarily covers per-directory authentication.
   Per-directory authentication means that users with write access to part of the
   filesystem that is being served can control access to their files as they wish.
   They need not have root access on the system or write access to the server's
   primary config files. 

   Access control for a given directory is controlled by a file named 
   .htaccess that resides in that directory. The server reads this file on each
   access to a document in that directory (or documents in subdirectories). 

   By-Password Authentication: Step By Step

   So let's suppose you want to restrict files in a directory called turkey to
   username pumpkin and password pie. Here's what to do: 

   Create a file called .htaccess in directory turkey that looks like
   this: 

   AuthUserFile /otherdir/.htpasswd
   AuthGroupFile /dev/null
   AuthName ByPassword
   AuthType Basic

   <Limit GET>
   require user pumpkin
   </Limit>

   Note that the password file will be in another directory (/otherdir). 

   Also note that in this case there is no group file, so we specify /dev/null
   (the standard Unix way to say "this file doesn't exist"). 

   AuthName can be anything you want. AuthType should always currently
   be Basic. 

   Create the password file /otherdir/.htpasswd. 

   The easiest way to do this is to use the htpasswd program distributed with
   NCSA httpd. Do this: 

   htpasswd -c /otherdir/.htpasswd pumpkin

   Type the password -- pie -- twice as instructed. 

   Check the resulting file to get a warm feeling of self-satisfaction; it should
   look like this: 

   pumpkin:y1ia3tjWkhCK2

   That's all. Now try to access a file in directory turkey -- Mosaic should
   demand a username and password, and not give you access to the file if you
   don't enter pumpkin and pie. If you are using a browser that doesn't handle
   authentication, you will not be able to access the document at all. 

   How Secure Is It?

   The password is passed over the network not encrypted but not as plain text
   -- it is "uuencoded". Anyone watching packet traffic on the network will not
   see the password in the clear, but the password will be easily decoded by
   anyone who happens to catch the right network packet. 

   So basically this method of authentication is roughly as safe as telnet
   -style username and password security -- if you trust your machine to be on
   the Internet, open to attempts to telnet in by anyone who wants to try, then
   you have no reason not to trust this method also. 

   Multiple Usernames/Passwords

   If you want to give access to a directory to more than one username/password
   pair, follow the same steps as for a single username/password with the
   following additions: 

   Add additional users to the directory's .htpasswd file. 

   Use the htpasswd command without the -c flag to additional users; e.g.: 

   htpasswd /otherdir/.htpasswd peanuts
   htpasswd /otherdir/.htpasswd almonds
   htpasswd /otherdir/.htpasswd walnuts

   Create a group file. 

   Call it /otherdir/.htgroup and have it look something like this: 

   my-users: pumpkin peanuts almonds walnuts

   ... where pumpkin, peanuts, almonds, and walnuts are the
   usernames. 

   Then modify the .htaccess file in the directory to look like this: 

   AuthUserFile /otherdir/.htpasswd
   AuthGroupFile /otherdir/.htgroup
   AuthName ByPassword
   AuthType Basic

   <Limit GET>
   require group my-users
   </Limit>

   Note that AuthGroupFile now points to your group file and that group 
   my-users (rather than individual user pumpkin) is now required for
   access. 

   That's it. Now any user in group my-users can use his/her individual
   username and password to gain access to directory turkey. 

   CERN has extensive documents on http-based authentication. The URL is
   http://info.cern.ch/hypertext/WWW/AccessAuthorization/Overview.html. 


   Graphical Information Map Tutorial

   Introduction

   This document is a step-by-step tutorial for designing and serving graphical
   maps of information resources. Through such a map, users can be provided
   with a graphical overview of any set of information resources; by clicking on
   different parts of the overview image, they can transparently access any of the
   information resources (possibly spread out all across the Internet). 

   First Steps

   This tutorial assumes use of NCSA httpd (version 1.0a5 or later). Some other
   servers (e.g. Plexus) can also serve image maps, in server-specific ways; see
   the specific server's docs for more information. 

   Make sure you have a working NCSA httpd server installed and running. 

   Make sure you have write privileges to the server's 
   conf/imagemap.conf config file. 

   Also make sure that the imagemap program is compiled and in the server's 
   htbin directory. 

   This tutorial also assumes use of NCSA Mosaic for X version 2.0. Other
   clients that support inlined GIF images and HTTP/1.0 URL redirection will
   also work. 

   Your First Image Map

   In this section we walk through the steps needed to get an initial image map
   up and running. 

   First, create an image. 

   There are a number of image creation and editing programs that will work
   nicely -- the one I use is called xpaint (you can find it on ftp.x.org in 
   /R5contrib; The important thing is that the image ends up in GIF
   format. 

   A common scheme for an image map is a collection of rectangles and
   circles, each containing a short text description of some piece of
   information or some information server; interconnections are conveyed
   through lines or arcs. Try to keep the individual items in the map spaced
   out far enough so a user will clearly know what he or she is clicking on. 

   Second, create an image map file. 

   Here is what an image map file looks like: 

   default /X11/mosaic/public/none.html

   rect http://cui_www.unige.ch/w3catalog    15,8    135,39
   rect gopher://rs5.loc.gov/11/global       245,86  504,143  
   rect http://nearnet.gnn.com/GNN-ORA.html  117,122 175,158 

   The format is fairly straightforward. The first line specifies the default
   response (the file to be returned if the region of the image in which the user
   clicks doesn't correspond to anything). 

   Subsequent lines specify rectangles in the image that correspond to arbitrary
   URLs -- for the first of these lines, the rectangle specified by 15,8 (x,y of
   the upper-left corner, in pixels) and 135,39 (lower-right corner)
   corresponds to URL http://cui_www.unige.ch/w3catalog. 

   So, what you need to do is find the upper-left and lower-right corners of a
   rectangle for each information resource in your image map. A good tool to
   use for this is xv (also on ftp.x.org in /contrib)-- pop up the Info window and
   draw rectangles over the image with the middle mouse button. 

   It doesn't matter where you put your map file or what you name it. For the
   purposes of this example, let's assume it's called /foo/sample.map. 

   Third, tell your server about your image map file. 

   You do this by adding a file to the server's conf/imagemap.conf file.
   The line looks like this: 

   sample : /foo/sample.map

   ... where sample is the symbolic name for your image map and 
   /foo/sample.map is the actual name of your map file. 

   Fourth, create an HTML document that contains your map image. 

   An example follows: 

   Click on the information resource you wish to see: <P>

   <A HREF="http://machine/htbin/imagemap/sample">
   <img src="sample.gif" ismap>
   </A> <P>

   Note: 

      machine is the name of the machine on which your HTTP server
      resides. 

      sample is the symbolic name of your image map (from above). 

      sample.gif is the name of your image (assuming, of course, that
      it's in the same directory on your server as the HTML file). 

   Fifth, try it out! Load the HTML file, look at the inlined image, click
   somewhere, and see what happens. 

   Subsequent Image Maps

   You can serve as many image maps from a single server as you want. Just add
   lines to conf/imagemap.conf pointing to each image map file you
   create. 

   Real-World Examples

   Following are examples of distributed image maps on servers in the real
   world; they may or may not work at any point in time. The URL for them is
   provided. 

      Experimental Internet Resources Metamap
      http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Demo/metamap.html

      University of California Museum of Paleontology
      http://ucmp1.berkeley.edu/. 

      National Institute of Standards and Technology
      http://www.nist.gov/welcome.html 

      server map at NCHPC information server
      http://info.lcs.mit.edu/Info/structure.html 


   WAIS and HTTP Integration

   Introduction

   This document overviews existing methods for using WAIS as a back-end
   search engine for HTTP servers. 

   Information herein is currently experimental and may or may not work for
   you. 

   WAIS and Plexus

   Plexus is a powerful Perl-based HTTP server written and maintained by Tony
   Sanders at BSDI. The URL's you might be interested in are:
   http://www.bsdi.com/server/doc/plexus.html
   http://www.cs.cmu.edu:8001/Web/People/rgs/perl.html

   WAIS and GN

   GN is a multi-protocol server written and maintained by John Franks at
   NWU. It is shipped with support for WAIS as a back-end search engine. 

   The URL's you might be interested in are:
   http://hopf.math.nwu.edu/
   http://hopf.math.nwu.edu:70/0h/docs/waisgn.guide

   WAIS and NCSA httpd 1.0

   Rob McCool has written a CGI script which allows NCSA httpd 1.0 as well
   as other CGI compliant servers to access a WAIS database in the same way
   that is mentioned in this document. The script is in the CGI archive. It
   contains instructions for setting it up under httpd 1.0. 

   The URL's you might be interested in are:
   ftp://ftp.ncsa.uiuc.edu/Web/ncsa_httpd/cgi/wais.tar.Z

   freeWAIS 0.202's URL Type

   freeWAIS 0.202 is shipped with support for type "URL". Use of this type is a
   little tricky. 

   First, Mosaic 2.0 doesn't know how to deal with this type directly, but
   Mosaic 2.1 (when it is released) will. 

   Second, use of this type apparently implies overloading the "headline" of a
   WAIS hit with the URL. This is fine, except then the description that the user
   sees of a given document is the URL, and URLs are, as usual, pretty cryptic
   things to just throw in front of average users. 

   But anyway, here's how it works: 

       waisindex ... -t URL what-to-trim what-to-add ...

   So what does that mean? 

   Well, first, -t URL tells waisindex to use type URL (note use of 
   lowercase -t in this instance). 

   Second, what-to-trim and what-to-add are parameters that tell the
   indexer how to put together the URL that's returned as the result of a query. 

   Suppose your documents are normally stored in /X11/mosaic/public.
   Suppose also that these documents are normally served via a URL that begins
   with http://wintermute.ncsa.uiuc.edu:8080. 

   This means that a file stored as /X11/mosaic/public/foo.html,
   for example, is normally served as 
   http://wintermute.ncsa.uiuc.edu:8080/foo.html. 

   The waisindex command you'd use in this case would be something like
   the following: 

       waisindex -d ~/localwais/sources/www -export
         -t URL /X11/mosaic/public http://wintermute.ncsa.uiuc.edu:8080
         /X11/mosaic/public/*.html

   ... where ~/localwais/sources/www is the name of the WAIS
   index file and /X11/mosaic/public/*.html are the files you are
   indexing. 

   When queries are made on this database, the string 
   /X11/mosaic/public is removed from the beginning of the filename of
   a matching file and the string 
   http://wintermute.ncsa.uiuc.edu:8080 is put in its place. 

   As per our previous example: /X11/mosaic/public/foo.html
   turns into 
   http://wintermute.ncsa.uiuc.edu:8080/foo.html as the
   result of a WAIS hit. 

   As you can see, this is perfect -- the WAIS server passes back the exact same
   URL that would normally be used to access this file via HTTP. So, everything
   from relative hyperlinks to relative inlined image references in the file will
   work correctly when the file is retrieved.