NCSA httpd tutorials
In an effort to make this documentation a bit more usable, we have written these
tutorials on various aspects of server setup.
NCSA httpd directory indexing
NCSA httpd provides a directory indexing format which is similar to that
which will be offered by the WWW Common Library. To set up this
indexing, follow these steps.
For an example of what these indexes look like, take a look at the demo.
Activating Fancy indexing
You first need to tell httpd to use the advanced indexing instead of the simple
version. The simple version should be used if you prefer its simplicity, or if
you are serving files off of a remote file server, for which the stat() call
would be costly. You tell the server which you want to use with either the
IndexOptions directive, or the older FancyIndexing directive. We recommend:
IndexOptions FancyIndexing
Icons
NCSA httpd comes with a number of icons in the /icons subdirectory which
are used for directory indexing. The first thing you should do is make sure
your Server Resource Map h as the following line in it:
Alias /icons/ /usr/local/etc/httpd/icons/
You should replace /usr/local/etc/httpd/ with whatever you set
ServerRoot to be.
Next, you need to tell the server what icons to provide for different types of
files. You do this with the AddIcon and AddIconByType directives. We
recommend something like the following setup:
AddIconByType (IMG,/icons/image.xbm) image/*
AddIconByType (SND,/icons/sound.xbm) audio/*
AddIconByType (TXT,/icons/text.xbm) text/*
This covers the three main types of files. If you want to add your own icons,
simply create the appropriately sized xbm, place it in /icons, and choose a
3-letter ALT identifier for the type.
httpd also requires three special icons, one for directories, one which is a
blank icon the same size as the other icons, and one which specifies the parent
directory of this index. To use the icons in the distribution, use the following
lines in srm.conf:
AddIcon /icons/menu.xbm ^^DIRECTORY^^
AddIcon /icons/blank.xbm ^^BLANKICON^^
AddIcon /icons/back.xbm ..
However, not all files fit one of these types. To provide a general icon for any
unknown files, use the DefaultIcon directive:
DefaultIcon /icons/unknown.xbm
Descriptions
If you want to add descriptions to your files, use the AddDescription
directive. For instance, to add the description "My pictures" to
/usr6/rob/public_html/images, use the following line:
AddDescription "My pictures" /usr6/rob/public_html/images/*
If you want to have the titles of your HTML documents displayed for their
descriptions, use the IndexOptions directive to activate ScanHTMLTitles:
IndexOptions FancyIndexing ScanHTMLTitles
WARNING: You should only use this option if your server has time to spare!!!
This is a costly operation!
Ignoring certain items
Generally, you don't want httpd sending references to certain files when it's
creating indexes. Such files are emacs autosave and backup files, httpd's
.htaccess files, and perhaps any file beginning with . (if you have a gopher or
FTP server running in that directory as well). We recommend you ignore the
following patterns:
IndexIgnore */.??* */README* */HEADER*
This tells httpd to ignore any file beginning with ., and any file starting with
README or HEADER.
Creating READMEs and HEADERs
When httpd is indexing a directory, it will look for two things and insert them
into the index: A header, and a README. Generally, the header contains an
HTML
tag with a title for this index, and a brief description of what's
in this directory. The README contains things you may want people to read
about the items being served.
httpd will look for both plaintext and HTML versions of HEADERs or
READMEs. If we add the following lines to srm.conf:
ReadmeName README
HeaderName HEADER
When httpd is indexing a directory, it will first look for HEADER.html. If it
doesn't find that file, it will look for HEADER. If it finds neither, it will
generate its own. If it finds one, it will insert it at the beginning of the index.
Similarly, the server will look for README.html, then README to insert a
trailer for the document.
Setting up CGI in NCSA httpd
CGI scripts are a way for documents to be generated on the fly. You should
first read the brief introduction to CGI to learn what it is and why you would
want to use it.
There are two main mechanisms to tell NCSA httpd where your scripts are.
Each has its pluses and its minuses.
ScriptAlias
The first approach is based on the Server Resource Map directive ScriptAlias.
With this directive, you specify to the server that you want to designate a
directory (or directories) as script-only, that is, any time the server tries to
retrieve a file from these directories it will execute the file instead of reading
it.
The usual setup is to have the following line in srm.conf:
ScriptAlias /cgi-bin/ cgi-bin/
This will make any request to the server which begins with /cgi-bin/ be
fulfilled by executing the corresponding program in ServerRoot/cgi-bin/.
You may have more than one ScriptAlias directive in srm.conf to desingnate
different directories as CGI.
The advantage of this setup is ease of administration, and centralization.
Many system managers don't want things as dangerous as scripts anywhere in
the filesystem. The disadvantage is that anyone wishing to create scripts must
either have their own entry in srm.conf or must have write access to a
ScriptAliased directory.
CGI as files
NCSA httpd 1.2 allows you to create CGI scripts anywhere, by specifying a
"magic" MIME type for files which tells the server to execute them instead of
sending them. To accomplish this, use the AddType directive in either the
Server Resource Map or in a per-directory access control file.
For instance, to make all files ending in .cgi scripts, use the following
directive:
AddType application/x-httpd-cgi .cgi
Alternatively, you could add .sh and .pl after .cgi to allow automatic
execution of shell scripts and PERL scripts. Note that you have to have
Options ExecCGI activated in the directory you create scripts. (you might
want to read more about directives like Option in the docs for installation).
The advantage of this setup is that scripts may be absolutely anywhere. The
disadvantage is that scripts may be absolutely anywhere (especially places you
don't want them to be like users' home directories).
NCSA httpd server side includes
NCSA httpd allows users to create documents which provide simple
information to clients on the fly. Such information can include the current
date, the file's last modification date, and the size or last modification of
other files. In its more advanced usage, it can provide a powerful interface to
CGI and /bin/sh programs.
Issues
Having the server parse documents is a double edged sword. It can be costly
for heavily loaded servers to perform parsing of files while sending them.
Further, it can be considered a security risk to have average users executing
commands as the server's User. If you disable the exec option, this danger is
mitigated, but the performance issue remains. You should consider these
items carefully before activating server-side includes on your server.
Setting Up Includes
First, you should decide which directories you want to allow Includes in.
Most likely this will not include users' home directories or directories you do
not trust. You should then decide, of the directories you are allowing includes
in, which directories are safe enough to use exec in.
For the directories in which you want to fully enable includes, you need to use
the Options directive to turn on the option Includes. Similarly for the
directories you want crippled (no exec) includes, you should use the option
IncludesNOEXEC. In any directory you want to disable includes, use the
Options directive without either option.
Next, you need to tell the server what filename extension you are using for the
parsed files. These files, while very similar to HTML, are not HTML and are
thus not treated the same. Internally, the server uses the magic MIME type
text/x-server-parsed-html to identify parsed documents. It will
then perform a format conversion to change these files into HTML for the
client. To tell the server which extension you want to use for parsed files, use
the AddType directive. For instance:
AddType text/x-server-parsed-html .shtml
This makes any file ending with .shtml a parsed file. Alternatively, if you
don't care about the performance hit of having all .html files parsed, you
could use:
AddType text/x-server-parsed-html .html
This would make the server parse all .html files.
Converting your old INC SRV documents to the new format
You should use the program inc2shtml in the support subdirectory of
the httpd distribution to translate your documents from httpd 1.1 and earlier
to the new format. Usage is simple: inc2shtml file.html >
file.shtml.
The Include format
All directives to the server are formatted as SGML comments within the
document. This is in case the document should ever find itself in the client's
hands unparsed. Each directive has the following format:
Each command takes different arguments, most only accept one tag at a time.
Here is a breakdown of the commands and their associated tags:
config
The config directive controls various aspects of the file parsing. There
are two valid tags:
errmsg controls what message is sent back to the client if an
error includes while parsing the document. When an error
occurs, it is logged in the server's error log.
timefmt gives the server a new format to use when providing
dates. This is a string compatible with the strftime library
call under most versions of UNIX.
sizefmt determines the formatting to be used when
displaying the size of a file. Valid choices are bytes, for a
formatted byte count (formatted as 1,234,567), or abbrev for
an abbreviated version displaying the number of kilobytes or
megabytes the file occupies.
include
include will insert the text of a document into the parsed document.
Any included file is subject to the usual access control. This command
accepts two tags:
virtual gives a virtual path to a document on the server.
You must access a normal file this way, you cannot access a
CGI script in this fashion. You can, however, access another
parsed document.
file gives a pathname relative to the current directory. ../
cannot be used in this pathname, nor can absolute paths be used.
As above, you can send other parsed documents, but you cannot
send CGI scripts.
echo prints the value of one of the include variables (defined below).
Any dates are printed subject to the currently configured timefmt.
The only valid tag to this command is var, whose value is the name
of the variable you wish to echo.
fsize prints the size of the specified file. Valid tags are the same as
with the include command. The resulting format of this command
is subject to the sizefmt parameter to the config command.
flastmod prints the last modification date of the specified file,
subject to the formatting preference given by the timefmt parameter
to config. Valid tags are the same as with the include command.
exec executes a given shell command or CGI script. It must be
activated to be used. Valid tags are:
cmd will execute the given string using /bin/sh. All of the
variables defined below are defined, and can be used in the
command.
cgi will execute the given virtual path to a CGI script and
include its output. The server does not perform error checking
to make sure your script didn't output horrible things like a
GIF, so be careful. It will, however, interpret any URL
Location: header and translate it into an HTML anchor.
Variables defined for Parsed Documents
A number of variables are made available to parsed documents. In addition to
the CGI variable set, the following variables are made available:
DOCUMENT_NAME: The current filename.
DOCUMENT_URI: The virtual path to this document (such as
/~robm/foo.shtml).
QUERY_STRING_UNESCAPED: The unescaped version of any
search query the client sent, with all shell-special characters escaped
with \.
DATE_LOCAL: The current date, local time zone. Subject to the
timefmt parameter to the config command.
DATE_GMT: Same as DATE_LOCAL but in Greenwich mean time.
LAST_MODIFIED: The last modification date of the current
document. Subject to timefmt like the others.
Making your setup more secure
When configuring the access control for your server, you will want to make
sure you do not give any unauthorized access to anyone. Please follow these
guidelines to ensure that your server is not compromised.
A word of caution on DNS based access control and user
authentication
The access control by hostname and Basic user authentication facilities
provided by httpd are relatively safe, but not bulletproof. The user
authentication sends passwords across the network in plaintext,
making them easily readable. The DNS based access control is only as
safe as DNS, so you should keep that in mind when using it. Bottom
line: If it absolutely positively cannot be seen by outside people, you
probably should not use httpd to protect it.
Disable Server-side includes wherever possible
Whenever you can, use the Options directive to disable server-side
includes. At the very least, you should disable the exec feature. Note
that because the default value of Options is All, you should include an
Options directive in every Directory cl ause in your global ACF and in
every .htaccess file you write.
Use AllowOverride None whe rever possible
Use this directive to prevent any "untrusted" directories (such as users'
home directories) from overriding your settings (and thus allowing
their friends to execute xterms as nobody with a server-side include or
other such horrors). You also gain a bonus in performance.
Protect your users' home directories
Protect your users' home directories with Directory directives. If
your users all have their home directories in one physical location
(such as /home), then this is easy:
AllowOverride None
Options Indexes
If they are not all in one location such as /home, then you should use
this wildcard pattern to secure them (assuming your UserDir is se t to
public_html):
AllowOverride None
Options Indexes
In addition, if you wish to give your users the ability to create
symbolic links to things only they own, use the Option
SymLinksIfOwnerMatch.
Mosaic User Authentication Tutorial
Introduction
This tutorial surveys the current methods in NCSA Mosaic for X version 2.0
and NCSA httpd for restricting access to documents. The tutorial also walks
through setup and use of these methods.
Mosaic 2.0 and NCSA httpd allow access restriction based on several criteria:
Username/password-level access authorization.
Rejection or acceptance of connections based on Internet address of
client.
A combination of the above two methods.
This tutorial is based heavily on work done by Ari Luotonen at CERN and
Rob McCool at NCSA. In particular, Ari wrote the client-side code currently
in Mosaic 2.0, and Rob wrote NCSA httpd.
Getting Started
Before you can explore access authorization, you need to install NCSA httpd
1.0a5 or later on a Unix machine under your control, or get write access to
one or more directories in a filespace already being served by NCSA httpd.
You also need to be running Mosaic for X version 2.0 or later, or another
browser known to support HTTP/1.0-based authentication.
Prepared Examples
Following are several examples of the range of access authorization
capabilities available through Mosaic and NCSA httpd. The examples are
served from a system at NCSA.
Simple protection by password.
This document is accessible only to user fido with password bones.
Important Note: There is no correspondence between usernames and
passwords on specific Unix systems (e.g. in an /etc/passwd file)
and usernames and passwords in the authentication schemes we're
discussing for use in the Web. As illustrated in the examples,
Web-based authentication uses similar but wholly distinct password
files; a user need never have an actual account on a given Unix system
in order to be validated for access to files being served from that
system and protected with HTTP-based authentication.
Protection by password; multiple users allowed.
This document is accessible to user rover with password bacon and
user jumpy with password kibbles.
Protection by network domain.
This document is only accessible to clients running on machines inside
domain ncsa.uiuc.edu.
Note for non-NCSA readers: The .htaccess file used in this case is
as follows:
AuthUserFile /dev/null
AuthGroupFile /dev/null
AuthName ExampleAllowFromNCSA
AuthType Basic
order deny,allow
deny from all
allow from .ncsa.uiuc.edu
Protection by network domain -- exclusion.
This document is accessible to clients running on machines anywhere
but inside domain ncsa.uiuc.edu.
Note for NCSA readers: The .htaccess file used in this case is as
follows:
AuthUserFile /dev/null
AuthGroupFile /dev/null
AuthName ExampleDenyFromNCSA
AuthType Basic
order allow,deny
allow from all
deny from .ncsa.uiuc.edu
General Information
There are two levels at which authentication can work: per-server and
per-directory. This tutorial primarily covers per-directory authentication.
Per-directory authentication means that users with write access to part of the
filesystem that is being served can control access to their files as they wish.
They need not have root access on the system or write access to the server's
primary config files.
Access control for a given directory is controlled by a file named
.htaccess that resides in that directory. The server reads this file on each
access to a document in that directory (or documents in subdirectories).
By-Password Authentication: Step By Step
So let's suppose you want to restrict files in a directory called turkey to
username pumpkin and password pie. Here's what to do:
Create a file called .htaccess in directory turkey that looks like
this:
AuthUserFile /otherdir/.htpasswd
AuthGroupFile /dev/null
AuthName ByPassword
AuthType Basic
require user pumpkin
Note that the password file will be in another directory (/otherdir).
Also note that in this case there is no group file, so we specify /dev/null
(the standard Unix way to say "this file doesn't exist").
AuthName can be anything you want. AuthType should always currently
be Basic.
Create the password file /otherdir/.htpasswd.
The easiest way to do this is to use the htpasswd program distributed with
NCSA httpd. Do this:
htpasswd -c /otherdir/.htpasswd pumpkin
Type the password -- pie -- twice as instructed.
Check the resulting file to get a warm feeling of self-satisfaction; it should
look like this:
pumpkin:y1ia3tjWkhCK2
That's all. Now try to access a file in directory turkey -- Mosaic should
demand a username and password, and not give you access to the file if you
don't enter pumpkin and pie. If you are using a browser that doesn't handle
authentication, you will not be able to access the document at all.
How Secure Is It?
The password is passed over the network not encrypted but not as plain text
-- it is "uuencoded". Anyone watching packet traffic on the network will not
see the password in the clear, but the password will be easily decoded by
anyone who happens to catch the right network packet.
So basically this method of authentication is roughly as safe as telnet
-style username and password security -- if you trust your machine to be on
the Internet, open to attempts to telnet in by anyone who wants to try, then
you have no reason not to trust this method also.
Multiple Usernames/Passwords
If you want to give access to a directory to more than one username/password
pair, follow the same steps as for a single username/password with the
following additions:
Add additional users to the directory's .htpasswd file.
Use the htpasswd command without the -c flag to additional users; e.g.:
htpasswd /otherdir/.htpasswd peanuts
htpasswd /otherdir/.htpasswd almonds
htpasswd /otherdir/.htpasswd walnuts
Create a group file.
Call it /otherdir/.htgroup and have it look something like this:
my-users: pumpkin peanuts almonds walnuts
... where pumpkin, peanuts, almonds, and walnuts are the
usernames.
Then modify the .htaccess file in the directory to look like this:
AuthUserFile /otherdir/.htpasswd
AuthGroupFile /otherdir/.htgroup
AuthName ByPassword
AuthType Basic
require group my-users
Note that AuthGroupFile now points to your group file and that group
my-users (rather than individual user pumpkin) is now required for
access.
That's it. Now any user in group my-users can use his/her individual
username and password to gain access to directory turkey.
CERN has extensive documents on http-based authentication. The URL is
http://info.cern.ch/hypertext/WWW/AccessAuthorization/Overview.html.
Graphical Information Map Tutorial
Introduction
This document is a step-by-step tutorial for designing and serving graphical
maps of information resources. Through such a map, users can be provided
with a graphical overview of any set of information resources; by clicking on
different parts of the overview image, they can transparently access any of the
information resources (possibly spread out all across the Internet).
First Steps
This tutorial assumes use of NCSA httpd (version 1.0a5 or later). Some other
servers (e.g. Plexus) can also serve image maps, in server-specific ways; see
the specific server's docs for more information.
Make sure you have a working NCSA httpd server installed and running.
Make sure you have write privileges to the server's
conf/imagemap.conf config file.
Also make sure that the imagemap program is compiled and in the server's
htbin directory.
This tutorial also assumes use of NCSA Mosaic for X version 2.0. Other
clients that support inlined GIF images and HTTP/1.0 URL redirection will
also work.
Your First Image Map
In this section we walk through the steps needed to get an initial image map
up and running.
First, create an image.
There are a number of image creation and editing programs that will work
nicely -- the one I use is called xpaint (you can find it on ftp.x.org in
/R5contrib; The important thing is that the image ends up in GIF
format.
A common scheme for an image map is a collection of rectangles and
circles, each containing a short text description of some piece of
information or some information server; interconnections are conveyed
through lines or arcs. Try to keep the individual items in the map spaced
out far enough so a user will clearly know what he or she is clicking on.
Second, create an image map file.
Here is what an image map file looks like:
default /X11/mosaic/public/none.html
rect http://cui_www.unige.ch/w3catalog 15,8 135,39
rect gopher://rs5.loc.gov/11/global 245,86 504,143
rect http://nearnet.gnn.com/GNN-ORA.html 117,122 175,158
The format is fairly straightforward. The first line specifies the default
response (the file to be returned if the region of the image in which the user
clicks doesn't correspond to anything).
Subsequent lines specify rectangles in the image that correspond to arbitrary
URLs -- for the first of these lines, the rectangle specified by 15,8 (x,y of
the upper-left corner, in pixels) and 135,39 (lower-right corner)
corresponds to URL http://cui_www.unige.ch/w3catalog.
So, what you need to do is find the upper-left and lower-right corners of a
rectangle for each information resource in your image map. A good tool to
use for this is xv (also on ftp.x.org in /contrib)-- pop up the Info window and
draw rectangles over the image with the middle mouse button.
It doesn't matter where you put your map file or what you name it. For the
purposes of this example, let's assume it's called /foo/sample.map.
Third, tell your server about your image map file.
You do this by adding a file to the server's conf/imagemap.conf file.
The line looks like this:
sample : /foo/sample.map
... where sample is the symbolic name for your image map and
/foo/sample.map is the actual name of your map file.
Fourth, create an HTML document that contains your map image.
An example follows:
Click on the information resource you wish to see:
Note:
machine is the name of the machine on which your HTTP server
resides.
sample is the symbolic name of your image map (from above).
sample.gif is the name of your image (assuming, of course, that
it's in the same directory on your server as the HTML file).
Fifth, try it out! Load the HTML file, look at the inlined image, click
somewhere, and see what happens.
Subsequent Image Maps
You can serve as many image maps from a single server as you want. Just add
lines to conf/imagemap.conf pointing to each image map file you
create.
Real-World Examples
Following are examples of distributed image maps on servers in the real
world; they may or may not work at any point in time. The URL for them is
provided.
Experimental Internet Resources Metamap
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Demo/metamap.html
University of California Museum of Paleontology
http://ucmp1.berkeley.edu/.
National Institute of Standards and Technology
http://www.nist.gov/welcome.html
server map at NCHPC information server
http://info.lcs.mit.edu/Info/structure.html
WAIS and HTTP Integration
Introduction
This document overviews existing methods for using WAIS as a back-end
search engine for HTTP servers.
Information herein is currently experimental and may or may not work for
you.
WAIS and Plexus
Plexus is a powerful Perl-based HTTP server written and maintained by Tony
Sanders at BSDI. The URL's you might be interested in are:
http://www.bsdi.com/server/doc/plexus.html
http://www.cs.cmu.edu:8001/Web/People/rgs/perl.html
WAIS and GN
GN is a multi-protocol server written and maintained by John Franks at
NWU. It is shipped with support for WAIS as a back-end search engine.
The URL's you might be interested in are:
http://hopf.math.nwu.edu/
http://hopf.math.nwu.edu:70/0h/docs/waisgn.guide
WAIS and NCSA httpd 1.0
Rob McCool has written a CGI script which allows NCSA httpd 1.0 as well
as other CGI compliant servers to access a WAIS database in the same way
that is mentioned in this document. The script is in the CGI archive. It
contains instructions for setting it up under httpd 1.0.
The URL's you might be interested in are:
ftp://ftp.ncsa.uiuc.edu/Web/ncsa_httpd/cgi/wais.tar.Z
freeWAIS 0.202's URL Type
freeWAIS 0.202 is shipped with support for type "URL". Use of this type is a
little tricky.
First, Mosaic 2.0 doesn't know how to deal with this type directly, but
Mosaic 2.1 (when it is released) will.
Second, use of this type apparently implies overloading the "headline" of a
WAIS hit with the URL. This is fine, except then the description that the user
sees of a given document is the URL, and URLs are, as usual, pretty cryptic
things to just throw in front of average users.
But anyway, here's how it works:
waisindex ... -t URL what-to-trim what-to-add ...
So what does that mean?
Well, first, -t URL tells waisindex to use type URL (note use of
lowercase -t in this instance).
Second, what-to-trim and what-to-add are parameters that tell the
indexer how to put together the URL that's returned as the result of a query.
Suppose your documents are normally stored in /X11/mosaic/public.
Suppose also that these documents are normally served via a URL that begins
with http://wintermute.ncsa.uiuc.edu:8080.
This means that a file stored as /X11/mosaic/public/foo.html,
for example, is normally served as
http://wintermute.ncsa.uiuc.edu:8080/foo.html.
The waisindex command you'd use in this case would be something like
the following:
waisindex -d ~/localwais/sources/www -export
-t URL /X11/mosaic/public http://wintermute.ncsa.uiuc.edu:8080
/X11/mosaic/public/*.html
... where ~/localwais/sources/www is the name of the WAIS
index file and /X11/mosaic/public/*.html are the files you are
indexing.
When queries are made on this database, the string
/X11/mosaic/public is removed from the beginning of the filename of
a matching file and the string
http://wintermute.ncsa.uiuc.edu:8080 is put in its place.
As per our previous example: /X11/mosaic/public/foo.html
turns into
http://wintermute.ncsa.uiuc.edu:8080/foo.html as the
result of a WAIS hit.
As you can see, this is perfect -- the WAIS server passes back the exact same
URL that would normally be used to access this file via HTTP. So, everything
from relative hyperlinks to relative inlined image references in the file will
work correctly when the file is retrieved.