Getting started in website development

Simon Waldman (


   1: Intro
   2: Steps to being useful ;-)
   3: Information you'll need
      3.1: General SEUL developers' info
      3.2: Specific information for website development
         3.2.1: Layout of the repository
         3.2.2: How the repository relates to the website
         3.2.3: What to check out
         3.2.4: sdoc

Section 1: Intro

This document aims to provide the basic information that people will require to start work on the SEUL website. It is an evolving document - at present it contains what I know I would have found useful. I would welcome ideas of what else should be included.

Section 2: Steps to being useful ;-)

  1. If you haven't already, talk to (Arma) about what you'd like to help with. Read this page.
  2. To get access to the cvs repository (where the source for the website is stored) email the sysarchs and ask for a logon id on cran (one of the servers). You'll need to include: If you don't see yourself doing a lot of website work, an id on cran may not be necessary - chat to me if you're unsure.
  3. Get yourself a copy of SSH (see the general FAQ), as telnet connections are discouraged for security reasons.
  4. Subscribe to the seul-pub mailing list.
  5. Optionally (very recommended) subscribe to the seul-commits list, where all CVS commit log messages are copied. It is a closed list, primarily to ensure the sysarchs know who's mucking with the repo... ;-)
  6. Read the rest of this document!

Section 3: Information you'll need

Section 3.1: General SEUL developers' info

Much of the stuff that a website developer will need to know is needed by anybody working on the SEUL project. This is all contained in a seperate document, the General SEUL Developer FAQ.

I suggest that you at least skim this document to see if there's anything you're not familiar with.

Section 3.2: Specific information for website development

Section 3.2.1: Layout of the repository

At the time of writing, the cvs repository is layed out as shown in the following diagram. This is not a complete tree, but an attempt to give a rough idea. This only contains the directories that concern us, and does not follow all the branches all the way down.

In most cases, web pages to do with a group or topic are contained in a subdirectory named html/ within that group or topic's directory.

Any images used on those web pages should be put in a images/ directory under this. So images for the seul-dev-apps group's page would reside in /dev/apps/html/images/. The exception to this rule is if you anticipate an image being used in more than one place. In this case it should go at the lowest level which will allow all the referring pages to access it without having to go down another branch of the tree. For example, an image that will be used on both the seul-dev-apps and seul-dev-distrib pages would go at /dev/html/images.

In a few cases there are some directories called doc/ instead of html/ which also check out to the website. These directories are primarily (almost exclusively) to be used for individual projects, where there is actual documentation. This ensures that all our project documentation is always available on the web. But remember, doc/ is strictly for project documentation. This means that there is rarely any need for seul-pub people to deal with them.

|     |
|     |---admin/          }
|     |        |---html/  }
|     |---apps/           }
|     |       |----html/  }
|     |---distrib/        }
|     |          |-html/  }  See Note 1
|     |---help/           }
|     |       |----html/  }
|     |---install/        }
|     |          |-html/  }
|     |---ui/             }
|     |     |------html/  }
|     |
|     |---sysarch/        }  See Note 2
|     |          |-html/  }
|     |
|     |---html/           }  See Note 3
|--pub/                   }  See Note 4
|     |---html/ 
|     |---website/        }  See Note 5
|     |          |-html/  }
|     |
|     |---public_html     }  See Note 6
|     |---html/
|--sys/                   }  See Note 7


  1. The dev-* groups' areas - Each group has (or will have) it's own 'mini-website' within these areas. I hope eventually to find somebody to take responsibility for each of these, either a member of seul-pub who watches the group or the other way round, in order to keep them up to date.
  2. The sysarchs' area - For stuff the sysarchs want to say, and generally maintained by them as well.
  3. dev/html/ - This is the root of the development website - i.e. that which is intended for members of the SEUL project rather than possible users.
  4. The pub/ area - Apart from the website itself, there are two types of stuff that seul-pub will produce: Documents for public consumption that are being worked on (eg whylinux) are worked on in subdirectories under pub/ (not shown on the diagram). When they are ready for public consumption, they will be checked out as similarly named subdirectories in the website.
    Pages on the development website can be edited in-situ, as it doesn't matter quite as much if a page looks weird for a few minutes while it's being worked on.
    Documents for use by the seul-pub group are stored in appropriate directories under pub/, which check out directly to the website.
  5. The pub/website/ directory - This is a kind of 'meta-website' - it contains information related to the running of the website, such as this document.
  6. pub/public_html/This is the root of the public website. It checks out to, and unusually everything under this directory maps directly to the website. So pub/public_html/images/ becomes
  7. sys/ This directory contains all the infrastructure systems for the project, such as the sdoc source. Throughout it are scattered directories that check out to the website, containing documentation on the stuff in there (e.g. details of the sdoc doctypes). Basically, it's a mess. But omega has promised to tidy it up, so there is some hope. ;-)

Section 3.2.2: How the repository relates to the website

Whenever a change is checked in (or committed - two phrases, same meaning) to the repository in one of the directories that contains website stuff, that document is parsed by sdoc and copied accross to the website. It should become availible from the website within a minute. Obviously this time depends on the load on the servers at the time, but if nothing has happened in ten minutes then one of two things has happened:

Normally a directory will appear on the website in the same location as it was in the repository, except with any proceeding /html/ missing. For example, /dev/apps/html/images/ would appear at There are, however, plenty of exceptions.

This can all get very confusing, but it is not so bad once you know that there is one file which governs where every directory maps to. This file is sys/web/etc/webcolist, and lines from it look like this:


If a directory is in this file, it will appear in the website at the location specified. This is recursive, so any directory underneath dev/apps/html/ will be checked out too, under /dev/apps/. It is worth having a look through this file, as it may make what you've read so far a lot clearer.

Section 3.2.3: What to check out

This is very much up to you. You will need to check out any section you are working on, but you may wish to get more - I have a copy of all the website directories on my hard disk.

There is a problem here: if you want to check out any more than one or two directories you have a choice between getting lots of directories individually, which is a bit laborious, or checking out a higher level directory and getting loads of stuff that is not to do with the website. The solution lies with the webcolist file. Once you have checked this one file out, it is fairly easy to write a perl script to get everything else that you want. For instance, if you wanted to get the whole website:

  1. Change to a directory where you will do your work, such as ~/work/seul/.
  2. Check out the webcolist by the following command:
    cvs checkout sys/web/etc
    This will create the sys/web/etc/ directory below the current directory and put in it webcolist and any other files that are in that directory in the repository.
  3. Write a perl script to check out everything you want. The example here will get every directory in webcolist. This can be quite a lot. If you know regular expressions (the messy bit after the if on line 6) it will not be hard to change this to check out only what you want.
    open(COLIST,"sys/web/etc/webcolist") || die "Couldn't open webcolist";
    $command = "cvs checkout ";
    while() {
      if (/(.+)=/) {
         print "\nAdding $1 to the checkout line: ";
         $command .= "$1 ";
         print "\n$command\n";
    You may have to alter the pathname to the perl interpreter on the first line to suit your computer.
  4. Save this, and make it executable by chmod +x filename.
  5. Run it. Be aware that it will create the directory structure from the current directory. This should be done in the aforementioned ~/work/seul/, or wherever you decide to put it.

If anyone wants to improve on this script send me a better one and I'll publish it here. Thanks go to arma for this one.

Section 3.2.4: sdoc

sdoc is the system used for SEUL documentation. It is a parser that processes HTML and passes it straight through, but also has a few tags of its own from which it produces HTML formatting. This allows us to seperate content from formatting, and to easily achive a uniform look across the website.

sdoc is currently under fairly major alteration. More information will be posted here when it is finalised. For the moment, look at the skeleton template for an example, most of it is fairly self-explanatory.

Please report any technical problems to
$Id: gettingstarted.html,v 2001/02/08 20:09:05 arma Exp $