Previous section To contents

9.10 A more complex example - a simple WWW server

As most of you know, WWW WWW (World Wide Web), works by using a client program which will fetch files from remote servers when asked. Usually by clicking a picture or text. This example is a program for the server which will send files to any computer that requests them. The protocol used to send the file is called HTTP. (Hyper-Text Transfer Protocol)

Usually WWW involves HTML. HTML (Hyper-Text Markup Language) is a way to write documents with embedded pictures and links to other pages. These links are normally displayed underlined and if you click them your WWW- browser will load whatever document that link leads to.

#!/usr/local/bin/pike
                
/* A very small httpd capable of fetching files only. * Written by Fredrik Hübinette as a demonstration of Pike. */
                
inherit Stdio.Port;
We inherit Stdio.Port into this program so we can bind a TCP socket to accept incoming connection. A socket is simply a number to separate communications to and from different programs on the same computer.

Next are some constants that will affect how uHTTPD will operate. This uses the preprocessor directive #define. The preprocessor is the first stage in the compiling process and can make textual processing of the code before it is compiled. As an example, after the first define below, all occurrences of 'BLOCK' will be replaced with 16060.

/* Amount of data moved in one operation */
#define BLOCK 16060
                
/* Where do we have the html files ? */
#define BASE "/usr/local/html/"
                
/* File to return when we can't find the file requested */
#define NOFILE "/user/local/html/nofile.html"
                
/* Port to open */
#define PORT 1905
A port is a destination for a TCP connection. It is simply a number on the local computer. 1905 is not the standard port for HTTP connections though, which means that if you want to access this WWW server from a browser you need to specify the port like this: http://my.host.my.domain:1905/

Next we declare a class called output_class. Later we will clone one instance of this class for each incoming HTTP connection.

class output_class
{
    inherit Stdio.File : socket;
    inherit Stdio.File : file;
Our new class inherits Stdio.File twice. To be able to separate them they are then named 'socket' and 'file'.

    int offset=0;
Then there is a global variable called offset which is initialized to zero. (Each instance of this class will have its own instance of this variable, so it is not truly global, but...) Note that the initialization is done when the class is cloned (or instantiated if you prefer C++ terminology).

Next we define the function write_callback(). Later the program will go into a 'waiting' state, until something is received to process, or until there is buffer space available to write output to. When that happens a callback will be called to do this. The write_callback() is called when there is buffer space available. In the following lines 'void' means that it does not return a value. Write callback will be used further down as a callback and will be called whenever there is room in the socket output buffer.

    void write_callback()
    {
        int written;
        string data;
The following line means: call seek in the inherited program 'file'.
        file::seek(offset);
Move the file pointer to the where we want to the position we want to read from. The file pointer is simply a location in the file, usually it is where the last read() ended and the next will begin. seek() can move this pointer to where we want it though.

        data=file::read(BLOCK);
Read BLOCK (16060) bytes from the file. If there are less that that left to read only that many bytes will be returned.

        if(strlen(data))
        {
If we managed to read something...

            written=socket::write(data);
... we try to write it to the socket.

            if(written >= 0)
            {
                offset+=written;
                return;
            }
Update offset if we managed to write to the socket without errors.

            werror("Error: "+socket::errno()+".\n");
        }
If something went wrong during writing, or there was nothing left to read we destruct this instance of this class.

        destruct(this_object());
    }
That was the end of write_callback()

Next we need a variable to buffer the input received in. We initialize it to an empty string.

    string input="";
And then we define the function that will be called when there is something in the socket input buffer. The first argument 'id' is declared as mixed, which means that it can contain any type of value. The second argument is the contents of the input buffer.

    void read_callback(mixed id,string data)
    {
        string cmd;
                
        input+=data;
Append data to the string input. Then we check if we have received a a complete line yet. If so we parse this and start outputting the file.

        if(sscanf(input,"%s %s%*[\012\015 \t]",cmd,input)>2)
        {
This sscanf is pretty complicated, but in essence it means: put the first word in 'input' in 'cmd' and the second in 'input' and return 2 if successful, 0 otherwise.

            if(cmd!="GET")
            {
                werror("Only method GET is supported.\n");
                destruct(this_object());
                return;
            }
If the first word isn't GET print an error message and terminate this instance of the program. (and thus the connection)
            sscanf(input,"%*[/]%s",input);
Remove the leading slash.

            input=BASE+combine_path("/",input);
Combine the requested file with the base of the HTML tree, this gives us a full filename beginning with a slash. The HTML tree is the directory on the server in which the HTML files are located. Normally all files in this directory can be accessed by anybody by using a WWW browser. So if a user requests 'index.html' then that file name is first added to BASE (/home/hubbe/www/html/ in this case) and if that file exists it will be returned to the browser.
            if(!file::open(input,"r"))
            {
Try opening the file in read-only mode. If this fails, try opening NOFILE instead. Opening the file will enable us to read it later.

if(!file::open(NOFILE,"r"))
{
If this fails too. Write an error message and destruct this object.

    werror("Couldn't find default file.\n");
    destruct(this_object());
    return;
}
            }
Ok, now we set up the socket so we can write the data back.
            socket::set_buffer(65536,"w");
Set the buffer size to 64 kilobytes.

            socket::set_nonblocking(0,write_callback,0);
Make it so that write_callback is called when it is time to write more data to the socket.

            write_callback();
Jump-start the writing.
        }
    }
That was the end of read_callback().

This function is called if the connection is closed while we are reading from the socket.

    void selfdestruct() { destruct(this_object()); }

This function is called when the program is instantiated. It is used to set up data the way we want it. Extra arguments to clone() will be sent to this function. In this case it is the object representing the new connection.

    void create(object f)
    {
        socket::assign(f);
We insert the data from the file f into 'socket'.

        socket::set_nonblocking(read_callback,0,selfdestruct);
Then we set up the callback functions and sets the file nonblocking. Nonblocking mode means that read() and write() will rather return that wait for I/O to finish. Then we sit back and wait for read_callback to be called.

    }
End of create()

};
End of the new class.

Next we define the function called when someone connects.

void accept_callback()
{
    object tmp_output;
This creates a local variable of type 'object'. An object variable can contain a clone of any program. Pike does not consider clones of different programs different types. This also means that function calls to objects have to be resolved at run time.

    tmp_output=accept();
The function accept clones a Stdio.File and makes this equal to the newly connected socket.

    if(!tmp_output) return;
If it failed we just return.

    output_class(tmp_output);
Otherwise we clone an instance of 'output_class' and let it take care of the connection. Each clone of output_class will have its own set of global variables, which will enable many connections to be active at the same time without data being mixed up. Note that the programs will not actually run simultaneously though.

    destruct(tmp_output);
Destruct the object returned by accept(), output_class has already copied the contents of this object.

}

Then there is main, the function that gets it all started.
int main(int argc, array(string) argv)
{
    werror("Starting minimal httpd\n");
Write an encouraging message to stderr.
    if(!bind(PORT, accept_callback))
    {
        werror("Failed to open socket (already bound?)\n");
        return 17;
    }

Bind PORT and set it up to call accept_callback as soon as someone connects to it. If the bind() fails we write an error message and return the 17 to indicate failure.
    return - 17; /* Keep going */
If everything went ok, we return -17, any negative value returned by main() means that the program WON'T exit, it will hang around waiting for events instead. (like someone connecting)
}
That's it, this simple program can be used as the basis for a simple WWW-server. Note that today most WWW servers are very complicated programs, and the above program can never replace a modern WWW server. However, it is very fast if you only want a couple of web pages and have a slow machine available for the server.


Previous section To contents