Lab #4: Webapp | CSE-2041A: Net-Centric Computing | 2012 fall

CSE-2041A
Net-Centric Computing
York University
Fall 2012

Lab #4: Webapp
Contact Phonebook

posted: 9 October 2012
revised: 16 October 2012

Introduction

Welcome to Lab #4. Let's build webapps! We shall use the old CGI style of doing things. Why? Well, it is simple; and, you see exactly what is going on, with nothing hidden.

Lab #4 has two parts, and spans two lab periods. It has two deliverables (one for each week), and is worth two lab points (one for each part / deliverable).

My CGI

Getting the basics working.
Contact Phonebook (PB)

A miniature contact-database “phonebook” app.

The second — the PB project — is a bit more involved. It is a full-fledged, albeit still somewhat simplistic, webapp.

A. My CGI

Let's set up a CGI, a server-side application program.

Since you people know Java, you will write the app program in Java. A “CGI” in Java is not supported! No problem, though. We will just have a miniature CGI “wrapper” program; all it will do is call your Java program.

CGI programs are called and executed by the web server — say, Apache — directly. By convention, a CGI program may be written in any interpreted / scripting programming language; e.g., Bourne Shell (sh), Perl, Python, PHP, and Ruby. There are other standard ways to set up web server-side apps written in Java, such as with Java Servlets. However, this hides some of what is actually going on. That would be a convenient abstraction, providing for more robust, faster app development. But it would be bad for us learning what is happening behind the scenes.

example: ParkeCalc

We have been doing the example of a server-side app in class of a calculator (ParkeCalc is what I called the one I sketched up). The versions are written in Perl, but illustrate much of what we are doing here.

You can poke at these at the following places:

You can look at the code in /~godfrey/course/2041/see/.

(How is this showing you the source files?! A CGI is usually executed by demand by the server, not delivered as a static document. And one usually would not want the source code visible to the outside world, for security. Check out the source of the .htaccess at /~godfrey/course/2041/see/htaccess. Caution! Do not use this .htaccess for your PB project. Get the one from PB template sources for that.)

example: HtmlTemplate

Look at the code of HtmlTemplate written in Java:

This is an example of a clean style for writing into output HTML. It also picks up the remote user, if the user has authenticated and that information is available.

You can invoke it in two ways:

What is the difference between the two? (See the .htaccess in the PB template sources.)

MyCalc.cgi

Study how each of the three versions of ParkeCalc works. You are to reverse engineer this to build your own version, MyCalc, in Java. Your MyCalc should operate like ParkeCalcC. Maybe make it slightly fancier in its functionality. Be creative.

How are the parameters in the ParkeCalc app delivered from the client to the server?
What is the URL that the form calls in the generated HTML from ParkeCalcB and ParkeCalcC?
What does your program have to return? Think this through carefully.

Since the server will not call your Java directly as a “CGI”, you need a wrapper CGI. Grabbing a copy of myCalc.cgi from /~godfrey/course/2041/see/ should suffice.

This assumes your Java source is in MyCalc.java in the same directory, and that you have compiled it, so there is a MyCalc.class. Permissions on myCalc.cgi should be at least “-rwx------”.

A useful program as a debugging tool is at

echo.cgi

This echos back to the client a request from the client showing the headers, query string (GET), payload (POST), and the form parameters parsed out into key-value pairs. Point your HTML form at this for debugging purposes to see exactly what is being sent.

You can look at the source code of echo.cgi in /~godfrey/course/2041/see/.

Debugging CGI

Internal Server Error!

Often while building a CGI program, when you test it by calling it from a client browser (e.g., Firefox), you get the dreaded response of Internal Server Error. This is a status 500 response.

Something broke with permissions, on invocation of the CGI program or with the program's response. But what? It is maddening; the message does not tell you much. (Okay, more can be learned by looking in the server's logs. Of course for that, you need access to the server's logs. In this case, you do not have access. You often do not.)

Debugging webapps — and CGI programs — is notoriously hard for such reasons. In one way, this is only to be expected. Our “application” has parts on at least two machines: the client and the server. These parts have to be coordinated. So, we need to develop a very good understanding of CGI, debugging acumen and good practices, and a good set of tools to cope.

What could result in an Internal Server Error?

permissions: The web server will not honour running your CGI program because permissions are not set up correctly.
ill-formed response: The output of your CGI program is not a properly formatted response.
runtime errors: Your CGI program had a runtime error when invoked.
other gotchas!

permissions

We are working with an Apache server running on Linux. Let us discuss standard permission rules Apache applies to determine whether it will honour invoking a CGI program.

Each directory on the path of directories from your root www/ directory (in your home directory ~) down to the directory that the CGI program is in needs to be accessible to other. That is, each directory must have permissions of at least “d--------x”. For directories, the “x” means that files and sub-directories in the directory can be reached, or accessed. (For directories, an “r” means a directory listing can be done.)

Of course, you likely want to be able to access (“x”), list (“r”), and write (“w”) — that is, to add and delete files and sub-directories — in your own (user) directory! So, “drwx-----x” would be more sensible. And it rarely makes sense to have permissions for other but not for group; so, “drwx--x--x” is better. Now, if you do not mind people who have access to the file system to do a directory listing of this directory, then “drwxr-xr-x”.

If you allow write for other (e.g., “drwx--x-wx”) on the directory that the CGI program is in, Apache will not invoke the CGI! This is because of security risks. Someone else could write a CGI file into your directory, and then execute it by calling it from a browser. And it would execute under your account!

Remember, each directory down the path has to be accessible (“x”) to other. For example, if your CGI is at the URL path /~you/2041/projects/cgi/mine.cgi — where you is your account name — then each of
- ~you/www/ (the www/ is implicit on the URL path)
- ~you/www/2041/
- ~you/www/2041/projects/
- ~you/www/2041/projects/cgi/
must have permissions, say, “drwxr-xr-x” (or, at least, “d--------x”).

You can add the permission access for other to the directory you are presently in with

% chmod o+x .

(Do a man on chmod to learn more on the in's and out's of chmod.)
The CGI program file needs to have executable for user; so at least “---x------”. (For files, “x” means executable; that is, can be run.) A sensible setting would be “-rwx------”. If you do not mind other people with access to the file system to read your CGI file, then “-rwxr--r--”. (This does not mean that people “visiting” the website by browser can read the file! Apache will not deliver the CGI program file.) And for others on the file system to run your CGI program from the command line: “-rwxr-xr-x”.

If other has write permission (“w”) on the CGI program file, Apache will not invoke it! Again, this is because of security risks. Someone could change the contents of your CGI program and then execute it by calling it from a browser. And the new code would execute under your account!

In your case, your CGI file — e.g., myCalc.cgi — is a very simple Bourne Shell (“sh”) script which, in turn, simply calls java (the java virtual machine, the JVM) to run your Java program. Your Java program needs to have been compiled beforehand then for the CGI to work; e.g.,

% javac MyCalc.java

Do this from the command line. This creates, say, MyCalc.class, the “executable”. (Well, this is the bytecode that java then interprets.) So the script myCalc.cgi is calling java on the class file MyCalc.class. Do not forget to recompile when you change your Java code!

Clearly then, the permissions on, say, your MyCalc.java are immaterial for the CGI program to work “through” the Web. The MyCalc.java file does not even have to be there! That is because myCalc.cgi is calling MyCalc.class (via having java run it).

What about permissions on MyCalc.class? It needs to be readable by owner (you), so that java as called by MyCalc.class has pemissions to read it. (Recall Apache executes myCalc.cgi as owner, which then executes MyCalc.class as owner.)

ill-formed response

What must be the output of your CGI program when invoked? Two things:

proper HTTP headers, followed by
an empty blank line, followed by
the content / payload (say, a proper HTML document).

The one header field you really need to provide is the Content-type. For example:

Content-type: text/html

<!DOCTYPE html>
<html>
<head>
    <title>My CGI's output payload!</title>
</head>
<body>
    <h1>Hello, world!</h1>
</body>
</html>

If your program's output is not like this, you will get the dreaded 500 status invoking the CGI “through” the Web from a client!

You can also use “Content-Type: text/plain” instead, especially while debugging.

Does Apache add any additional headers to the HTTP response? How would you be able to tell?

runtime errors

If your CGI program breaks in a runtime error on invocation, you will see the Internal Server Error. Why? Because the output is not the right format; no headers followed by the payload.

In the case of using Java as the programming language for our “CGI”, a technique often used while debugging is to wrap most all the main in a try block. Then have a catch that catches anything, prints the header and then the exception messages as the payload. So the program never really fails outright. That way, you could see at the browser what happened.

Of course, also run your program from the command line to debug it. And remember to run the .cgi from the command line too to see that all is working.

other gotchas

There are numerous other things that can go wrong too, but that are generally quite rare for us.

One that is not: MS Windows and Unix / Linux handle end-of-line differently in text files. End-of-line is actually indicated within a text file as an invisible character (or two). For Unix-like systems, that end-of-line character is LF (line feed), which is ASCII 10, and indicated in strings in many languages like Java and C++ as “\n”. For MS Windows systems, end-of-line is indicated by a sequence of two characters: CR LF. CR is carriage return, ASCII 13, and indicated in strings in many languages as “\r”. Maddeningly, one cannot easily tell which style is being used in a file just looking at it. It looks the same in most contexts either way.

Okay, why do we care? Unfortunately, Unix / Linux cares for shell scripts. And our .cgi wrapper — e.g., myCalc.cgi — is a Bourne Shell script. If you cut-and-pasted, say, myCalc.cgi into, say, NotePad on a Windows system and saved it, it will have the CR-LF end-of-lines. And that will break when Apache — running on Linux — tries to execute it!

Sigh. A very sneaky, invisible error. On Unix / Linux, you can convert any text file from CR-LF end-of-lines (Windows) to LF end-of-lines (Unix) by the command dos2unix. E.g.,

% dos2unix myCalc.cgi

If you are getting a 500 status, do this to your .cgi to eliminate this possible problem.

B. Contact Phonebook

requirements document

The user starts by visiting the URL


		https://www.cse.yorku.ca/~cseXXXXX/pb/pb.cgi

where cseXXXXX is your Red login. At that point, the browser prompts for username and password. Any valid Red user should be able to login. After a successful authentication, the user is presented with the a search page having the following components.

A message that says “Welcome X”, where X is the user's username. If the user has visited the webapp before, then the message should say, “Welcome back X. Your last visit was xxx.” The xxx should be the time of the person's last visit.
A label, a textbox, and a button entitled “Find”. The user can enter a person's last name — or a part thereof — and click “Find” in order to initiate a database search for the telephone number of that person. The search should be case insensitive.

When the “Find” button is clicked, the entry is looked up in the database. If not found, the search page is re-served with an error message at the top that says “No match!”, and with the textbox pre-filled with whatever the user had entered.

If, however, the search was successful, the user is presented with a detail page which is made up of four label-textbox pairs:

Last Name
First Name
Telephone
Comments

The text-boxes are to be pre-populated with values from the corresponding columns of the matching row in the database. (If several matches were found, then only the first one — in last name order — should be shown.)

Note that the text-boxes must be in read-only mode, and that the 4th one must actually be a text area, in order to accommodate large comments. This page must also include a “Back” button to enable the user to return to the search page with its text-box pre-filled.

If it is you who has come to the page, another button should be made available, named “New Contact”, below the Find search bar. (Do not provide this — or make the functionality accessible — to anyone else.) If you click it, it gives you the very same layout as a search result, showing you fields for Last Name, First Name, Telephone, and Comments. However, this time, there is nothing filled in the fields, and they are open for you to enter values. At the bottom should be a “Back” button and a “Save” button. “Save” should add that as an entry into your contact database.

design sketch

Your webapp consists of five files.

pb.cgi
- A shell script that launches the Java app.
PBapp.java
- An app whose main method contains new PB();.
PB.java
- The PhoneBook Webapp.
PBdao.java
- A class that implements the database connectivity.
PBrow.java
- A simple data structure to represent one row in the Contact table.

Start by developing PBrow. It should have five private attributes — the four mentioned in the requirements plus the ID — a getter and setter for each, and a default constructor that initializes the ID to -1, and the remaining attributes to empty strings.

Next, develop PBdao whose attributes must include a Statement object. Its constructor must connect to your database, and initialize the statement attribute. It must also have the method


		public PBrow find(String entry)

This takes the entry made by the user in the search page, looks for it in the database — as described in the requirements — and returns the matching row. (The return should be null, if no match is found.)

The PB class acts as a controller that receives the incoming request, extracts the needed HTTP headers and parameters, and sequences the flow. It also includes methods for serving up the two pages.

implementation

Template implementations of some of the needed classes are available at


		http://www.cse.yorku.ca/course/2041/lab4/

Put the htaccess file in place as .htaccess once you want to force authentication (and using HTTPS). Its permissions should be “-rw-r--r--”.

Grab a copy of pb.cgi from /~godfrey/course/2041/see/. Note that pb.cgi is assuming that your PBapp is in a directory pb within this one (that the CGI file is in).

Start your implementation using these files, rather than starting from scratch.

It is recommended that you use Eclipse as an IDE. To use Eclipse, click “Start, Programs, Java.” Once in Eclipse, create a new “Java Project”. Your home directory is mapped as drive Z, so create your workspace there.

steps

I recommend that you proceed in the following steps.

Database

Set up your contact database.

Get your app code working for the database connection part for fetching a particular contact record by name.
Online

Get your app working “online”, so one can access it from a web client.

Just handle the initial “Find” functionality at this point. Do not worry about greeting the person or the “New Contact” yet.

Implement the HTML of the response pages.
Greet

Implement the greeting functionality to welcome, and to welcome back, a user.
New Contact

Now extend your app to include the “New Contact” functionality.

Forget about HTTPS and anything requiring authentication — Steps #3 & #4 — for the first week. Save that for Part II.

1. Database

derby

Create a table named Contact. Use the Derby database system from Lab #2 with the CSE “database”, using your Derby user name and password from Lab #2.

The file create_contact.sql describes the schema for the table. Use ij and that file to make your table:

run 'create_contact.sql';

Note that you do not set schema ... This table belongs to you, and so is under your default schema.

We will use just a subset of the columns today. Use phone1 to hold the telephone number. Note that the ID column is an integer that is auto-generated and, hence, is unique for each entry.

Add a few rows to the table so you can test your webapp. E.g.,

insert into Contact (LastName, FirstName, Phone1, Comments) 
    values ('University', 'York', '416-224-1833', 'Alma mater');

It is recommended that you enter the names — both first and last — in uppercase. This will facilitate search. Do a select to view your inserted rows. Verify that they have different IDs.

code

In the design, PBdao oversees the connection to the Derby database. Any queries that you need to make to the database are implemented as methods here; for example, find.

PBrow is a helper class. It is an object that holds a contact “row” that you retrieved from the database, or set in order to put in the database. It has setters and getters for each of the fields. This just makes the design for the rest of the code simpler.

2. Online

Get your app connected and running. PBapp has the main method. It is launched by pb.cgi, which is just a little CGI launcher script. (You likely will not have to modify PBapp.)

PB handles the main application logic. It is responsible for handling the CGI requests, and generating the proper HTTP responses and payloads.

It will need to fetch the CGI input parameters. The app is to use the method POST, so these will be on standard input for PB. It will decide what mode of operation it is in, depending on what the parameters are. Accordingly, it will deliver the right HTML payload back to the client, following proper HTTP headers.

At this point, you should handle two modes: welcome and find results. For welcome, you are delivering the basic start page, which has a “Find” search bar. You would deliver this if no parameters came in. For find results, this is delivering the search result of a “Find” request. This page should also provide a “back” button, which would get the client back to welcome.

3. Greet

Now, add the functionality to track the user (client). Since clients have to authenticate to get to your app (and are using HTTPS so the session is confidental), your app can know who it is. All we are doing for this project is welcoming the user, and welcoming back the user, if he or she is coming to the app again.

Who the user is is not passed in as a parameter. Rather, it is part of the environment the web-server sets for your app instance. The environment consists of environment variables with such type of information stored there.

In Java, these can be accessed using the one-parameter method System.getenv(...). This returns a string with the value for that key, if the key is present. So, to fetch the user name is System.getenv("REMOTE_USER"). (If null is returned, the server did not set this key "REMOTE_USER". for instance, maybe the server does not know. This would be the case if your app is accessible by a non-authenticated session.

How will you “remember” whether and when the user used the app last? Set a cookie. This will be stored by the user's client browser, and sent back to the app whenever the client starts a new HTTP session with the app.

Cookies go back and forth as headers in HTTP. In the request from the client, there is a header Cookie: that delivers any cookies to the server that were previously stored from past visits. In the response from the server, there is a header Set-cookie: that asks the client browser to store a cookie on the server's behalf.

Your app wants to set a cookie called pb with the timestamp of now. Use java.util.Date().getTime() to find now. Then you want to print as a header Set-Cookie: pb=1319552311; secure. The number is an example of the timestamp. The secure tells the client only to return the cookie under use of HTTPS.

If a client returns back to your app later, in the request, there will be a header Cookie: ...; pb=1319552311; .... This is the list of cookies being delivered back to you. Likely, in your case, there will be just the one key-value in the list (for pb), but you cannot count on this. In Java, you can get this information from the environment, System.getenv("HTTP_COOKIE").

To welcome back a user, search for the pb cookie. If it is there, you can retrieve its timestamp value, which says when the user last used the app (the last time the cookie was set). You will need to convert the timestamp into a human readable date and time. I recomend java.util.Date().

4. New Contact

Next is to extend the functionality of your app. First, go to Lab Report, and check in your code. You are only responsible for parts 1 to 3 for the Lab Report. And save a copy of your code for yourself.

If you get this part done and want to turn in the fuller version later, please do.

Add the functionality of “New Contact” as defined in the Requirements section.

Lab Reports

part I (week #5)

Submit your MyCalc.java.

submit 2041 lab4a MyCalc.java

part II (week #6)

Leave your pb app active. Submit your java files.

% submit 2041 lab4b PBapp.java PB.java PBdao.java PBrow.java