|
CSE-2041A
Net-Centric Computing
York University
Fall 2012
|
Lab #4: Webapp
Contact Phonebook
|
posted: 9 October 2012
revised: 16 October 2012
|
|
|
|
Welcome
to Lab #4.
Let's build webapps!
We shall use the old CGI style of doing things.
Why?
Well, it is simple;
and, you see exactly what is going on,
with nothing hidden.
Lab #4 has two parts,
and spans two lab periods.
It has two deliverables
(one for each week),
and is worth two lab points
(one for each part / deliverable).
My CGI
Getting the basics working.
-
Contact Phonebook (PB)
A miniature contact-database
“phonebook” app.
The second
—
the PB project
—
is a bit more involved.
It is a full-fledged,
albeit still somewhat simplistic,
webapp.
|
|
|
|
Let's
set up a CGI,
a server-side application program.
Since you people know Java,
you will write the app program in Java.
A “CGI” in Java is not supported!
No problem, though.
We will just have a miniature CGI
“wrapper” program;
all it will do is call your Java program.
CGI programs are called and executed by the web server
— say, Apache —
directly.
By convention,
a CGI program may be written in any interpreted / scripting
programming language;
e.g.,
Bourne Shell (sh ),
Perl,
Python,
PHP,
and
Ruby.
There are other standard ways to set up web server-side apps
written in Java, such as with
Java Servlets.
However,
this hides some of what is actually going on.
That would be a convenient abstraction,
providing for more robust, faster app development.
But it would be bad for us
learning what is happening behind the scenes.
|
|
example: ParkeCalc
We
have been doing the example of a server-side
app in class of a calculator
(ParkeCalc is what I called the one I sketched up).
The versions are written in Perl,
but illustrate much of what we are doing here.
You can poke at these at the following places:
You can look at the code in
/~godfrey/course/2041/see/.
(How is this showing you the source files?!
A CGI is usually executed by demand by the server,
not delivered as a static document.
And one usually would not want the source code
visible to the outside world,
for security.
Check out the source of the .htaccess at
/~godfrey/course/2041/see/htaccess.
Caution!
Do not use this .htaccess for your PB project.
Get the one from
PB template sources
for that.)
|
|
example: HtmlTemplate
Look
at the code of HtmlTemplate
written in Java:
This is an example of a clean style for writing into output HTML.
It also picks up the remote user,
if the user has authenticated
and that information is available.
You can invoke it in two ways:
-
htmlTemplate.cgi
-
htmlTemplate.cgi
What is the difference between the two?
(See the .htaccess
in the
PB template sources.)
|
|
|
|
Study
how each of the three versions of ParkeCalc works.
You are to reverse engineer this to build your own version,
MyCalc ,
in Java.
Your
MyCalc
should operate like ParkeCalcC .
Maybe make it slightly fancier in its functionality.
Be creative.
-
How are the parameters in the ParkeCalc app
delivered from the client to the server?
-
What is the URL that the form calls
in the generated HTML
from
ParkeCalcB
and
ParkeCalcC ?
-
What does your program have to return?
Think this through carefully.
Since the server will not call your Java directly as a
“CGI”,
you need a wrapper CGI.
Grabbing a copy of myCalc.cgi from
/~godfrey/course/2041/see/
should suffice.
This assumes your Java source is in MyCalc.java
in the same directory,
and that you have compiled it,
so there is a MyCalc.class .
Permissions on myCalc.cgi
should be at least
“-rwx------ ”.
A useful program as a debugging tool is at
This echos back to the client a request from the client
showing the
headers,
query string (GET ),
payload (POST ),
and
the form parameters parsed out into key-value pairs.
Point your HTML form at this for debugging purposes
to see exactly what is being sent.
You can look at the source code of
echo.cgi
in
/~godfrey/course/2041/see/.
|
|
|
|
Internal Server Error!
Often
while building a CGI program,
when you test it by calling it from a client browser
(e.g., Firefox),
you get the dreaded response of
Internal Server Error.
This is a status 500 response.
Something broke
with permissions,
on invocation of the CGI program
or with the program's response.
But what?
It is maddening;
the message does not tell you much.
(Okay, more can be learned by looking in the server's logs.
Of course for that,
you need access to the server's logs.
In this case,
you do not have access.
You often do not.)
Debugging webapps
—
and CGI programs
—
is notoriously hard for such reasons.
In one way,
this is only to be expected.
Our “application”
has parts on at least two machines:
the client and the server.
These parts have to be coordinated.
So,
we need to develop
a very good understanding of CGI,
debugging acumen and good practices,
and
a good set of tools
to cope.
What could result in an Internal Server Error?
-
permissions:
The web server will not honour running your CGI program
because permissions are not set up correctly.
-
ill-formed response:
The output of your CGI program is not
a properly formatted response.
-
runtime errors:
Your CGI program had a runtime error when invoked.
-
other gotchas!
|
|
permissions
We
are working with an Apache server running on Linux.
Let us discuss standard permission rules Apache applies
to determine whether it will honour invoking a CGI program.
-
Each directory
on the path of directories from your
root www/ directory
(in your home directory ~ )
down to the directory
that the CGI program is in
needs to be accessible to other.
That is, each directory must have permissions
of at least
“d--------x ”.
For directories,
the “x ”
means that files and sub-directories in the directory
can be reached,
or accessed.
(For directories,
an “r ”
means a directory listing can be done.)
Of course,
you likely want to be able to
access (“x ”),
list (“r ”),
and
write
(“w ”)
—
that is, to add and delete files and sub-directories
—
in your own (user) directory!
So,
“drwx-----x ”
would be more sensible.
And it rarely makes sense to have permissions for other
but not for group;
so,
“drwx--x--x ”
is better.
Now,
if you do not mind people who have access to the file system
to do a directory listing of this directory,
then “drwxr-xr-x ”.
If you allow write for other
(e.g., “drwx--x-wx ”)
on the directory that the CGI program is in,
Apache will not invoke the CGI!
This is because of security risks.
Someone else could write a CGI file into your directory,
and then execute it by calling it from a browser.
And it would execute under your account!
Remember, each directory down the path has to be accessible
(“x ”) to other.
For example,
if your CGI is at the URL path
/~you/2041/projects/cgi/mine.cgi
—
where you is your account name
—
then each of
~you/www/
(the www/ is implicit on the URL path)
~you/www/2041/
~you/www/2041/projects/
~you/www/2041/projects/cgi/
must have permissions, say,
“drwxr-xr-x ”
(or, at least, “d--------x ”).
You can add the permission access for other
to the directory you are presently in with
% chmod o+x .
(Do a man on chmod
to learn more on the in's and out's of chmod .)
-
The CGI program file needs to have executable
for user;
so at least
“---x------ ”.
(For files,
“x ”
means executable;
that is, can be run.)
A sensible setting would be
“-rwx------ ”.
If you do not mind other people with access to the file system
to read your CGI file,
then “-rwxr--r-- ”.
(This does not mean
that people “visiting”
the website by browser can read the file!
Apache will not deliver the CGI program file.)
And for others on the file system
to run your CGI program from the command line:
“-rwxr-xr-x ”.
If other has write permission
(“w ”)
on the CGI program file,
Apache will not invoke it!
Again,
this is because of security risks.
Someone could change the contents of your CGI program
and then execute it by calling it from a browser.
And the new code would execute under your account!
In your case,
your CGI file
—
e.g., myCalc.cgi
—
is a very simple Bourne Shell
(“sh ”)
script
which, in turn,
simply calls java (the java virtual machine, the JVM)
to run your Java program.
Your Java program needs to have been compiled
beforehand then for the CGI to work; e.g.,
% javac MyCalc.java
Do this from the command line.
This creates, say, MyCalc.class ,
the “executable”.
(Well, this is the bytecode that java then interprets.)
So the script
myCalc.cgi
is calling java
on the class file MyCalc.class .
Do not forget to recompile when you change your Java code!
Clearly then,
the permissions on, say,
your MyCalc.java
are immaterial for the CGI program to work
“through”
the Web.
The MyCalc.java file does not even have to be there!
That is because myCalc.cgi
is calling MyCalc.class
(via having java run it).
What about permissions on MyCalc.class ?
It needs to be readable by owner (you),
so that java as called by MyCalc.class
has pemissions to read it.
(Recall Apache executes
myCalc.cgi as owner,
which then executes MyCalc.class as owner.)
|
|
ill-formed response
What
must be the output of your CGI program when invoked?
Two things:
-
proper HTTP headers,
followed by
-
an empty blank line,
followed by
-
the content / payload
(say, a proper HTML document).
The one header field you really need to provide is
the Content-type .
For example:
-
Content-type: text/html
<!DOCTYPE html>
<html>
<head>
<title>My CGI's output payload!</title>
</head>
<body>
<h1>Hello, world!</h1>
</body>
</html>
|
If your program's output is not like this,
you will get the dreaded 500 status
invoking the CGI “through” the Web
from a client!
You can also use
“Content-Type: text/plain ” instead,
especially while debugging.
Does Apache add any additional headers to the HTTP response?
How would you be able to tell?
|
|
runtime errors
If
your CGI program breaks in a runtime error on invocation,
you will see the Internal Server Error.
Why?
Because the output is not the right format;
no headers followed by the payload.
In the case of using Java as the programming language
for our “CGI”,
a technique often used while debugging
is to wrap most all the main
in a try block.
Then have a catch that catches anything,
prints the header and then the exception messages as the payload.
So the program never really fails outright.
That way,
you could see at the browser what happened.
Of course,
also run your program from the command line to debug it.
And remember to run the .cgi from the command line too
to see that all is working.
|
|
other gotchas
There are numerous other things
that can go wrong too,
but that are generally quite rare for us.
One that is not:
MS Windows and Unix / Linux handle end-of-line differently
in text files.
End-of-line is actually indicated within a text file
as an invisible character (or two).
For Unix-like systems,
that end-of-line character is LF (line feed),
which is ASCII 10,
and indicated in strings in many languages like Java and C++
as “\n ”.
For MS Windows systems,
end-of-line is indicated by a sequence of two characters:
CR LF .
CR is carriage return, ASCII 13,
and indicated in strings in many languages
as “\r ”.
Maddeningly,
one cannot easily tell which style is being used in a file
just looking at it.
It looks the same in most contexts either way.
Okay, why do we care?
Unfortunately,
Unix / Linux cares for shell scripts.
And our .cgi wrapper
—
e.g., myCalc.cgi
—
is a Bourne Shell script.
If you cut-and-pasted, say,
myCalc.cgi
into, say, NotePad
on a Windows system and saved it,
it will have the CR -LF
end-of-lines.
And that will break when Apache
—
running on Linux
—
tries to execute it!
Sigh.
A very sneaky, invisible error.
On Unix / Linux,
you can convert any text file from
CR -LF end-of-lines (Windows)
to LF end-of-lines (Unix)
by the command dos2unix .
E.g.,
% dos2unix myCalc.cgi
If you are getting a 500 status,
do this to your .cgi
to eliminate this possible problem.
|
|
|
|
requirements document
The
user starts by visiting the URL
https://www.cse.yorku.ca/~cseXXXXX/pb/pb.cgi
where cseXXXXX is your Red login.
At that point,
the browser prompts for username and password.
Any valid Red user should be able to login.
After a successful authentication,
the user is presented
with the a search page having the following components.
-
A message that says
“Welcome X”,
where X is the user's username.
If the user has visited the webapp before,
then the message should say,
“Welcome back X.
Your last visit was xxx.”
The xxx should be the time of the person's last visit.
-
A label, a textbox, and a button entitled “Find”.
The user can enter a person's last name
—
or a part thereof
—
and click
“Find”
in order to initiate a database search
for the telephone number of that person.
The search should be case insensitive.
When the “Find” button is clicked,
the entry is looked up in the database.
If not found,
the search page is re-served with an error message at the top
that says “No match!”,
and with the textbox pre-filled with whatever the user had entered.
If, however, the search was successful,
the user is presented with a detail page
which is made up of four label-textbox pairs:
-
Last Name
-
First Name
-
Telephone
-
Comments
The text-boxes are to be pre-populated with values
from the corresponding columns of the matching row
in the database.
(If several matches were found,
then only the first one
—
in last name order
—
should be shown.)
Note that the text-boxes must be in read-only mode,
and that the 4th one must actually be a text area,
in order to accommodate large comments.
This page must also include a
“Back”
button to enable the user to return
to the search page with its text-box pre-filled.
If it is you who has come to the page,
another button should be made available,
named “New Contact”,
below the Find search bar.
(Do not provide this
—
or make the functionality accessible
—
to anyone else.)
If you click it,
it gives you the very same layout
as a search result,
showing you fields for
Last Name,
First Name,
Telephone, and
Comments.
However, this time,
there is nothing filled in the fields,
and they are open for you to enter values.
At the bottom should be a
“Back” button
and a
“Save” button.
“Save” should add that as an entry
into your contact database.
|
|
design sketch
Your
webapp consists of five files.
-
pb.cgi
-
PBapp.java
-
PB.java
-
PBdao.java
-
PBrow.java
Start by developing PBrow .
It should have five private attributes
—
the four mentioned in the requirements plus the ID
—
a getter and setter for each,
and a default constructor that initializes the ID to -1,
and the remaining attributes to empty strings.
Next,
develop PBdao whose attributes
must include a Statement object.
Its constructor must connect to your database,
and initialize the statement attribute.
It must also have the method
public PBrow find(String entry)
This takes the entry made by the user in the search page,
looks for it in the database
—
as described in the requirements
—
and returns the matching row.
(The return should be null,
if no match is found.)
The PB class acts as a controller that
receives the incoming request,
extracts the needed HTTP headers and parameters, and
sequences the flow.
It also includes methods for serving up the two pages.
|
|
implementation
Template
implementations of some of the needed classes
are available at
-
http://www.cse.yorku.ca/course/2041/lab4/
Put the htaccess file in place
as .htaccess
once you want to force authentication
(and using HTTPS ).
Its permissions should be
“-rw-r--r-- ”.
Grab a copy of pb.cgi from
/~godfrey/course/2041/see/.
Note that pb.cgi is assuming
that your PBapp
is in a directory pb within this one
(that the CGI file is in).
Start your implementation using these files,
rather than starting from scratch.
It is recommended that you use Eclipse as an IDE.
To use Eclipse,
click
“Start, Programs, Java.”
Once in Eclipse,
create a new
“Java Project”.
Your home directory is mapped as drive Z,
so create your workspace there.
|
|
steps
I
recommend that you proceed in the following steps.
-
Database
Set up your contact database.
Get your app code working for the database connection part
for fetching a particular contact record by name.
-
Online
Get your app working “online”,
so one can access it from a web client.
Just handle the initial “Find”
functionality at this point.
Do not worry about greeting the person
or the “New Contact” yet.
Implement the HTML of the response pages.
-
Greet
Implement the greeting functionality
to welcome, and to welcome back, a user.
-
New Contact
Now extend your app to include the
“New Contact” functionality.
Forget about HTTPS and anything requiring authentication
—
Steps #3 & #4
—
for the first week.
Save that for Part II.
|
|
|
|
derby
Create
a table named Contact .
Use the Derby database system from Lab #2
with the CSE “database”,
using your Derby user name and password
from Lab #2.
The file create_contact.sql
describes the schema for the table.
Use ij
and that file to make your table:
run 'create_contact.sql';
Note that you do not set schema ...
This table belongs to you,
and so is under your default schema.
We will use just a subset of the columns today.
Use phone1 to hold the telephone number.
Note that the ID column is an integer
that is auto-generated and, hence, is unique for each entry.
Add a few rows to the table so you can test your webapp.
E.g.,
insert into Contact (LastName, FirstName, Phone1, Comments)
values ('University', 'York', '416-224-1833', 'Alma mater');
It is recommended that you enter the names
—
both first and last
—
in uppercase.
This will facilitate search.
Do a select to view your inserted rows.
Verify that they have different IDs.
|
|
code
In
the design,
PBdao oversees the connection to the Derby database.
Any queries that you need to make to the database
are implemented as methods here;
for example, find .
PBrow is a helper class.
It is an object that holds a contact “row”
that you retrieved from the database,
or set in order to put in the database.
It has setters and getters for each of the fields.
This just makes the design for the rest of the code simpler.
|
|
|
|
Get
your app connected and running.
PBapp has the main method.
It is launched by pb.cgi ,
which is just a little CGI launcher script.
(You likely will not have to modify PBapp .)
PB handles the main application logic.
It is responsible for handling the CGI requests,
and generating the proper HTTP responses and payloads.
It will need to fetch the CGI input parameters.
The app is to use the method POST ,
so these will be on standard input for PB .
It will decide what mode of operation
it is in, depending on what the parameters are.
Accordingly,
it will deliver the right HTML payload back to the client,
following proper HTTP headers.
At this point,
you should handle two modes:
welcome and find results.
For welcome,
you are delivering the basic start page,
which has a “Find” search bar.
You would deliver this if no parameters came in.
For find results,
this is delivering the search result of a
“Find” request.
This page should also provide a
“back” button,
which would get the client back to welcome.
|
|
|
|
Now,
add the functionality to track the user (client).
Since clients have to authenticate to get to your app
(and are using HTTPS
so the session is confidental),
your app can know who it is.
All we are doing for this project
is welcoming the user,
and welcoming back the user,
if he or she is coming to the app again.
Who the user is is not passed in as a parameter.
Rather,
it is part of the environment
the web-server sets for your app instance.
The environment
consists of environment variables
with such type of information stored there.
In Java,
these can be accessed using the one-parameter
method System.getenv(...) .
This returns a string with the value for that key,
if the key is present.
So, to fetch the user name is
System.getenv("REMOTE_USER") .
(If null is returned,
the server did not set this key "REMOTE_USER" .
for instance, maybe the server does not know.
This would be the case if your app is accessible
by a non-authenticated session.
How will you “remember”
whether and when the user used the app last?
Set a cookie.
This will be stored by the user's client browser,
and sent back to the app whenever the client
starts a new HTTP session with the app.
Cookies go back and forth as headers in HTTP.
In the request from the client,
there is a header Cookie: that
delivers any cookies to the server
that were previously stored from past visits.
In the response from the server,
there is a header Set-cookie: that
asks the client browser to store a cookie on the server's behalf.
Your app wants to set a cookie
called pb with the timestamp of now.
Use java.util.Date().getTime() to find now.
Then you want to print as a header
Set-Cookie: pb=1319552311; secure .
The number is an example of the timestamp.
The secure tells the client only to return the cookie
under use of HTTPS.
If a client returns back to your app later,
in the request,
there will be a header
Cookie: ...; pb=1319552311; ... .
This is the list of cookies being delivered back to you.
Likely, in your case, there will be just the one key-value in the list
(for pb ), but you cannot count on this.
In Java,
you can get this information from the environment,
System.getenv("HTTP_COOKIE") .
To welcome back a user,
search for the pb cookie.
If it is there,
you can retrieve its timestamp value,
which says when the user last used the app
(the last time the cookie was set).
You will need to convert the timestamp
into a human readable date and time.
I recomend java.util.Date() .
|
|
|
|
Next
is to extend the functionality of your app.
First,
go to Lab Report,
and check in your code.
You are only responsible for parts 1 to 3
for the Lab Report.
And save a copy of your code for yourself.
If you get this part done
and want to turn in the fuller version later,
please do.
Add the functionality of “New Contact”
as defined in the Requirements section.
|
|
|
|
part I (week #5)
Submit
your MyCalc.java.
submit 2041 lab4a MyCalc.java
|
|
part II (week #6)
Leave
your pb app active.
Submit your java files.
% submit 2041 lab4b PBapp.java PB.java PBdao.java PBrow.java
|
|
|
|