Linux's history as an enthusiast's playground has always made it a fun place to work for programmers. Combine the fun of Linux with the power of Java and JSP, and quickly build secure multi-tier Web applications using the latest technologies.
All the cool new programming languages, like Ruby, always have compilers/interpreters and tools for Linux, and the old UNIX standbys like Tcl/Tk are still around when you need them. Why, then, is Java not a ubiquitous player in the Linux arena?
Linux and Java really do have a lot to offer each other. Both are rock-solid and scalable server-class software systems, and most college and university graduates with software-related degrees are familiar with them, making for a powerful combination. In this article, I introduce you to Java Web applications through the Java Servlet Specification, the Java programming language itself and Java Server Pages. These three tools can help you get a Web application running in a lot less time than you think.
The Java Servlet Specification defines a Servlet Container, a Web application and the Servlet API, which is the glue that holds these pieces together.
A Servlet Container is analogous to a Web server, but it also knows how to deploy and manage Web applications, and so it often is known as an Application Server. The Servlet Container provides services that support the Servlet API, which is used by the Web application to interact with HTTP requests and responses.
A Java Web application is a self-contained collection of configuration files, static and dynamic resources, compiled classes and support libraries that are all treated as a cohesive unit by the Servlet Container. They are somewhat different from standard LAMP-style Web applications, which are more like collections of associated programs or scripts than formally defined, self-contained units. To demonstrate a Java Web application, I have developed a simple “timesheet” featuring some of the standard Java libraries that helped me write it.
Typically, a Web application is packaged in a WAR (Web ARchive) file, which is just a ZIP file with a special directory structure and configuration file layout. The directory structure of the Web application logically and physically separates these types of files. The WEB-INF directory contains all the configuration files, a lib directory contains all libraries (packaged in JAR, or Java ARchive files), and a classes directory contains the application's compiled code. Listing 1 shows the file layout of the Web application for reference.
The WEB-INF directory also contains a special file, web.xml, which is known as the Web application's deployment descriptor. It defines all the behaviors of the Web application, including URI mappings, authentication and authorization. Let's look at the deployment descriptor for this Web application.
You can see that each servlet is defined in a <servlet> element that defines the Java class that contains the code, as well as a name for the servlet (to be used later). After the servlets have been defined, they are then mapped (by name) to incoming URIs using <servlet-mapping> elements. This servlet mapping may seem tedious and verbose, but it can be very powerful for several reasons:
You can map one servlet to multiple URIs.
You can use wild-card mappings (/foo/bar/*).
You may not want to reveal any of the code's structure to remote visitors.
You may have servlets you don't want to map at all.
After the servlet mappings come container-managed authentication and authorization. The Servlet Specification requires that Servlet Containers provide mechanisms for authentication and authorization, and the configuration in the Web application is declarative: web.xml simply specifies what resources are protected and who is allowed to access them, using role-based authorization constraints. The setup is quite straightforward, and the Web application becomes simpler by not having to implement that capability inside the application. In this application, I've chosen to use HTTP BASIC authentication to simplify the application. DIGEST, FORM and (SSL) CLIENT-CERT are other options allowed by the Servlet Specification.
Now that you have a sense of how the Web application is packaged and deployed, let's turn our attention to the real action in the Web application: the code.
Java is both a programming language and a runtime environment, much like Perl and PHP. In those cases, the compiler generally is invoked when the script is executed, while Java is always compiled beforehand. The Java programming language itself is object-oriented, procedural, block-structured and entirely familiar to anyone who has written in a C-like language. It has a number of explicitly defined primitive data types as well as reference types. All the Java code you write lives within the definition of a class, including servlet code.
Let's take a look at the source code for the GetTasksServlet (Listing 3), which implements the “get-tasks” servlet, which is mapped to the URL /tasks.
The first line of the file declares the “package” in which the class is defined. Packages help keep code organized and have implications on variable, method, and class scope and visibility. The next set of lines are “imports” that indicate to the compiler which classes will be referenced by this class. Those classes beginning with java. are standard Java classes, while those beginning with javax.servlet are those provided by the Java Servlet Specification. Then, we define a class called GetTasksServlet that extends an existing class called HttpServlet, the basis for all HTTP-oriented servlets. The HttpServlet class defines a number of doXXX methods, where XXX is one of the HTTP methods, such as GET (doGet), POST (doPost), PUT (doPut) and so on. I have overridden the doGet method in order to respond to HTTP GET requests from clients.
The doGet method accepts two parameters: the request and the response, which provide hooks into the resources provided by the Servlet Container and to the information provided by the client for a particular HTTP request. I use two utility methods (defined later in the class) to obtain a list of clients and a list of tasks, and store them in the request object's “attributes”, a location where data can be placed in order to pass them between stages of request processing. You'll see how to access this information next when I cover JSP files for generating content. Finally, I invoke the “request dispatcher's” forward method, which tells the container to forward the request to another resource: tasks.jsp.
Java Server Pages (JSPs) is a technology for dynamic content generation for things like Web pages. JSPs are analogous to PHP pages, where static text can be mixed with Java code, and the result is sent to the client. Technically speaking, JSPs are translated on the fly by a special servlet (provided by the Servlet Container) into their own servlets and compiled into bytecode, and then run just like “normal” servlets. Listing 4 shows the code for tasks.jsp—the page referenced in GetTasksServlet's doGet() method above.
The page begins with a page declaration that includes some metadata about the page, including the output character encoding, and then some “taglib” tags that tell the JSP compiler I want to use some “tag libraries”. Tag libraries are helper libraries that allow JSP scripts to wield powerful tools using very simple syntax. After the DOCTYPE, there is a <fmt:setBundle> element, and in the <title> of the page, there is a <fmt:message> element. These two tags, defined by the “fmt” tag library, work together to provide internationalization capabilities to this page. The <fmt:setBundle> tag defines the string resource bundle to be used by the page, and the <fmt:message> tag uses that bundle to pull localized text from the appropriate file to display in the page. The result is, when I visit this page with my Web browser set to the en_US locale, I get text in English, but if I switch the locale to fr_BE and reload the page, the page will switch into French without any further programming.
The standard Java API actually provides all this capability out of the box, but the JSTL (Java Standard Template Library) “fmt” tag library gives us access to Java's internationalization APIs without having to write any Java code. By providing a Java property file (a text file with simple key=value syntax) for each locale I want to support, I get text localization practically for free. Further down in the JSP file, you can see the use of another “fmt” tag, <fmt:formatDate>. This tag formats a date object using the user's locale and a simple name for the format (“simple” in this case). This results in MM/dd/yy in the US and dd/MM/yy in Belgium.
The next JSTL tag is <c:forEach>. This tag actually encloses a body, which is evaluated multiple times: once for each item it finds in the “items” attribute. The value of ${items} means that the value is not just a simple literal value, but an expression that should be evaluated. The object “items” is found in the request object's “attributes”—remember I put it there in the servlet code—and used here as the data for the loop. Within the body of <c:forEach>, the “item” object is defined and can be used by any JSTL tags.
The next tag, <c:out>, outputs a value in a Web-safe manner. If the value contains any < characters, they will be escaped to avoid nasty XSS attacks. The value of ${clientMap[item.clientId].name} is again an expression that tells <c:out> to take the client ID from the item object, use that to look up a value in the “clientMap”, and then get its name. The objects “item” and “clientMap” are both retrieved from the request attributes, and the <c:out> tag handles the expression evaluation and output escaping for us.
This page includes a form that allows us to enter new tasks. One of the most important attributes of the <form> is the “action”, which, of course, tells the form where the data should be sent. I use the <c:url> tag here to generate a URL for us. It may seem silly to use a tag when I simply could have used /timesheet/save-task as the value of the action attribute, but there are some subtle issues in play here, which must be taken into account. First, a Web application can be deployed into any “context path”, which means that the path to the servlet might actually be /my-timesheet/save-task. The <c:url> tag knows where the Web application has been deployed (courtesy of the request object, defined by the Servlet API) and can provide the appropriate path prefix to the URL. Second, <c:url> can encode the URL with a session identifier, which is essential to providing a good user experience for many Web applications. The <c:url> tag is smart enough to omit the session identifier from the URL if the client is using cookies to communicate the session identity to the server, but to include it in the URL as a fallback when cookies are unavailable. Sessions are another handy feature defined by the Servlet Specification, provided by the Servlet Container and accessible via the Servlet API.
Now that I've covered the display of the timesheet and the form that can be used to submit a new task, let's take a look at the code that accepts this form submission: SaveTaskServlet.java (Listing 5), which implements the “save-task” servlet, which is mapped to the URL /save-task.
The SaveTaskServlet overrides the HttpServlet's doPost method so we can handle HTTP POST messages. It gathers the data from the request, made available through the request object's getParameter method, then creates a Task object and calls a helper method (defined later in the class) called “save”. After saving the new task, the user is redirected to the “tasks” servlet to view the updated list of tasks. Did you notice that the line of code performing the redirect calls response.encodeRedirectURL and prepends the context path to the target URI? This is precisely the tedium that is avoided in JSP files by using the <c:url> tag.
SaveTaskServlet also defines a “save” method that interacts with the database. While none of this code is servlet-oriented, it's instructive to see the power of some of Java's standard APIs. In this case, it's the JDBC API that gives us access to relational databases (Listing 6).
First, this method obtains a connection from a database connection pool and then determines if the Task is being created from scratch or updated (although our UI doesn't offer an “update” method yet, this class has been designed to allow updates). In each case, a parameterized SQL statement is prepared and then filled with data passed in from the calling code. Then, the statement is executed to write to the database, and a new object is passed back to the caller. In the case of a new task, the database-generated primary key is fetched from the statement after execution in order to pass it back to the caller.
Under normal circumstances, methods such as “save” would be split out into a separate class for easier organization, testing and architectural separation, but I've left them in the servlet classes for simplicity.
The example's full source code and prebuilt WAR file are available from the Linux Journal FTP server (see Resources), and I encourage you to download it and play around with it. I've also included quick installation instructions for Java and the Apache Tomcat servlet container, which will be required to run the example application.
Often, Perl and PHP-based Web applications are composed of self-contained scripts that perform one task: loading and displaying tasks, for instance. This kind of thing is entirely possible using nothing but JSPs. There are tag libraries that perform SQL queries, and you even can write Java code directly into a JSP, although I haven't covered it here because it's not necessary with the rich tools provided by the JSTL. On the other hand, there are some philosophical and practical reasons not to stuff everything into a single JSP. Most (Java) programmers subscribe to the “model-view-controller” architecture, where code is separated into logical units that model your problem domain (that would be the Task and Client objects in our example), provide views of your data (that's our JSPs) and control program flow (the servlets). This architectural separation actually leads to quite a few practical benefits, including:
Easier code maintenance: separation promotes code re-use and simplifies automated testing.
Error handling: if the controller is the only likely component to fail (due to bad input, db connection failure and so on), you don't have to worry about the view component failing during rendering, ruining your output.
Most Java projects are going to be split up in this way, so I wrote my example to illustrate this architecture, and I hope you consider using this architecture in your Java projects too.
Adding Java to your repertoire for building Web applications gives you access to the built-in services guaranteed by the Servlet Specification as well as a plethora of high-quality third-party libraries. Servlet containers provide many services useful to your Web applications through simple configuration and/or APIs. Java Server Pages can be used to build complex Web pages quickly while avoiding business logic. The Servlets you write to implement your business logic have full access to many APIs for just about anything you can think of. The power of Java Web applications and the stability and scalability of Linux can be combined into a platform on which many high-quality on-line services are built, including mine. I hope I've given you a taste of how easy it is to create a robust and useful Java Web application using the tools provided by the Java Servlet Specification, and that you consider using Java for your next Web application.