The Filter API can be used to implement an access control system which can either allow or disallow access based on the requested URL. The filtering system passes all web browser requests through the RequestFilters before being sent to the intended web server. A RequestFilter can disallow access to a URL by throwing a Java FilterException. The filtering system will receive this exception and discontinue processing the request. In order to inform the web browser that the request was rejected, the filtering system will send the browser an HTTP error response including the reason why the request was rejected. An example of a RequestFilter that will reject requests is shown in Figure 5.3. In this example the call to reject(url) is assumed to lookup the requested URL in a database and return true if the URL should be rejected. Regular expressions can also be used to allow or reject URLs that match a specific pattern.
The access control system based on RequestFilters can be improved with a ContentFilter. Since almost all URL requests originate from hypertext links in HTML documents, a ContentFilter can be used to process all HTML documents and remove any hypertext links that refer to URLs that will be rejected. The benefit of removing such links is that entire pieces of web pages that link to a rejected URL can be removed. For example, a web page may contain an image that when clicked on will cause the web browser to visit a web site. If the URL for the web site is rejected, the image can also be removed. This will effectively remove all traces of the rejected URL from the HTML document.
An example of how a ContentFilter can remove HTML tags is shown in Figure 5.4. In this example there is an IMG tag enclosed in the hypertext anchor A tag. When a user clicks on this image, the web page specified in the HREF attribute will be displayed by the web browser. However, when a ContentFilter is used to filter this HTML, the ContentFilter will see that the HREF in the A tag links to a rejected URL, and will proceed to remove the A tag and all its contents up-to and including /A.