HIP 3.04 behind a reverse proxy

As mentioned in this post, I’ve rejigged our HIP server so that it has an Apache instance running in front of it that’s acting as a reverse proxy.
Just in case anyone is interested in going down the same route, here’s how I did it with Apache and mod_perl…

Why?
Well, as previously mentioned, I wanted to make the dynamic content being served by our Apache server (running on port 4128) appear to be coming from the same server as the HIP installation (running on port 80).  Both web servers run on the same physical hardware.

Also, as anyone who’s on the iPac mailing list will know, the issue of security keeps popping up every now and then.  To the best of my knowledge (and I’d love to be corrected on this!), SirsiDynix don’t release any security patches for the JBoss/Jetty web server that powers HIP.
By having either a dedicated proxy (such as Squid) or a web server running as a reverse proxy in front of your HIP server, you can help protect the HIP server from malformed URLs that could be attempts to compromise it.  Anyone who’s ever watched the JBoss/Jetty console errors will have probably seen it choking on ugly looking URLs.
Another possible use is to move your HIP to a different physical server, but have the proxy/reverse proxy make it appear (to the outside world at least) that your HIP server is still in the original place — that must surely a part of all IT magician’s act (or magITian?).
You may also be able to set up SSL on the Apache server to provide a secure method of delivery your OPAC pages.
If you hunt around on Google, you’re bound to find a lot more information.
How?
The simplest method is to make Apache the server listening on port 80, move HIP to another port, and then configure Apache to act as a reverse proxy for requests to the OPAC.

The first thing you need to do is to move HIP to a different port (e.g. port 81).  The easiest way I’ve found of doing this is to modify the port number in the ./appserver/jboss/bin/run.bat file – HIP 3.04 has the following:
rem Setup HIP Instance sepecific properties
set HIP_OPTS=-Djetty.port=80 -Djetty.admin.port=222
set HIP_INST=-c default

…so, just change that to your new port number:
rem Setup HIP Instance sepecific properties
set HIP_OPTS=-Djetty.port=81 -Djetty.admin.port=222
set HIP_INST=-c default

Next, install Apache 2 with mod_perl 2.0 and configure it (using the httpd.conf file) to listen on the default port 80:
# Listen: Allows you to bind Apache to specific IP addresses and/or
# ports, instead of the default. See also the
# directive.
#
# Change this to Listen on specific IP addresses as shown below to
# prevent Apache from glomming onto all bound IP addresses (0.0.0.0)
#
#Listen 12.34.56.78:80
Listen 80

Next uncomment the following proxy modules:
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_connect_module modules/mod_proxy_connect.so
LoadModule proxy_http_module modules/mod_proxy_http.so

If you haven’t already done so, you’ll need to tell Apache to use mod_perl:
LoadModule perl_module modules/mod_perl.so
Then add the proxy directives:
ProxyRequests Off
ProxyPass        /ipac20   http://127.0.0.1:81/ipac20
ProxyPassReverse /ipac20   http://127.0.0.1:81/ipac20
ProxyPass        /hipres   http://127.0.0.1:81/hipres
ProxyPassReverse /hipres   http://127.0.0.1:81/hipres

…you can change 127.0.0.1 if you need to.
The ProxyRequests Off is very important (for security reasons) and you need to include both a ProxyPass and a ProxyPassReverse to ensure that any Location headers returned by HIP are adjusted.
If you want Apache to be able to redirect a request for the root document to a default profile, then you’ll need to uncomment the rewrite module:
LoadModule rewrite_module modules/mod_rewrite.so
…and add something like this:
RewriteEngine on
RewriteRule   ^/$   /ipac20/ipac.jsp?profile=xyz   [R]

…where xyz is the name of the relevant HIP profile.
At this point, you should be able to request a HIP page from Apache and it will proxy the relevant page from the HIP server.  The only catch is that HIP generates an awful lot of absolute URLs, so you get links in the page that reference port 81.
To correct this, you need get Apache to tweak the HTML coming from HIP and nuke the port number.  There’s probably quite a few ways of doing this, but here’s how you do it with mod_perl…
One of the wonderful things about Apache and mod_perl is that you can do cool stuff like writing filters that parse the HTML just prior to Apache sending it to the user.
So, create a new Perl module in the directory where the other Apache 2 modules are stored (e.g. /usr/site/lib/Apache2) and name it ModProxyHIP.pm:
package Apache2::ModProxyHIP;
  use strict;
  use warnings;
  use base qw(Apache2::Filter);
  use Apache2::Const -compile => qw(OK);
  use constant BUFF_LEN => 1024;
sub handler : FilterRequestHandler
{
  my $f = shift;
  my $output = '';
  while ($f->read(my $buffer, BUFF_LEN))
  {
    $output .= $buffer;
  }
  $output =~ s|http://127.0.0.1:81||g;
  $output =~ s|http%3A%2F%2F127.0.0.1%3A81||g;
  $f->print( $output );
  Apache2::Const::OK;
}
1;

…you might need to tweak those two references to 127.0.0.1 accordingly.
(I should point out that the above module is very simplistic and you might want to look at this page for ideas about improving it)
Next, add the following to the Apache 2 httpd.conf file:
<location /ipac20>
  PerlOutputFilterHandler Apache2::ModProxyHIP
</location>

That basically tells Apache to use your new module to filter the output from all requests that start with /ipac20.
If you’ve got Apache and HIP running on the same server, then you might want to get rid of the previous /hipres lines:
ProxyPass        /hipres   http://127.0.0.1:81/hipres
ProxyPassReverse /hipres   http://127.0.0.1:81/hipres

…and replace them with something like:
Alias /hipres/ "/appserver/jboss/server/default/deploy/hipres.war/"
…which means Apache will delivery the static files directly.
There you have it — Apache 2 acting as a reverse proxy for you HIP server!
Just to summarise, here’s what happens:

  • user requests a web page from your OPAC by sending the URL to port 80
  • Apache server accepts the request and proxies the request back to your HIP server
  • HIP server generates the output and passes it back to Apache
  • Using the mod_perl filter, Apache parses the output to remove references to your HIP server’s port number
  • Apache returns the final HTML to the web browser

If you let Apache deliver content for the /hipres directory, then you also get the added benefit of HIP not having to waste time delivering the static content (i.e. stylesheets and images).
Any malformed URLs should get handled by Apache (which you can more easily keep patched and secure) – e.g.:
https://library.hud.ac.uk/I_LOVE_HIP!!!!
The usual caveats apply:

  • this worked fine for us with HIP 3.04 (UK release), but may cause your HIP install to implode
  • SirsiDynix Support will probably laugh at you if you have any problems
  • try it on a test HIP server first!!!

Once I’d got the config sorted out on a test server, it took less than 60 seconds to drop the new configuration changes onto our live server and restart the Apache and JBoss/Jetty services.

9 thoughts on “HIP 3.04 behind a reverse proxy”

  1. Hey Dave, I am trying to set up mod_proxy to HIP too, and started from your directions… but I think I found an even much much simpler way to do it.
    I _think_ that if you just put:
    ProxyPreserveHost On
    in your apache conf, you no longer need that Perl filter at all. Everything Just Works. It appears to. I almost didn’t believe it could be this easy, but I’m using it, it seems to be working…
    So if anyone wants to give that a try, avoid the perl filter and just use ProxyPreserveHost On, it’s worth a try–and if anyone does and finds it is succesful let me know!

  2. Having trouble getting a PerlOutputFilterHandler to work, it’s generating errors on the perl side. I think maybe I’m missing something I need to install for perl, the Apache2 module maybe? Anyone have any ideas?

  3. this is very interesting, and I have gave it a try. However, i think you still need the perl filter even if you use the ProxyPreserveHost On, because there are still a lot of links to the jboss port. We, give it a try and let you all know. We have tried this in the past, and we were 50% successful, but with this perl module, we will see if we can improve it. Thanks a lot for sharing

  4. Just a thought here.. what did you do to the admin port ? 222 ..
    in my test lab, I have proxied it too.

  5. Hi Simon
    We left the admin port unproxied and inaccessible to the open web.

Comments are closed.