OpenWrt Forum Archive

Topic: Captive Portal and Web Server Cache Problems

The content of this topic has been archived on 20 Apr 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

We have a program called the Public WiFi Project where we help small downtown associations deploy free public internet.  One of the features we're trying to implement is causing some problems.  I'm a functional developer - I work with the concepts, and a gentleman I work with codes it.  I wanted to explain a problem we're having, and see if there's any suggestions.

We've developed a localized Captive Portal (not requiring a remote server).  Since our implementations are intended for small down towns, they really aren't technically savvy, so the system has to be self contained.  Our Captive Portal is a very simple implementation, and it's purpose is to create an autonomous Captive Portal solution (one that's only dependent on the router).  The intent is for the community to promote their services (or the locations services) using this Captive Portal, not necessarily for any true authentication.  Let me start by saying it works 99% of the time, and we're real happy with it's simpleness, but it has a problem with cache.

Here's the way it works. 

Using IP tables, all port 80 traffic is directed to a specified web server (all other ports are blocked) - we installed a 2nd web server on the router.  For testing purposes though, we could specify any web server (for example one on a remote server).  The web server that the packets are directed to is configured to always respond with the same page.  The page that it responds with is on the router itself (but it can be anywhere as long as it's in the "allow list" or "walled garden"). 

The page is nothing more then simple html with a link to a shell script on the router.  The theory is that the client gets this page (maybe a welcome message specific to that location), and then clicks the link to proceed.  Same basic function of any Captive Portal.  When the shell script is executed, that person's IP address is added to a second table (cleared list) that is excluded from the block.  From that point on, their traffic is managed and routed as it normally would have been.

I'm not expecting that any of this is new, but it's how the web server is responding that is not allowing this to work 100% of the time.  There are some pages that the cache interferes with the operation.  Here's an example:

When a client first opens their browser, assuming that their home page is "www.google.com", iptables will re-route their request to the router's 2nd web server, and respond with the local.html.  This html ends up being stored in the client's cache against www.google.com's entry (I believe it's tied to the resolved IP).  The client clicks on the link to the shell script, and they are "unlocked" from this basic captive portal.  The problems arises when the client requests "www.google.com" again.  The computer pulls that local.html out of cache every time.  The only way to get around this is to type "CTRL F5" or clear their cache.  To the consumer, they think they have to re-authenticate in the captive portal (or in other words, click on the link to the shell script), but that's not the case - it's just a cache issue.

We have tried every html implementation of expiring cache, no cache, content = 0, etc.... in the local.html.  Nothing seemed to have any impact.  We've exhausted the html developers forums.

We finally decided to do a simple test.  What if the web server that we sent packets to wasn't the one on the router , but a more rich featured web server on one of our linux servers - would we have the same problems?  We configured that, and we did not.  The page expired just like our tags indicated, and the page for www.google.com was not referenced by cache.  This tells me it has something to do with the web server.

We then made sure all the headers from both web servers (web server2 on the local router, and the linux server web server) were the same, sending the same hard coded expiration and modified headers from the webserver, and that didn't solve the cache problem.

I also know that this issue doesn't happen with all web sites (for example yahoo and cnn don't have these caching problems).  Additionally, we have some computers that we can't duplicate this problem on (some local settings prevent it). 

So my questions are: 

Has anyone seen this problem before and solved it?
Is there something in the basic web server packages on OpenWRT that's been excluded that prevents this caching?
Is there a developer that has time to help on this problem?  (we do have a budget for development).

I have referred to many of the existing captive portal solutions (wifi dog, chili spot, etc....), but they all are much more then we wanted (more overhead).  I believe what we're trying to do here is very basic, and should work, it's just this one little cache issue that's affecting it (otherwise, every other aspect of this Captive Portal solution is perfect).

I would appreciate your thoughts. 

Jim

Did you try a temporary moved http redirect? It's not 100% bullet proof but it's the most reliable I've seen.

Did you investigate clock issues? Most routers don't have any way to keep time, so the datetime on the router will usually be 01/01/1970 after boot.

Thanks for those thoughts.  Regarding the clock issues, we run "rdate" at startup and then via cron.  It usually puts us within a second or two of our servers.  From what we can tell, we're ok on the time issue.

Regarding a temporary moved http redirect, we didn't think this simple web server could do that, but we are looking into it right now.  This is a good idea, I'll post the results of this attempt. 

Thanks again.

Jim

The discussion might have continued from here.