Commentary

David Berlind
 

Is There A Non-Persistent Middle Ground For Offline Browsing?

Redmonk analyst Stephen O'Grady and I have been batting the challenges of persistence in Web browsers back and forth over the last week. Obviously exasperated with something, he tweeted (via Twitter) how he'd like to see a browser have the ability to recall recently visited Web pages from a local cache of some sort. I immediately replied, thinking what he was really looking for was off-line persistence of a Web application and, after Twitter failed us in the ability to carry the dialog, we took to the blogs (him, me) to continue the conversation.

Redmonk analyst Stephen O'Grady and I have been batting the challenges of persistence in Web browsers back and forth over the last week. Obviously exasperated with something, he tweeted (via Twitter) how he'd like to see a browser have the ability to recall recently visited Web pages from a local cache of some sort. I immediately replied, thinking what he was really looking for was off-line persistence of a Web application and, after Twitter failed us in the ability to carry the dialog, we took to the blogs (him, me) to continue the conversation.Offline persistence of anything in the browser -- pages, appications, data, etc. -- is an incredibly tricky problem. So much so that we'll be having an open conversation about it at Mashup Camp in less than two weeks (there's still room to join us).

O'Grady makes it clear that he's not looking for the whole enchilada -- the ability to interact with Web pages while offline as though he were online. He's after something that's not quite so ambitious -- just get him the old page(s) back (with whatever data he may have last filled into it) regardless of the browser's state of connectivity. Presumably, that information is either in the browser's cache already or something can be done to make sure it's there and easily retrieved. I agree. If this is easily done, then someone please do it. It would go a long ways towards improving the overall Web experience for a lot of us.


More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

But is it easily done? Or, better put, is it trivial enough to warrant the effort or is so non-trivial that shooting for the moon -- offering the same capability as a part of a Google Gears-style off-line persistence of Web apps -- makes more sense. O'Grady blogged my question and his response:

[DB:] thanks in part to Gears and similar technologies from Adobe, Apache, Oracle, and Sun, we know more is possible. So, why accept less?" - [SO:] because it's likely to arrive far more quickly.

I'm not so sure and responded as such directly on his blog with the following comment:

Regarding your answer "because it's likely to arrive far more quickly", is that really true? In this case, the science of caching form data in a way that's recoverable during some future session is nearly as difficult as solving the persistence problem in general. A modicum of structure would be required and, as you know, the majority of form-based HTML pages lack structure.

With such a lack of structure to these pages, the only approach might be something like a plug-in that's the equivalent of an autosave (like what Gmail does), but instead, to the local hard drive where it can be retrieved. But I suspect that even that is almost as complicated as the persistence problem.

For example, go to any page with a form on it (even the one to fill out a comment on your blog), fill some of the form out, and then hit File Save and save the file to your local hard drive. Now, open that file with file open and you'll notice that whatever you filled into the form is no longer there.

State in combination with lack of structure (even though the form seems structured) is most definitely an issue. The more I think it through, the more I realize how it's really a thorny problem.

Thread-per-tab browsers (like Chrome) might be a part of the answer in that each tab could run in its own shell and those shells in turn could be capable of running some code against the currently loaded page in a way that doesn't interfere with the HTTP server's understanding of the page's state. I'm thinking "screen scraping" tech that independently (of the web server) creates its own last known state for every tab.

One question if this were working…: would ordinary users expect the cached-page to be able inject the recovered information into the "real pages" when they're available? You're a power user. The idea that you might be able to pull the cached page back and copy & paste some information somewhere so it doesn't get lost isn't exactly a great user experience. Most people would get to that recovered page and ask "Now what?." There would be an expectation that the recovered page could inject the data back into the real page (when it's available), in which case, a significant amount of flexible structure and intelligence would have to be incorporated into the solution…. an architecture that's remarkably close to being a persistence mechanism.

To be honest, I'm pretty sure most of what I'm saying is technically accurate (I mistakenly left out my concerns about security with such an architecture). But, on the same token, things under the hoods of browsers may have progressed to a point where I need to be taken out to the woodshed for a technical spanking. I've only managed to keep half an eye on the situation over the last year and know things have improved dramatically.

Or perhaps the answer is a bit simpler. For example, why, when I hit my browser's back button, am I sometimes returned to a Web form that still has all my user-entered data in it while at other times, hitting the back button takes me to a completely blank form (as though I never entered anything). I most often notice this in e-commerce situations. Then again, when the browser's back button is involved, the implication is that the original form was submitted (thereby updating the page's state).

If you have thoughts on this, please do share them with us.


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links