It seems to be the time to move on. Edwin of Collaxa sold his company to Oracle. I’ve left BEA and moved to Google. I’ve wanted for a long time to work in a world where software was delivered as a service and to hundreds of millions of people. I’m going to change the subject of this blog to more general ruminations of various sorts.
During the last year, I was writing primarily about what Web Services and mobile communications would mean to browsing. I postulated that web services meant that information rather than content could flow from the internet to the client to be formatted on the client side and that because of the intermittant nature of mobile communications and/or just plain latency and availability issues, this would be useful regardless as long as the data were to be cached on the client. I’m not going to write about this anymore. I was the victim of some rather unpleasant spam against the site, so I’ve closed the entries, deleted the ones where cleaning up the spam was more trouble than it was worth, and included, below a summary of most of the old entries.
Authored 7/25/2003
More Later
Aaron Swartz points out to me that Tim Bray has argued that people don’t want rich clients
In general, I agree with Tim. Steve Jobs used to argue that a pc needed to be as easy to use as a radio or telephone (back when they were easy to use) and when the web came along I was surprised that Steve didn’t notice that the future was here. Indeed I said so at Microsoft in 1996/7 when saying this there was a hugely unpopular thing to say. Personally I like the web. It is easy, doesn’t require arcane knowledge to use, … So, in terms of user interface, I think it is generally the right solution. I’m always surprised when I hear people disparage the web as poor or limited or dumbed down UI as Macromedia does and Laszlo and Altio and Nexaweb and others have sometimes done.
Limitations:
But there is no doubt that there are two things it isn’t particularly good at:
Offline Access. IE’s attempt to make sites go on working offline is a joke. Offline requires a new paradigm.
Push. If Information needs to flow to the user more or less as it occurs then the browser paradigm isn’t always the best model, at least without help.
All by themselves, I’d argue that the gain outweights the pain. But add in occasionally or poorly connected which is the world of mobile devices and the more limited UI gestures that really work on mobile devices and I think the game gets more interesting.
More later.
Authored 7/30/2003
Web service browser
Judging from some of the comments I’ve received, I need to be more careful with my terminology. Having heard this since 6th grade, I’m not surprised. This entry will try to define some basic terminology and then discuss how we would recognize a Web Services Browser if we saw one.
Ok. when I say “Rich Client” from now on, I don’t mean a VB look and feel let alone a Flash look and feel. Macromedia and the inimitable Kevin Lynch do that really well (a lot better than Microsoft) and I’ll just stipulate that this can be the right thing. What I’m more interested in though is something that has the authoring simplicity and model of pages. If you can build HTML, you should be able to build this. You should be able to use your editor of choice. Indeed, as much as possible, this should be HTML. I mean 3 things:
1) The user can quickly and easily navigate through information in ways that the site providing the information didn’t expect
2) The user can run offline as well as online
3) If information changes and the user is connected, then the changes can be “pushed” into the UI if so desired
When I talk to customers about why on earth they are still using Java Applets or VB to build user interface, frequently the reasons are because they can’t do these three things otherwise.
Similarly, when I say “Web Services” I don’t mean that all of the SOAP crap is required. I mean that a service exposes some way to send/receive XML messages to/from it. REST is fine if it works (more on REST in another entry).
So, to summarize, a Web Services Browser is a browser that can access information published as XML messages by services, let the user interact in a rich and graceful way with this information or these services, but can run well in terms of interaction whether the user is online or offline.
Let me give an example. Suppose I’m managing a project and I want to be able to review the project status while on the road. I want to be able to sort by task priority, or employee or amount the task is late. I want to be able to update priorities, add comments, add tasks, and so on. If I am online, say at Starbucks or the airport or Hotel, then I want to be as up to date as possible and I want all the changes I’ve made offline to percolate back to the project. If I’m online and information changes in the service, I want to see the changes immediately flow into the page I’m viewing if they’re relevant to it so that if, for example, I’ve been viewing a summary dashboard style page and some tasks get updated I can see at a glance that they have changed. If I switch machines to my mobile PDA or just connect in through someone else’s computer or an Internet kiosk, I still want to see/update the information of course.
All this I want the authoring model to be more or less what I know. That’s the vision.
Authored 8/7/2003
Detour
Reading all the comments on my last posting, I’m going to give up and detour for an entry about “Rich UI” before getting into the heart of how one might build a web services driven browser and why. It is clear that my comments on Rich UI were both overly facile and unclear.
Sean Neville of Macromedia has posted a really excellent comment to the last posting and Kevin Lynch has an excellent white paper which I encourage readers to look at.
One of Seans’s key points is that Macromedia focuses more on the richness of data and on the richness of the interaction with the user than of the widgets per se. Kevin Lynch has a great demo of the forthcoming central which makes the same point. When you see it, you are overwhelmed by how gracefully the media fit into the application and appear to be an integral part of it even as they are being dynamically fetched from the server. I’m excited about the work Macromedia is doing here. I think it is great work and can substantially enrich the web experience. I was also impressed with the Lazslo presentations (which at least when I saw them sat on top of Macromedia’s Flash engine) and with Altio which had its own Java rendering engine. In short, I’m not against “Rich UI”. Why would I be? I got into this field years and years ago when I fell in love with the Xerox Parc work and set out with partners like Eric Michelman and Andrew Layman to build the first project manager with a graphical user interface. Later Eric and I split off and teamed up with Brad Silverberg to build Reflex, one of the worlds first databases with a graphical user interface. None of this work, of course, used media itself as a type with the dazzling richness and aplomb that the new Macromedia Central demos do. But they shared the excitement and vision.
But what excites me about Central isn’t as much the media as other aspects of it. It has agents which intelligently look in the background for changes and let me know. It can run offline as well as online. It can have context which is so important for many mobile applications. It supports collaboration and direct access of information from the internet. It’s user interface makes use of the ability to ask for and send information to the internet dynamically to enable a much more interactive feel. So, it is reasonable to ask, in what possible way isn’t this already a complete realization of the vision I’ve been discussing?
In most ways it is. And of course, in many ways it is a superset. But there is one gotcha that causes me to think that this isn’t, strictly speaking, a browser. Browing is about building and traversing pages whose user interface palatte is an extraordinarily limited one. In a funny way, browsing is almost an anti-GUI revolution. It isn’t of course. Look at the rich use of layout, flow, media, and so on. But it is in one key sense. The set of user actions is tiny. Fill in fields. Click on URL’s or buttons. Lay things out using flow and/or Tables. Most of the other enhancements in HTML from one I helped build (DHTML) to Framesets to animation have been largely unused. Why is this? Why has the web kept its gesture of user interactions so small? I believe it is because most web applications are not intended for frequent/daily use. Instead of the old model in which you use 2 or 3 applications every day you tend to interact with hundreds or more every month or so. In this world, ease of use and discoverability are paramount, ease of authoring is key, and simplicity is all. Of course there are many applications that scream out for a product like Central. In general, however, I think they are applications that one expects to use over and over during the day in the ways many people used to use Office before email and the browser rendered it largely irrelevant.
This is not, to me, the ideal model for Amazon.com or BMW or my 401K application in my company portal or my broker. These are not applications I use everyday let alone frequently during a day. They are applications I run when I have a specific task in mind and I expect them to fit the usability rules of the browser. Mostly they behave just as I wish unless I go offline or I want to sort or filter in ways that they didn’t expect/predict.
In short, I think there is room for both. I think Macromedia is heading in exactly the right direction and have told Kevin Lynch so. At the same time, I think that there is a need for a plain old browser that can interact with the server at the information level and I think there is also a need for Central. We don’t use only one size and type of screwdriver. Why should this world be different?
Authored 8/14/2003
Web Services Browser continued
This is the first entry to actually describe how a web services browser might work, at least as I imagine it. Remember, in this dream, a web services browser is primarily interacting with information, not user interface. It does have “pages” but it also has a local cache of information garnered from the web services to which these pages are bound and the two are distinct. Related, but distinct. In the next few entries, I’ll discuss how an online/offline calendar might work since it is something I feel is sorely lacking from the web.
Let’s make the web services part of the Calendar really simple. There is a service to get activities for a date range. In addition, there are operations to add an activity, to modify an activity, and to delete an activity. Each activity can have notes, one or more assigned people, phone number, title, priority, start time and duration, external visibility, reminder attributes, category, and repeating characteristics if any. In short, mostly what is in Yahoo’s calendar if they only surfaced it as a web service!
This information can be thought of as a local XML store with documents for each activity and web service operations used to read, write, delete, and query it. The “pages” that are designed to run offline together are bundled into a “Client Web”. These pages are standard web pages with three key extensions:
1) All HTML elements on the page can have a “selector” attribute that can select the requisite information from this local XML store and can repeat if the selector returns multiple items.
2) Any element can contain a macro that references the “current” selected XML element in {}’s. So, for example, a TABLE might have a TR with selector=”\activity” to list all activities and within it a TD might contain {\date}. This would mean that there would be a row in the table for each activity and that within each row, the TD should display the date for that particular activity.
3) URL’s and Forms can bind to “Actions” in a “local controller” script which then does whatever is necessary on the client and returns the next page to display. The script may just invoke a web service and return the same page. It may simply update some of the “XML store” and return the same page. It may simply switch pages. It is, in essence a super lightweight web server except that it doesn’t need to round-trip to the Internet if the pages being returned are all in the Client Web. There is a “begin” action in the controller that is run before the first invoked page is run (usually the entry point), and there is an “end” action that is invoked if the user leaves the “Client Web”. We call this the elementcontext in each case.
There a couple of other key points:
1) The macro language doesn’t only have access to the XML Store. There is a “magic” XML variable, $Form, which represents the values in the most recently submitted form. Thus the attribute
selector=”/activities[date >= $Form/startdate]” will select all activities with a date greater than the start date filled in in the form. Similarly, if a URL is selected the current XML path that was its context is available in $ElementContext
2) If the controller returns the page currently being displayed, it simply refreshes itself in a flickerfree manner. In short, the user experience is just as it would be in the online case, but with less obvious flicker.
Now, how does the Calendar work? it has a default page with the current month set up in the begin Action using the Selectors to pull in the items that make sense for each day in that month. Each day has URL’s that point to Actions for Add, Delete, Modify and each action in the controller script will have access to $ElementContext to see which day was in question. They will then invoke the requisite Web Service Operations to add/delete/modify.
The Delete Action is the simplest. It loads a Delete Confirmation Page from the ClientWeb, (details to follow in subsequent entries), sets it to filter on the Activity that is to be deletes and then returns this page. This page in turn has a URL pointing to the DeleteConfirmed Action which uses the web service to delete the Activity. The Update will basically do the same with an Update page and an UpdateConfirmed Activty pointed to by the Submit in the the Form. The insert will work the same way. In subsequent entries, I’ll imagine how the logic in the Controlller might look and discuss some choices for “parameterizing” the pages.
So, to summarize, a set of related pages designed to work with the XML information in quesiton called a ClientWeb, a set of information retrieved using Web Services, and a Controller to coordinate the actions that occur between pages and to invoke suitable web services when necessary. How does all this work offline? Also to be covered in subsequent entries. Lastly let me freely admit that this is a dream in progress, open to all, and sure to be wrong in some of its details.
Authored 9/26/2003
Delayed continuation
I promised that I’d walk through how to build a calender. But based on comments/feedback, this entry will discuss various ways in which a client can browse information gleaned from web services on the net and why I picked the model I did.
There are basically three ways that information can turn into User Interface.
Given my desire to build a browser that interacts with the information being gleaned from web services one has to choose which of these three models to use. I chose the second in this blog. It is equally reasonable to choose the others and AboveAll seems to be building an interesting flavor of version one.
Once this is picked, then one has to ask where is the rendering engine and how does it talk to the Web Services. There are, roughly speaking, four ways I can think of:
Today, the strategy in the industry clearly is the last one. We render pages on the server and ship rendered user interface (HTML) up to the client. The industry does this because it makes it much easier to deploy changes to the logic. Change the web server and hey presto, every client sees the changes. People tend to forget that a huge benefit of the browser based model was simply to remove the deployment hell and risk that deploying EXE’s had meant. So, any change to this model should be considered judiciously. Nevertheless, I argue that the time has come for a richer browser model. It should still be essentially a no-deploy provision-less model, but the actual rendering needs to be able to take place on the client. Why?
Because of the increasingly mobile nature of computing. If everyone sat at a desk all desk connected to a broadband link I might argue otherwise although it is interesting to note that even as we all talk about Grid computing, we centralize the easiest thing to distribute out, namely the rendering. But we don’t all sit at desks. More laptops than PC’s sell these days and 80211b has liberated us all from the net tap. In addition, mobile devices are coming into their own as GPRS becomes ubiquitous. I just roamed all over the world with my Blackberry 6210 and had effortless synchronized instant access to my calender, contacts, and mail at all times. (i’m about to upgrade to the 7210 which is a killer device). In this world, the connection is intermittant and the bandwidth is unreliable. In Paris, for example, my GPRS went up and down like a Yo-Yo and in the US it is deeply unreliable. Just try using GPRS on the amtrak train from NYC to Boston! Many of us fly all the time and want to have access to our information on the plane, the car, the bus, and so on. Since many of us are mainly mobile, this is a severe problem. Mobile computing therefore, is the driver here.
It is worth noting that done correctly this will improve the customer experience even in the fully connected world. For example, my broker’s online web pages don’t let me filter, view and sort my assets in the way I would like. If I have my browser talking to my broker’s web services rather than to its rendered pages, then my pages can enable my view of the information, not just the one that my broker limits me to. Obviously the same is true for viewing news, mail, and so on. Similarly, I have to wait each time I want to view the next 23 messages in my mail browser (although I can configure it). In short, we may end up viewing the current model as a tyranny of the content provider in which we could view information only in the way they had in mind.
I am arguing that we need a model other than the fourth to meet the mobile users requirements and that it will ultimately benefit everyone. Which one? The first one (directly invoking web services from pages as the browser renders them) doesn’t really meet the bill because it still requires connectivity at rendering time and provides no mechanism to sort/filter the information. That leaves two models: Have a cache that talks to web services or have a cache that talks to another cache. I’m seriously torn here and I suspect that, in the long run, both will be required. It is much easier and lighterweight to just have the cache on the client synchronize with a cache on the server. Then the only thing required for communication to the Internet from the PC/Device is some sort of souped up SynchML. This may not be intuitive to the reader. It stems from my assumption (perhaps incorrect) that we can package the page itself as XML so that any change to the page can itself be delivered into the cache using the same SynchML protocol. On the other hand, this still limits the freedom of the mobile client to integrate information across web services since someone has to write a server somewhere which runs this cache that synchronizes. But in either case, one ends up assuming that there is a each page really is a cache of information on the client associated with the page plus the specific XML for the page, but separate from it, that in the background it can synchronize with information on the Internet.
I’m going to wave my hands for the time being about how this data gets synchronized between the client and the Internet and just assume that some sort of Synchronization Protocol exists.
But assuming this opens up some specific challenges. How does the page reference this information? If the user is clicking through links and is offline, how does this work? How do user interface gestures that are supposed to invoke processes on the Internet work when one is not connected (for example giving someone a raise, approving an expense report, moving a scheduled item, selling a stock, booking a hotel room, assigning a bug report to an engineer, and so on)?
Proposed answers to these questions shortly (a lot sooner than the last delay). I promise.
Authored 9/30/2003
When connectivity isn’t certain
Everyone keeps asking me why this isn’t Mozilla or Luxor or one of 100 other browser efforts currently underway and, honestly, maybe some or all of them can do or will do what I’m about to discuss, namely run offline and deal gracefully with being intermittantly connected. This is a critical issue that I sometimes think is widely ignored. I remember being on a panel at Java One with Bill Joy and hearing him say that we needed to design assuming perfect connectivity to the net! If you design assuming that connectivity isn’t perfect, then you reverse assumptions. Instead of assuming that you can always get the next page from the Internet, you assume that you often cannot, or not quickly enough to be reasonable or pleasant. Instead of assuming that you can rely on the server running on the Internet to interpret each click on a URL or button, you assume that you often cannot. Yesterday a very bright BEA employee asked me why the clicks didn’t just turn into GET’s or PUT’s and another very bright person asked why I couldn’t just use REST. In both cases, the issue that that I want a great user experience even when not connected or when conected so slowly that waiting would be irritating. So this entry discusses what you do if you can’t rely on Internet connectivity.
Well, if you cannot rely on the Internet under these circumstances, what do you do? The answer is fairly simple. You pre-fetch into a cache that which you’ll need to do the work. What will you need? Well, you’ll need a set of pages designed to work together. For example, if I’m looking at a project, I’ll want an overview, details by task, breakout by employee, late tasks, add and alter task pages, and so on. But what happens when you actually try to do work such as add a task and you’re not connected? And what does the user see.
To resolve this, I propose that we separate view from data. I propose that a “mobile page” consists both of a set of related ‘pages’ (like cards in WML), an associated set of cached information and a script/rules based “controller” which handles all user gestures. The controller gets all requests (clicks on Buttons/URL’s), does anything it has to do using a combination of rules and script to decide what it should do, and then returns the ‘page’ within the “mobile page” to be displayed next. The script and rules in the “controller” can read, write, update, and query the associated cache of information. The cache of information is synchronized, in the background, with the Internet (when connected) and the mobile page describes the URL of the web service to use to synchronize this data with the Internet. The pages themselves are bound to the cache of information. In essence they are templates to be filled in with this information. The mobile page itself is actually considered part of the data meaing that changes to it on the Internet can also be synchronized out to the client. Throw the page out of the cache and you also throw the associated data out of the cache.
Now every user action isn’t dependent on connectivity. Go to the next page. It is within the cached mobile page. Let the user filter or sort tasks. This is done by merging the result of a query on the cache into the target page using “binding” and controlled, where necessary, by the script in the controller. Change the priority of a task or create a new one or delete one or request that relevant users be notified about changes and all these changes are entered into the cache as “queued” requests. Once sent to the Internet as part of synchronization, their status is changed to “sent” and once processing on the Internet has actually handled these requests, their status is changed to “confirmed” or they are simply deleted. But the user actions that created these requests never was blocked. It will be important, of course, that the user interface make clear what is queued, what is sent, and what is confirmed.
Some actions may not return meaningful information until synchronization has occured. For example, ask for all news stories on a stock and they may not be local. In that case you see the request as Queued, Sent, and then finally resolved with a link to the now cached news stories and from then on news stories may synchronize to that link in the background.
This model is profoundly different than the normal browser. It assumes that connectivity may not be present. Accordingly, it makes sure that everything that the customer needs is present on the client. This means that not only does the client handle the formating of the information into pages, it also holds the intelligence about what happens between pages. Hence the script controller. Why script? Because it is small, lightweight, interpretive, easy to dynamically deploy and modify, and already used in pages so that a form of it is already part of the browser runtime. People forget how small script engines can be. The original dBase II ran on a 64K machine. I would expect that this particular script would be sandboxed. In my vision, all it could do is read and write the cache associated with the mobile page and decide which page should be displayed next. Since it is script, it provides a install free lightweight model for intelligence that runs as a result of user actions so that the model isn’t too constrained. Pure declarative systems always turn out to be too limited in my experience.
Of course, the simple model for all this is straight web services between the client and the Internet and I have no doubt that this will work for a huge amount of cases. The cache will just hold the XML returned from the web service or pushed into it (next entry).
So perhaps all this can be built on top of Luxor or Mozilla or .. If so, great! Having written one browser, I certainly don’t want to write another. I want to work with those who do to enable these capabilities.
Authored 10/1/2003″
Comments on comments
James Strachan writes that I don’t need a new browser, merely a client-side proxy to do what I want. Absolutely true based on what I’ve said so far. My next entry (where I’ll talk about push) will challenge this wisdom a bit, but in general, James is right. I need a new smart super light-weight proxy.
Edwin Khodabakchian (who I consistently regard as one of the most interesting people working on the web today) says he is going to try and mock this up for me. Awesome!
Doğa Armangil, who I don’t know, says experience is showing her that a web browser + web services yields a better use experience than rich client without them. I hear similar things. A “very large” financial institution discovered that when they replaced their CICS->ShadowDirect->RS 6000 Protocol conversion->ODBC/OLEDB/ADO (MSFT Stack)->ASP to build pages with a native mainframe Web Services ->JSP, things got an order of magnitude faster. That’s right. Faster. They were stunned. But this doesn’t solve the intermittant connectivity issue per se.
Authored 10/20/2003″
When a tree isn’t a tree
This entry is dedicated to discussing how to handle cases when the disconnect is total (for example I’m on the airplane). It has been 3 weeks which is too long, but I’m constantly amazed by the expert bloggers who post almost every day. Where do they find the time? Do they have day jobs? I do, two actually, and it turns out empirically that a posting every couple of weeks is going to be as good as it gets. So I’m occasionally connected. The powers that be at Intel have mandated that the politically correct term for software that works on occasionally connected devices is mobilized software. We wouldn’t want to remind people that things sometimes go wrong. OK, I’m mobilized. But what happens when you really aren’t connected? What is available to you and how did it get there? The answer, as I’m about to explain, can be thought of as trees.
Let’s start with a simple case that drives me crazy. As many know, I use and love a Blackberry 7210 for most of my work. I literally don’t use a PC sometimes for days on end right now. Mail is a joy. The phone works well for me (I carry a backup for the vast majority of the US that ATT Wireless seems unable to reach), my calender works well. Only my contacts don’t synch wirelessly (come on Blackberry!). But browsing isn’t fun at all. I like to keep an eye on google’s news. And navigating through it using the 7210 is really painful even when GPRS is humming and downright impossible on planes or, as it turns out, in Bedford NY when ATT wireless is your provider. So, I’d like my Blackberry to have google in the cache. Now if anything should be easy it is this. It is a nice simple hierarchy with subjects like Sports, Sci/Tech, … and a set of stories. But there are a couple of complications. The stories aren’t laid out at all well for a mobile device. Menus seem to go on forever. I don’t care about sports. Yes, I know, that’s un-American. Nor do I care about entertainment. What would the web services browser do?
As said earlier, it would traverse a cached data model. One can think of the data model as a simple tree. It would have a starting point, a trunk if you will, called google news. It would have a set of branches for each type of article which I’d mark as Business, Sci/Tech, and so on. Each branch could use a web service to fetch articles of that type. Each would return a set of articles, each of which would contain a one-line summary, a picture (if appr), the story, the byline, and a URL to the raw HTML. On my Blackberry I’d subscribe to the URL for google news. I’d immediately see my choices (e.g. the articles I could delve into) and as I scrolled the wheel over each, below, a short one-liner for each article. Scroll below to the one-liners for each article and click on one and, hey preto, details and story. Perhaps the article is abbreviated and I have to traverse through the URL to the HTML for the raw story. OK, then I can decide from the precis. Now my user experience will be much better. Today, I tend to turn on my Blackberry as the plane lands and then catch up on my mail on the rental bus. Tomorrow, I’d wait for the plane to take off and then catch up on the latest news. In essence, instead of going to a URL, I’m subscribing to one, but with navigation and presentation capabilities designed for me.
Want to reclaim the space? I’d always have a menu on any story OR on the main google news area that would let me DELETE as I can on my mail today. Or I age them.
I have the same problem with Yahoo Finance except that here the information is just a single portfolio I keep in Yahoo. I’d like it to trickle in in the background so that when/if I do check it, the experience is immediate and that the list is a list of holdings with the drill-in going to the short summary about the stock in question.
And so on.
All this should be a no-brainer so far. It is read-only. I don’t have write privileges. All I need is a way for each URL to specify a tree whose branches are words that make sense to me (like business, sci/tech, world or MSFT,ORCL,BEAS) and then tell me which web service will return relevant information and how to display both a summary and a detail view. (More on this in a later entry). When GPRS returns or I sit down with my Laptop in the airport/hotel/Starbucks, then I’ll catch up. But the key point is that I can access things I care about even when I’m not connected. The model has to know which branches I want it to be pre-emptive about fetching. Remember, I don’t want sports or entertainment cluttering up my machine. It should even know my priorities since I can lose connectiity at any time. And if the list of entries is too big I want what google news does today, namely to just give me a few and make me get the rest. But I’d like to control how many. So each branch should have prefetch meta-data. It should also have how stale the data can be before it should throw it out. I don’t want stories more than 2 days old, for example. So I should be able to mark each branch in terms of how old the data is before I throw it out.
Now even before we getting to sites where I can do things/change data (which is harder), there are some interesting problems.
Suppose google adds a new topic to its news. This is like a new branch. It lets me get to new types of stories, for example, China, which seems like it is getting big enough to deserve its own topic. How does the browser know that this new branch is there? Well if the meta-data about the tree is adequately self-describing, it could refresh the meta-data about the site and discover this. It would need to merge my annotations about aging, pre-fetching, and so on with this new tree. What if it isn’t google, but a set of employees in a company. How does the browser tell the difference between someone having updated the employee versus someone quitting and someone else joining. What if they have the same name? Clearly, you need some unique identity returned from each web service as part of each leaf so that the browser can uniquely identify the leaf returned and can detect that it needs the cache on the PC/PDA/.. to be updated.
Sometimes I would like the browser to do that and sometimes I’d want to control it. This is what I discussed, a couple of entries ago. Just use meta-data versus layout control. In the latter case, I’d have explicitly marked which branches I wanted to pursue and all others would be off limits. If you look at blogs for example, often they list in turn the blogs they follow and you can only go through them to the ones they list. Others do it more dynamically based on links/popularity and so on. You decide.
Blogs, of course, are an example of this sort of browser today, but why, oh why, are they a special case? Why doesn’t the browser work this way for any set of information?
How do I manage the amount of information on my PC/Blackberry/Phone? I discussed one model, namely describing how long to hang onto information. But it is important not to be wasteful. Think of Zagat’s online. I might have branches to search by cuisine, price-range, stars, and neighborhood. Each would list a set of values which would in turn be branches (twigs?) to a list of restaurants. Each restaurant would have a review, stars, description, price range, phone number, address. Some would let you book them using a different web service (we’ll talk about this in the next entry) or get a map. This can add up, at least in NYC or Seattle or SF (I divide my time almost equally between the three). But wait, it gets much worse! The same restaurant might be found (in fact certainly will) from all lists. We certainly don’t want it repeated 5 times. So in fact, all these twigs must point to the same leaf. (Yeah, for you CS types, that’s a graph, but I’m a history major and for me a graph is the population of the world growing over the last century). but even if we’re smart enough to just cache the restaurant information one time, it is a lot of space. So what to do? I’m a super impatient guy. I want access all the time with more or less instant response. Well, we can take the big stuff (pictures, maps, ..) and use a LRU cache for that which is fixed and then the rest really isn’t that big. The NYC Zagats has less than 1900 restaurants with about 300 bytes of info for each or 600KB. Teeny no. But if a PC, it is nothing. If a PDA, I could afford to subscribe to 3 or 4 of these. If a phone, I could not. So include some smarts on the cache. I mark a neighborhood and a cuisine or two. Then later I unsubscribe. Perhaps I limit the price-range. Now we’re down to 100 resturants max or about 30KB and I could have a bunch of these at any given time on my PDA. In short, when I “go to a URL”, I’m subscribing to it and it becomes automatic to subscribe/unsubscribe. Clearly the UI will need to give me a space feedback.
How does Zagat get paid? Same as today. To use their online site, I have to login. But here they could be smarter. Get a cut of the bookings if I use their web service to book. Give me a micro-bill choice for my PDA wireless provider.
In every case, the thing that has driven this is a tree whose branches are labeled so that I can see/subscribe/annotate them, perhaps with twigs and smaller twigs and so on, and finally leaves. Now this is a bit misleading. Let me give an example. Another area that I want access to offline is our company address book and I’d like it to include our company org chart. I can never figure out who works for who. (Perhaps this is deliberate). If I modeled it and then subscribed to it using this system (same deal in terms of size of information by the way and same filtering techniques work), then as I traverse through people, each person not only has interesting information about themselves, they have links to other people based on their relationships. Here leaves contain twigs and it is clear that the bough of the tree anology just broke under the weight of the – no I won’t go there. So, more generally, I can navigate to lists of information using meaningful names, drill into items in the list, and then from those items again navigate to interesting lists.
Lastly think about CRM or SFA (Customer Relations Management or Salesforce Automation). As I navigate through customers I branch to projects or contracts we have with them. Then I see the related support people and can drill into them and see the related other projects/contracts/opportunities and so on. Again, the same pruning techniques of limiting information by location, job function, project size/price range, and so on can quickly prune a lot of information down to a reasonable amount. Connectivity will always be better. Then all information can be reached. But this model lets the user control which information is available all the time.
This entry has only discussed what happens when you’re disconnected and the information you’re browsing is read-only. In the next entry, I’ll discuss what to do if you’re trying to reach a service (like booking a restaurant) or modify information and you’re disconnected and how, it seems to me, the browser should handle it.
Authored 10/28/2003
Time Out
I want to take time out to passionately support Jon Udell’s latest blog entry on Avalon. I’m not sure that Royale and Central are any more “open” than XAML, but I am darn sure that all of Microsoft’s latest efforts seem to be moving inoxerably to a closed client world and even a only-if-you-pay world. While I applaud Infopath for example, it will never be the force for change and the ubiquitous client that it could have been. It will be, if not still born, at least damaged by the lack of a free ubiquitous runtime. Foolish in my opinion because if the runtime had been free and ubiquitous I think this product would alone have sold the new office. While I don’t know if XAML’s client will be free or not, obviously tying the graphics to Windows says it all in terms of ubiquity if not iniquity. An XML grammar alone does not an open standard make. Thanks Jon for saying it so clearly.
On a related note since Jon cited Edwin, earlier I mentioned that I was reluctant to use BPEL for the controller in the web services browser I’ve been writing about. Edwin Khodabakchian of Collaxa nailed the issue that causes me to think it will not work. A ‘controller” for UI is really a misnomer. In a site filled with pages (or pages filled with rich UI interactive gestures) the user is in charge, not the controller. There is no directed flow. Occasionally we build directed flows and call them wizards, but in general sites are not written to move from task to task in some ponderous choreography. BPEL is intended to do just this and because of it, really isn’t suitable for playing the controller role even if it were versatile enough. So, oddly, this entry both agrees with Jon when he disagrees with Edwin and agrees with Edwin when he disagrees with me.
Authored “11/15/2003”
Modifying Information Offline
Changing Data Offline:james@doit.org writes that I should refrain from blogging because my blog “is a real slow” one. Perhaps this is true, but I shall persevere. In this entry, I’m going to discuss how I imagine a mobilized or web services browser handles changes and service requests when it isn’t connected. This is really where the peddle hits the metal. If you just read data and never ever alter it or invoke related services (such as approving an expense report or booking a restaurant) then perhaps you might not need a new browser. Perhaps just caching pages offline would be sufficient if one added some metadata about what to cache. Jean Paoli has pointed out to me that this would be even more likely if rather than authoring your site using HTML, you authored it as XML “pages” laid out by the included XSLT stylesheets used to render it because then you could even use the browser to sort/filter the information offline. A very long time ago when I was still at Microsoft (1997) we built such a demo using XSLT and tricky use of Javascript to let the user do local client side sorting and filtering. But if you start actually trying to update trip reports, approve requests, reserve rooms, buy stocks, and so on, then you have Forms of some sort, running offline, at least some of the time, and code has to handle the inputs to the “Forms” and you have to think through how they are handled.
XAML:First a digression. I promised I’d dig into this a bit more. At the end of the day, I think that thinking of XAML as an industry standard for UI is premature and does assume that Windows will have complete dominance. It is essentially an extremely rich XML grammar for describing UI and user interactions. It subsumes declaratively the types of things VB can do, the flow patterns in HTML or word, and the 2-D and time based layout one sees in Powerpoint or these days in Central and Royale from Macromedia. In short, it is a universal UI canvas, described in XML, targeting Avalon, Longhorn’s new graphics engine. That is the key point. It isn’t an industry standard unless you assume that Avalon’s graphics are pervasive which I think is a stretch. Also, people are talking about this as though it will be here next month. As far as I can determine, Microsoft’s next massive OS effort, Longhorn, will ship somewhere between 2005 and 2006. In short, it is probably 3 years away. 3 years from now my daughter will be grown up and in college and who knows what the world will look like. I have no doubt that Microsoft is salivating at the thought that this will subsume HTML (not to mention Flash and PDF) and thus put those pesky W3C folks out of business, but I can’t really worry about it. Kevin Lynch of Macromedia should be the pundit on this one. End of digression.
Browser Model so far:As discussed already, this new browser I’m imagining doesn’t navigate across pages found on the server addressed by URL’s. It navigates across cached data retrieved from Web Services. It separates the presentation – which consists of an XML document made up of a set of XHTML templates and metadata and signed script – from the content which is XML. You subscribe to a URL which points to the presentation. This causes the XML presentation document to be brought down, the UI to be rendered, and it starts the process of requesting data from the web services. As this data is fetched, it will be cached on the client. This fetching of the data normally will run in the background just as mail and calendar on the Blackberry fetch the latest changes to my mail and calendar in the background. The data the user initially sees will be the cached data. Other more recent or complete information, as it comes in from the Internet, will dynamically “refresh” the running page or, if the page is no longer visible, will refresh the cache. I’m deliberately waving my hands a bit about how the client decides what data to fetch when. I’m giving a keynote talk about this at XML 2003 and I want to save some of my thunder. So far, though, I’ve described a read only model, great for being able to access information warehouses and personal data and like clinical trial history or training materials or find good restaurants in the neighborhood or do project reviews all while offline, but not as good when used for actually updating the clinical trials or entering notes into them or building plans for a team or commenting on the training materials or booking the restaurants.
It’s a fake:It is very important to remember in this model that “reality” usually isn’t on the device, be it a PC or a Blackberry or a Good or a Nokia 6800. Because the information on the device is incomplete and may have been partially thrown out (it is a cache) you don’t know really which tasks are in a project or which patients are in a trial or which materials have been written for a section. You only know which ones you have cached. The world may have changed since then. Your client side data (when running offline) may be incomplete. So, if you modify data, you need to remember that you are modifying data that is potentially out of date.
Don’t change it. Request the change:Accordingly, I recommend that the model is that, in general, data isn’t directly modified. Instead, requests to modify it (or requests for a service) are created. For example, if you want to book a restaurant, create a booking request. If you want to remove a patient from a clinical trial, create a request to do so. If you want to approve an expense report, create a request to approve it. Then relate these requests to the item that they would modify (or create) and show, in some iconographical manner, one of 4 statuses:
1) A request has been made to alter the data but it hasn’t even been sent to the internet.
2) A request has been sent to the Internet, but no reply has come back yet.
3) The request has been approved
4) The request has been denied.
Expense Reports:Let me start with a simple example. While offline, the user sees a list of expense reports to approve. On the plane, he/she digs into them, checks out details, and then marks some for approval and adds a query to others. All these changes show up but with an iconic reminder to the status/query fields that these fields reflect changes not yet sent to the Internet. The user interface doesn’t stall out or block because the Internet isn’t available. It just queues up the requests to go out so that the user can continue working. The user lands and immediately the wireless LAN or GPRS starts talking to the internet. By the time the user is at the rental bus, the requests for approval or further detail have been sent and icons have changed to reflect that the requests have now been sent to the Internet. Some new data has come in with more expense reports to be approved and some explanations. By the time the user gets to his/her hotel, these requests on the Internet have been de-queued and processed invoking the appropriate back-end web services and responses have been queued up. By the time that the user connects in at the hotel or goes down to the Starbucks for coffee and reconnects there (or if the device is using GPRS much sooner) the responses have come in. If the requests are approved, then the icon just goes away since the changed data is now approved. If the requests are denied, then some intelligence will be required on the client, but in the simplest case, the icon shows a denied change request with something like a big red X, (this is what the Blackberry does if can’t send mail for some reason as I learned to my sorrow on Sunday). The user sees this and then looks at the rejection to see why.
Notice that all this requires some intelligence on the part of the web services browser and potentially some intelligence on receipt of the approvals or denials from the internet. In the model I’m imagining, the client side intelligence will be done in script that will be associated either with the user actions (pressing submit after approving or querying) or with the Internet actions (returning approval or rejection). The script will have access to the content and can modify it. For example, on receipt of a rejection, it might roll back the values to their prior ones. Server side intelligence will be handled using your web service server of choice.
Restaurant Reviews and booking:Let’s take a slightly more complicated example. I’m flying into Santa Fe and don’t know the town. Before I leave NYC for Santa Fe, I point at the mobilized URL for my favorite restaurant review and check off a price range and cuisine type (and/or stars) that I care about. By the time I get on the plane and disconnect from wireless or GPRS, the review has fetched all the restaurants and reviews for the restaurants I’ve checked off onto my PC or PDA. On the plane, I browse through this, pick a restaurant, and then, ask to “book it” since the user interface shows that it can be booked. A Booking request is then shown AND the script also modifies my calendar to add a tentative reservation. Both items clearly show that the requests have not yet left my computer. When I land, the requests go through to the Internet and on to the booking web service and to exchange. It turns out that the restaurant has a free table and I get back an approval with reservation number and time. But the service acting as a middle man on the Internet also updated my “real” calendar to reflect this change. Now I need to replace the tentative reservation in my calendar with the real one created in Exchange by the Internet and I might as well delete the booking request since my calendar now clearly shows the reservation. Script handles this automatically and I’m OK and a happy camper. But should I have even modified my local calendar? Probably not since the integration process on the Internet was going to do it anyway and it just makes it hard to synchronize. I should have waited for the change on the calendar to come back to my client.
In practice this tends to work:This all sounds quite tricky, but as someone who has been using a Blackberry for 3 years now, it really isn’t. You get very used to eye-balling your mail to see if it has actually been sent yet or not. You soon wish that the changes you make to your calendar had similar information since you’re never sure that your tireless assistant hasn’t made some changes to your calendar that conflict with your own and you want to know, are there changes approved or not. What it does require is a decision about where changes are made and how the user is made aware of them. If the user is connected, of course, and the web services are fast and the connection is quick, then all this will essentially be transparent. Almost before the user knows it, the changes will have been approved or rejected and so the tentative nature of some of the data will not be clear. In short, this system works better and provides a better user experience when connected at high speeds. Speed will still sell. But the important thing is that it works really well even when the connection is poor because all changes respond immediately by adding requests, thus letting the user continue working, browsing, or inspecting other related data. By turning all requests to alter data into data packets with the request, the user interface can also decide whether to show these overtly (as special outboxes for example or a unified outbox) or just to show them implicitly by showing that the altered data isn’t yet “final” or even not to alter any local data at all until the requests are approved. For example, an approvals system might only have buttons to create approval/denial requests and not enable the user to directly alter the items being approved (invoices, expenses, transfers) at all.
Authored “12/7/2003
Learning to Rest
Answering comments is turning out to be more interesting than trying to explain my ideas which is, I suppose, part of the idea behind Blogs. In any case, I’ve had a slew of comments about REST. I admit to not being a REST expert and the people commenting, Mike Dierkin and Mark Baker in particular, certainly are. They tell me I’m all wet about the REST issues. I’ll dig into it. But I admit, I don’t get it. I need to be educated. I’m talking to our guy, Mark Nottingham, about this who does understand it all, but I also will be at XML 2003 in Philadelphia on Wednesday if people want to explain to me the error of my ways.
Let me explain why I’m confused. I don’t want to depend upon HTTP because I want to be able to use things like IM as a transport, particularly Jabber. So I want some protocol neutral way to move information back and forth. Maybe I’m wrong here, but I’m particularly wary of HTTP because in many cases I want the “sender” of the message to know reliably that the receiver has it. How does REST handle this? I’m going to try and find out. Secondly, whenever I send a set of messages to a set of services and responses start coming back, I want a standard protocol level way to correlate the responses with the sent messages. How do I do this without depending on the HTTP cookie and without the poor programmer having to worry about the specific headers and protocols? Again, I need to learn. Third, I DO believe in being able to describe a site so that programming tools can make it easy/transparent to read/write the requests and responses without the programmer thinking about the plumbing and formats. How is this done in the REST world? What’s the equivalent of WSDL. Sure WSDL is horribly complex and ugly. But what is the alternative here? I don’t mind, to tell the truth, if the description is URL centric rather than operation-centric, but I do want to know that the tooling “knows” how to build the appropriate Request. I care about programming models. Fourth, as I’ve said before, I think that the model for the moblized browser communicating with the internet is frequently going to be a fairly subtle query/context based synchronization model. For example, given the context of a specific trip and some cached details, update me with the latest details. Given the context of a specific project and some cached tasks and team members, update me with the latest tasks, team members, etc for that project. How do I do this in the REST model given that this isn’t trying to change the data on the server and that I need to pass in both the context, the implicit query, and the data I already know? Fifth, how do I “subscribe” to “Events” and then get “unsolicited messages” when someone publishes something I want to know about? KnowNow has been doing this for years, (Hi Rohit), but how do I do this in REST. So, REST folks, feel free to send me mail explaining all this to me. I’m trying to learn.
Authored 12/9/2003
REST
Obviously I’ve opened a can of worms and I’m getting a ton of helpful advice, pointers on reading, and discussions about who to talk to. The thing is, I have a day job, and it is going to take me a couple of weeks to read all this, talk to the right folks and educate myself. So I’m going to go offline, do that, and then come back and comment on what I’ve learned. As a believer in helping programmers, I think some of the comments about them not needing tools to understand the datagrams going back/forth are hasty, but at the same time I think what is meant is that it is harmful for the programmers not to EXPLICITLY understand what the wire format is. I completely agree with this. SOA relies on people understanding that what matters is the wire format they receive/return, not their code. Automatic proxies in this sense are dangerous and honestly if someone can show me that REST could eliminate the hideous complexity of WSDL and the horror that is SOAP section 5, that alone might make me a convert. Interestingly, if you do believe that programmers should understand the wire format because it is the contract, then one could argue that ASP.NET’s use of Forms and the enormous amount of HTML it hides from programmers is a big mistake. But I’m already deep into one digression so I’ll leave that thought for others. In either case, there is a basic point here that really resonates with me. If there is always a URL to a resource, then they can much more easily be aggregated, reused, assembled, displayed, and managed by others because they are so easy to share