Basic idea is to use e4 as a workspace(s) integration bus. The goal
is to break up monolithic applications into components which perform
single tasks well. This has many benefits:
- choice of best application for task
- ability to use different application components on different
hardware
- ability to split workspace over multiple machines
- ability to have common services for groups of users
Fundmental Concepts
===================
The system is build on some fundamental concepts:
type
objects are typed. the most useful type, at this stage, seems to
be to reuse the existing MIME type framework, as is done for Xdnd,
Windows registry, etc.
action
objects are useless without actions. there exist many actions in
the normal desktop metaphor-style environment. edesk uses a
common, standardised set of actions to complete its types.
user
a user is a person. users have unique identities within the edesk
environment.
device
a device is a computer. a device MUST be networked, at least some
of the time, in order to participate in an edesk system. in
addition to networking, it may possess one or more facilities for
input from user, output to user, storage, computation, or special
purpose I/O (ie. zip drive).
some devices may be single or limited purpose, for example,
consider an audio speaker capable of receiving an MP3 stream and
"rendering" it into sound, or an X terminal. others will be more
general-purpose, like a desktop PC.
workspace
a workspace is a physical place within which a single user is
working. users may serially use many workspaces, but at any point
in time, may be present only in one.
a workspace is composed of devices, and the set of devices
supporting a particular workspace may change.
mobile devices may individually support a workspace (when the user
is on the bus), and then later form part of a different workspace
(when the user is at her desk).
a workspace may continue to operate on behalf of a user when the
user is not present.
some devices may support multiple workspaces simultaneously
(consider a large screen in a lunch room).
modality
user interactions may use different forms: graphical, textual,
auditory, hardcopy, keyboard, gestural, etc. the selection of a
particular combination of modalities to support an action on a
typed object can be made dynamically.
you might choose a text-based browser to display some pages, and a
graphical browser for others. you might choose to have your email
read to you sometimes, or textually presented others.
some factors influencing this decision are obviously the devices
available within the current workspace, and the modality of the
triggering component. but additional factors can, and might be
used to select an appropriate mechanism for a particular
circumstance.
Using these components, the applications forming the familiar desktop
computing environment can be restructured into a distributed
collection of collaborating components, able to span multiple devices
and to more-or-less seamlessly follow the user as she moves between
workspaces.
Instead of (necessarily) building monolithic applications, each
combination of type and action may be supported by a single
component. The selection of this component is the result of a runtime
negotiation between contenders, decided on the basis of type, action,
workspace, current modality, explicit user direction and possibly
other factors.
This concept is not entirely novel. The basis for GNOME (the GNU
Network Object Model Environment) is similar, using CORBA as the
communicating framework. However, they have focussed on the
development of Microsoft-competitive applications, rather than a model
of a networked object environment.
Similarly, the Tooltalk event bus is part of the Open Groups's CDE
(Common Desktop Environment) and provides a means for applications to
communicate. It was never widely adopted by application writers.
Actions
=======
The general "action" event drives the workspace. It has three
fundamental parameters:
Action
This is the type of action which is being requested. There are not
that many really different actions that are needed to run your
normal desktop. The current list is:
do
Every type can have a default action, much like the default click
in a traditional GUI environment. This will normally display
text, fetch an URL, compile C code, etc.
create
Create a new instance of the specified type. This is like the
MIME "compose" action.
reply
Like create, but using another instance of the same type as the
initiator of the creation. This is useful in mail and tickertape
(so far, any others?).
how does this compare to the (common) operation of cloning an
existing file to create a new one?
edit
Modify the specified instance of the specified type.
view
Display an object, in its current form. The use of different
modalities is the key to this action: it can mean show on screen,
read via speech synthesis, print on paper, etc.
convert
Change the supplied object (and its dependents?) into an object of
the specified target format ? Could be useful for HTML,
PostScript, C code, graphics files, etc.
interpret
Useful for objects that represent encoded instructions in some
form. Executable binaries, shell, PERL or Python scripts. Even
things like HTML source can be "interpret"ed.
get
For references (only?)
put
??? FTP?
Type:
This is the type of the object upon which you wish to perform the
action. It is represented using MIME types.
Object:
This is the object upon which the action is to be performed.
Note that it might be necessary to distinguish between objects
contained within the event, and objects located elsewhere, but
referred to by the event. Many tools will need to be able to work
indirectly on objects, especially those on a filesystem.
On the other hand, the tools might not always share a filesystem.
Use of global handles, such as an HTTP URL, would allow an
indirection via a fetcher. Perhaps as a `Location' attribute?
Interaction
===========
Caching
there is a general issue with elvin, nicely illustrated by a HTTP
caching service:
when something emits a request to fetch an URL, both the fetcher and
the cache see the event. both could start to respond, but that is
inefficient. the question is: how do we maintain an ecology of
interacting apps with a need to prioritise responses?
the example is nicely illustrative: what if there is no cache
available? what if the cache is there, but doesn't have the requested
item? what if the cache wasn't there initially, but starts later (or
vice versa)?
where the example falls down is that there is only one level choice
(cache or fetcher). it's possible that there are many layered
options, and that the layering could be configured either by the user
or by service provisioning.
an alternative example would be a set of bots listening to a
tickertape channel. deciding which should respond to a query is a
similar problem.
Participants
Security
Event Formats
=============
taking the fundamental concepts, the interaction patterns and wrapping
these into event formats, we get:
When a tool is started, it needs to query the workspace to see whether
and how it is needed. This should be done for each type that the tool
understands.
edesk.elvin.org : 1000
Event : "query"
User : "user@example.com"
Workspace : "office"
Type : "text/plain"
Actions : "|view|"
Modes : "|gui|"
Tools respond to a query with a proposal of their role in the
workspace for that type. Each action and modality requires a separate
response.
The "Level" is the currently assigned level for that tool, if any,
while "Affinity" is the tool's preference: "top", "bottom" or "middle"
with an optional "+" suffix. This is used to automatically layer
tools in a stack from user (top) down. The trailing "+" indicates a
willingness to stack away from the specified position by a few levels
if required.
The "Round" field is used to record the number of attempts to propose
a layering. If the number of rounds exceeds a maximum value, the
negotiation fails. If a central arbitrator is present, it might
monitor the negotiation, and require human input at a specified point
also.
edesk.elvin.org : 1000
Event : "propose"
User : "user@example.com"
Workspace : "office"
Type : "text/plain"
Action : "view"
Mode : "gui"
Level : 1
Affinity : "top"
Round : 1
Name : "less/xterm/edeskd"
Cid : "f1d2d2f924e986ac86fdf7b36c94bcdf32beec15"
For each workspace, an edeskd may act as an arbitrator for tools.
When a conflict arises, edeskd might ask the user to resolve it, and
then assign tools to a particular layering.
edesk.elvin.org : 1000
Event : "assign"
User : "user@example.com"
Workspace : "office"
Type : "text/plain"
Action : "view"
Mode : "gui"
Level : 3
Name : "less/xterm/edeskd"
Cid : "f1d2d2f924e986ac86fdf7b36c94bcdf32beec15"
The basic request for an action on an object. The object may be
present inline in the "Object" field, or indirectly via the "Location"
field.
edesk.elvin.org : 1000
Event : "request"
User : "user@example.com"
Workspace : "office"
Xid : "f1d2d2f924e986ac86fdf7b36c94bcdf32beec15"
Action : "do"
Type : "text/uri-list"
Mode : "gui"
Level : 1
Object : [68 74 74 70 3a 2f 2f 63 6e 6e 2e 63 6f 6d]
Location : "webnfs://fileserver/home/user/file"
When a tool chooses to respond to a request, it must inform the
workspace that it is doing so. At minimum, it should send one such
notification, indicating that the request is complete. If the request
is time-consuming, it must respond immediately to claim the request,
optionally throughout the performance with partial progress
notifications, and finally with a completion notification.
At this stage, result codes are taken directly from the HTTP
protocol. This will be clarified.
edesk.elvin.org : 1000
Event : "progress"
User : "user@example.com"
Workspace : "office"
Xid : "f1d2d2f924e986ac86fdf7b36c94bcdf32beec15"
Percent-Complete : 35
Result-Code : 0
Result-Text : "Fetching ... 35%"
Some Example Apps
=================
This section explains the architecture (and re-architecture) of some
example applications using the edesk model.
Web Browser
-----------
A web browser often consists of multiple components:
- HTML viewer
- HTTP client (URL fetcher, with cache?)
- bookmark manager
- hypertext history viewer
- cookie jar
The plan is to split the browser into these components, connected by
Elvin. The viewer must accept commands to display a file of HTML, and
generate events when a hyperlink is clicked.
The HTTP fetcher needs to listen for requests to fetch URLs, and
notify their arrival. Some feedback on progress is also important.
The feedback of download progress need not necessarily be in the
"viewer" app, eg, there could be a separate progress-bar-app which
consumes progress events. This would have the benefit that each app
would not have to implement their own.
The bookmark editor should accept notifications of URLs, and should
emit requests to display an URL.
The history viewer is similar to the bookmark editor.
The cookie jar will need to respond to requests from the fetcher so
that the latter can provide the cookie data to the remote HTTPD. This
assumes that the fetcher is the only component with an outgoing
connection.
Tickertape
----------
Tickertape could consist of a
scroller
the scroller should scroll pixmaps according to various config
parameters, and report mouse/keyboard events relative to those
pixmaps.
this would require events to: add/replace/remove/list/get pixmaps,
set/get scrolling config, report UI events, and some mgmt events
(startup, shutdown, etc).
history window
much like the scroller, but showing the threaded history.
chat message composer
compose Tickertape-format chat event. much like the current chat
composer.
subscription editor
a subscription editor would provide a means to construct
subscription expressions, perhaps with some graphical help? or a
query by example? or ... a text window ;-)
it would talk to the message renderer.
message renderer
would subscribe to subscriptions from the editor, and generate
pixmap events for the scroller/history.
Progress Meter
--------------
This is a general purpose application. It consists of a number of
"progress bar" widgets, each of which can be identified by name,
colour, etc, based on the owning event stream. Once a task is
completed, some completion action could also be performed.
Imagine having all your downloads, cvs updates, compilations, expense
requests, SATS, general workflows being monitored in a single place,
so you don't have to return to check on their progress.
It could even handle things that weren't so much a percent-complete
type event stream, but just an ongoing actions event stream, like
software releases, web page changes, etc.
In some ways this is a generalisation of buildmon, with bits of
tickertape also.
Problems
========
Nesting of workspaces
it'd be nice to have a single URL fetcher or HTTP cache, for
example, that worked for an entire site. how do we scope the
requests so that this can work?
should such user-agnostic tools ignore the Workspace parameter when
looking for requests? probably?
26 nov 1999 : modified : $Date: 2002/09/07 08:02:04 $