[tahoe-dev] web "control panel", static server selection UI

Mon Jan 24 18:44:18 UTC 2011

I had an idea for addressing #467 (static-server selection) this
morning: so simple and easy to use, I don't know why I hadn't thought of
it before. (actually I do, that's in the second part of this message).

The goal of #467 is to give the user a way to directly control which
servers they use for uploads (I'll ignore downloads for now), because
your upstream bandwidth is limited, and you might be paying for storage,
so you want to spend your limited resources on servers that are most
likely to give you the shares back later, and don't want to risk them on
unknown/unreliable servers.

A related way to improve the reliability of uploads is to not appear to
succeed when, in fact, we don't have enough servers to achieve our
reliability goals. The "servers-of-happiness" metric is a numeric
criterion for this which, so far, has proved difficult for most users to
understand.

I'd previously been thinking of a tahoe.cfg syntax for writing down a
list of serverids, which you might copy-and-paste from a central web
page, or something. But this morning's idea was that we should be doing
this graphically, through a "control panel" on the web UI, with
checkboxes next to each server to mean "I'm allowed to use this server"
(aka "allowed") and "disable uploads unless this server is connected"
(aka "required").

[see the attached picture for the concept]

The default setting for all new servers is "allowed=yes" and
"required=no", which behaves just like the existing code. There is an
extra button at the bottom to control what defaults a new server will
get (as well as a tahoe.cfg option to set it before the node even starts
up): if you set it to "allowed=no", then we won't ever use a server that
the user hasn't explicitly enabled.

For static+stable grids, where you can enumerate all the servers that
you expect to use (and they should be up all the time, like the
allmydata.com grid or a personal grid), you can start the node, wait a
couple seconds to connect to everything, then set "required=yes" for all
of them. From that point on, if one or more of the servers drops out,
the control panel will show "uploads disabled: server 'xyz' not
connected", and uploads will fail loudly.

This same page should be used to control encoding parameters and
servers-of-happiness. Another awesome feature would be a little
simulator: a playground where you can set k/h/N and which servers are up
or down, and then pretend to do an upload, and see how uploaded shares
would have been placed (and whether it would have succeeded or not). I
think this would be a good way to learn what exactly
servers-of-happiness does, by experimentation.

== Private Control Panel ==

The reason that I didn't think of this before is because, so far,
BASEDIR/tahoe.cfg has been the only protected control channel we've had
for the Tahoe node, and you can't get nice webby UI displays by editing
a text file. tahoe.cfg is nice and safe: for an attacker to control it,
they need write access to your local filesystem, and that's equivalent
to full control over your client node anyways. Letting arbitrary HTTP
clients have access safely requires more work, some sort of
access-control mechanism.

We refuse to use passwords and cookies to protect web resources because
ambient browser authorities would open us up to confused-deputy/CSRF
attacks. We use secret URLs (built around filecaps) to control file
access, but we don't yet have a generalized access-control mechanism for
web resources which aren't tied to a particular Tahoe filecap/dircap.

Some other resources which would benefit from such access control:

 * upload authority: several folks don't want the root Welcome page to
   allow visitors to upload unlinked files (and consume arbitrary
   storage space): protecting those upload URLs would work even better
   than restricting them to living off of mutable directory writecap
   URLs.

 * Accounting-based storage authority (closely related to upload
   authority): when storage space is tied to client private keys, we'll
   need a way to tell the client which key to use. Ambient storage
   authority is easy (put a private key in BASEDIR/private), but that
   makes it awkward to allow multiple users to share a client node, and
   doesn't fit our user-provides-all-authority style
  - in addition to file-upload controls, we'd like a private page to
    show you how much space you're using. Imagine a URL like
    /accounts/$PRIVKEY/usage

 * server control: we want a page to show how much space other users are
   consuming on your server, and to allow/refuse/delete those shares. To
   encourage reciprocity ("I'll store your shares because you're storing
   my shares"), this display should correlate inbound usage with our
   outbound usage, so it may need to be tied to the storage-authority
   private key somehow

So, I need your help! I think we need to design a reasonable web access
control model for non-filecap single-node maybe-Account-based resources.
My current idea is:

 * start with node-wide resources, deferring Account-based resources a
   bit longer

 * the node writes $BASEDIR/private/web.secret with an unguessable
   URL-safe string at startup, known below as $SECRET

 * the webapi hosts a control-panel page at /control/$SECRET/
   - POSTs go to e.g. /control/$SECRET/set-encoding

 * "tahoe webopen --control-panel" reads $SECRET and uses it to
   construct the secret URL, just like it does with "tahoe webopen
   ALIAS:" now.

 * the "/control/$SECRET/servers/" control panel shows a list of known
   servers (everything announced by the introducer, plus everything
   recorded on our allowed/required lists). The lists are stored in flat
   files (one serverid per line, maybe with extra whitespace-separated
   fields) in $BASEDIR/servers-allowed and /servers-required , and each
   time you change the checkboxes, those files are re-written.

 * the webui root "welcome page" gets an extra line that tells the user
   to run "tahoe webopen --control-panel" to get server controls.
   Otherwise it shows a read-only view of non-private server state.

The $SECRET value could be long-lived or short-lived. On one end of the
spectrum, we could create it exactly once, the first time the node boots
and os.path.isfile($BASEDIR/private/web.secret) is False. We could also
regenerate it each time the node boots, or every 10 minutes, or after 10
minutes of inactivity (no access to /control/* URLs). We could also say
that the "tahoe webopen --control-panel" command actually *writes* a new
secret into $BASEDIR/private/web.secret, and the webapi's /control/*
handler reads that file on every access, only allowing it if the "*"
component matches the contents of web.secret (this wouldn't work so well
with "disconnected" bare ~/.tahoe/ setups, where you've copied node.url
from a live $BASEDIR into a CLI --node-directory= stub).

The goal of non-eternal secrets, of course, is to mitigate the
unfortunate tendency of web browsers to broadcast any URLs they can see,
via search-engine toolbars, anti-phishing servers, Referrer headers,
JS-visible history/frame.location access, etc. I think using "tahoe
webopen" to start the process is pretty easy, and ties web access to
local filesystem access pretty cleanly, but I'm not sure how to make the
tradeoff between needing to re-run "tahoe webopen" if you've idled too
long (also not being able to bookmark the control panel), and protecting
against authority leaks via the browser.

So.. what do you think? What form of (short- or long- term) secret-URL
scheme would you feel comfortable using as protection for things like
which-servers-should-we-use controls?

cheers,
 -Brian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: control-panel-demo.png
Type: image/png
Size: 180905 bytes
Desc: not available
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20110124/68a290f8/attachment.png>