# [tahoe-dev] [tahoe-lafs] #778: "shares of happiness" is the wrong measure; "servers of happiness" is better

tahoe-lafs trac at allmydata.org
Wed Jan 27 01:35:19 UTC 2010

```#778: "shares of happiness" is the wrong measure; "servers of happiness" is
better
---------------------------------------+------------------------------------
Reporter:  zooko                      |           Owner:  zooko
Type:  defect                     |          Status:  assigned
Priority:  critical                   |       Milestone:  eventually
Component:  code-peerselection         |         Version:  1.4.1
---------------------------------------+------------------------------------

Comment(by davidsarah):

I was sure that computing servers-of-happiness must correspond to some
well-understood mathematical problem, but I've now discovered which one --
it's called "maximum bipartite matching".

Since servers are distinct from shares, the relation between servers and
shares corresponds to a [http://en.wikipedia.org/wiki/Bipartite_graph
bipartite graph].

Any bijective function from servers to shares that is included in this
relation is called a ''matching'' of the bipartite graph.

The servers-of-happiness number is the maximum size of any such matching.
Intuitively, that's because a matching gives a set of H servers that have
at least H distinct shares between them.

So, apply a maximum bipartite matching algorithm to the sharemap (to find
''some'' maximum matching), take its size, and you're done.

(Actually, the servers-of-happiness value as we originally defined it is
max(k-1, max_bipartite_matching_size). But the max_bipartite_matching_size
works as a measure of the quality of the server->share mapping even when
it is less than k-1. In that case there won't be enough shares to recover
the file, of course.)

I doubt that the algorithm outlined by Zooko ''always'' computes
max_bipartite_matching_size correctly. That's because it is a greedy
algorithm -- it works by removing edges from the bipartite graph that are
''not'' in some maximum matching, and assumes that it's possible to make a
correct local decision of which edge to remove. There are definitely
greedy algorithms that result in a near-maximum matching, but it won't
always be maximum.

Algorithms to find maximum bipartite matchings are really simple, just a
few lines of code, and have reasonable algorithmic complexity. I will try
to find one and translate it to Python.

--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/778#comment:162>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
```