[tahoe-dev] Tahoe-LAFS Weekly Dev Chat, 2012-10-02

Zooko Wilcox-OHearn zooko at leastauthority.com
Tue Oct 2 17:18:07 UTC 2012

Tahoe-LAFS Weely Dev Chat, 2012-10-02

in attendance: Zooko (scribe), David-Sarah, Andrew

CAVEAT LECTOR; Some of this was added by Zooko after the chat ended.

Andrew Miller has found three bugs in #1240: 1. There's a bug. 2. A
different code path partially compensates for it sometimes
(David-Sarah calls this "a masking bug"). 3. The tests aren't smart
enough to realize that that the task has been done wrong and masked
rather than done right. Unclear if Andrew will fix #1240 before
David-Sarah stops accepting patches into Tahoe-LAFS v1.10.

Andrew: "Dynamic Merkle Tree" -- a Red-Black Merkle Tree

deterministic balancing, good asymptotic cost even in the worst case,
good small constant

Any data structure can be Merklized by replacing the links with hashes.

Take MDMF and replace the Merkle Tree with a Red-Black Merkle Tree.
Now you have an LDMF! (Except the backend needs to be able to store,
find, and insert data blocks efficiently. Fortunately
LeastAuthority.com's new Cloud Backend can do that! Well actually
maybe it can't because it currently identifies the blocks by their
*block number*. But it is a lot closer than the current disk backend,
which stores the each block under its offset in a single share file.)

Andrew is trying to work out how to do the physical storage and
addressing, apart from the logical -- Merkle-Tree-authenticated
integrity-checking. He's looking at B trees or B+ trees. Zooko thinks
this may be closely related to tickets #1543, #1687.

Zooko is excited because of the history here. Brian and he thoughtof
the design of LDMF, then thought it was too hard and thought of the
design of MDMF, then thought that was too hard and invented SDMF,
which was stupid enough that they could implement it. Much later,
Kevan came along and, not realizing how hard MDMF was, decided to do
it for a summer project (Google Summer of Code) and worked really hard
and well on it for about two years, which is why we now have MDMF. So
why is Zooko excited? Because hopefully Andrew doesn't realize how
hard LDMF is!

Zooko recommended N. Askitis's dissertation on cache-oriented data structures.

Andrew has at least two other ideas, and is talking about putting all
three of them into his dissertation. Zooko thinks that's three
dissertations. Maybe Andrew should get three PhD's.

If you add access control at the sub-file level of granularity, so
that you can grant access to only *part* of an LDMF, then what's the
difference between a directory structure and an LDMF? You could store
a deep directory structure in an LDMF. The question is how do you
identify sub-parts of an LDMF in order to talk about them with someone
else and grant someone else access to them. You don't want to grant
them access to a byte span! Like, here's read access to bytes 1000
through 3000 of this file, because then you could no longer safely
insert and delete things in that file.

Brian once suggested packing the share data of multiple files and
directories together for more efficient download and storage -- a
"Virtual CD" -- #204, #1029.

Once we finish the next RAIC Milestone, David-Sarah will focus on
Tahoe-LAFS v1.10 release. Zooko may not be able to focus on that much
due to ongoing LeastAuthority.com work. LeastAuthority.com is bidding
for another contract.

https://tahoe-lafs.org/trac/tahoe-lafs/ticket/204# "virtual CDs"
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1029# download a subtree
as an archive
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1240# remove
ResponseCache in favour of MDMFSlotReadProxy's cache
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1543# rearrange share
format to make downloads faster
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1687# store copy of
block-hash-chain with each block


Zooko Wilcox-O'Hearn

Founder, CEO, and Customer Support Rep


More information about the tahoe-dev mailing list