[tahoe-dev] GSoC Share Rebalancing and Repair Proposal

Zooko O'Whielacronx zookog at gmail.com
Tue Apr 23 20:18:47 UTC 2013


Dear Mark:

Yay! Good proposal! I'm excited about the prospect of getting some
focused work on improving repair and rebalancing. There are a lot of
different ways that the functionality could be improved.

I'm not sure, but I have the _feeling_ that the #1382 that Kevan
Carstensen started may be sort of a critical basis for successful work
on related tickets. See his github branch for the code he's written:
https://github.com/isnotajoke/tahoe-lafs/commits/ticket1382

Kevan: do you think Mark could profitably work on other repair and
rebalancing tickets while leaving your #1382 branch alone? Or do the
two of you, Kevan and Mark, think it might be a good idea to have Mark
take over Kevan's branch and finish it?

https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1382# immutable peer
selection refactoring and enhancements

Mark, would you be able to attend the coming Tahoe-LAFS Weekly Dev
Chat on Thursday at 15:30Z (8:30am Pacific)?

https://tahoe-lafs.org/trac/tahoe-lafs/wiki/WeeklyMeeting

Regards,

Zooko

On Sun, Apr 21, 2013 at 6:17 PM, Mark Berger <mjberger at stanford.edu> wrote:
> Hi everyone, over the last few days I have been working on a proposal for
> GSoC to address share rebalancing and repair. I've copied the proposal below
> (with some of my personal contact information redacted :] ). If you see
> something wrong in my proposal, have any questions, or have any suggestions,
> please let me know.
>
> Thanks!
> Mark Berger
>
>
>
> Organization: Tahoe-LAFS
> =============
>
> Student Info:
> =============
> Mark J. Berger
> Time Zone: Pacific
> Time Zone during GSoC: Eastern
> IRC Handle: Mark_B at irc.freenode.net
> Github: markberger
> Email: mjberger [at] stanford.edu
>
> University Info:
> ================
> University: Stanford University
> Major: Computer Science
> Current Year: Freshman
> Expected Graduation: June 2016
> Degree: BS
>
> About Me:
> =========
>
> I'm a freshman at Stanford University studying computer science. Right now
> I am finishing up my core requirements and will be pursuing the artificial
> intelligence track or the systems track within the major. My interests lie
> in machine learning, large distributed systems, and web applications.
>
> I began programming during an internship at Four Directions Productions in
> 2011, where I learned how to use Python in conjunction with Maya. The
> majority of my college coursework has been in C or C++ on linux with a
> little Java. This has made me familiar with tools such as GCC, GDB and
> Valgrind.
>
> While I have never contributed to an open source project before, I am
> making an effort to learn about Tahoe-LAFS and become familiar with its
> code base and community. Using a virtual machine, I've successfully
> installed Tahoe on an Ubuntu server and connected to the Public Test Grid.
> I've also subscribed to the mailing list, connected to the IRC channel, and
> successfully pulled the code off of Github. While I know my lack of
> experience in open source is a short coming, I am completely dedicated to
> using GSoC's Community Bonding Period to overcome any obstacles before the
> official coding period begins.
>
>
> Project Title: Share Rebalancing and Repair in Tahoe-LAFS
> =========================================================
>
> Abstract:
> =========
>
> The "servers of happiness" algorithm has improved Tahoe's ability to
> maximize redundancy by ensuring a given subset of all shares are placed on
> distinct nodes. However, this processes is not used to upload mutable
> files, instead opting for the old "shares of happiness" algorithm, which
> has well documented downsides. Additionally, file repair does not
> necessarily  redistribute files to new servers when nodes have been added.
> This creates issues in terms of redundancy and long term server health.
> Implementing proper file rebalancing for all file types during file upload,
> modification, and repair will enhance the reliability of the Tahoe system
> and take full advantage of erasure encoding.
>
>
> Deliverables:
> =============
>
> 1. Mutable files automatically distribute over nodes according to the
> "servers of happiness" algorithm whenever uploaded, modified, or repaired
> (ticket #232).
>
> 2. Repair will redistribute files according to "servers of happiness"
> algorithm and only renew the appropriate leases (ticket #699).
>
> 3. Documentation changed to correctly reflect the new feature set
>
> 4. Create a test suite to be used on a network of virtual machines in order
> to test file rebalancing.
>
>
> Time Line:
> ==========
>
> Note: I would like to have a code review session with my mentor on a weekly
> basis at minimum, especially at the beginning of the program. Those sessions
> are
> left off the time line to avoid redundancy
>
> May 27th - June 17th (Community Bonding):
> -----------------------------------------
>
> - Remain available via IRC and email
> - Closely follow the development email list
> - Isolate and understand the classes which pertain to the current
>  implementations of the servers of happiness algorithm to determine which
>  parts can be reused.
> - Discuss with my mentor(s) and the community to determine whether code
>  should be refactored to apply to both immutable and mutable files or if
>  the two need to remain distinct for design reasons
> - Discuss with my mentor(s) and the community the best way to go about
> testing
>  file rebalancing.
>
> Note: June 3rd through the 14th is my final exams period and I will be
> packing
> so that I can go home to Upstate NY. Since I will be very busy during this
> time, not all of the above may be accomplished in time to start coding.
> My classes do not resume until the end of September 23rd, so I can push my
> time line back a week or two if need be.
>
>
> Jun 17th - 28th
> ---------------
> - Implement "servers of happiness" for mutable files during the initial
>  file upload and file modification
>
> Jul 1st - 12th
> --------------
> - Throughly document code
> - Write test scripts for larger networks
> - Test code using virtual machines or predetermined test scheme from CBP
>
> Jul 15th - 19th
> ---------------
> - Clean up test scripts
> - Throughly document test scripts
> - Fix minor bugs
> - Continue to consider and test edge cases
>
> Note: "Servers of happiness" for mutable files should be in a mergable state
>       with tests before the midway point on July 29th.
>
> Jul 22nd - Aug 2
> ----------------
>
> - Modify repair code to use the "server of happiness" algorithm for both
>  immutable and mutable files. This should be accomplished by utilizing the
>  existing code from the initial upload process
>
> - Edit mechanism for lease renewal to ensure minimal amount of lease
>  renewal is done during rebalancing
>
> Aug 5th - 16th
> --------------
>
> - Throughly document code
> - Extend tests for mutable files to encompass rebalancing during file repair
>
> Aug 19th - 23rd
> ---------------
>
> - Clean up test scripts
> - Throughly document test scripts
> - Fix minor bugs
> - Continue to consider and test edge cases
>
> Aug 26th - 30th
> ---------------
>
> - Change documentation to reflect additional features
>
>
> The weeks of September 1st and 8th are left blank for flexibility.
>
>
> Possible projects if the above are accomplished ahead of schedule:
> ==================================================================
>
>  - Detect if disk(s) on a server are in a near fail state. If the disk(s)
>    are close to failing, notify the administrator, and slowly begin
>    redistributing shares to the other storage nodes (tickets #481 and #864).
>
>  - Let the user specify a maximum storage capacity for a given storage node
>    based on folder size instead of free space left on the machine.
>
>  - Tahoe backend for Google Drive (ticket #1831).
>
>
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>



More information about the tahoe-dev mailing list