Help understanding an UploadUnhappinessError

Kyle Markley kyle at arbyte.us
Sun Dec 7 22:19:34 UTC 2014


Hello,

I've started running into a problem on my grid (1.10.0) where I'm 
getting an UploadUnhappinessError that seems to only happen when 
uploading very large files, i.e. over a gigabyte.  My problem is that I 
don't have enough information to understand the cause of the error.

My encoding is 4/4/10.  All the storage nodes (5) have plenty of space.  
This happens when uploading a brand new file, so there are no shares 
already existing on the grid.

When this error occurs, it has a side-effect of killing any other 
operation occurring on the same node at the same time.  (If I'm doing 
two "tahoe backup" operations, they /both/ die.)  But the problem is 
100% reproducible with a simple "tahoe put" operation running by itself.

On the WUI recent operations page I see:
Contacting Servers [xxx] (second query), 0 shares left..
If I click through I see hashing is 100% and the other progress items 
are at 0%.

In twistd.log I simply see:
2014-12-07 14:09:22-0800 [-] disconnectTimeout, no data for 24 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 21 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 21 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 21 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 23 seconds

There is no incident report.

Checking one of the remote storage nodes, there is nothing in the 
twistd.log at all, and no incident report.

Here's the CLI traceback:

Error: 500 Internal Server Error
Traceback (most recent call last):
   File 
"/usr/local/lib/python2.7/site-packages/foolscap-0.6.4-py2.7.egg/foolscap/call.py", 
line 677, in _done
     self.request.complete(res)
   File 
"/usr/local/lib/python2.7/site-packages/foolscap-0.6.4-py2.7.egg/foolscap/call.py", 
line 60, in complete
     self.deferred.callback(res)
   File 
"/usr/local/lib/python2.7/site-packages/Twisted-13.0.0-py2.7-openbsd-5.3-amd64.egg/twisted/internet/defer.py", 
line 380, in callback
     self._startRunCallbacks(result)
   File 
"/usr/local/lib/python2.7/site-packages/Twisted-13.0.0-py2.7-openbsd-5.3-amd64.egg/twisted/internet/defer.py", 
line 488, in _startRunCallbacks
     self._runCallbacks()
--- <exception caught here> ---
   File 
"/usr/local/lib/python2.7/site-packages/Twisted-13.0.0-py2.7-openbsd-5.3-amd64.egg/twisted/internet/defer.py", 
line 575, in _runCallbacks
     current.result = callback(current.result, *args, **kw)
   File 
"/usr/local/lib/python2.7/site-packages/allmydata/immutable/upload.py", 
line 604, in _got_response
     return self._loop()
   File 
"/usr/local/lib/python2.7/site-packages/allmydata/immutable/upload.py", 
line 455, in _loop
     return self._failed("%s (%s)" % (failmsg, 
self._get_progress_message()))
   File 
"/usr/local/lib/python2.7/site-packages/allmydata/immutable/upload.py", 
line 617, in _failed
     raise UploadUnhappinessError(msg)
allmydata.interfaces.UploadUnhappinessError: shares could be placed or 
found on only 1 server(s). We were asked to place shares on at least 4 
server(s) such that any 4 of them have enough shares to recover the 
file. (placed all 10 shares, want to place shares on at least 4 servers 
such that any 4 of them have enough shares to recover the file, sent 6 
queries to 5 servers, 2 queries placed some shares, 4 placed none (of 
which 0 placed none due to the server being full and 4 placed none due 
to an error))

Incidentally, that "N placed none due to an error" is a very frustrating 
message.  It tells me something is wrong but gives me absolutely no 
indication of what I might do about it.  I wish I had some information 
about the error(s)!

Any ideas?

-- 
Kyle Markley




More information about the tahoe-dev mailing list