Help understanding an UploadUnhappinessError
Kyle Markley
kyle at arbyte.us
Sun Dec 7 22:19:34 UTC 2014
Hello,
I've started running into a problem on my grid (1.10.0) where I'm
getting an UploadUnhappinessError that seems to only happen when
uploading very large files, i.e. over a gigabyte. My problem is that I
don't have enough information to understand the cause of the error.
My encoding is 4/4/10. All the storage nodes (5) have plenty of space.
This happens when uploading a brand new file, so there are no shares
already existing on the grid.
When this error occurs, it has a side-effect of killing any other
operation occurring on the same node at the same time. (If I'm doing
two "tahoe backup" operations, they /both/ die.) But the problem is
100% reproducible with a simple "tahoe put" operation running by itself.
On the WUI recent operations page I see:
Contacting Servers [xxx] (second query), 0 shares left..
If I click through I see hashing is 100% and the other progress items
are at 0%.
In twistd.log I simply see:
2014-12-07 14:09:22-0800 [-] disconnectTimeout, no data for 24 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 21 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 21 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 21 seconds
2014-12-07 14:09:23-0800 [-] disconnectTimeout, no data for 23 seconds
There is no incident report.
Checking one of the remote storage nodes, there is nothing in the
twistd.log at all, and no incident report.
Here's the CLI traceback:
Error: 500 Internal Server Error
Traceback (most recent call last):
File
"/usr/local/lib/python2.7/site-packages/foolscap-0.6.4-py2.7.egg/foolscap/call.py",
line 677, in _done
self.request.complete(res)
File
"/usr/local/lib/python2.7/site-packages/foolscap-0.6.4-py2.7.egg/foolscap/call.py",
line 60, in complete
self.deferred.callback(res)
File
"/usr/local/lib/python2.7/site-packages/Twisted-13.0.0-py2.7-openbsd-5.3-amd64.egg/twisted/internet/defer.py",
line 380, in callback
self._startRunCallbacks(result)
File
"/usr/local/lib/python2.7/site-packages/Twisted-13.0.0-py2.7-openbsd-5.3-amd64.egg/twisted/internet/defer.py",
line 488, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File
"/usr/local/lib/python2.7/site-packages/Twisted-13.0.0-py2.7-openbsd-5.3-amd64.egg/twisted/internet/defer.py",
line 575, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File
"/usr/local/lib/python2.7/site-packages/allmydata/immutable/upload.py",
line 604, in _got_response
return self._loop()
File
"/usr/local/lib/python2.7/site-packages/allmydata/immutable/upload.py",
line 455, in _loop
return self._failed("%s (%s)" % (failmsg,
self._get_progress_message()))
File
"/usr/local/lib/python2.7/site-packages/allmydata/immutable/upload.py",
line 617, in _failed
raise UploadUnhappinessError(msg)
allmydata.interfaces.UploadUnhappinessError: shares could be placed or
found on only 1 server(s). We were asked to place shares on at least 4
server(s) such that any 4 of them have enough shares to recover the
file. (placed all 10 shares, want to place shares on at least 4 servers
such that any 4 of them have enough shares to recover the file, sent 6
queries to 5 servers, 2 queries placed some shares, 4 placed none (of
which 0 placed none due to the server being full and 4 placed none due
to an error))
Incidentally, that "N placed none due to an error" is a very frustrating
message. It tells me something is wrong but gives me absolutely no
indication of what I might do about it. I wish I had some information
about the error(s)!
Any ideas?
--
Kyle Markley
More information about the tahoe-dev
mailing list