[tahoe-dev] help with XML parsing problem

Lele Gaifax lele at nautilus.homeip.net
Sat Jun 21 17:48:26 UTC 2008


On Fri, 20 Jun 2008 22:10:57 -0700
zooko <zooko at zooko.com> wrote:

> xml.parsers.expat.ExpatError: not well-formed (invalid token): line  
> 4317, column 39
> 
> I know that you have fixed problems like this before in parsing  
> darcs's output, so could you give me a hint as to what the problem
> is?

By any chance it's an encoding problem: maybe that buildslave
operates, say, in latin1, while the others in utf-8, or the other way
around.

Unfortunately I do not have a general recipe: if you open the "darcs
changes --xml-output" with an editor, I bet that around that line you
will find either an "foreign" author name with "strange" letters, or a
patch name containing non-ASCII characters.

In the context of darcsver it is probably a waste using the
--xml-output and the minidom parser just to count the patches...

Wouldn't it be enough doing the equivalent of

  $ lasttag=$(darcs query tag | head -1)
  $ darcs changes --from-tag "$lasttag" | grep '^[^ ]' | wc -l

eventually with a smarter way of determining the "last (interesting)
tag"?

BTW, I see that there's now an handy "--count" on darcs changes :-)
It's used by the darcs build process, see latest patch on darcs-users
wrt determine_release_state.pl

hth,
ciao, lele.
-- 
nickname: Lele Gaifax    | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas    | comincerò ad aver paura di chi mi copia.
lele at nautilus.homeip.net |                 -- Fortunato Depero, 1929.



More information about the tahoe-dev mailing list