[mythtv] [mythtv-commits] Ticket #9918: Incorrect character encoding in xml status (patch provided)
Michael T. Dean
mtdean at thirdcontact.com
Thu Jul 14 14:01:12 UTC 2011
On 07/13/2011 09:22 AM, MythTV wrote:
> #9918: Incorrect character encoding in xml status (patch provided)
> -------------------------------------------------+-------------------------
> Reporter: Ian Dall<ian@…> | Owner:
> Type: Bug Report - General | Status:
> Priority: minor | infoneeded_new
> Component: MythTV - General | Milestone: unknown
> Severity: medium | Version:
> Keywords: xml encoding | Unspecified
> | Resolution:
> | Ticket locked: 0
> -------------------------------------------------+-------------------------
>
> Comment (by Ian Dall<ian@…>):
>
> Thanks, setting LANG as above does result in UTF-8 encoded xml which
> parses without error.
>
> The thing is, if LANG isn't *.UTF-8, then the xml is invalid, as any
> encoding except UTF-8 must have a "Text Declaration" (or a "Byte Order
> Mark" if it is UTF-16). [w3c xml 1.0 Section 4.3.3]
>
> So, I guess it is a low priority given there is a work around, but I would
> maintain that either UTF-8 locale should be forced (as my patch does) or
> the text declaration should be added with the encoding which is actually
> used. Something like:
>
> {{{
> QTextCodec *default_enc = QTextCodec::codecForLocale();
This sets a global value which overrides the Qt autodetection--it
affects all code that's executed, not just the code you're adding.
> QDomProcessingInstruction encoding =
> doc.createProcessingInstruction("xml", "version=\"1.0\" encoding=\""
> +
> default_enc->name() + "\"");
> doc.appendChild(encoding);
> }}}
>
> Unfortunately this doesn't work because QTextCodec::codecForLocale()
> always has a name of "System" in Qt 4.7, so I can't see any clean way to
> do this.
Right. This is also why we don't have a log line that tells us which
encoding Qt is using so we can tell you that you're hitting the Qt
bug... I planned to add this, but as you found, the codecForLocale() is
useless for debugging.
So, regarding this bug, I would say that it /needs/ to be fixed in Qt,
not here. Qt does autodetection of system character encoding, and if
QDomDocument creates a invalid XML stream unless developers override
that autodetected encoding (and ignore the user-/system-specified
encoding), Qt is broken.
I started down the rabbit hole to try to figure out what Qt is doing
wrong so I could report a good bug, but I never got to the root of the
problem--and was spending far too much of my time on the issue. I will
mention that it has an effect on HTTP parsing (thereby affecting
MythNetvision's requests and MythVideo Storage Group processing stuff)
and a lot more.
http://www.gossamer-threads.com/lists/mythtv/dev/439348#439348
(Note that some stuff has been changed in MythTV since I last looked,
and I have a feeling that we just made a previously-not-working case
work while breaking other previously-working cases.)
Mike
More information about the mythtv-dev
mailing list