<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
On 7.11.2010 14:32, Michael T. Dean wrote:
<blockquote cite="mid:4CD6AA77.7010803@thirdcontact.com" type="cite"> On
11/06/2010 02:45 PM, David Kubicek wrote:
<br>
<blockquote type="cite">On 6.11.2010 18:24, David Kubicek wrote:
<br>
<blockquote type="cite">On 19.6.2010 01:29, Svend Høst wrote:
<br>
<blockquote type="cite">A recording has been made and the
title from the epg contains danish
<br>
letters "æ/ae,ø/oe and å/aa".
<br>
If i try to locate and play the recording by upnp and
browsing the
<br>
upnp tree By Date, the recording is shown properly.
<br>
If i choose By Title no content is streamed to my player, so
the
<br>
recording is there and it is properly recorded since i can
see it
<br>
when choosing by date.
<br>
<br>
From the log :
<br>
2010-06-19 00:30:16.119 HTTPRequest::ProcessSOAPPayload :
<br>
"urn:schemas-upnp-org:service:ContentDirectory:1#Browse" :
<br>
2010-06-19 00:30:16.413 UPnpCDS::HandleBrowse
ObjectID=RecTv/1,
<br>
ContainerId=
<br>
2010-06-19 00:30:16.431 UPNP Browse : Searching for : RecTv
/
<br>
ObjectID : RecTv/1
<br>
2010-06-19 00:30:16.449 UPnpCDSTv::IsBrowseRequestForUs -
Not sure...
<br>
Calling base class.
<br>
2010-06-19 00:30:16.548 HTTPRequest::SendResponse(xml/html)
() :200
<br>
OK -> 172.17.14.200<a class="moz-txt-link-rfc2396E" href="http://172.17.14.200"><http://172.17.14.200></a>: 1
<br>
2010-06-19 00:30:31.478 HTTPRequest::ProcessSOAPPayload :
<br>
"urn:schemas-upnp-org:service:ContentDirectory:1#Browse" :
<br>
2010-06-19 00:30:31.566 UPnpCDS::HandleBrowse
<br>
ObjectID=RecTv/1/key=Bonder�ven retro, ContainerId=
<br>
2010-06-19 00:30:31.586 UPNP Browse : Searching for : RecTv
/
<br>
ObjectID : RecTv/1/key=Bonder�ven retro
<br>
2010-06-19 00:30:31.605 UPnpCDSTv::IsBrowseRequestForUs -
Not sure...
<br>
Calling base class.
<br>
2010-06-19 00:30:31.699 HTTPRequest::SendResponse(xml/html)
() :200
<br>
OK -> 172.17.14.200<a class="moz-txt-link-rfc2396E" href="http://172.17.14.200"><http://172.17.14.200></a>: 1
<br>
</blockquote>
I've had the same issue - living in the Czech Republic, many
shows and
<br>
movies have the Czech diacritics (CP: iso-8859-2 / win1250) in
them.
<br>
Until now, I had to use "By Date", which was a bit of a pain
in the
<br>
ass to be honest.
<br>
<br>
I never looked into it before, just a bit of googling, and I
haven't
<br>
found anybody with the same issue, so I thought it was a local
problem
<br>
in my setup. Probably nobody ever reported it - except you,
that is.
<br>
:) About an hour ago, I noticed your email by pure chance and
it
<br>
helped me to see that my situation was the same: the bug with
empty
<br>
folders via UPnP appeared only for shows with Czech characters
-- I
<br>
never noticed that, thought the issue was "random".
<br>
<br>
For example: "*By Title*" browsing displayed a folder "*C(erná
zmie
<br>
(18)*", but when I clicked it open on PS3 or in Totem on the
desktop,
<br>
it was *empty*. Locating the show via "*By Date*" played it
without fail.
<br>
<br>
So, I checked the source in my local MythTV 0.23-fixes repo
and
<br>
indeed, there was a bug in handling UTF8 requests. US coders
didn't
<br>
expect the search filters to contain UTF8 characters, so they
used
<br>
.toLatin1() conversions all around the place.
<br>
</blockquote>
</blockquote>
<br>
Actually, they wrote code assuming that QUrl worked as described
in the Qt documentation. In Qt's API, URIs are UTF-8 encoded
characters percent-encoded to ASCII (as they should be, per RFCs
3986 and 3987).
<br>
<br>
<blockquote type="cite">
<blockquote type="cite"> I'm attaching a simple
<br>
fix, switching from .toLatin1() to .toUtf8() fixes the whole
issue.
<br>
Applying to upstream is as simple as "search and replace".
<br>
<br>
Some of the 11 conversions don't actually need .toUtf8(), but
it
<br>
doesn't hurt either. Developers can apply just those parts
that are
<br>
required.
<br>
</blockquote>
New ticket for this issue:
<a class="moz-txt-link-freetext" href="http://svn.mythtv.org/trac/ticket/9188">http://svn.mythtv.org/trac/ticket/9188</a>
<br>
</blockquote>
<br>
Out of curiosity, are you using a properly-configured environment:
<br>
<br>
<a class="moz-txt-link-freetext" href="http://www.gossamer-threads.com/lists/mythtv/dev/439348#439348">http://www.gossamer-threads.com/lists/mythtv/dev/439348#439348</a>
<br>
<br>
Svend, please try the configuration in that post, too. It should
allow everything to work without code changes/requiring a
recompile and redeploy.
<br>
<br>
Please post the output of locale /in the environment that's
running mythbackend/. And please try again, without your patch,
with a properly-configured UTF-8 locale (which is the only way
QUrl works as documented--the bug mentioned in the above)--i.e.
starting mythbackend from a shell where you can verify the
environment. Most distros have start scripts that don't properly
configure the environment.
<br>
<br>
I'm pretty sure the changes in your patch will have bad effects on
some configurations.
<br>
<br>
If existing code works with a UTF-8 locale (and I'm almost
positive it will), please say so on your ticket. Your changes
will definitely need to be tested in multiple different
environments running with multiple different Latin-compatible and
non-Latin-compatible encodings specified.
<br>
<br>
If you're really interested in tracking down the issue I think may
exist in Qt (which I haven't made time to do since I'm way behind
on my list of deliverables for MythTV), I'd be very appreciative.
I can give you some information that should be a good start (and
will require a bit of code-sleuthing in QUrl, QTextCodec, QString,
and some related classes).
<br>
<br>
Thanks,
<br>
Mike
<br>
</blockquote>
<br>
My environment is OK, locale gives "cs_CZ.UTF-8" as it should for
all variables. Without it, date formatting, sorting and even
keyboard input doesn't work properly, this setting is one of the
first orders of business after Linux installation. BUT, it doesn't
matter in this case. :)<br>
<br>
You see, UPnP XML replies are sent in UTF8 encoding by Myth
forcibly, without consulting locale. That's good, it's universal and
what UPnP/DLNA clients expect. You're talking about how Myth
interfaces with system-wide CP settings, along with DB's CP, etc.
That's not the issue here, that mechanism works perfectly and is not
touched by the patch. The patch only changes CP handling of UPnP's
HTTP lib when talking to remote clients. Not how Myth works with
CP's and conversions "on localhost".<br>
<br>
Thing is, Myth UPnP uses UTF8 for all communication (whatever CP you
have system-wide, it doesn't matter), so my point is just that it
should expect UTF8 from clients *<b>by default</b>* too. Of course,
the *<b>proper</b>* way to handle it is to consult the encoding
specified in clients' HTTP headers + XML bodies and use that to
decode the requests, but until such mechanism is in place, expecting
UTF8 back is way more appropriate than expecting Latin1, which we
don't speak - ever - and which isn't used by any UPnP/DLNA client
I've tested.<br>
<br>
The tests showed that all common clients (WMP11+, PS3 and
Linux-running Totem with UPnP plugin) also send UPnP requests in
UTF8. It's not because Myth talks in UTF8, they too use it for all
UPnP communications simply because it's universal. Just like Myth.
But once you speak UTF8, you should default to reading UTF8 back,
not anything else and certainly not Latin1.<br>
<br>
Until the default handling switches from Latin1 to UTF8, UPnP won't
work properly for characters not present in Latin1.<br>
<br>
<b>In short:</b> you're simply ignoring what CP client says it is
and force the conversion into Latin1. As long as we know everything
- including us - speaks in UTF8, this is wrong. Ideally, we should
honor the CP indicated by the client, but until that is done,
expecting UTF8 works much, much better.<br>
<br>
<pre class="moz-signature" cols="72">--
David Kubicek</pre>
</body>
</html>