[mythtv-users] Commercial Flagging Idea - Distributed Commercial
Flagging (long)
Jomama
tirebiter at gmail.com
Fri Apr 15 07:13:02 UTC 2005
I will be glad to help out with this very daring pilot program.
On 4/14/05, Christopher David Petersen
<christopher.david.petersen at gmail.com> wrote:
> I'm starting this thread so I don't pollute the discussion of fingerprinting
> for commercial flagging (which is a brilliant, but separate idea).
>
> This thread is meant to discuss that idea of distributed commercial flagging
> (DCF) via existing algorithms to reduce load and increase accuracy.
>
> In brief, here's the basic idea:
>
> 1) Collect commercial flagging information from participating users at a
> central server (hopefully, this isn't a DMCA violation).
> 2) Analyze the data to determine groups of users who have performed
> duplicate work.
> 3) Analyze the data to predict groups of users who will be performing
> duplicate work.
> 4) Distribute the future duplicate work among the users to reduce each users
> individual load.
>
>
> Here's an example using _Lost_ and _Alias_ (chosen for their short names):
>
> Givens:
> - 23 users with Comcast Analog Basic Cable service in Portland, OR record
> and flag new episodes of _Lost_ and _Alias_ each week.
> - These users use a variety of commercial flagging methods.
> - The machines have a variety of available CPU power.
>
> Scenario:
> - 23 users submit data to the DCF server via a secured and anonymous
> interface. This data includes, which shows they flagged and the start and
> end times of each commercial segment. All times are synchronized, to the DCF
> server's highly-accurate clock (more on this later).
>
> - After submitting each show's data, the DCF server indicates to the client
> whether the client can join a "partnership".
>
> - "Partnerships" are created when the DCF server determines that 2 or more
> users are performing (and will perform) duplicate work (with similar output)
> for 1 or more shows.
>
> - 10 users are invited to join the new partnership for new episodes of
> _Lost_ and _Alias_ on Comcast Analog Basic Cable, Portland, OR. These ten
> users are invited because they're machines are of similar power (i.e.
> commercial flagging occurs after a similar delay and in a similar amount of
> time). These users are now "Partners" within the "Partnership".
>
> - At first, none of the partners are "trusted" or have earned any "credits"
> within the partnership. As partners submit more data they earn more credits.
> The exact amount they earn per submission is weighted by how much they are
> trusted (their "fidelity" factor") and the accuracy of the submitted data
> (how similar it is to other data). Once partners have earned enough credits,
> they can "purchase" data from the partnership.
>
> - After N weeks, only 7 users have earned enough credits to share data.
>
> - 3 partners are selected to flag next week's episode of _Lost_.
>
> - 4 partners are selected to flag next week's episode of _Alias_.
>
> - Of the 3 partners selected to flag _Lost_, all do so and submit their
> data.
>
> - Of the 4 partners selected to flag _Alias_, only 3 do so and submit their
> data. The 1 user who did not submit data has lessened his "fidelity" factor.
>
> - The 6 partners who submitted data, earn credits and increase their
> fidelity factor.
>
> - The 3 partners who *do not* have _Lost_ flag data spend credits to receive
> this data (at a discounted cost, because of their increased fidelity).
>
> - The 3 partners who *do not* have _Alias_ flag data spend credits to
> receive this data (again, at the discount cost).
>
> - The flag data is not perfect: clocks, settings, reception, etc. vary. The
> partners use the "purchased" flag data to limit their own commercial
> flagging to those suspect times within the shows (with perhaps a 1 minute
> margin before and after). The results of these "verification" flag jobs are
> submitted back to the server.
>
> Summary:
> So, now 7 users have formed a partnership to share the load of flagging
> _Alias_ and _Lost_.
> 6 of them are significantly reduced their flagging load for these two shows.
> 1 partner needs to regain the trust of the partnership by submitting data in
> a timely manner.
>
> One can easily imagine a greatly expanded model, where a particular user
> could belong to dozens of partnerships. Each partnership could have hundreds
> of users, and dozens of shows. As a result of participating in partnerships,
> the user may one be required to flag a few shows (in their entirety) each
> week.
>
> Benefits:
> - Reduced commercial flagging for individual partners.
> - Increased accuracy of commercial flagging (via consensus).
> - "Leaching" is not allowed.
> - Negative effects of poisoning are reduced through "fidelity" factors and
> credits.
> - New methods of commercial flagging (either local or distributed) can be
> seamlessly incorporated.
> - The available CPU power could be used for new extremely processor
> intensive flagging methods.
>
> Drawbacks:
> - Requires central server.
> - Requires many participants.
> - Requires frequent communications with the server (albeit, not much data s
> transferred).
> - Requires changing commercial flagging to acquire partnership data.
> - Requires changing commercial flagging to allow for flagging just parts of
> the show.
> - Requires interface changes to alert users when they are about to "fail in
> their partnership duties" by not recording and flagging a show.
> - The central DCF server stores recording habits of users. It's anonymous,
> but still concerning.
> - Requires similar "content streams". Anecdotal experience (hearing the same
> commercials over the phone with friends) make me suspect that commercial
> *times* don't vary within the same Service Provider. Analysis of submitted
> data will be the acid test. If they server never finds suitable
> partnerships, then everybody's content streams must be different, and the
> whole project is a failure.
>
> - If the project is successful, content providers will further vary the
> content streams.
>
> Progress:
> - I have built a local database to store the DCF data.
> - I am building a sql script to populate the DCF database from mythconverg.
> - I will be collecting data (via emailed output of the sql script) from
> other users.
>
> - I have outlined a solution for time synchronization. Basically, partners
> submit the machines local time with every transaction.- I am defining a
> secure and anonymous interface for the DCF server.
> - I am defining factors which I believe should effect the "fidelity" of data
> submitted.
>
> Ideas, questions, comments, criticisms are welcome.
>
>
> --
> Christopher David Petersen
> Member of PoORMUG http://poormug.bitbucket.com/
>
>
>
>
>
>
>
> _______________________________________________
> mythtv-users mailing list
> mythtv-users at mythtv.org
> http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-users
>
>
>
More information about the mythtv-users
mailing list