Content Sharing Status in the OMP

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
deproduction's picture

The Open Media Foundation's last deliverable for the Open Media Project Knight Foundation grant is a shared content solution for the OMP partners. Our proposed solution differs from existing solutions in that we want to enable automated or user-driven content-sharing, as opposed to the admin-focused solutions that exist with PEGMedia or the ACM/Telvue solution... For example, we want a community member in San Fran, searching for "Cambodian Comedy" videos to see the matching files in SF, and also see a lower-tier of search results from other OMP partner sites. If they like a show they see which was contributed from Denver's Cambodian Comedians, we want that local SF member to be able to "sponsor" or vote on that show and (based on scheduling rules for their station) have the broadcast file transferred to air on the SF Public Access TV station.

From the beginning, we've explored the benefits of a centralized media repository, vs a distributed solution. In either case, the media module panel today showed how this process will be simpler in Drupal 7, with streamwrappers which allow nonlocal files to be viewed on your website.

Here's a summary of the conversation happening at the Open Media Foundation so-far, broken down into the centralized and distributed approach.

Distributed Approach:
From the beginning, we've liked the idea of a distributed approach that doesn't tax any single provider of storage or bandwidth, and fits the general approach of the Public Access and Community Media environment. The idea here is to host a Centralized Metadata Server that would be the master of all metadata and file-locations for the OMP media content of all stations, and would contain the database that let the SF webserver see the Cambodian Comedy files from Denver and other partner sites, and would facilitate the movement of those files when necessary.
Because a significant end-goal of the Open Media Project is enabling our communities to find the top-rated content out of tens or hundreds of thousands of hours of content, the centralized metadata server would help manage and coordinate voting, comments, and other community feedback across sites, and let stations automatically schedule a "best Community Media of the Week" for example, featuring the top-rated content from across all OMP partners. This will also help reduce duplication of files and metadata, or the confusion of different copies of identical shows, as they are shared between stations (so we don't accidentally recognize each new copy of a video as an independent show).

Centralized Approach:
We have a unique opportunity through partnering with Archive.org to work with an organization who can provide nearly limitless storage and bandwidth without charging the OMP partners. We'll be meeting with Archive.org on Friday this week to discuss the challenges we're facing with our unique metadata needs, and hopefully get this process moving.
After hearing ArthurF's discussion of the media module in D7, I asked him about how we could use streamwrappers to bypass local bandwidth, local transcoding, and local storage of media files using Archive.org. He explained how our media ingest process could upload files directly to Archive.org (never hitting local storage or bandwidth), allowing Archive to transcode the files and manage large, remote uploads, and (if we could coordinate it with Archive) spit-out the broadcast-quality version back to the Access Station. This isn't a necessity in Denver of SF, since we have enough local storage, bandwidth, and transcoding capacity to make Archive.org the last step in the process, but it would be possible to make them the first step in the process, and off-load the majority of the work and expenses to Archive. Arthur explained how this would possible even in D6, but will be easier in D7.

Of course, none of this matters if OMP members don't use Creative Commons. Its our position that the content-sharing system should not support the exchange of content without CC licenses. We hope to have a solution that is at least sharing the content of SF and Denver (the only two partners who have so-far committed to enforcing CC) by the end of June.

Comments

You can add Davis Media

darrick's picture

You can add Davis Media Access to your list of partners committed to enforcing Creative Commons for online content and sharing.

As for the two approaches, I was inspired with Tim O'Reilly's talk on Cloud Computing here at Drupal Con (http://sf2010.drupal.org/conference/sessions/open-source-cloud-era). The stream wrappers available in Drupal 7 will allow the OMP to leverage the power of cloud computing. Archive.org is just one service available.

I look at the OMP as being a centralized metadata store for a public access station. The project could add a module to allow stations to subscribe to other stations and thus allow a user to search among all the subscribed stations. This is a matter of providing a hook_search function searching locally and then obtaining search results from the subscribed to stations and adding those results to the overall results.

For file storage a station should be able to mix and match storing locally, via archive.org, s3 or whatever location they so choose. The stream wrappers will add that functionality.

For VOD, stations should be able to mix and match their choice of VOD services, be it local as with DOM, blip.tv, youtube or archive.org. Davis Media Access moved from using the flash filefield in om_show to using a embedded media file exactly for this reason. This way we can serve some shows locally and some via blip.tv or vimeo.

Transcoding is the third aspect which can be moved to the cloud. Currently the choice is either ffmpeg_wrapper which encodes on the local server or ffmpeg_wrapper_remote which can use a single remote server. But Arthur believes the remote wrapper should be abstracted, so centers just as they can choose a number of file stores and vod services can choose from a number of transcoding services.

That said I don't see why how archive.org manages it's metadata matters. It looks like you just want to use them as free place to store some files, transcode some files and offer VOD. It is the OMP which holds the standardized metadata which allows the stations to share their programing not archive.org.

I believe the main issue as Kevin has stated is ensuring the OMP metadata is consistent across the stations and the md5 of the shared media file remains consistent.

Meeting with Archive

deproduction's picture

We had a good meeting today with Archive.org that included Brewster and Tracey from Archive, Forest Mars from MNN, John Hauser from Humboldt, and three representatives from BAVC and OMF. Archive reiterated their commitment to helping support a collection that could work for including Archive.org as a central repository for media content for all OMP partners, and we left with a good road-map for implementing this. OMF is committed to incorporating at least the first steps of this solution into the OMP software by the end of June, 2010, and getting all of Denver Open Media's content uploaded as an automated part of the OMP ingest process beginning in July at the latest.
I'm eager to hear from Austin, Boston, and others when they might be able to follow-suit.

Whatever your first issue of concern, media had better be your second, because without change in the media, the chances of progress in your primary area are far less likely. http://denveropenmedia.org

channelAustin's approach to this

stefanwray's picture

First point is that channelAustin is not far enough along in the implementation to be able to know exactly what route is best down the road. In other words, we do not have content ingested into the OMP system here that we could share. So it may be premature for us to discuss our optimal approach to content sharing. Unfortunately, for a variety of reasons, we are still configuring and working out the implementation here. Won't go into the reasons right now, but we are working hard to get to the point when we can begin ingesting content.

Having said that, channelAustin is committed to using Creative Commons. We have all along thought that the Creative Commons approach makes sense. There may be some minor adjustments or modifications as to exactly how we deploy the requirement of Creative Commons, but I think we can stand behind that for now.

Regarding content file sharing, we are unresolved as to how we are to manage the movement of large, broadcast quality, video files over the Internet. In either the Distributed Approach or the Centralized Approach mentioned above, there is an underlying assumption about the bandwidth capacity. There are also questions of file type.

We receive, on average, about 135 GB of uncompressed .mov or .avi files (SD) per day. (If and when we switch to HD file submission, this figure will increase substantially. We have the capacity to handle HD for playback, but the cable companies only playback SD, hence our SD-only file submissions). These files are converted to MPEG2 Program Stream files (using Content Agent) for playback on a Synergy Broadcast Server. Note that some servers require MPEG2 Transport Stream files and these are not compatible with the Program Stream ones.

channelAustin is on fiber. I just spoke with Brian Blake, channelAustin's IT Director, and he says that, in theory, we could potentially be able to upload this volume of content (135 GB / day), and follow the approach for the centralized server model, but that it would most likely need to be at late night when demand on the network is lower. But like Denver we right now have terabyte capacity for local storage and probably don't need to move the files to Archive.org for processing. It makes more sense for us to transcode our files here locally, continuing to use Content Agent. Also, using the centralize approach would mean there'd be latency with respect to when a user submits a digital file, and when they are able to complete Create Show.

Using Archive.org to offload older shows that we no longer have a desire to store locally may be something that channelAustin finds more suitable.

There are still some unresolved questions. If channelAustin wants to share files with another station, would we be sharing the uncompressed .mov file, or would be sharing compressed MPEG2? If we're sharing the uncompressed .mov files it's easier for anyone else to convert those to the appropriate MPEG2 format for playback. If we share the MPEG2, then we may have to generate a MPEG2 Program Stream and an MPEG2 Transport Stream with Content Agent -- which seems like a really bad scenario.

My point is that we need to inject file types and file formats into this discussion so we're all on the same page - speaking less abstractly and more specifically about file formats.

At this point actually, at channelAustin, once a .mov file has been processed by Content Agent and converted to an MPEG2 Program Stream file, the original .mov file is soon thereafter deleted. We are not saving those .mov files.

Great Point

kreynen's picture

@Stefan Wray, This is a great point and not something I had thought off.

The format of the original upload is an issue since IA isn't creating an MPEG, but only Ogg and H.264 of the original. In order for this to work without a lot of re-encoding, stations that are moving to HD with MPEG or H.264 would need to create an "official" MPEG2 to upload to archive OR each SD station who wants to playback a show where the original only exists in an HD format would need to re-encode the original.

Obviously the ideal would be to have both available. It seems unlikely that a station moving to HD would want to be generating a lot of unnecessary SD files. It would be great if after the first SD station re-encoded the HD file, that new SD version could be added to archive, but only the collection owner can make that update. Not sure how that SD file would make its way back to the collection owner.

Obviously when it comes to the HD transition, channelAustin is ahead of the curve. Most PEG stations are looking for the MPEG2s that will work without re-encoding on the Princetons, Tightropes, and Leightronix servers they have purchased in the last 3-4 years. At the same time, developing a sharing solution that only works for SD stations seems pretty short sighted.

archive/streamwrappers/non-local hosting

jdcreativity's picture

I think I am really pushing for non local web hosting at our facility, so if the Archive can spit out what we need via the Streamwrappers in D7 I'm excited for it. If I am not mistaken that would enable non local hosting to work even in the longer term which would be really appealing for an operation our size.

To Stefan's point - a file format that would permit some easy editing would also provide great collaboration and remix opportunities for these high quality files that are getting pushed around.

The dream is to publish the file once. If what I need is to do the compression on a desktop into "official" MPEG2 which will permit access for files to flow through the Open Media system onto my local server with the Archive as a backbone, I think it is a small price to pay in time and resources.

Ongoing Status Updates

deproduction's picture

I hope brian can post an update about the work with Archive.org and I'd also like to hear his thoughts on the central metadata server. That approach was something we determined we lacked the capacity to tackle during the Knight grant (waiting at least until D7), but it is still my favorite approach for OMP partners, allowing our sites to be truly inter-connected content-wise, while ensuring that nobody has to upload 135GB per day (as Stefan indicated). When I told that to Brewster, however, he casually suggested that Archive could serve as the central metadata repository, even without hosting all the files, and offered that they'd be willing to modify the metadata handling for the OMP collection in a way that could accommodate all that data residing at Archive (from vote results/ratings to tracking of duplicates across the OMP network).

Either way, for now, we're moving forward with a stand-alone solution to uploading content to Archive (the centralized approach), keeping the distributed approach in-mind for a long-term solution. Speaking of Long-term, I've always thought that the codec conversation was a low priority, since its bound to change every few years. Also, we're all very capable of transcoding when necessary, and the OMP tools can be leveraged to automate that process in a way that would likely be less time-consuming than bicycling Mpeg2s around our networks. But, I did speak to Archive about this a while back, and with our collection, they had agreed that if we all determined that an Mpeg2 codec that was compatible with all our playback servers was the ideal, AND if we all uploaded those files, Archive said they could make those original Mpeg2 files available, along with Ogg and the two flavors of Mp4 that are their current modus operandi.

Content sharing is the piece of this puzzle that is most inspiring to funders, and unfortunately, its the aspect of our Knight grant that was least successful. Good thing is, its still moving forward on multiple fronts. Archive has been supportive in our two visits, and remains the only viable option for a centralized solution given their funding model and bandwidth agreements, both of which increase as more data storage and transfer are needed.

Whatever your first issue of concern, media had better be your second, because without change in the media, the chances of progress in your primary area are far less likely. http://denveropenmedia.org