[GSoC proposal] Derivates API for Media ecosystem (D7)

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
slashrsm's picture

Overview: The goal of this project is to implement Derivates API for Media Library (Media, Styles, ...) ecosystem in Drupal 7. This API will provide a flexible, extensible and abstract API to implement derivation engines for different types of files.

Description:
With evolving of HTML5 and it’s video functionalities we face formats problem. Currently we do not have de-facto stadard, that would satisfy every web browser, mobile device, … There is only one solution at the moment, if we want to support every major browser. We must provide our video in different formats. Currently we cannot do that, since Media ecosystem in D7 does not provide any way to distribute one instance of video content in different formats. It would be really usefull to have a possibility to do that.

Video is not the only type of content that could be converted. We have audio, that could be distributed in different formats, qualities, … Some day we could also have possibility to upload a document in any format, and system would easily distribute it in pdf, odt, rtf, doc, … We could do this almost for any file type, that can exist in different formats and we have possibility to convert it.

Derivates API would provide framework, that would provide standard Media and Styles integration for that kind of systems. This is a D7 project.

The following features will be available when project is completed:

  • unified and standardised wrapper for back end engines,
  • ability to process various types of media files (audio, video, ...) through 3rd party providers,
  • ability to create custom derivation presets,
  • ability to choose automatically extracted thumbnails for any media type.

Edit:

There was a lot of questions about connection between Video module and this project. Video module is currently not a part of the Media ecosystem. We would like to have similar features also in Media. Video is focused just on video entities, while Derivates API should provide standard derivates interface for all media types. I personally think, that Video module could be used as one of the backends, that would use Drivates API to communicate with Media ecosystem and to create derivates of video files for it. The same way some other module could do that for audio, documents, ... I was also thinking about adding Derivates <-> Video integration to this project, but I was not sure, if everything can be done in one project.

My initial idea was to implement just support for video transcoding. Somethining with similar concept like Image styles, with only exception, that would be used for video. I explained this idea to some community members, and there is the idea for Derivates API came from. I seemed better that something more abstract would be better, since it would not be developed just for one media entity type. I was also told, that there are plans to do some significat architectural changes to Media soon. I think that this is a great time to plan also Derivates API.

Schedule:

  • May 23 - May 29 - In-depth research of Styles and Media ecosystem, research of Video module and possibilities about integrating Video module with Media via Derivates API, research of Media Mover
  • May 30 - June 15 - API planning: research of possibilities, final definition of features, research of different APIs in Drupal, guidelines for API developing (blogs, sessions recordings, books, ... ?)
  • June 15 - June 11 - Implementation
  • July 11 - Midterm submission
  • July 11 - August 15 - Implementation, finalising, polishing, ...
  • August 15 - 22 - Final report submission, presentation @ DrupalCon London (?).

Who:
My name is Janez Urevc, and I am undergraduate student at Faculty of Computer and Information sciences, University of Ljubljana. I am currently spending most of my time in web. I am also interested in AI (search algorithms, data mining, ...).

I’ve been passionate about open source software since high school. I found Drupal about 3 years ago and used it to for few projects. I’ve also contributed a few smaller patches and a small module to Drupal community (Sequenced newsletter). I’ve been developing D7 websites since Oct 2010. I released my first D7 site only one week after D7’s official release. I would like to become envolved in Drupal community even more. That is the main purpuse of my application.

I am also an active member and one of launchers of Drupal Community in Slovenia.

My contact info:

  • janez AT janezek DOT org (email, XMPP)
  • slashrsm @ freenode, twitter and skype

Links:

Comments

In general this sounds like a

dawehner's picture

In general this sounds like a great idea.

Personally i would be interested how you will integrate with external providers to convert the data.
So will you provide some abstract way of moving the data into the cloud?

Additional will be part of the project integration's with cdn's and how to colloborate with the other existing solutions.

... This are just questions i'm wondering myself.

+1 for the idea and +1

kreynen's picture

+1 for the idea and +1 @dereine's suggestion to support external encoding too. Running media_mover on the same server as Drupal caused some serious performance issues when working with large files. Most Open Media Project sites have traded the CPU and storage hit for a bandwidth hit and moved from media_mover to http://drupal.org/project/internet_archive.

Projects like http://drupal.org/project/video assume that you are starting with a small web quality/size video and just looking for support for other devices. I'd like to see a Derivates API that can create HTML5 friendly video from 1-2GB broadcast quality MPEG2 or H.264.

Offloading the actual

bojanz's picture

Offloading the actual processing is vital.
media_mover is a nice API, but doing the actual encoding on your server is not viable in most cases.
The project sounds very interesting. I guess there would need to be one implementation of the API, doing the offloading, and maybe one implementation of the API allowing you to do the converting on your server.
Having implementations of the API from the start allow you to see its weak points easier, as well as get people excited faster.
I'd be interested to hear aaron's opinion (or anyone else from the media_mover team)

So, once again, +1

Yes, this is great idea and

heshanlk's picture

Yes, this is great idea and we already in a middle of the way about discussing about it. You can see the proposal here. If you can please provide more details over this one http://groups.drupal.org/node/137104. What your going to do with Media?

Senior Drupal Developer at DrupalConnect

The idea of this API is to

slashrsm's picture

The idea of this API is to provide abstract framework for handling file derivates of any kind. It will provide unified layer between Media ecosystem and modules that will do actual transcodes. That means, that you can generally have more backend implementations that will handle conversions for same file type. Video could be processed localy, on another server in the same network or even on some video transcoding service. It depends on backend module used.

I believe, that also Video module could be integrated with Media over this API. When we provide unified interface via Derivates, Video could be one of the modules that would offer it's services through it.

Does this makes sense?

Janez Urevc - software engineer @ Examiner.com - @slashrsm - janezurevc.name

It would be great, even in

heshanlk's picture

It would be great, even in video module itself has some solid API for transcoding and we have implemeted with FFMPEG_Wrapper module, FFMPEG and Zencoder trasncoders with local file system and Amazon S3. You can understand about using different stream wrappers(which is already in), different transcoders (Which is already done with video module) and meta data extractors like in Video module.

The idea of this API is to provide abstract framework for handling file derivates of any kind

What do you mean by this? Could you please more specifically say what API does?

Senior Drupal Developer at DrupalConnect

Great idea. Stay focused.

effulgentsia's picture

Adding my +1 to the ones already mentioned.

I like that your proposal is nicely focused on a specific problem space: a unified API, integrated into Drupal, for generating a derivative file from a source file. Don't worry. There's plenty to do in this problem space. Definitely enough to fill a summer. Especially if you consider the requirements correctly identified by dereine and bojanz that the API must support the source file, the derivate file, and the processing server, all being remote (and potentially, all 3 being on different remote servers). This is a great time to work on this, with stream wrappers in Drupal 7 core, and with no D7 version of Media Mover yet. This overlaps a lot with what Media Mover does, so studying D6 Media Mover gives you a good head start, but D7 stream wrappers lets you evolve this into both a lighter and richer API. This can then become the foundation on which a full D7 Media Mover port can happen (where the full port would include some additional UI, workflow management, and entity/field integration). Hey, if you have time to help on that after the summer, that would be sweet!

Note that while this will be an awesome addition to the Media ecosystem, it is distinct from the current Styles module. The current Styles module is focused on an API / UI for configuring how a file is displayed within the context of the web page (e.g., configuring that in such and such context, the video markup should be targeted for html5, and the video player should use x library, have a green skin, and not autostart the video automatically), whereas this proposal is about generating a suitable file to play. Both are important problems to solve, and when both are solved well, there will be room for polishing the integration UI between the two, but for now, I recommend staying focused on the file generation part of the puzzle. This is how Drupal is built. One lego piece at a time.

Given that, I recommend removing "control of media entity display mode (according to presets)." from the project goals (don't worry if you see this comment too late to change the official proposal, it's normal for the specifics to evolve as the project starts and progresses). That part is solidly in the domain of Media and Styles. Hey, we'd love your help on improving that too, if you have time. But the more you can stay focused on the main problem space, the happier I think you'll feel about what you accomplish.

I'm certainly looking forward to this. Between a solid system of generating derivative files, work on the Styles UI itself, WYSIWYG integration improvements proposed in http://groups.drupal.org/node/137104, and the other Media related GSoC proposals, Media within Drupal can see a really nice push forward this summer!

I appreciate your support and

slashrsm's picture

I appreciate your support and great ideas you exposed. I removed that item from features list.

My basic motivation for this application was a big will to involve in Drupal community, so I'd love to stay involved in Drupal also after the project. I really hope that my application will be approved, since it would be a much friendlier ecosystem to start contributing something more serious.

Can I contact you in planning phase of the project for support?

Janez Urevc - software engineer @ Examiner.com - @slashrsm - janezurevc.name

Absolutely

effulgentsia's picture

Can I contact you in planning phase of the project for support?

Yes. http://drupal.org/user/78040/contact or just add a comment to this post, since I'm subscribed to it.

I absolutely agree with Alex

arthurf's picture

I absolutely agree with Alex that you should stay focused on the API itself. I am on board to help with this as much as I can. I would be very glad to refocus all of my development effort of Media Mover 7 on a derivatives API- it's a huge win for all of us if it can get put into place.