Media Foundation - Architecture

Architecture

The MF architecture is divided into the Control layer, Core Layer and the Platform layer. The core layer encapsulates most of the functionality of Media Foundation. It consists of the media foundation pipeline, which has three components: Media Source, Media Sink and Media Foundation Transforms (MFT). A media source is an object that acts as the source of multimedia data, either compressed or uncompressed. It can encapsulate various data sources, like a file, or a network server or even a camcorder, with source specific functionality abstracted by a common interface. A source object can use a source resolver object which creates a media source from an URI, file or bytestream. Support for non-standard protocols can be added by creating a source resolver for them. A source object can also use a sequencer object to use a sequence of sources (a playlist) or to coalesce multiple sources into single logical source. A media sink is the recipient of processed multimedia data. A media sink can either be a renderer sink, which renders the content on an output device, or an archive sink, which saves the content onto a persistent storage system such as a file. A renderer sink takes uncompressed data as input whereas an archive sink can take either compressed or uncompressed data, depending on the output type. The data from media sources to sinks are acted upon by MFTs; MFTs are certain functions which transform the data into another form. MFTs can include multiplexers and demultiplexers, codecs or DSP effects like reverb. The core layer uses services like file access and networking and clock synchronization to time the multimedia rendering. These are part of the Platform layer, which provides services necessary for accessing the source and sink byte streams, presentation clocks and an object model that lets the core layer components function asynchronously, and is generally implemented as OS services. Pausing, stopping, fast forward, reverse or time-compression can be achieved by controlling the presentation clock.

However, the media pipeline components are not connected; rather they are just presented as discrete components. An application running in the Control layer has to choose which source types, transforms and sinks are needed for the particular video processing task at hand, and set up the "connections" between the components (a topology) to complete the data flow pipeline. For example, to play back a compressed audio/video file, the pipeline will consist of a file source object, a demultiplexer for the specific file container format to split the audio and video streams, codecs to decompress the audio and video streams, DSP processors for audio and video effects and finally the EVR renderer, in sequence. Or for a video capture application, the camcorder will act as video and audio sources, on which codec MFTs will work to compress the data and feed to a multiplexer that coalesces the streams into a container; and finally a file sink or a network sink will write it to a file or stream over a network. The application also has to co-ordinate the flow of data between the pipeline components. The control layer has to "pull" (request) samples from one pipeline component and pass it onto the next component in order to achieve data flow within the pipeline. This is in contrast to DirectShow's "push" model where a pipeline component pushes data to the next component. Media Foundation allows content protection by hosting the pipeline within a protected execution environment, called the Protected Media Path. The control layer components are required to propagate the data through the pipeline at a rate that the rendering synchronizes with the presentation clock. The rate (or time) of rendering is embedded as a part of the multimedia stream as metadata. The source objects extract the metadata and pass it over. Metadata is of two types: coded metadata, which is information about bit rate and presentation timings, and descriptive metadata, like title and author names. Coded metadata is handed over to the object that controls the pipeline session, and descriptive metadata is exposed for the application to use if it chooses to.

Media Foundation provides a Media Session object that can be used to set up the topologies, and facilitate a data flow, without the application doing it explicitly. It exists in the control layer, and exposes a Topology loader object. The application specifies the required pipeline topology to the loader, which then creates the necessary connections between the components. The media session object manages the job of synchronizing with the presentation clock. It creates the presentation clock object, and passes a reference to it to the sink. It then uses the timer events from the clock to propagate data along the pipeline. It also changes the state of the clock to handle pause, stop or resume requests from the application.

Read more about this topic:  Media Foundation

Famous quotes containing the word architecture:

    Defaced ruins of architecture and statuary, like the wrinkles of decrepitude of a once beautiful woman, only make one regret that one did not see them when they were enchanting.
    Horace Walpole (1717–1797)

    No architecture is so haughty as that which is simple.
    John Ruskin (1819–1900)

    For it is not metres, but a metre-making argument, that makes a poem,—a thought so passionate and alive, that, like the spirit of a plant or an animal, it has an architecture of its own, and adorns nature with a new thing.
    Ralph Waldo Emerson (1803–1882)