Plugins and proxy modules

Introduction

This chapter presents an overview of input and output plugins, video and audio filters, and transitions. They are the most common building blocks of OpenVIP networks and therefore deserve a special treatment. Although they could be implemented as modules (according to the IModule interface), there are special interfaces to simplify the development of these plugins.

A complete list of available plugins can be obtained at any time using the list_plugins.py Python script.

Input and output plugins

It is clear that every correct OpenVIP network must contain so-called input and output modules, i.e. modules which have no input or output connectors, respectively. An input module typically reads data from a file or another data stream; note that OpenVIP itself is flexible enough to cooperate with almost any source. An output module writes the data to a file or another destination.

The previous chapter presented an overview of modules and their methods. We learned that a module has to implement at least four methods: EnumConnectors, SetStreams, QueryRequirements and Process. However, most input plugins have a similar behaviour (e.g. their QueryRequirements method returns an empty request list). It is therefore reasonable to use a simplified interface for input plugins:

Note that IInput is an abstract interface and that every input plugin has to implement at least one of IVideoInput or IAudioInput interfaces. OpenVIP core calls the EnumStreams method to obtain a list of streams that are available in this source. EnumStreams returns a list of StreamInfo object, i.e. the same type of information that IModule::SetStreams would return.

Depending on returned stream types, the core then queries the object for specialized interfaces: If EnumStreams returns one or more STREAM_VIDEO entries, the input plugin object must implement the IVideoInput interface and the core will use the method DecodeVideo to ask the plugin for a particular video frame. Similarly, if the plugin reports one or more STREAM_AUDIO streams, the input plugin object must implement the IAudioInput interface and the core will use the method DecodeAudio to ask the plugin to decode a specific block of audio.

One might now ask: How is it possible that I use an input plugin in the same way as a module, if it doesn't implement the IModule interface? Let's have a look at a part of a network description file:

<module id="loader0" class="Input">
    <param name="filename">input.mpg</param>
    <param name="format">FFMpeg</param>
</module>

The Input class is a so-called input proxy module. It means that it is a true module (it implements the IModule interface) and it's task is to translate IInput methods into IModule's methods (and vice versa). The right input plugin is selected using the format parameter - in our case it is FFMpeg, which means that the Input class will call the methods of FFMpegInput class. In general, the communication between Input class and IInput plugins proceeds as follows:

Input::EnumConnectors calls IInput::EnumStreams to learn about the available streams. For each stream it creates an output connector with a standardized name such as video0, audio0 etc. It then returns the list of these connectors to the core.

The information from IInput::EnumStreams is also used in Input::SetStreams - this method simply copies the information about available streams to a new StreamInfoList list and returns it to the core.

The Input::QueryRequiredData method always returns an empty list since an input plugin has no input connectors.

Finally, there is the Input::Process method. It reads the requests parameter supplied by the core and for each request it calls either IAudioInput::DecodeAudio or IVideoInput::DecodeVideo depending on the type of the request.

One more note about the input plugin autodetection mechanism: Each input plugin should be accompanied with a so-called factory class which implements the IInputFactory interface:

The factory informs the core whether a specified file can be understood by the corresponding input plugin. The autodetection code instantiates all known IInputFactory implementations and calls their CanRead method until first of the factories returns true. After that, CreateInputObject of this factory is called and returned IInput object is used to load the data from the file.

The situation with output plugins is almost the same. Such a plugin has to implement the IOutput interface and at least one of its specializations:

There is also the output proxy class called Output, which translates between the IModule's and IOutput's methods calls.

The section Input and output plugins contains an overview of all implemented input and output plugins and their parameters. To develop a new input or output plugin, all you have to write is a class which implements the IInput or IOutput interface (and at least one of their specializations), respectively (and possibly add a factory for the input plugin).

Video filters

The IVideoFilter is a simplified interface for video filter plugins:

As you already expect, there is a proxy class called VideoFilter which translates between IModule's and IVideoFilter's methods. The right video plugin is selected using the parameter videofilter of the VideoFilter module (see the network example in Network format).

A video filter always has one input and one output connector and therefore the VideoFilter::EnumConnectors method returns two connectors named video0.

Most video filters don't change the video stream's parameters such as width, height etc. That's why the VideoFilter::SetStreams method simply copies the input stream parameters. However, there are some video filters (such as Resize), which need to modify the stream's parameters. These filters have to implement the IStreamChangingFilter interface:

The VideoFilter::SetStreams method then calls the filter's ChangeStream method to get information about the output stream.

To produce a single output video frame a video filter usually needs one or more input video frames. This is the purpose of the IVideoFilter::QueryRequiredFrames method. Its full prototype is

void QueryRequiredFrames(in long frame, out FrameNumberList req_frames);

When a filter is asked to produce frame with number frame, it enumerates all the frames required to fulfill this request and stores their numbers into the req_frames list.

The VideoFilter::QueryRequirements then simply calls the filter's QueryRequiredFrames and for each frame number from req_frames it creates a Request structure with that frame number and connector identifier video0.

The video filtering is actually performed using the IVideoFilter::Process call:

IVideoFrame Process(in long frame, in VideoFrameList inputs);

The VideoFilter::Process method only translates the Request structure to a video frame number and supplies it to filter's Process method.

A simple video filter example

There is a class called SimpleVideoFilter which even more simplifies the development of new video filters. You may use this class if the following statement holds: To produce output frame with number N, the filter needs the input frame with number N. In fact this is true for the overwhelming majority of video filters - only special filters such as motion blur need more frames at once.

The only thing a simple video filter has to do is to implement the DoProcess method:

virtual upf::Ptr<IVideoFrame> DoProcess(IVideoFrame *in);

This method simply gets an input frames, processes it and returns the output frame.

This section presents an example of a simple filter which inverts images. We start with the header and class declaration:

#include <upf/upf.h>
#include "openvip/openvip.h"
#include "openvip/SimpleVideoFilter.h"

using namespace std;
using namespace upf;
using namespace openvip;

class InvertVFilter : public SimpleVideoFilter
{
   protected:
   Ptr<IVideoFrame> DoProcess(IVideoFrame *in);

   UPF_DECLARE_CLASS(InvertVFilter)
};

UPF_IMPLEMENT_CLASS(InvertVFilter)
{
   UPF_INTERFACE(IVideoFilter)
   UPF_PROPERTY("Description", "Simple invert")
}

UPF_DLL_MODULE()
{
   UPF_EXPORTED_CLASS(InvertVFilter)
}

There is a convention in OpenVIP that a video filter class name should end with the VFilter suffix (see also the openvip/doc/devel/coding.txt file). If you use such filter in a network, it is then sufficient to specify the name without the suffix, e.g. Invert.

The UPF related statements are always the same, you just use the right class name and provide a short description of your filter using the Description property (be sure to enter this description - it is used by the GUI).

The DoProcess method itself is simple, too:

Ptr<IVideoFrame> InvertVFilter::DoProcess(IVideoFrame *in)
{
   int w = in->GetWidth();
   int h = in->GetHeight();
   Ptr<IVideoFrame> out = in->GetWCopy(FORMAT_RGB24);
   pixel_t *dst = out->GetWData(FORMAT_RGB24);

   for (int i=0; i<3*w*h; i++)
      dst[i]=255-dst[i];

   return out;
}

We just ask for the input frame in RGB24 format and invert the individual pixels (there are 3*w*h of them, w*h for every colour channel).

This filter has the disadvantage that it always asks for a RGB24 image. If the input plugin reads frames in YV12 format, each frame must be converted before being processed. A more effective version of our filter should first check in what formats is the frame available (using the GetFormats method) and then invert the image in that format.

Video transitions

A video transition is an operation on two video streams which produces a single output video stream. The output usually looks like a gradual transition from the pictures of the first stream to the pictures of the second stream. Video transition classes implement the IVideoTransition interface:

Again, there is a proxy class called VideoTransition which makes video transitions behave like modules. We will not delve into the proxy implementation details here as it is almost the same as the VideoFilter proxy class described above. We just note that the input connectors are always denominated video0 and video1 and the output connector is always video0 and that the video transition plugin is set using the parameter transition.

To write a new video transition, all you have to do is a class which implemets the SetLength and Process methods.

The SetLength method only informs the plugin about the transitions' length in frames; it is always called before the actual computation begins. Here is the full prototype of the Process method:

IVideoFrame Process(in long frame, in IVideoFrame inputA, in IVideoFrame inputB);

This call asks the plugin to render frame with number frame; the frame values range from 0 to N-1, where N is the value obtained from SetLength. The Process method blends the two images inputA and inputB together and returns the result. E.g. a simple crossfade transition would do the following operation on every pixel:

*dst = (pixel_t)((1-alpha)*(*src_a) + alpha*(*src_b));

where alpha=frame/(N-1) and src_a, src_b and dst are pointers to the corresponding pixels.

Audio filters

An audio filter is a class which implements the IAudioFilter interface:

The audio processing can be performed in one or more passes; you should brush up the facts about multi-pass modules from More on modules before reading the next paragraphs.

The meaning of IAudioFilter interface methods will be explained using the normalization filter example. The normalization filter first scans the whole audio stream and looks for the maximum sample absolute value. It then scales all samples in such a way that they cover the whole 16-bit range.

It is obvious that normalization can be performed in two passes (the first looks for the maximum, the second does the scaling); the NormalizationAFilter::GetPasses therefore returns 2.

Here goes the first pass: We have to report the number of computation steps. Let's say we'll search the audio stream in 64 kB blocks. The SetLength method tells us the total length of the audio stream. Our GetCompStepCount method therefore returns the number of 64 kB blocks that fit in the total length (the last block may be incomplete); we will denote this number N.

The core (more exactly, the audio filter proxy class AudioFilter) now calls the QueryRequiredData method with computation step address equal to 0, 1, ..., N-1. We reply with the AudioAddress structure identifying the appropriate 64 kB audio block. The core supplies that block to the Process method; we scan it for a maximal sample value.

We are now done with the first pass and have the maximal value. The core calls audio filter's NextPass method to switch it to the second pass. This is the last pass and since we don't want our module to be terminal, the GetCompStepCount should return 0.

The QueryRequiredData method now gets requests for audio blocks specified using an AudioAddress. To produce a normalized a block of audio we need the same block from input stream; our QueryRequiredData therefore simply copies the AudioAddress structure. The core then supplies that block in form of an audio buffer to the Process method. We multiply each sample with a constant computed after the first pass and return the processed audio buffer.

Just a note to the AudioFilter class: The input and output connector are both denominated audio0 and the filter is selected using the parameter audiofilter.

Audio transitions

An audio transition is the analogy of video transition for audio streams. When blending two video sequences together, one usually wants to mix their audio tracks, too. Audio transition classes implement the IAudioTransition interface:

Its methods are almost the same as in IVideoTransition. Instead of frames, the SetLength sets the transitions' length in samples. The Process method now has the following prototype:

IAudioBuffer Process(in long pos, in IAudioBuffer inputA, in IAudioBuffer inputB);

It has to mix two audio buffers which correspond to the position pos in the input streams; this number ranges from from 0 to N-1, where N is the value obtained from SetLength. A simple crossfade audio transition would do the following operation on every sample:

*dst = (sample_t)((1-alpha)*(*src_a) + alpha*(*src_b));

where alpha is the sample number divided by N-1 and src_a, src_b and dst are pointers to the corresponding samples.

The transitions' Process and SetLength methods are again called by the AudioTransition proxy class, which makes audio transitions behave like modules. The input connectors (set by the AudioTransition::EnumConnectors method) are always denominated audio0 and audio1 and the output connector is always audio0. The right audio transition plugin is selected using the parameter transition.