This chapter presents an overview of input and output plugins, video
and audio filters, and transitions. They are the most common building
blocks of OpenVIP networks and therefore deserve a special treatment.
Although they could be implemented as modules (according to the
IModule interface), there are special
interfaces to simplify the development of these plugins.
A complete list of available plugins can be obtained at any time
using the list_plugins.py Python script.
It is clear that every correct OpenVIP network must contain so-called input and output modules, i.e. modules which have no input or output connectors, respectively. An input module typically reads data from a file or another data stream; note that OpenVIP itself is flexible enough to cooperate with almost any source. An output module writes the data to a file or another destination.
The previous chapter presented an overview of modules and their
methods. We learned that a module has to implement at least four methods:
EnumConnectors,
SetStreams,
QueryRequirements and
Process. However, most input plugins have a
similar behaviour (e.g. their QueryRequirements
method returns an empty request list). It is therefore reasonable to use a
simplified interface for input plugins:

Note that IInput is an abstract
interface and that every input plugin has to implement at least one of
IVideoInput or
IAudioInput interfaces. OpenVIP core calls
the EnumStreams method to obtain a list of
streams that are available in this source.
EnumStreams returns a list of
StreamInfo object, i.e. the same type of
information that IModule::SetStreams would
return.
Depending on returned stream types, the core then queries the object
for specialized interfaces: If EnumStreams
returns one or more STREAM_VIDEO entries, the input
plugin object must implement the
IVideoInput interface and the core will use
the method DecodeVideo to ask the plugin for a
particular video frame. Similarly, if the plugin reports one or more
STREAM_AUDIO streams, the input plugin object must
implement the IAudioInput interface and the
core will use the method DecodeAudio to ask the
plugin to decode a specific block of audio.
One might now ask: How is it possible that I use an input plugin in
the same way as a module, if it doesn't implement the
IModule interface? Let's have a look at a
part of a network description file:
<module id="loader0" class="Input">
<param name="filename">input.mpg</param>
<param name="format">FFMpeg</param>
</module>The Input class is a so-called input proxy
module. It means that it is a true module (it implements the
IModule interface) and it's task is to
translate IInput methods into
IModule's methods (and vice versa). The
right input plugin is selected using the format
parameter - in our case it is FFMpeg, which means that the
Input class will call the methods of
FFMpegInput class. In general, the communication
between Input class and
IInput plugins proceeds as follows:
Input::EnumConnectors calls
IInput::EnumStreams to learn about the available
streams. For each stream it creates an output connector with a
standardized name such as video0,
audio0 etc. It then returns the list of these
connectors to the core.
The information from IInput::EnumStreams is
also used in Input::SetStreams - this method
simply copies the information about available streams to a new
StreamInfoList list and returns it to the core.
The Input::QueryRequiredData method always
returns an empty list since an input plugin has no input
connectors.
Finally, there is the Input::Process
method. It reads the requests parameter supplied by the core and for each
request it calls either IAudioInput::DecodeAudio
or IVideoInput::DecodeVideo depending on the type
of the request.
One more note about the input plugin autodetection mechanism: Each
input plugin should be accompanied with a so-called factory class which
implements the IInputFactory
interface:

The factory informs the core whether a specified file can be
understood by the corresponding input plugin. The autodetection code
instantiates all known IInputFactory
implementations and calls their CanRead method
until first of the factories returns true. After that,
CreateInputObject of this factory is called and
returned IInput object is used to load the
data from the file.
The situation with output plugins is almost the same. Such a plugin
has to implement the IOutput interface and
at least one of its specializations:

There is also the output proxy class called
Output, which translates between the
IModule's and
IOutput's methods calls.
The section Input and output
plugins contains an overview of all implemented input and output
plugins and their parameters. To develop a new input or output plugin, all
you have to write is a class which implements the
IInput or
IOutput interface (and at least one of
their specializations), respectively (and possibly add a factory for the
input plugin).
The IVideoFilter is a simplified
interface for video filter plugins:

As you already expect, there is a proxy class called
VideoFilter which translates between
IModule's and
IVideoFilter's methods. The right video
plugin is selected using the parameter videofilter
of the VideoFilter module (see the network example in Network format).
A video filter always has one input and one output connector and
therefore the VideoFilter::EnumConnectors method
returns two connectors named video0.
Most video filters don't change the video stream's parameters such
as width, height etc. That's why the
VideoFilter::SetStreams method simply copies the
input stream parameters. However, there are some video filters (such as
Resize), which need to modify the stream's parameters. These filters have
to implement the IStreamChangingFilter
interface:

The VideoFilter::SetStreams method then
calls the filter's ChangeStream method to get
information about the output stream.
To produce a single output video frame a video filter usually needs
one or more input video frames. This is the purpose of the
IVideoFilter::QueryRequiredFrames method. Its
full prototype is
void QueryRequiredFrames(in long frame, out FrameNumberList req_frames);
When a filter is asked to produce frame with number
frame, it enumerates all the frames required to fulfill
this request and stores their numbers into the
req_frames list.
The VideoFilter::QueryRequirements then
simply calls the filter's QueryRequiredFrames and
for each frame number from req_frames it creates a
Request structure with that frame number and
connector identifier video0.
The video filtering is actually performed using the
IVideoFilter::Process call:
IVideoFrame Process(in long frame, in VideoFrameList inputs);
The
VideoFilter::Process method only translates the
Request structure to a video frame number and
supplies it to filter's Process method.
There is a class called SimpleVideoFilter
which even more simplifies the development of new video filters. You may
use this class if the following statement holds: To produce output frame
with number N, the filter needs the input frame with
number N. In fact this is true for the overwhelming
majority of video filters - only special filters such as motion blur need
more frames at once.
The only thing a simple video filter has to do is to implement the
DoProcess method:
virtual upf::Ptr<IVideoFrame> DoProcess(IVideoFrame *in);
This method simply gets an input frames, processes it and returns the output frame.
This section presents an example of a simple filter which inverts images. We start with the header and class declaration:
#include <upf/upf.h>
#include "openvip/openvip.h"
#include "openvip/SimpleVideoFilter.h"
using namespace std;
using namespace upf;
using namespace openvip;
class InvertVFilter : public SimpleVideoFilter
{
protected:
Ptr<IVideoFrame> DoProcess(IVideoFrame *in);
UPF_DECLARE_CLASS(InvertVFilter)
};
UPF_IMPLEMENT_CLASS(InvertVFilter)
{
UPF_INTERFACE(IVideoFilter)
UPF_PROPERTY("Description", "Simple invert")
}
UPF_DLL_MODULE()
{
UPF_EXPORTED_CLASS(InvertVFilter)
}There is a convention in OpenVIP that a video filter class name
should end with the VFilter suffix (see also the
openvip/doc/devel/coding.txt file). If you use such
filter in a network, it is then sufficient to specify the name without the
suffix, e.g. Invert.
The UPF related statements are always the same, you just use the right class name and provide a short description of your filter using the Description property (be sure to enter this description - it is used by the GUI).
The DoProcess method itself is simple,
too:
Ptr<IVideoFrame> InvertVFilter::DoProcess(IVideoFrame *in)
{
int w = in->GetWidth();
int h = in->GetHeight();
Ptr<IVideoFrame> out = in->GetWCopy(FORMAT_RGB24);
pixel_t *dst = out->GetWData(FORMAT_RGB24);
for (int i=0; i<3*w*h; i++)
dst[i]=255-dst[i];
return out;
}We just ask for the input frame in RGB24 format and invert the
individual pixels (there are 3*w*h
of them, w*h for every colour
channel).
This filter has the disadvantage that it always asks for a RGB24
image. If the input plugin reads frames in YV12 format, each frame must be
converted before being processed. A more effective version of our filter
should first check in what formats is the frame available (using the
GetFormats method) and then invert the image in
that format.
A video transition is an operation on two video streams which
produces a single output video stream. The output usually looks like a
gradual transition from the pictures of the first stream to the pictures
of the second stream. Video transition classes implement the
IVideoTransition interface:

Again, there is a proxy class called
VideoTransition which makes video transitions
behave like modules. We will not delve into the proxy implementation
details here as it is almost the same as the
VideoFilter proxy class described above. We just
note that the input connectors are always denominated
video0 and video1 and the output
connector is always video0 and that the video
transition plugin is set using the parameter
transition.
To write a new video transition, all you have to do is a class which
implemets the SetLength and
Process methods.
The SetLength method only informs the
plugin about the transitions' length in frames; it is always called before
the actual computation begins. Here is the full prototype of the
Process method:
IVideoFrame Process(in long frame, in IVideoFrame inputA, in IVideoFrame inputB);
This call asks the plugin to render frame with number frame; the
frame values range from 0 to N-1, where
N is the value obtained from
SetLength. The Process
method blends the two images inputA and
inputB together and returns the result. E.g. a simple
crossfade transition would do the following operation on every
pixel:
*dst = (pixel_t)((1-alpha)*(*src_a) + alpha*(*src_b));
where
alpha=frame/(N-1)
and src_a, src_b and
dst are pointers to the corresponding pixels.
An audio filter is a class which implements the
IAudioFilter interface:

The audio processing can be performed in one or more passes; you should brush up the facts about multi-pass modules from More on modules before reading the next paragraphs.
The meaning of IAudioFilter interface
methods will be explained using the normalization filter example. The
normalization filter first scans the whole audio stream and looks for the
maximum sample absolute value. It then scales all samples in such a way
that they cover the whole 16-bit range.
It is obvious that normalization can be performed in two passes (the
first looks for the maximum, the second does the scaling); the
NormalizationAFilter::GetPasses therefore returns
2.
Here goes the first pass: We have to report the number of
computation steps. Let's say we'll search the audio stream in 64 kB
blocks. The SetLength method tells us the total
length of the audio stream. Our GetCompStepCount
method therefore returns the number of 64 kB blocks that fit in the total
length (the last block may be incomplete); we will denote this number
N.
The core (more exactly, the audio filter proxy class
AudioFilter) now calls the
QueryRequiredData method with computation step
address equal to 0, 1, ..., N-1. We reply with the
AudioAddress structure identifying the
appropriate 64 kB audio block. The core supplies that block to the
Process method; we scan it for a maximal sample
value.
We are now done with the first pass and have the maximal value. The
core calls audio filter's NextPass method to
switch it to the second pass. This is the last pass and since we don't
want our module to be terminal, the
GetCompStepCount should return 0.
The QueryRequiredData method now gets
requests for audio blocks specified using an
AudioAddress. To produce a normalized a block of
audio we need the same block from input stream; our
QueryRequiredData therefore simply copies the
AudioAddress structure. The core then supplies
that block in form of an audio buffer to the
Process method. We multiply each sample with a
constant computed after the first pass and return the processed audio
buffer.
Just a note to the AudioFilter class: The
input and output connector are both denominated
audio0 and the filter is selected using the parameter
audiofilter.
An audio transition is the analogy of video transition for audio
streams. When blending two video sequences together, one usually wants to
mix their audio tracks, too. Audio transition classes implement the
IAudioTransition interface:

Its methods are almost the same as in
IVideoTransition. Instead of frames, the
SetLength sets the transitions' length in
samples. The Process method now has the following
prototype:
IAudioBuffer Process(in long pos, in IAudioBuffer inputA, in IAudioBuffer inputB);
It has to mix two audio buffers which correspond to the position
pos in the input streams; this number ranges from from
0 to N-1, where N is the value
obtained from SetLength. A simple crossfade audio
transition would do the following operation on every sample:
*dst = (sample_t)((1-alpha)*(*src_a) + alpha*(*src_b));
where alpha is the sample number divided by
N-1 and src_a,
src_b and dst are pointers to the
corresponding samples.
The transitions' Process and
SetLength methods are again called by the
AudioTransition proxy class, which makes audio
transitions behave like modules. The input connectors (set by the
AudioTransition::EnumConnectors method) are
always denominated audio0 and
audio1 and the output connector is always
audio0. The right audio transition plugin is selected
using the parameter transition.