iXML in NDI Streams
NDI Tag = '< iXML >'
iXML is useful in many different areas of media recording, playback and transfer. Another area which can benefit from iXML are within NDI Streams.
The objective here is to provide 2 types of information, firstly a way to properly label and define audio tracks in the same way iXML has been used for years in production sound - that is to allow for audio track labels, audio track function definitions and also to associate audio tracks as a family (which is implied within a single stream but this topic goes beyond that).
Another area where iXML can be used in NDI is to provide inter-stream associations, for example to link video streams as multi-cam stacks, and to provide upstream compositing instructions for audio and video, enabling object-based video and audio workflows where a master (primary or parent) NDI Stream can reference external streams for additional contribution to the final user presentation.
A really simple example here would be for graphics - like lower 3rds - where the master stream can be sent clean and the graphics are sent with forward-compositing instructions so the end-system can construct the appropriate overlay as defined by the sending system. The critical difference here is that the receiving system (which may be driven by an end-user consumer) can make decisions about the object based reconstruction, such as whether to view the graphics, or perhaps to change camera angles
Precedents exist for this very type of behaviour, in the form of closed captions which are sent alongside the video with instrucitons on how they are presented, but its ultimately within the end consumer's domain whether to enable them or not.
In the future object based workflows will become more common and iXML can be used to provide the metadata definition of how to link pieces together.
It is anticipated that NDI frame/packet based and also NDI free real time metadata will contain iXML metadata
One promising application is the use of visual offset and scaling metadata which will allow smaller subsections of overlays to run in parallel NDI streams and then be superimposed at the full rater destination using rules. This would, for example allow for a lower 3rd graphic to be sent in a 1920 x 100 stream to be superimposed on top of a 1920 x 1080 stream, at position x = 0.00 offset and y = 0.90 offset and at scale 1.0 (this represents the portion of the full raster covered by the x proportion of the overlay). This would also allow for high resolution graphics to be sent and overlayed on top of much lower resolution video, providing a much higher perceived quality by the consumer and easier to read text. This type of object based workflow requires 2 iXML constructs - the first, in the main stream creates a link association between the main stream and the graphics stream. This is done using FILE_SET iXML structure. The second construct, in the graphics stream delivers upstream compositing instrutions along with the graphics video stream which defines the x and y offset and the relative coverage (scaling) metadata alongside the video. This shoud be inserted as per-frame metdata so it can be applied dynamically and frame accurately. This metadata will be a visual equivalent to the DEFAULT_MIX metadata which can be used in audio iXML files to push upstream mixing information.