The central goal of the work plan structure is to maximize the interaction among work packages and make all the efforts converge towards, first, the large-scale pilots and, second towards innovation transfer and exploitation. We have therefore chosen to organize the overall project in 2 axes: content creation, and broadcast infrastructure, and use the pilots as the meeting point for all the different efforts.
The Work packages involved in this project are the following:
- WP1 Management, is responsible for the overall coordination of the project - I2cat (Spain),
- WP2 Requirements, format and creation of Immersive experiences. is responsible for the content format ideation, translating this into concrete production needs and content creation - VRT (Belgium),
- WP3 Immersive Broadcast Platform will focus on implementing the software tools needed to create and experience this novel format of content, as well as ensure distribution and delivery provide the best possible user experience – PSNC (Poland),
- WP4 Demonstration Pilots will be the testbed of the three previous work packages, most notably in 2 large-scale pilots, followed by an evaluation of the end-user experience - I2cat (Spain),
- WP5 Innovation, Dissemination and Exploitation will be in constant interaction with industrial stakeholders, gathering feedback and disseminating the work done in the consortium through the larger community, including particularly standardisation committees and content creators - iMinds (Belgium).
PSNC is the coordinator of the Work Package 3 - Immersive Broadcast Platform. The result of work in this work package is a full omnidirectional image chain between capturing, by processing, encoding, content distribution up to users’ display. The main objectives of this work package are:
- to design a reliable and robust system architecture of the hardware and software platform and facilitate a smooth integration of all the project technical components,
- to design, set up and deploy an omnidirectional camera system capable of capturing live high resolution high frame rate video,
- to design and implement real-time process to effectively encode multiple images from cameras into full omnidirectional video,
- to design and implement the required functionalities to adapt the existing production tools to omnidirectional inputs and across-device visualization and interaction,
- to design and implement the communication servers required to distribute omnidirectional content (incl. live stream) to remote users through existing and next generation access networks efficiently,
- to design and implement the clients and libraries required to display omnidirectional video-based productions across devices (TV, second screen and HMD) maintaining coherence, synchronization, and responsivity in LAN environments,
- to integrate and test the different components in an end-to-end pilot and validate it in lab conditions.
The overall architecture of the ImmersiaTV system is depicted on Pic. 1

Pic. 1 – ImmersiaTV system architecture
Work in this WP3 is divided into eight Tasks and five of them - the key tasks are described
below:
T3.2 Capture – This task will focus on the development of a distributed video capture and processing architecture ground up designed for omnidirectional video in a TV broadcasting context. It addresses the issues of high equipment cost, too low perceived image resolution and frame rate, too low video processing performance and/or quality, lack of versatility in deployment of current systems. In addition to camera heads, the architecture will consist of edge capture, replay and per-camera processing units, and a central video processing unit.
T3.3 Production Tools – This task addresses the field of immersive content production tools inside the ImmersiaTV framework. Both live events and off-line documentary production scenarios are envisioned. The production tools enable creative content professionals to deliver media experiences in an unprecedented, disruptive way by carefully combining immersive content and traditional storytelling techniques. Manual or semi-automatic production of the content is facilitated by providing means for preparing omnidirectional shootings, automatic pre-selection and intuitive presentation of captured content for live and non-live scenarios. It also includes the development of an advanced immersive story editor (documentary) and/or director’s tool (live event) that enable to mix immersive and non-immersive story elements into an appealing end user content experience.
T3.4 Encoding&Decoding – The encoding and decoding of the content will be performed in three different iterations, from off-line coding to real-time coding & decoding to region of interest (ROI) & low latency coding & decoding.
Iteration 1: Off-line coding will make use of off-self video compression encoders (e.g. HEVC/H.264). The captured content will go through various processing in order to create one or more video (depending on the number of cameras used for capture) in mono or multiview (depending on the type of camera(s)) where a pre-processing has been performed in order to represent them as conventional frame based video.
Iteration 2: In this approach, the codec developed in the previous iteration will be redesigned in order to allow for a real-time encoding in addition to real time decoding, by taking into account the network conditions (bit rate).
Iteration 3: This approach will extend the previous codec in order to take into account both the content to be coded and the view point of the user or the device. A region of interest (ROI) estimation will analyse the view point of the user, the device and the content and will produce a priority map for what part of the content should be coded when and streamed when.
T3.5 Delivery and Reception – In this task delivery and reception from origin server until the end user device screen will be addressed, discussed, designed and implemented. This task covers all network transmissions since the produced stream until any user device. It will encompass the selection of the appropriate base technologies according to the current and future networks in all the phases of the distribution, providing reliable cost-effective solutions that should be applicable in nowadays advanced and emerging networks. There are two main areas of work identified in this task, (i) transmission from the content providers servers until the audience housing, and (ii) the transmissions from the media centre to the end devices, mostly over WiFi (table, smartphone, HMD) or Ethernet (PC, smartTV, Set-Top-Box).
T3.6 Interaction and Display – This task will consist, mostly, on the delivery of the end-user application. Each iteration will focus its main efforts on a specific task. The first iteration will implement interactive display mechanisms adapted to immersive displays and second screens (head movements, tablet moved around, finger gestures). It will also deliver end-user receptors for each device that can synchronize between them on the basis of a multimedia server orchestrating the different video streams. The second iteration will introduce a distributed synchronization mechanism, potentially overriding the need for a central media server in good delivery conditions, where fast and stable network communication is available and little buffering is needed. The third iteration will refine the solutions developed in the first 2 iterations and implement additional functionality needed, such as a better memory management allowing the exploratory mode, which requires interaction with a bigger amount of video streams.