August'24: Kamaelia is in maintenance mode and will recieve periodic updates, about twice a year, primarily targeted around Python 3 and ecosystem compatibility. PRs are always welcome. Latest Release: 1.14.32 (2024/3/24)
Status: Blocked - Performance bottlenecks - code
can't run fast enough
Current Developers: Matt
Current "inflight" dev location:
/Sketches/MH/RTP/
Start Date: ??
Major Milestone date: n/a
Expected End Date: 22nd December 2006
End Date: tbd
Date this page last updated: 27th November
2006
Estimated effort so far: 9 days
A tool to mix MPEG transport streams received over multicast in RTP
format; and rebroadcast as a new multicast RTP stream.
Internal work on developing live multicast streaming services needs a
way to take data from one stream and mix it into another. The streams
are multicast RTP packets containing MPEG Transport Stream data. The
tool, when deployed should be able to run 24/7, combining a subset of
data from 2 or more streams to generate a new one. This would be used,
for example, to mix existing EPG data into an existing stream containing
audio and video.
Benefits:
Task Sponsor: BB (BBC internal)
Task Owner: Matt (MH)
Developers:
User
Interested Third Parties
Requirements (non exhaustive):
Receive multicast RTP containing MPEG Transport Stream containing H264 @ ~1Mbit/s (MUST)
Simulataneously receive a 2nd multicast RTP containing MPEG Transport Stream containing EIT data and MPEG2 video @ ~5Mbit/s (MUST)
Combine (demultiplex and remultiplex) EIT data from 2nd stream with
video from the 1st to form a new stream (MUST)
Transmit the new stream as multicast RTP (MUST)
Adjust stream timestamps (MPEG Transport Stream level, and possibly MPEG Program Elementary Stream level) if needed (WOULD LIKE)
Relevant Influencing factors:
Components to parse and create RTP packets
Command line tool, as described
Webpages describing:
Code
RTP handling
Internet components (uprades/modifications)
SDP handling
DVB/MPEG Transport stream processing
case-insensitivity problem fixed for /trunk/
...removing filename clash problems for case insensitive filesystems like that on win32/osx
New/modified components for mainline codebase (RTP, DVB)
Improved throughput of multicast component and Selector in general
Develop code
Need to determine, experimentally, if timestamp resynchronisation
algorithms will be neededIf resynchronisation algorithms are needed.
Technically remultiplexing severely jitters the timestamps on the
transport stream packets.
CPU load is higher than anticipated - handling a single 1-2Mbit/s
stream takes 50%+ CPU usage on the Mac Mini currently being used for
testing. A faster "Core Duo" Mac Mini has been tried, but the system
struggles to keep up with the 4Mbps MPEG2 stream (ie. usage teeters
close to 90%/100%).
Multicast I/O improvements
Selector component has been improved (local copy in the working dir) to
increase responsiveness. Specifically, instead of requests to select on
file handles queueing up at its inbox until the current select() call
timeout fires; a separate filehandle is used to wake it immediately if
there are pending requests.
The Multicast components have been optimised (local copy in the working
dir) to sleep when inactive, using the Selector component to wake
them.
Threaded component bottlenecks
I've also tried writing
Why? Interactions between a thread and the main thread are
bottlenecked:
Each component is taking between 10% and 20% CPU. Moving the Selector
or Multicast components themselves into separate threads doesn't reduce
the amount of CPU being spent in the main thread. The Mac Mini could
probably cope if it were possible to spread some of the workload across
the 2nd CPU without incurring a penalty in the main thread.
Proposal: Axon modifications
I believe there may be mileage in experimenting with modifying Axon such
that threads can perform all tasks themselves, using (hopefully fine
grained) locking to ensure thread safety. This would eliminate the need
for threaded components to have a microprocess running in main thread
handling all its requests. This would substantially reduce the overhead
incurred when making a component threaded. It would potentially also
have the benefit that two components running in threads independant of
the main thread would not be bottlenecked by the needing the main thread
to handle message passing on their behalf.
This would probably qualify as a separate project task, lasting a few
weeks.
Other routes to try first
Michael suggests that such a radical approach may well not be necessary.
Instead the following perhaps should be tried first: