Timeline object can contain many tracks, nested stacks,
clips, gaps, and transitions. This document is meant to clarify how these
objects nest within each other, and how they work together to represent an
Simple Cut List¶
Let’s start with a simple cut list of a few clips. This is stored as a
Timeline with a single
Track which contains several
Figure 1 - Simple Cut List
Timeline can hold multiple tracks, it always has a top-level
object to hold its
Track children. In this case, that
Stack has just one
Track, named “Track-001”.
Within “Track-001”, there are four
Clip objects, named “Clip-001”, “Clip-002”,
“Clip-003”, and “Clip-004”. Each
Clip has a corresponding media reference,
“Media-001”, “Media-002”, etc.
At the bottom level, we see that each media reference has a target_url and
an available_range. The target_url tells us where to find the media (e.g.
a file path or network URL, etc.) The available_range specifies the range
of media that is available in the file that it points to. An available_range
TimeRange object which specifies a start_time and duration. The
start_time and duration are each
RationalTime objects, which store a
value and rate. Thus we can use
RationalTime(7,24) to mean frame 7 at 24
frames per second. In the diagram we write this as just 7 for brevity.
In this case most of our media references have an available_range that starts at 0 for some number of frames. One of the media references starts at 100. Assuming the media is 24 frames per second, this means that the media file contains media that starts at 4 seconds and 4 frames (timecode 00:00:04:04).
In many cases you might not know the available_range because the media is
missing, or points to a file path or URL which might be expensive to query.
If that’s the case, then the available_range of a media_reference will be
Above the media references, we see that each
Clip has a source_range, which
specifies a trimmed segment of media. In cases where we only want a portion
of the available media, the source_range will start at a higher start_time,
and/or have a shorter duration. The colored segments of “Media-001”, “Media-002”
and “Media-003” show the portion that each clip’s source_range references.
In the case of “Clip-004”, the source_range is
None, so it exposes its entire
media reference. In the OTIO API, you can query the trimmed_range() of a
clip to get the range of media used regardless of whether it has a
source_range, available_range or both - but it needs at least one of the
Also note that a clip’s source_range could refer to a segment outside the available_range of its media reference. That is fine, and comes up in practice often (e.g. I only rendered the first half of my shot). OTIO itself does no snapping or verification of this, but downstream applications may handle this in a variety of ways.
Track in this example contains all four clips in order. You
can ask the
Stack for its trimmed_range() or duration() and it
will sum up the trimmed lengths of its children. In later examples, we
will see cases where a
Stack is trimmed by setting a source_range,
but in this example they are not trimmed.
Transition is a visual effect, like a cross dissolve or wipe, that blends
two adjacent items on the same track. The most common case is a fade or
cross-dissolve between two clips, but OTIO supports transitions between
Composable items (
Gaps, or nested
Figure 2 - Transitions
In Figure 2, there is a
Transition between “Clip-002” and “Clip-003”. The in_offset
and out_offset of the
Transition specify how much media from the adjacent
clips is used by the transition.
Notice that the
Transition itself does not make “Track-001” any shorter or
longer. If a playback tool is not able to render a transition, it may
simply ignore transitions and the overall length of the timeline will
not be affected.
In Figure 2, the
Transition’s in_offset of 2 frames means that frames 1 and 2 of “Media-003” are used in the cross dissolve.
The out_offset of 3 frames means that frames 8, 9, 10 of “Media-002” are used.
Notice that “Media-002”’s available_range is 2 frames too short to
satisfy the desired length of the cross-dissolve. OTIO does not
prevent you from doing this, as it may be important for some use cases. OTIO
also does not specify what a playback tool might display in this case.
Transition’s in_offset and out_offset are not allowed to extend beyond
the duration of the adjacent clips. If a clip has transitions at both
ends, the two transitions are not allowed to overlap. Also, you cannot
place two transitions next to each other in a track; there must be a
composable item between them.
A fade to or from black will often be represented as a transition
to or from a
Gap, which can be 0 duration. If multiple tracks are present
note that a
Gap is meant to be transparent, so you may need to consider
Clip with a
GeneratorReference if you require solid black or any
other solid color.
A more typical timeline will include multiple video tracks. In Figure 3,
Stack now contains “Track-001”, “Track-002”, and “Track-003” which
Gap children. Figure 3 also shows a flattened copy of the
timeline to illustrate how multitrack composition works.
Figure 3 - Multiple Tracks
Gap in “Track-001” is 4 frames long, and the track below, “Track-002”, has
frames 102-105 of “Clip-003” aligned with the
Gap above, so those frames
show through in the resulting flattened
Note that the
Gap at the front of “Track-002” is used just to offset “Clip-003”.
This is a common way to shift clips around on a track, but you may also
Track’s source_range to do this, as illustrated in “Track-003”.
“Clip-005” is completely obscured by “Clip-003” above it, so “Clip-005” does not appear in the flattened timeline at all.
You might also notice that “Track-001” is longer than the other two. If you
wanted “Track-002” to be the same length, then you would need to append a
at the end. If you wanted “Track-003” to be the same length, then you could
extend the duration of its source_range to the desired length. In both
cases, the trimmed_range() will be the same.
The children of a
Track can be any
Composable object, which includes
Transitions. In Figure 4 we see an
example of a
Stack nested within a
Figure 4 - Nested Compositions
In this example, the top-level
Stack contains only one
contains four children, “Clip-001”, “Nested Stack”, “Gap”, and “Clip-004”. By
Stack) we can refer to a
Composition as though it was just another
Clip in the outer
If a source_range is specified, then only a trimmed segment of the inner
Composition is included. In this case that is frames 2 through 7 of
“Nested Stack”. If no source_range is specified, then the full
available_range of the nested composition is computed and included in
the outer composition.
“Nested Stack” contains two tracks, with some clips, gaps, and a track-level source_range on the lower track. This illustrates how the content of “Nested Stack” is composed upwards into “Track-001” so that a trimmed portion of “Clip-005” and “Clip-003” appear in the flattened composition.
Notice how the
Gap in “Track-001” cannot see anything inside the nested
composition (”Clip-003”, etc.) because those are not peers to “Track-001”,
they are nested within “Nested Stack” and do not spill over into adjacent
Gaps. In other words, “Nested Stack” behaves just like a
Clip that happens
to have complex contents rather than a simple media reference.