Joey Hagedorn, Joshua Hailpern, and Karrie G. Karahalios. VCode and VData: Illustrating a new Framework for Supporting the Video Annotation Workï¬‚ow Extended Abstracts of AVI 2008. pdf
Joshua Hailpern. VCode and VData: Illustrating a new Framework for Supporting the Video Annotation Workflow. Google engEDU: Tech Talks June 21, 2008. Mountain View, CA. Video
Quite often, there are multiple streams of video that coders must annotate. This can be different camera angels, or even screen capture video on a computer. VCode presents one main video at full size, and a dock with other streams playing in real time. When a docked stream is clicked on, it repositions itself into the main video window, while the video which was the previous focus, scales down to the dock, thus equating visual importance with relative size and visual weight.
When annotating a video, we often have to capture different types of data. VCode supports multiple types of event annotation: ranged, momentary and notes/phonetic transcription. A ranged event is one which extends over a period of time (marking action start and duration). Momentary marks have no duration, and thus represent one specific moment in time. Comments can be attached to any mark, allowing additional observations, levels/ranking, or phonetic transcription (through onscreen phonetic keyboard). Any mark with a comment has a inverted outlines to signify that it has a comment attached.
The heart of VCode is the timeline. Events are graphically represented here by diamonds. The spatial-linear presentation allows users to not only see chronologically the location of their events, but also the relative position of one mark to another. Each "track" or "dependent variable" is a different color, allowing for quick and easy assessment of the annotations made to date. To avoid scrolling and maximize the use of screen real-estate, momentary events overlap, thus optimizing screen usage, while providing enough area for track isolation.
In addition to the annotator's events, additional secondary data can be displayed on the timeline, for example a waveform of the audio from the video. If data (from a computer, or even another coder's annotations) is logged in a separate log file, this data can be displayed graphically as a bar, line or scatter plot. As a result, video annotators can use the best information available, when making their decisions when to code.
Continuous playback is not always the preferred method of analyzing a video. Often multiple modes of playback are utilized; continuous or standard playback, continuous interval playback (play for N seconds, then s top), and skip interval playback (jump N seconds, then stop). This allows the video to be divided in to smaller segments for annotation of events that are more difficult to pinpoint (i.e. when a smile starts or ends). Though conceptually simple, manipulations of video using a standard VCR is often described as annoying, and due to hand eye coordination, repeatability & reliability may suffer.
To ensure consistent configuration between coders and sessions, all administrative features are consolidated in a single window. The expected workflow is such that a researcher would setup a single coding document with all the variables to be used on all the videos. This template would then be
duplicated (with media and log files inserted for each trial). By using this model, large video files only need to be on a hard drive in one location, rather than embedded in the VCode file. Through the Admin Window, the name, color and hot key of each tack can be set through this list presentation. Tracks can be enabled as ranged events through a check box in this interface. The Administration Window is also where a researcher specifies videos and data file to be coded, as well as secondary data for contextual annotation. These elements are specified and synchronized through a drag and drop interface, all of which is hidden from the coder to prevent configuration corruption.
Critical aspects of the video coding workflow (training, reliability, and accuracy) revolve around demonstrating agreement between coders. VData (Figure 3) is a separate executable application specifically targeted to aid researchers in training and agreement analysis of coded data produced in VCode. VData calculates user agreement simply by dragging and dropping in VCode files. Users can set the tolerance variable to accommodate for variability in the mark placement by the coders.
It is not uncommon for multiple tracks or variables to be measuring slight variations on a theme (e.g. smiling vs. large smile vs. grin ), thus VData implements a track-merging feature which allows opportunities on two distinct tracks to be treated indistinguishably. For a holistic view, researchers can select tracks to be added into a total agreement calculation. In other words, if analysis determines that a single track is not reliable or it is determined that a given track will not be used in the future, it can be easily excluded from the total agreement calcuation.
As of version 1.2, VData now supports Cohen's Kappa Calculations for even more agreement analysis. This is intended only for tracks annotated with a form on interval playback mode.
We have optimized coder training and reliability analysis by providing a graphical mechanism to directly compare annotations of two coders. VData can create a VCode session containing specific tracks of two individual coders for side-by-side comparison. The visual, side by side, representation of the data makes it easy to recognize systematic errors in context and detect differences between two coders markings. This reduces the time necessary to locate discrepancies and discuss the reasons why they might have occurred. It is necessary to keep records of these agreement analyses performed with VData by text export. Maintaining export at each stage of the process provides additional transparency and maintains traceability of results that come out of the system.
1.8 Ghz or Faster Processor (Intel or PPC)
Video in .mov format
1 Gig of Ram
We would like to thank NSF-0643502 for their support of this project.