Metadata hits a home run at Createasphere 2010
When correctly implemented, metadata can be very useful. For example, when you load CD into iTunes, the album art, song titles, genre and other information magically show up thanks to metadata and Gracenote. Metadata is also one of the reasons that creating playlists is a much less-painstaking process than making old school mixed tapes. On a digital platform, you can find media, compare it, change orders and share it all before the kids need to go to school.
It may be a forgone conclusion that metadata is good on a personal level, but how about on a business level? I took a deep dive into metadata at the recent Createasphere conference in New York. A presenter from Major League Baseball (MLB) discussed media asset management and how the sports organization is taking metadata to the next level.
The metadata problem
In previous years, MLB would have guys at ball parks tagging meaningful plays by hand, which would result in all sorts of human error and a wide variation of content. For example, an MLB tagger would call Pedro Martinez “Peety” or spell Martinez with s. Hand tags were also hard to read and required double-entry.
MLB needed a better way to manage tags, especially when the organization wanted to open up MLB Networks. A lot of data was already available online (i.e. rosters, players, batting order immediately populated), so much of the work could be streamlined out of the gate. MLB’s solution, which took several years, was to create an intelligent logging system that would allow people to tag assets in realtime. The assets could then feed out to MLB.com, baseball clubs, fantasy games and other business partners.
Creating the metadata standards and interface required many months of planning. The system needed to be fast, with primary and secondary level buttons covering 95% of all available options. Metadata became completely hierarchical, and converting free-text fields to buttons helped MLB reach the goals for their tagging system.
How the metadata tagging system works
MLB has been tagging the video of every useful moment over the course of the this season’s 2,500 games. You would think that capturing 12,000 terabytes of video would be difficult, but the speaker said that digitizing is the easy part. “Assets are only as good as your metadata. Aggregating as much disparate metadata in a central system was much harder,” the presenter said.
An MLB play-by-play logger is assigned to each game. The logger flags as much info as possible as it happens. At end of logging cycle, the logger plays back flags and adds more richness. Their work is reviewed by logging supervisors who could determine richness of metadata.
The mockup below shows the logger’s touch screen interface. Loggers tap on a tag button on the left and select players from the at-bat team in the middle and the field team on the right. The button interface reduces between keyboard and chair (BKAC) errors and dramatically accelerates data entry, which enables loggers to keep up in near-realtime.
Choosing the best metadata
Loggers work on two levels: a basic level dubbed “Just the Facts, Ma’am” and a more discrimination-intensive “Someone Might Want This” level. Just the Facts is where the basic stats, such as at-bats, strikeouts, walks and home runs go. These actions require a tagging time code and some descriptive keyword buttons. Monkeys could do it, but they would need to be sophisticated monkeys.
Loggers need to be more than library scientists or baseball experts. They have to have a solid sense of the game and know what matters to end users and network programmers. What makes Someone Might Want This so powerful is that it handles subjective observations. Loggers earn their keep by determining the following list of metadata parameters:
- Diving catch or great play
- Company logo is in back
- Shot quality, noteworthiness
- Bloopers – funny or terrible plays
- Celebrities in stands
- Reactions category for players
- Replay notes
- Play made by umpire (i.e. close call)
- Crowd reactions
All this metadata is a huge benefit for story producers. For example, creating super-tease highlights ( such as finding all examples of bats broken) used to take 200 man-hours per week. But thanks to having assets properly tagged, highlights now take less than three minutes. MLB also pointed to a story on Derek Jeter’s 600th home run as another example. It turns out that the milestone coincided with Jeter’s birthday, which gave producers an interesting story angle. None of this would be possible without searchable metadata. Sure it requires brute force right now, but even that may change soon.