Synchronizing Gameplay and Animation with Music


I’ve developed an aerial shooter called Substream that extensively uses gameplay events and environmental animations that are synced to the game’s soundtrack. This article covers some of the things I’ve discovered; what works, what I can get away with, and how far to push it…

Timing is everything

Humans learn that when we see and hear things happening concurrently we can try to draw an association. So when it comes to syncing up music and game events, timing is extremely important. But triggering visuals at the beat isn’t always best. If an animation’s prominence builds over a few frames, triggering it before the beat can provide a better sense of synchronicity, so that the two events (animation and sound) peak together.

I synced up my events by hand, and I’ve learnt to be careful and get it right on the first pass. If I misplace the position of a note by even a quarter beat it creates an interesting kind of problem; I can usually sense that an animation amongst a string of notes was mistimed quite clearly, but not spot where. The illusion that music and animations are one is easily broken, even when you can’t point to it.

Substream’s AudioMarker Syncronization Tool

Pitch is nothing

When making movements that appear to be linked to the pitch of the music, you can lie a lot. As long as things generally go one way when the pitch goes up, and usually the other way when it falls, it’s all good.

Rock Band and Guitar Hero are a great examples of this. You have five buttons on the guitar but many notes in the music. If a note is roughly higher than the previous one the player will usually be asked to press a button that’s somewhere higher on the fretboard. Non-guitar players will quickly get a feel for which end makes higher notes and which makes lower notes and fully accept this system, even though it’s used so liberally by the game.

Rock Band

You can alter perceptions

Although animations need to be timed well, it’s possible to get away with missing notes completely. And this is where things start to get spooky.

If ten notes play and nine have an animation, the one without can appear quieter, or not seem to count as part of the melody. Alternatively, adding an animation usually linked to a snare at a time when no snare is playing can lead you to believe it was there, very quiet – you’re not sure if you heard it. A bigger than average animation can make a note appear louder.

No, it’s not Synaesthesia

When a colourful musically animated game comes along marketing departments sometimes like to bring out this fancy word. I used it once or twice, but after doing some research I decided to stop.

Synaesthesia is a condition of mind where the experience of one sense is associated with a completely different sense or idea. This can mean that hearing a musical note comes with a certain colour, but it can also mean that numbers have a physical location in space or that certain words form a taste. This is something people experience their whole lives, but it’s rarely considered a negative by them. Although someone with synaesthesia might try to communicate their experiences to others through a game design, you cannot induce Synaesthesia in someone.

A musical game might convince you that a certain object in a scene is creating a sound in the music; but that’s different, and it’s not as special as I’d like to tell you it is. Video games trigger sounds when objects animate all the time. Boom.

Input detection needs to be quick

If you’re expecting the player to be able to perform an action or trigger something on a beat this can be difficult to balance. If the player presses the button just before a note there’s no problem because you can have your code remember this for a few milliseconds and then perform the action when the beat comes past. But if the player presses it late then you have a decision to make about performing the action late, waiting until the next beat, or not allowing the action. If you’re running at 30 frames per second and you get a late input that you don’t process until the next frame, this will make all of these worse. If you can, check the state of the input device right before you make this decision in the code and do all of that as late in the frame update as possible.

Sometimes you can just throw stuff at the screen

We see shapes moving through TV static because our brains hunt for patterns with meaning. Enough particles will seem like they must somehow be related to an electric guitar.

Some people will never see it

Sad news. Some people just don’t get it. Maybe all their attention is on the gameplay. Or perhaps they are not musically minded. But a few people have stepped away from my game at conventions and been completely surprised when I’ve told them it was all synchronised to the soundtrack. At least they still think the game looks and sounds nice as separate factors. But unfortunately that special link will always pass a small percentage of people by completely.

First published on 29th September 2014.