Basically all of the tracks here for the 6 images (one for each side of the cube) use 3D Source Alpha compositing mode. Then I repositioned the images to form a cube and parented the 5 latter tracks to the backside image. This allows me to mess around with the "cube" as a single object (by adjusting the parent's track motion), rather than painfully readjusting the 6 images separately. I may implement this in a future YTPMV. I dunno, we'll see. ;P