Image and Video Generation: A deep Learning Approach

Video generation consists of generating a video sequence so that an object in a source image is animated according to some external information (a conditioning label or the motion of a driving video). In this talk Prof. Nicu Sebe will present some of the recent achievements – made by him and his team – adressing these specific aspects: 1) generating facial expressions, e.g. smiles that are different from each other (e.g. spontaneous, tense, etc.) using diversity as the driving force, 2) generating videos without using any annotation or prior information about the specific object to animate.

Once trained on a set of videos depicting objects of the same category (e.g. faces, human bodies), their method can be applied to any object of this class. To achieve this, they decouple appearance and motion information using a self-supervised formulation. To support complex motions, they use a representation consisting of a set of learned keypoints along with their local affine transformations. A generator network models occlusions arising during target motions and combines the appearance extracted from the source image and the motion derived from the driving video. Their solutions score best on diverse benchmarks and on a variety of object categories.

Tuesday, 04-20-2021, 05.00 p.m. – 06.00 p.m., virtual event

More details at:

This is an event of I-AIDA – International Artificial Intelligence Doctoral Academy.

Last edited on