Prof. Zahra Montazeri, University Of Manchester
Modeling and Rendering Cloth

Zahra is a lecturer (Assistant Professor) at the University of Manchester in the Department of Computer Science. Her field of research is physics-based computer graphics with a focus on photorealistic rendering and appearance modeling for complex materials such as cloth, hair, and fur. She worked as a research consultant at Disney Research and collaborates with Weta Digital on joint research projects. She received Star Wars movie credit for The Mandalorian and her research paper was used in the production of Avatar, The Way of Water.Before joining academia, she was a Research and Development at Industrial Light & Magic (ILM). She holds a PhD in Computer Science from University of California, Irvine (UCI), advised by Shuang Zhao, and worked under the supervision of Professor Henrik Wann Jensen. She received her M.Sc. in Computer Science from UCI in 2017 and her B.Sc. in Computer Engineering from the Sharif University of Technology in 2015, Iran.

Prof. Markus Hadwiger, KAUST
Visualizing Large-Scale Scientific Data: Volume Data and Flow Fields

This talk will give an overview of selected research of the High-Performance Visualization research group ( at the KAUST Visual Computing Center (VCC). Interactive visualization is crucial to exploring, analyzing, and understanding large-scale scientific data, such as the data acquired in medicine or neurobiology using computed tomography or electron microscopy, and data resulting from large-scale simulations such as fluid flow in the Earth’s atmosphere and oceans. In data-driven sciences, the extreme size as well as complexity of data presents a tremendous challenge to interactive visualization and analysis. We will give an overview of two major research directions that we have been working on:(1) Custom data structures and algorithms for large-scale visualization, also taking the characteristics of GPU architectures into account; and (2) Mathematical techniques from differential geometry and mathematical physics for the visualization of large flow fields and important flow features such as vortices.

Prof. Minhyuk Sung, KAIST
Towards Shape Editability and Multi-View Consistency in 3D Generation

Generative AI has seen widespread growth thanks to the exponential increase in the scale of data. Its expansion into 3D generation is particularly noteworthy, as it has been developed to distill knowledge from 2D generative models, given the scarcity of large-scale 3D datasets. While recent 3D generation techniques have showcased impressive results, they have also exposed the limitations of 2D-prior-based 3D generation, including a lack of shape editability, limited connection to text due to the absence of data describing geometry, and inconsistency across multiple views. In this talk, I will introduce our recent work, which focuses on achieving part-level editing with a 3D diffusion model and a novel text-to-3D dataset. Additionally, I will discuss our synchronization technique for achieving view consistency when conducting 2D diffusion jointly. I will conclude by sharing my expectations for the future of 3D generation and the challenges we aim to address in our future research.

Silvia Sellán, University Of Toronto
Uncertain Surface Reconstruction

We propose a method to introduce uncertainty to the surface reconstruction problem. Specifically, we introduce a statistical extension of the classic Poisson Surface Reconstruction algorithm for recovering shapes from 3D point clouds. Instead of outputting an implicit function, we represent the reconstructed shape as a modified Gaussian Process, which allows us to conduct statistical queries (e.g., the likelihood of a point in space being on the surface or inside a solid). We show that this perspective improves PSR's integration into the online scanning process, broadens its application realm, and opens the door to other lines of research such as applying task-specific priors.

Prof. Matthias Niessner, Technical University of Munich
Neural Surface Reconstruction: To Learn or Not to Learn?

In the recent years, we have seen tremendous progress on leveraging neural networks – in particular, for reconstructing 3D surfaces, neural networks have become a necessary tool in order to achieve state-of-the-art performance. While we often see networks used to learn generalizable discriminative features (e.g., establishing feature matches for structure-from-motion), MLP-based representations have recently gained significant popularity as data structures substituting explicit representations such as voxel grids. Importantly, these representations can also be leveraged as parametric models, for instance, to regularize underconstrained reconstruction problems. Overall, the question is: when do we need to use networks for learning, and when should they be used for storing data and surface fitting? In this talk, I will give an overview of our latest works in this area, starting with neural networks as feed forward predictors and generative models for surface reconstruction. I discuss how to learn parametric models that can model dynamic scene reconstructions, e.g., for humans or arbitrary deformable surfaces. Finally, I will discuss when to use networks as model representations -- when to simply leverage them for fitting problems, and when generalization and feature learning is required to achieve cutting-edge reconstruction results.

Prof. Ariel Shamir, Reichman University
Generative algorithms for non-photorealistic (NPR) visual content

Computer vision and graphics algorithms for both analysis and synthesis have developed considerably in recent years due to advancements in neural networks and deep learning methods. Nevertheless, these algorithms concentrate mainly on photorealistic inputs and outputs. In this talk, I will present several efforts to advance the state-of-the-art on non-photorealistic (NPR) visual content such as animations, cartoons and even art paintings. The main challenges stem from the differences of these domains in subject, appearance, variance, and abstraction. I will show how learning correct representations as well as domain adaptation techniques enable tracking, segmentation, landmark detection in NPR domains, and allow synthesis of abstract visual depictions.

Prof. Mark Pauly, EPFL
Computational Inverse Design of Deployable Structures

Research at the EPFL Geometric Computing Laboratory (GCM) aims at empowering creators. We develop efficient simulation and optimization algorithms to build computational design methodologies for advanced material systems and digital fabrication technology. Mathematical reasoning, geometric abstractions, and powerful numerical methods are key ingredients in our work.In this talk I will show how these tools can be used to solve challenging inverse problems for deployable structures that can transition between multiple geometric states. Several design studies will highlight how the interplay of geometry, computation, and digital fabrication technologies facilitates the discovery of new material systems with superior functional performance. Such systems offer a wide variety of potential applications, for example in industrial and consumer products, soft robotics, or architecture

Prof. Alexei A. Efros, UC Berkeley
Self-Supervised Visual Learning and Synthesis

Computer vision has made impressive gains through the use of deep learning models, trained with large-scale labeled data. However, labels require expertise and curation and are expensive to collect. Can one discover useful visual representations without the use of explicitly curated labels? In this talk, I will present several case studies exploring the paradigm of self-supervised learning -- using raw data as its own supervision. Several ways of defining objective functions in high-dimensional spaces will be discussed, including the use of General Adversarial Networks (GANs) to learn the objective function directly from the data. Applications of self-supervised learning will be presented, including colorization, on/off-screen source separation, paired and unpaired image-to-image translation (aka pix2pix and cycleGAN), and curiosity-based exploration.

Dr. Fabian Fuchs, Oxford University
Learning Invariant Representations with Neural Networks from Visual Input

My research topic is learning invariant representations. Simply put: whereas most of deep learning is concerned with finding the important information in an input, I focus on ignoring harmful or irrelevant parts of information. This can be important to counteract biases or to better leverage structure in the data. In this talk, I will cover two ideas. (1) Creating invariances with respect to a specific feature using adversarial training ( (2) Leveraging permutation invariant neural network architectures for addressing set-based problems ( & The application domains I am embedding this research in are intuitive physics as well as pedestrian tracking and trajectory prediction with a specific focus on relational reasoning.

Dr. Srinath Sridhar, Stanford University
Category-Level Object Perception with Normalized Object Coordinate Space (NOCS)

In this talk, I will introduce Normalized Object Coordinate Space (NOCS) to capture category-level information about object properties such as 6D pose and shape. NOCS is a canonical space that normalizes for position, orientation, and size of instances in a certain category. It can be used to represent intra-category variation of specific object properties. I will then describe a new representation called NOCS map that can be used to learn to predict 6D pose or 3D shape from a single RGB image. NOCS maps have several advantages compared to representations such as voxel grids or point clouds. For instance, they jointly encode shape and pose, have direct 2D-3D correspondences, and allow us to use well-studied 2D CNN machinery. I will show results from both 6D pose estimation [1] and 3D reconstruction [2]. Finally, I will discuss opportunities to extend NOCS maps to different object perception tasks.

Prof. Leonidas J. Guibas, Stanford University
Progress in Object-Centric Machine Learning

Deep knowledge of the world is necessary if we are to augment real scenes with virtual entities, or to have autonomous and intelligent agents and artifacts that can assist us in everyday activities -- or even carry out tasks entirely independently. One way to factorize the complexity of the world is to associate information and knowledge with stable entities, animate or inanimate, such as a person or a vehicle, etc -- things we can generically call objects. In this talk I'll survey a number of recent efforts whose aim is to create and annotate reference representations for objects based on 3D models, with the aim of delivering information to new observations, as needed. The information may relate to object geometry, appearance, articulation, materials, physical properties, affordances, or functionality.One challenge of the 3D world is that 3D data typically come as point clouds or meshes, which do not have the regular grid structure of image or video data. This makes it challenging to apply the highly successful convolutional deep architectures (CNNs) to 3D data, as CNNs heavily depend on neighborhood regularity for weight sharing and other optimizations. The talk will illustrate deep architectures capable of processing irregular 3D geometry for tasks such as object extraction from scenes, geometric primitive fitting, inferring object function from observations, and learning to differentiate objects through language. Tools towards these goals include canonical spaces for objects and representations of their compositional structure, as well as multi-objective training and learned communication patterns in architectures.

Dr. Aaron Hertzman, Adobe Research
Can Computers Create Art

I will discuss whether computers, using Artificial Intelligence (AI), could create art. I will cover the history of automation in art, examining the hype and reality of AI tools for art together with predictions about how they will be used. I will also discuss different scenarios for how an algorithm could be considered the author of an artwork, which comes down to questions of why we create and appreciate artwork.

Dr. Richard Zhang, Adobe Research
Modeling Perceptual Similarity and Shift-Invariance in Deep Networks

While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification have been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called perceptual losses? What elements are critical for their success? We introduce a new dataset of human perceptual similarity judgments and systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.Despite their strong transfer performance, deep convolutional representations surprisingly lack a basic low-level property -- shift-invariance, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the Nyquist sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components. We observe increased accuracy in ImageNet classification, across several commonly-used architectures. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.

Dr. Bryan Russell, Adobe Research
Deep Deformation Networks for 3D Geometry

In my talk, I will describe recent work on learning to represent and generate 3D shapes. I will start by describing AtlasNet, an approach that represents a 3D shape as a collection of parametric surface elements and naturally infers a surface representation of the shape. I’ll then describe an extension, 3D-CODED, for matching deformable shapes to obtain 3D correspondences. Finally, I’ll describe an approach for representing shapes as the deformation and combination of learnable elementary 3D structures.

Thu Nguyen Phouc, University of Bath
HoloGAN: Unsupervised learning of 3D representations from natural images

We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3Drepresentation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.

Prof. Taku Komura, University of Edinburgh
Learning Neural Character Controllers from Motion Capture Data

I will present about two data-driven frameworks based on neural networks for interactive character control. The first approach is called a Phase-Functioned Neural Network (PFNN). In this network structure, the weights are computed via a cyclic function which uses the phase as an input. Along with the phase, our system takes as input user controls, the previous state of the character, the geometry of the scene, and automatically produces high quality motions that achieve the desired user control. The entire network is trained in an end-to-end fashion on a large dataset composed of locomotion such as walking, running, jumping, and climbing movements fitted into virtual environments. Our system can therefore automatically produce motions where the character adapts to different geometric environments such as walking and running over rough terrain, climbing over large rocks, jumping over obstacles, and crouching under low ceilings. Our network architecture produces higher quality results than time-series autoregressive models such as LSTMs as it deals explicitly with the latent variable of motion relating to the phase. Once trained, our system is also extremely fast and compact, requiring only milliseconds of execution time and a few megabytes of memory, even when trained on gigabytes of motion data. Our work is most appropriate for controlling characters in interactive scenes such as computer games and virtual reality systems. The second approach is called Mode-Adaptive Neural Networks. This is an extension of the PFNN and has the capability to control quadruped characters, where the locomotion is multimodal. The system is composed of the motion prediction network and the gating network. At each frame, the motion prediction network computes the character state in the current frame given the state in the previous frame and the user-provided control signals. The gating network dynamically updates the weights of the motion prediction network by selecting and blending what we call the expert weights, each of which specializes in a particular movement. Due to the increased flexibility, the system can learn consistent expert weights across a wide range of non-periodic/periodic actions, from unstructured motion capture data, in an end-to-end fashion. In addition, the users are released from performing complex labelling of phases in different gaits. We show that this architecture is suitable for encoding the multi-modality of quadruped locomotion and synthesizing responsive motion in real-time.

Dr. Xi Wang, Technical University of Berlin
I Can See What You Think: The Mental Image Revealed by Gaze Tracking

Humans involuntarily move their eyes when retrieving an image from memory. This motion is often similar to actually observing the image. We suggest to exploit this behavior as a new modality in human computer interaction, using the motion of the eyes as a descriptor of the image. Interaction requires the user's eyes to be tracked but no voluntary physical activity. We perform a controlled experiment and develop matching techniques using machine learning to investigate if images can be discriminated based on the gaze patterns recorded while users merely think about image. Our results indicate that image retrieval is possible with an accuracy significantly above chance. We also show that this result generalizes to images not used during training of the classifier and extends to uncontrolled settings in a realistic scenario.

Thu Nguyen Phouc, University of Bath
RenderNet: A Deep Convolutional Network for Differentiable Rendering from 3D Shapes

Traditional computer graphics rendering pipeline is designed for procedurally generating 2D quality images from 3D shapes with high performance. The non-differentiability due to discrete operations such as visibility computation makes it hard to explicitly correlate rendering parameters and the resulting image, posing a significant challenge for inverse rendering tasks. Recent work on differentiable rendering achieves differentiability either by designing surrogate gradients for non-differentiable operations or via an approximate but differentiable renderer. These methods, however, are still limited when it comes to handling occlusion, and restricted to particular rendering effects. We present RenderNet, a differentiable rendering convolutional network with a novel projection unit that can render 2D images from 3D shapes. Spatial occlusion and shading calculation are automatically encoded in the network. Our experiments show that RenderNet can successfully learn to implement different shaders, and can be used in inverse rendering tasks to estimate shape, pose, lighting and texture from a single image.

Prof. Nils Thuerey, Technical University of Munich
Physics-based Deep Learning for Fluids

In this talk I will focus on the possibilities that arise from recent advances in the area of deep learning for accelerating and improving physics simulations. I will focus on fluids, which encompass a large class of materials we encounter in our everyday lives. In addition to being ubiquitous, the underlying physical model, the Navier-Stokes equations, at the same time represent a challenging, non-linear advection-diffusion PDE that poses interesting challenges for deep learning methods. I will explain and discuss several research projects from our lab that focus on temporal predictions of physical functions, temporally coherent adversarial training, and predictions of steady-state turbulence solutions. Among other things, it turns out to be useful to make the learning process aware of the underlying physical principles. Here, especially the transport component of the Navier-Stokes equations plays a crucial role. I will also give an outlook about open challenges in the area of deep learning for physical problems. Most importantly, trained models could serve as priors for a variety of inverse and control problems.

Prof. Manolis Savva, Simon Fraser University
Understanding 3D Environments through Embodiment

In this talk, I will describe a series of past, current and future projects that aim to understand 3D environments through the perspective of a human or other embodied agent acting in the scene. By learning from observations of people acting in the real world, we can obtain an agent-centric representation of the structure and semantics of 3D environments that is useful for both analysis and synthesis tasks. First, I will demonstrate how we can use this representation to analyze 3D environments and predict how likely they are to support specific human actions. Then I will show how we can use the same representation to generate 3D environments and human poses depicting common actions. Finally, I will describe ongoing work on building an embodied simulation framework to establish a common platform for research in embodied agents acting within realistic 3D environments. This platform allows us to leverage computer graphics to generate 3D environments with controlled variation, enabling systematic learning through simulation for problems in computer vision, robotics, NLP, and AI.

Prof. Peter Wonka, KAUST

Prof. Leonidas Guibas, Stanford University
Learning Motion Patterns in 3D

Human activities invariably involve movement and interactions with other objects, animate or inanimate. Reasoning about motions raises many interesting challenges, especially when we have multiple moving objects with distinct motions, yet all synergistic towards a common high-level goal. Motion estimation, representation, and segmentation from visual data are all non-trivial problems, especially since occlusions are very common in human-object interactions involving contacts. In this talk we survey a number of recent efforts in high-level motion analysis and inference from visual data, such as RGB and RGB-D videos or point clouds over time. We start by reviewing certain recent deep neural architectures for processing point cloud data. We then look at the simple problem of inferring 3D flow in dynamic point clouds, revisiting ICP from a learning perspective. We proceed to derive descriptors for human-object interactions that aim to capture certainly key aspects of the geometry and dynamics of the interaction, but without being too closely tied to any particular motion representation. We finally discuss how to infer the motion patterns of multi-step human activities in a desktop or tablet-top settings, such as for example in setting a table for dinner. We exhibit a recurrent neural architecture that can learn from 2D videos the patterns of such activities and generate synthetic interactions that follow both physical laws and human conventions. This machinery allows us to transport interactions spatially, to new settings, as well as transport interactions temporally, to produce continuations or completions of partially observed activities. This latter functionality facilitates the creation of assistive agents that can help people by inferring intent and provide them with either informational or physical help in smart environments.

Prof. Matthias Niessner, Technical University of Munich
Reconstructing and Understanding 3D Indoor Environments

In the recent years, commodity 3D sensors have become easily and widely available. These advances in sensing technology have spawned significant interest in using captured 3D data for mapping and semantic understanding of 3D environments. In this talk, I will give an overview of our latest research in the context of 3D reconstruction of indoor environments. I will further talk about the use of 3D data in the context of modern machine learning techniques. Specifically, I will highlight the importance of training data, and how can we efficiently obtain labeled and self-supervised ground truth training datasets from captured 3D content. Finally, I will show a selection of state-of-the-art deep learning approaches, including discriminative semantic labeling of 3D scenes and generative reconstruction techniques.

Prof. Adrian Hilton, University of Surrey
4D Vision for Human Performance Capture & Animation

4D vision is an emerging area within Computer Vision addressing the capture and analysis of real-world dynamic scenes from video. This talk will review progress in 4D vision over the past decade through to current challenges. Recent advances have enabled 4D capture of natural dynamic scenes such as people together with the use of capture human performance for video-realistic animation. The technology is currently being deployed in entertainment content production for both film and immersive content for VR.

Dr. Florent Lafarge, INRIA
Partitioning Images Into Polygons: Two Methods and a Few Applications

The over-segmentation of images into atomic regions has become a standard and powerful tool in Vision and Graphics. The very popular superpixel methods, that operate at the pixel level, cannot directly capture the geometric information disseminated into the images. In this talk, we propose an alternative representation to superpixels. By operating at the level of geometric shapes, typically line-segments, one can generate geometric partitions of images as layouts of polygons. Such layouts are compact, scalable, and come with geometric guarantees. We present two different methods to generate such geometric partitions. The first method builds a Voronoi diagram that conforms to preliminarily detected geometric shapes, whereas the second one exploits a kinetic framework to locally propagate the geometric shapes. Through some applications in urban reconstruction, we show that such partitions are particularly adapted to analyse images with strong geometric signatures as man-made objects.

Prof. Jan Beneš, Charles University, Prague
On Realism of Architectural Procedural Models

The goal of procedural modeling is to generate realistic content. The realism of this content is typically assessed by qualitatively evaluating a small number of results, or, less frequently, by conducting a user study. However, there is a lack of systematic treatment and understanding of what is considered realistic, both in procedural modeling and for images in general. We conduct a user study that primarily investigates the realism of procedurally generated buildings. Specifically, we investigate the role of fine and coarse details, and investigate which other factors contribute to the perception of realism. We find that realism is carried on different scales, and identify other factors that contribute to the realism of procedural and non-procedural buildings.

Prof. Jesus Perez Rodriguez, Universidad Rey Juan Carlos
Design and Fabrication of Deformable Structures

Digital fabrication allows for an extremely fast transition from virtual prototypes to their physical realization. In the case of deformable objects, one would like to design these prototypes with a clear idea in mind about how they should behave once they are printed. It is not easy to predict what combination of material and geometric properties will produce a specific global deformation behavior. We seek to create tools that simplify as much as possible the way a user specifies the desired behavior and automate the rest of the design process. In this talk, we take a brief look at the diversity of recent works, identify the fundamental aspects of these methods, and present computational solutions for the design, simulation, and fabrication of two interesting kinds of deformable structures: i) We first explore Flexible Rod Meshes. These are light-weight and cost-efficient physical shapes, that can be fabricated in one piece from a single base material, and yet produce deformable objects with really complex behaviors. We present a tool that takes as input a deformable surface together with a set of poses and boundary conditions and automatically computes a rod mesh ready to be printed. ii) We then study Kirchhoff-Plateau Surfaces. These are planar networks of thin elastic rods embedded in pre-tensioned membranes that deploy into complex, three-dimensional shapes, composed of minimal surface patches. We propose a tool to interactively explore this intriguing and expressive design space, using a combination of topology and geometry editing, forward simulation, sensitivity analysis and highly efficient inverse design. In the last part of the talk, we’ll briefly take a look at some new trends and promising challenges in the field.

Dr. Dan Koschier, RWTH Aachen University
Boundary and Interface Handling in Physically Based Animation

Physics simulations for solids and fluids are today essential for the production of realistic animations and special effects in feature films, computer games and surgical simulators. The underlying mathematical models often require handling of boundaries and geometric interfaces. Phenomena modelled using interfaces include but are not limited to collision handling, two-way coupling in solid-fluid interaction, multiphase flows, and cutting and fracture of solid objects. Despite the wide range of existing approaches, an accurate and robust treatment is still difficult. In this talk I will present recent approaches that aim towards robust and accurate treatment of interfaces and boundaries in physically based animation. In the first part I will give an overview over the research that resulted from the work as PhD student. The main part will be devoted to a recent approach for the simulation of cutting of deformable solids. A finite element discretization will be introduced that is able to capture discontinuities in the underlying partial differential equation's solution due to physical cuts. Without the requirement on any topological changes in the discretization mesh, basis enrichments are employed that augment the approximation basis by discontinuous functions. One key aspect of the method is the construction of specialized quadrature rules for numerical computation of integrals over piecewise polynomial but discontinuous functions arising due to dissected finite elements. On the basis of several examples and comparisons the robustness and visual realism of the method will be demonstrated. Finally, the talk will be concluded by a discussion of limitations and future work.

Dr. Bailin Deng, Cardiff University
From Digital to Physical: Computational Methods for Design and Fabrication

In the past few decades, advances in digital design tools have made it possible to design highly complex 3D shapes. However, physical realization of these shapes remains a challenging task. Recently, the emergence of affordable fabrication tools such as 3D printers and laser cutters allows us to turn a digital design into a physical object. But effective use of these tools requires the design shape to satisfy specific requirements related to the fabrication technologies, which are not considered by traditional 3D design tools. We argue that these fabrication requirements can be incorporated into the design process as geometric constraints, such that the resulting designs can be realized using specific technologies and materials. We present a few fabrication-aware design tools for different applications, from freeform architectural design to cost-effective fabrication of large objects.

Dr. Bryan Russell, Adobe Research
Video Recognition at Adobe

In this talk I will describe ongoing efforts in video recognition at Adobe. Video presents additional challenges over recognition in still images. Example challenges include the sheer volume of data, lack of annotated data across time, and presence of action categories where motion and appearance cues are critical. I’ll describe work that addresses the issue of visual representation in video and its intersection with natural language.

Prof. Nils Thuerey, TU Munich
Data-driven Fluid Simulation

Physics simulations for virtual smoke, explosions or water are by now crucial tools for special effects. Despite their wide spread use, it’s still very difficult to get get these simulations under control, or to make them fast enough for practical use. In this talk I will present recent research projects that aim for solving and alleviating these issues. A central part of this talk will be devoted to methods for interpolating fluid simulations. I will describe a method that uses 5D optical flow to register two space-time volumes of simulations. Once the registration is computed, in-between versions can be generated very efficiently. In contrast to previous work, this approach uses a volumetric representation, which is beneficial for smooth and robust registrations without user intervention. I will show several examples of smoke and liquid animations generated with this interpolation method, and discuss limitations of the approach. The talk will be concluded by giving an outlook of open questions in the area.

Prof. Peter Wonka, King Abdullah University of Science and Technology
Integer Programming for Layout Problems

In this talk, I will give an introduction to Integer Programming (IP) and show how we used IP in recent research projects. The projects range from problem formulations in visualization to urban modeling.

Dr. Manfred Lau, Lancaster University
Tactile Mesh Saliency

I will discuss the problem of "tactile mesh saliency", where tactile salient points are those on a virtual mesh that a human is more likely to grasp, press, or touch if the mesh were a real-world object. While the concept of visual saliency has been previously explored in the areas of mesh and image processing, tactile saliency detection has not been explored. I will describe the solution towards this problem that we have developed. We collect crowdsourced data of relative rankings and develop a new formulation to combine deep learning and learning-to-rank methods to compute a tactile saliency measure. The solution is demonstrated on a variety of 3D meshes and various applications including material suggestion for rendering and fabrication. Time permitting, I will also discuss other problems that I have recently worked on that take a similar learning framework.

Prof. Tao Ju, Washington University in St. Louis
Topology-aware Modeling from Curves

Many applications of surface models, such as mesh processing, simulation, and manufacturing, are sensitive to the topological properties of the models. To create a surface with the desirable topology, a common strategy is to first reconstruct the surface from the input data using a topology-oblivious algorithm and then fix any topological errors in a post-process. We advocate a different strategy that reconstructs the surface with topology constraints in mind. The talk reviews several recent work in this direction that revolve around reconstructing surfaces from a network of spatial curves. We will consider a variety of topological constraints, such as manifoldness, connected components, and genus.

Prof. Michiel van de Panne, University of British Columbia
Learning Agile Movement Skills with Physics-based Simulation

Interactive physics-based simulations are now capable of reproducing a growing number of motion skills, often with a focus on generating agile-and-robust locomotion. In this talk, I review recent progress in simulation-based models of human and animal motion as used for computer animation, where they seek to replace simpler kinematic models based on motion-capture. We will discuss the roles of optimization, machine learning, and simplified models in these approaches, as well as what insights might be shared between robotics and our simulation-based work in animation. A wide variety of animated results will be shown to illustrate the capabilities of current methods. I'll also identify several research directions where we still need to see significant progress.

Dr. Ligang Liu, University of Science and Technology of China
Fabrication Oriented Geometric Design and Optimization

3D printers have become popular in recent years and enable fabrication of custom objects for home users. The promise of moving creations from a virtual space into reality is truly tantalizing, and its applications go far beyond basic manufacturing and rapid prototyping. However, many obstacles remain for 3D printing to be practical and commonplace. In this talk, I will review our recent works on geometric modeling and processing for 3D printing applications.

Dr. Wenping Wang, University of Hong Kong
Computing Medial Axis Transform of 3D Objects

As a complete shape description, the medial axis of a geometric shape possesses a number of favorable properties--it encodes symmetry, local thickness and structural components of the shape it represents. Hence, the medial axis has been studied extensively in shape modeling and analysis since its introduction by Blum in 1960s. However, the practical application of the medial axis is hindered by its notorious instability and lack of compact representation; that is, a primitive medial axis without proper processing is often represented as a dense discrete mesh with many spurious branches. In this talk I shall represent some recent studies on computing stable and compact representations of the medial axes of 3D shapes. Techniques from mesh simplification will be employed to compute a medial axis without spurious branches and represented by a small number of mesh vertices, while meeting specified approximation accuracy.

Jean Hergel, INRIA Nancy
3D Fabrication of 2D Mecanisms

The success of physics sandbox applications and physics-based puzzle games is a strong indication that casual users and hobbyists enjoy designing mechanisms, for educational or entertainment purposes. In these applications, a variety of mechanisms are designed by assembling two-dimensional shapes, creating gears, cranks, cams, and racks. We propose to start from such casual designs of mechanisms and turn them into a 3D model that can be printed onto widely available, inexpensive filament based 3D printers.

Dr. Thomas Auzinger, IST Austria
Exact Anti-Aliasing and Approximate Shape Optimization

This talk covers two topics from computer graphics: In the first part, it is shown how to perform exact anti-aliasing in the context of rasterization by utilizing closed-form solutions of the corresponding filter convolutions. This provides a ground truth solution to edge anti-aliasing in the context of 3D-to-2D rasterization, which is made possible by an analytic visibility method. Parallel algorithms are presented for these methods and an efficient GPGPU implementation is outlined. The second part of the talk presents a reduced-order approach to shape optimization. The task of optimizing the physical behavior of fabricable models is formulated in terms of offset surfaces. This allows the associated non-linear optimization problem to be efficiently encoded in a reduced-order basis - Manifold Harmonics in our case - which significantly reduces the computational effort to find a solution.

Prof. Michael Bronstein, University of Lugano
Deep Learning on Geometric Data

The past decade in computer vision research has witnessed the re-emergence of 'deep learning' and in particular, convolutional neural network techniques, allowing to learn task-specific features from examples and achieving a breakthrough in performance in a wide range of applications. However, in the geometry processing and computer graphics communities, these methods are practically unknown. One of the reasons stems from the facts that 3D shapes (typically modeled as Riemannian manifolds) are not shift-invariant spaces, hence the very notion of convolution is rather elusive. In this talk, I will show some recent works from our group trying to bridge this gap. Specifically, I will show the construction of intrinsic convolutional neural networks on meshes and point clouds, with applications such as finding dense correspondence between deformable shapes and shape retrieval.

Prof. Ron Kimmel, Technion
A Spectral Perspective on Shapes

The differential structure of surfaces captured by the Laplace Beltrami Operator (LBO) can be used to construct a space for analyzing visual and geometric information. The decomposition of the LBO at one end, and the heat operator at the other end provide us with efficient tools for dealing with images and shapes. Denoising, matching, segmenting, filtering, exaggerating are just few of the problems for which the LBO provides a convenient operating environment. We will review the optimality of a truncated basis provided by the LBO, and a selection of relevant metrics by which such optimal bases are constructed. A specific example is the scale invariant metric for surfaces, that we argue to be a natural choice for the study of articulated shapes and forms.

Prof. Mario Botsch, Bielefeld University
A Spectral Perspective on Shapes

Finite element simulations of deformable objects are typically based on spatial discretizations using either tetrahedral or hexahedral elements. This allows for simple and efficient computations, but in turn requires complicated remeshing in case of topological changes or adaptive simulations. In this talk I will show how the use of arbitrary polyhedral elements in FEM simulations avoid the need for remeshing and thereby simplifies adaptive refinement, interactive cutting, and fracturing of the simulation domain.

Dr. Nils Thuerey, TU München
Fluid Effects - From Capture to Simulation

Physics simulations are widely recognized to be crucial tools for complex special effects in feature films. In addition real-time simulations are by now often central game-play elements in modern computer games. Despite this, we are still very far from being able to accurately simulate the complexity of nature around us, and the common numerical methods are often difficult to fine-tune and control. In this talk I will focus on fluid effects, and I will explain a different take on dealing with them in virtual environments: instead of trying to calculate everything from scratch based on a physical model, I will outline a method to capture the motion of a fluid based on a sequence of density inputs. A simulation tightly coupled with optical flow is used in an optimization step to calculate the actual flow velocities. Interestingly, an accurate flow simulation turns out to be a crucial prior to constrain the optical flow reconstruction to physical motions. I will then demonstrate how the extracted velocities can be used to re-run modified setups and generate higher-resolution versions of the original flow. The talk will be concluded by giving an outlook of requirements of the visual effects industry, open challenges for capturing flows, and areas of application outside of computer graphics.

Prof. Helmut Pottmann, TU Wien
Computational Differential Geometry & Fabrication-Aware Design

This talk will present an overview of my recent research which evolves around discrete and computational differential geometry with applications in architecture, computational design and manufacturing. From the mathematical perspective, we are working on extensions of classical differential geometry to data and objects which frequently arise in applications, but do not satisfy the classical differentiability assumptions. On the practical side, our work aims at geometric modeling tools which include important aspects of function and fabrication already in the design phase. This interplay of theory and applications will be illustrated at hand of selected recent projects on the computational design of architectural freeform structures under manufacturing and structural constraints. In particular, we will address smooth skins from simple and repetitive elements, self-supporting structures, form-finding with polyhedral meshes, optimized support structures, shading systems and the exploration of the available design space.

Dr. Siddhartha Chaudhuri, Princeton University
Content Creation with Semantic Attributes

Visual media surrounds us, and there is growing interest in new applications such as 3D printing and collaborative virtual worlds. As more and more people engage in producing visual content, there is a demand for interfaces that help novice users carry out creative design. Such an interface should allow people to easily and intuitively express high-level design goals, such as 'create a fast airplane' or 'create a cute toy', while allowing the final product to be customized according to each person's preferences. Current interfaces require the design goal to be reached through careful planning and execution of a series of low-level drawing and editing commands -- which requires previsualization, dexterity and time -- or serendipitiously through largely unstructured exploration. The gap between how a person thinks about what she wants to create, and how she can interact with a computer to get there, is a barrier for the novice. In this talk, I will present recent work on capturing design intent in high-level, linguistic terms. For example, the designer may want to make a virtual creature more 'scary', or a web page more 'artistic'. Such requirements are natural for humans, yet cannot be directly expressed in current interfaces. Our work combines crowdsourcing, machine learning and probabilistic shape analysis to create a design interface that directly supports such expression. The approach is data-driven: large repositories of existing designs are used to learn shared structure and semantics, and repurposed for synthesizing new designs. I will conclude with a discussion of directions, opportunities and challenges for new tools for high-level design that exploit the inter-relationship of semantics, function and form to aid the creative process.

Prof. Maneesh Agrawala, University of California, Berkeley
Storytelling Tools

Storytelling is essential for communicating ideas. When they are well told, stories help us make sense of information, appreciate cultural or societal differences, and imagine living in entirely different worlds. Audio/visual stories in the form of radio programs, books-on-tape, podcasts, television, movies and animations, are especially powerful because they provide a rich multisensory experience. Technological advances have made it easy to capture stories using the microphones and cameras that are readily available in our mobile devices, But, the raw media rarely tells a compelling story. The best storytellers carefully compose, filter, edit and highlight the raw media to produce an engaging piece. Yet, the software tools they use to create and manipulate the raw audio/video media (e.g. Pro Tools, Photoshop, Premiere, Final Cut Pro, Maya etc.) force storytellers to work at a tediously low-level – selecting, filtering and layering pixels or cutting and transitioning between audio/video frames. While these tools provide flexible and precise control over the look and sound of the final result, they are notoriously difficult to learn and accessible primarily to experts. In this talk I'll present a number of recent projects that aim to significantly reduce the effort required to edit and produce high-quality audio/visual stories.

Prof. Leonidas J. Guibas, Stanford University
The Functoriality of Data: Understanding Geometric Data Sets Jointly

The information contained across many data sets is often highly correlated. Such connections and correlations can arise because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy. This is particularly true for data sets of a geometric character, such as GPS traces, images, videos, 3D scans, 3D models, etc. We argue that when extracting knowledge from the data in a given data set, we can do significantly better if we exploit the wider context provided by all the relationships between this data set and a ‘society' or 'social network' of other related data sets. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps, small and large, between inter-related data. The network can act as a regularizer, allowing us to benefit from the 'wisdom of the collection' in performing operations on individual data sets or in map inference between them.

Dr. Nicolas Mellado, INRIA Bordeaux Sud-Ouest
Growing Least Squares for Surface Analysis and Editing

We present a generalization of Algebraic Point Set Surfaces for the analysis of point-clouds in Scale-Space, called Growling Least Squares. We will show some work-in-progress using this technique to decompose and interactively edit multi-scale features on large point-clouds. We will also show how algebraic sphere fitting can be used to develop enhanced shading effects in the real-time ray-tracer used in Modo.

Dr. Gael Guennebaud, INRIA Bordeaux Sud-Ouest
Non-Oriented Gradient Fields for Surface Reconstruction

We present recent advance in Moving Least Square surfaces. In particular, we will show how to extend the concept of Algebraic Point Set Surfaces to point clouds with non-oriented input normals. Indeed, when line of sight information is lacking, computing a consistent orientation is as difficult as the surface reconstruction problem itself. We will also show applications of this new technique to image abstraction and stylization.

Dr. Pierre Alliez, Inria Sophia-Antipolis – Mediterranee
Anisotropic Surface Remeshing

Polygon surface meshes are preferred over triangle meshes in a number of applications related to geometric modeling and reverse engineering. Among those, anisotropic meshes are preferred over isotropic ones when seeking faithful surface approximation for a low number of elements. In this talk I will present an approach for anisotropic polygonal surface remeshing. Our algorithm takes as input a surface triangle mesh. An anisotropic rectangular metric is derived from a user-specified normal-based tolerance error and the requirement to favor rectangle-shaped polygons. Our algorithm uses a greedy refinement and relaxation procedure that adds, deletes and relocates generators so as to match two criteria related to partitioning and conformity. I will discuss several directions to generalize this metric and to consolidate it in order to optimize the complexity / distortion trade-off for effective shape approximation.
This is joint work with Bertrand Pellenard and Jean-Marie Morvan.

Dr. Florent Lafarge, Inria Sophia-Antipolis – Mediterranee
Surface Reconstruction through Point Set Structuring

We present a method for reconstructing surfaces from point sets. The main novelty lies in a structure-preserving approach where the input point set is first consolidated by structuring and resampling the planar components, before reconstructing the surface from both the consolidated components and the unstructured points. The final surface is obtained through solving a graph-cut problem formulated on the 3D Delaunay triangulation of the structured point set where the tetrahedra are labeled as inside or outside cells. Structuring facilitates the surface reconstruction as the point set is substantially reduced and the points are enriched with structural meaning related to adjacency between primitives. Our approach departs from the common dichotomy between smooth/piecewise-smooth and primitive-based representations by combining canonical parts from detected primitives and free-form parts of the inferred shape. Our experiments on a variety of inputs illustrate the potential of our approach in terms of robustness, flexibility and efficiency.
This is joint work with Pierre Alliez.

Dr. Sylvain Lefebvre, INRIA Nancy Grand-Est / Loria
Synthesizing Structured Content from Example

Synthesizing structured content from example is challenging, since blindly introducing randomness in the process may break precise alignments between features. In this talk I will describe two techniques for synthesizing new structured images from example, which do not require a high-level description of the content. The first approach targets synthesis of textures used in architectural scenes, such as facades, control panels, doors and windows. The second approach targets structured patterns synthesized along curves. Our approaches provide convincing results at little cost, and afford for compact storage of the results. This is especially important for applications with downloadable content.

Prof. Shi-Min Hu, Tsinghua University
High-quality Structure Information From Low-quality Scan Acquisition

3D scanning technology developed very fast, shapes with fruitful details can be well captured, and we need to convert point clouds to mesh models. Previous works mainly focus on high-quality input, and mainly small scale data, e.g., Bunny, Dragon models. Recent trend in point cloud processing is large scale data: e.g., façade, factory, and process of lower quality data by consumer level devices: e.g., Microsoft Kinect. This talk will introduce two works on recovering High-quality structure information from low-quality scan acquisition: adaptive partitioning of urban facades and structure recovery by part assembly.