Conferências

 


  • Looking at People
  • David A. Forsyth, University of Illinois at Urbana-Champaign
  • Vision-Based Graphics
  • Luiz Velho, IMPA – RJ
  • Google Maps Street View: Overview & Computer Vision Challenges
  • German K. M. Cheung, Google, Inc. – Mountain View - CA

    Looking at People

    David A. Forsyth

    There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because phenomena such as aspect and individual variation greatly affect the appearance of what people are doing. Recent work in kinematic tracking has produced methods that can report the kinematic configuration of the body fairly accurately and fully automatically.

    The problem of vocabulary is more difficult. I will discuss a generative activity model that allows activities to be assembled from a set of distinct spatial and temporal components. The models themselves are learned from labelled motion capture data and are assembled in a way that makes it possible to learn very complex finite automata without estimating large numbers of parameters. The advantage of such a model is that one can search videos for examples of activities specified with a simple query language, without possessing any example of the activity sought. In this case, aspect is dealt with by explicit 3D reasoning.

    An alternative strategy for dealing with aspect and individual variation is to build discriminative methods applied to appearance features. The difficulty here is that activities look different when seen from different directions. I will describe recent methods that make it possible to transfer models --- that is, to learn a model of an activity from one view, then recognize it in a completely different view.

    Biography:
    David Forsyth is currently a full professor at U. Illinois at Urbana-Champaign, where he recently moved from U.C Berkeley, where he was also full professor. He has published two books and over 140 papers on computer vision, computer graphics and machine learning. He has served as program co-chair for IEEE Computer Vision and Pattern Recognition in 2000 and 2011, general co-chair for CVPR 2006, and program co-chair for the European Conference on Computer Vision 2008, and is a regular member of the program committee of all major international conferences on computer vision. He has served five years on the SIGGRAPH program committee. He has received best paper awards at the International Conference on Computer Vision and at the European Conference on Computer Vision, and received an IEEE technical achievement award for 2005 for his research. He became an IEEE fellow in 2009. His recent textbook, "Computer Vision: A Modern Approach" (joint with J. Ponce and published by Prentice Hall) is now widely adopted as a course text (adoptions include MIT, U. Wisconsin-Madison, UIUC, Georgia Tech and U.C. Berkeley), and appears in four languages.

    Voltar ao topo

    descrição

    Vision-Based Graphics

    Luiz Velho

    In this talk I will focus on the non-trivial relations between the areas of Computer Vision and Graphics. Such multidisciplinary trend is very rich and leads to powerful approaches for solving challenging problems in interactive media and other applications. Some of the methodologies involved have been known as "Image-Based Modeling and Rendering". Exciting results are already apearing in conferences of the areas such as ACM SIGGRAPH, ICCV, and others.

    Biography:
    Luiz Velho is a Full Researcher / Professor at IMPA - Instituto de Matematica Pura e Aplicada of CNPq , and the leading scientist of VISGRAF Laboratory. He received a BE in Industrial Design from ESDI / UERJ in 1979, a MS in Computer Graphics from the MIT / Media Lab in 1985, and a Ph.D. in Computer Science in 1994 from the University of Toronto under the Graphics and Vision groups. His experience in computer graphics spans the fields of modeling, rendering, imaging and animation. During 1982 he was a visiting researcher at the National Film Board of Canada. From 1985 to 1987 he was a Systems Engineer at the Fantastic Animation Machine in New York, where he developed the company's 3D visualization system. From 1987 to 1991 he was a Principal Engineer at Globo TV Network in Brazil, where he created special effects and visual simulation systems. In 1994 he was a visiting professor at the Courant Institute of Mathematical Sciences of New York University. He also was a visiting scientist at the HP Laboratories in 1995 and at Microsoft Research China in 2002. He has published extensively in conferences and journals of the area. He is the author of several books and has taught many courses on graphics-related topics. He is a member of the editorial board of various technical publications, and was the guest editor of the Special Issue on Computer Graphics of JBCS and of Computer & Graphics . He has also served on numerous conference program committees. His awards include the "Ordem Nacional do Merito Cientifico", a Honors Prize in the II Compaq Award for Computer Science and Prizes for Best Technical Videos and Best Papers at SIBGRAPI. In 1996 he was the Program Chair of the IX SIBGRAPI . He was distinguished as the first researcher in South America to be on the SIGGRAPH Papers Committee, in 1999. He served in the SIGGRAPH Papers Committee also in 2000, 2002 and 2003. He was a member of the Eurographics IPC in 2008. He received the prestigious grant award "Cientista do Nosso Estado" from FAPERJ in 2004, 2007 and 2009. He has been a Keynote Speaker in several conferences, including SGP 2005, CNMAC 2006, the SBPC Congress 2006, SIBGRAPI 2007, ISMM 2007, and WVC 2010.

    Voltar ao topo

    descrição

    Google Maps Street View: Overview & Computer Vision Challenges

    German K. M. Cheung

    Unveiled in May 2007, the Street View feature of Google Maps is the result of a substantial engineering effort by a team including software engineers, mechanical engineers, UI designers, computer vision scientists, operations experts, and scores of others. The initial vision for Street View was provided by Google co-founder Larry Page, who personally collected street scene videos from his moving car in order to bootstrap research in this area. Turning this initial vision into a product required developing major new pieces of technology, including robust data collection platforms (vans, cars, tricycles, snowmobiles, etc.), systems for computing accurate pose from imperfect sensors, various software components to stitch, blend, color correct and warp collected imagery, a number of systems to address privacy issues, and a lot more. In the first part of this talk I will give an overview and brief history of the StreetView project, and highlight some of the unique computer vision challenges the engineering team is addressing. In the second part, I will focus on our privacy pipeline and effort on detecting and blurring identifiable faces/license plates in the Street View images.

    Biography:
    German Cheung is currently a software engineer at Google Inc. where he joined in 2006. He has worked in the Google Picasa project before becoming part of the Street View team with emphasis on the computer vision part of the privacy pipeline of the captured images. Prior to joining Google, he was a senior research engineer at Neven Vision focusing on 3D face modeling, detection and recognition. Dr. Cheung received a B.Eng (EEE) degree from The University of Hong Kong, his M.Phil (also in EEE) from Hong Kong University of Science and Technology and his Ph.D in Robotics from the Robotics Institute of Carnegie Mellon University in 2003. He was a postdoc at Carnegie Mellon University, managing and conducting research in both the Virtualized RealityTM and the ASIMO Labs before leaving in 2004. Dr Cheung has published articles in the field of signal processing, vision, graphics and robotics and taught short course at SIGGRAPH. He has major interests in object detection/recognition, mutliple-view 3D reconstruction, vision-based motion capture and vision for graphics applications.

    Voltar ao topo

    descrição