Joint Workshop on
Efficient Deep Learning in Computer Vision

June 15th, 2020
Seattle, Washington
in conjunction with CVPR 2020

Joint Workshop on
Efficient Deep Learning in Computer Vision

June 15th, 2020
Seattle, Washington
in conjunction with CVPR 2020
Computer Vision has a long history of academic research, and recent advances in deep learning have provided significant improvements in the ability to understand visual content. As a result of these research advances on problems such as object classification, object detection, and image segmentation, there has been a rapid increase in the adoption of Computer Vision in industry; however, mainstream Computer Vision research has given little consideration to speed or computation time, and even less to constraints such as power/energy, memory footprint and model size. The workshop has three main goals on solving and discussing efficiency in Computer Vision:

First, the workshop aims to create a venue for a consideration of the new generation of problems that arise as Computer Vision meets mobile and AR/VR systems constraints, to bring together researchers, educators and practitioners who are interested in techniques as well as applications of compact, efficient neural network representations. The workshop discussions will establish close connection between researchers in machine learning and computer vision communities and engineers in industry, and to benefit both academic researchers as well as industrial practitioners.

Second, the workshop aims at reproducibility and comparability of methods for compact and efficient neural network representations, and on-device machine learning. Thus a set of benchmarking tasks (image classification, visual question answering) will be provided together with defined data sets, in order to compare the performance of neural network compression methods on the same networks. Submissions are encouraged (but not required) to use these tasks and data sets in their work. Also, contributors are encouraged to make their code available.

Third, the workshop aims to discuss the next steps in developing efficient feature representations from three aspects: energy efficient, label efficient, and sample efficient. Despite DNNs are brain-inspired and can achieve or even surpass human-level performance on a variety of challenging computer vision tasks, they continue to trail humans’ abilities in many aspects, such as high energy-efficiency and the ability to perform low-shot learning (learning novel concepts from very few examples). Therefore, the next generation of feature representation and learning techniques should aim to tackle recognition tasks with significantly reduced computational complexity, using as little training data as people need, and to generalize to a range of different tasks beyond the one task the model was trained on.
Computer Vision has a long history of academic research, and recent advances in deep learning have provided significant improvements in the ability to understand visual content. As a result of these research advances on problems such as object classification, object detection, and image segmentation, there has been a rapid increase in the adoption of Computer Vision in industry; however, mainstream Computer Vision research has given little consideration to speed or computation time, and even less to constraints such as power/energy, memory footprint and model size. The workshop has three main goals on solving and discussing efficiency in Computer Vision:
First, the workshop aims to create a venue for a consideration of the new generation of problems that arise as Computer Vision meets mobile and AR/VR systems constraints, to bring together researchers, educators and practitioners who are interested in techniques as well as applications of compact, efficient neural network representations. The workshop discussions will establish close connection between researchers in machine learning and computer vision communities and engineers in industry, and to benefit both academic researchers as well as industrial practitioners.
Second, the workshop aims at reproducibility and comparability of methods for compact and efficient neural network representations, and on-device machine learning. Thus a set of benchmarking tasks (image classification, visual question answering) will be provided together with defined data sets, in order to compare the performance of neural network compression methods on the same networks. Submissions are encouraged (but not required) to use these tasks and data sets in their work. Also, contributors are encouraged to make their code available.
Third, the workshop aims to discuss the next steps in developing efficient feature representations from three aspects: energy efficient, label efficient, and sample efficient. Despite DNNs are brain-inspired and can achieve or even surpass human-level performance on a variety of challenging computer vision tasks, they continue to trail humans’ abilities in many aspects, such as high energy-efficiency and the ability to perform low-shot learning (learning novel concepts from very few examples). Therefore, the next generation of feature representation and learning techniques should aim to tackle recognition tasks with significantly reduced computational complexity, using as little training data as people need, and to generalize to a range of different tasks beyond the one task the model was trained on.

Important Dates

Paper Submission Deadline:
Notification to authors:
Camera ready deadline:
Workshop:
March 8, 2020 pst
April 1, 2020 pst
April 10, 2020 pst
June 15, 2020 (Full Day)

Important Dates

Paper Submission Deadline:
Notification to authors:
Camera ready deadline:
Workshop:
March 8, 2020 pst
April 1, 2020 pst
April 10, 2020 pst
June 15, 2020 (Full Day)

Topics

  • Efficient Neural Network and Architecture Search
    • Compact and efficient neural network architecture for mobile and AR/VR devices
    • Hardware (latency, energy) aware neural network architectures search, targeted for mobile and AR/VR devices
    • Efficient architecture search algorithm for different vision tasks (detection, segmentation etc.)
    • Optimization for Latency, Accuracy and Memory usage, as motivated by embedded devices
  • Neural Network Compression
    • Model compression (sparsification, binarization, quantization, pruning, thresholding and coding etc.) for efficient inference with deep networks and other ML models
    • Scalable compression techniques that can cope with large amounts of data and/or large neural networks (e.g., not requiring access to complete datasets for hyperparameter tuning and/or retraining)
    • Hashing (Binary) Codes Learning
  • Low-bit Quantization Network and Hardware Accelerators
    • Investigations into the processor architectures (CPU vs GPU vs DSP) that best support mobile applications
    • Hardware accelerators to support Computer Vision on mobile and AR/VR platforms
    • Low-precision training/inference & acceleration of deep neural networks on mobile devices
  • Dataset and benchmark
    • Open datasets and test environments for benchmarking inference with efficient DNN representations
    • Metrics for evaluating the performance of efficient DNN representations
    • Methods for comparing efficient DNN inference across platforms and tasks
  • Label/sample/feature efficient learning
    • Label Efficient Feature Representation Learning Methods, e.g. Unsupervised Learning, Domain Adaptation, Weakly Supervised Learning and SelfSupervised Learning Approaches
    • Sample Efficient Feature Learning Methods, e.g. Meta Learning
    • Low Shot learning Techniques
    • New Applications, e.g. Medical Domain
  • Mobile and AR/VR Applications
    • Novel mobile and AR/VR applications using Computer Vision such as image processing (e.g. style transfer, body tracking, face tracking) and augmented reality
    • Learning efficient deep neural networks under memory and computation constraints for on-device applications

Keynote Speakers

Title: TBD

Biography: Philip Torr did his PhD (DPhil) at the Robotics Research Group of the University of Oxford under Professor David Murray of the Active Vision Group. He worked for another three years at Oxford as a research fellow, and is still maintains close contact as visiting fellow there. He left Oxford to work for six years as a research scientist for Microsoft Research, first in Redmond USA in the Vision Technology Group, then in Cambridge UK founding the vision side of the Machine learning and perception group. He then became a Professor in in Computer Vision and Machine Learning at Oxford Brookes University, where he has brought in over one million pounds in grants for which he is PI. Recently in 2013 he returned to Oxford as full professor where he has established the Torr Vision group and has brought in over five million pounds of funding. Philip Torr won several awards including the Marr prize (the highest honour in vision) in 1998. He is a Royal Society Wolfson Research Merit Award Holder. More recently he has been awarded best science paper at BMVC 2010 and ECCV 2010. He was involved in the algorithm design for Boujou released by 2D3. Boujou has won a clutch of industry awards, including Computer Graphics World Innovation Award, IABM Peter Wayne Award, and CATS Award for Innovation, and a technical EMMY. He then worked closely with this Oxford based company as well as other companies such as Sony on the Wonderbook project. He is a director of new Oxford based spin out OxSight.
Title: TBD

Biography: Nic Lane received his Ph.D. degree in 2011 from Dartmouth College, Hanover, New Hampshire. He is an associate professor in the Computer Science Department at the University of Oxford, United Kingdom. He is an experimentalist and likes to build prototype next-generation wearable and embedded-sensing devices based on well-founded computational models. His work has received multiple best paper awards, including one from the ACM/IEEE Conference on Information Processing in Sensor Networks 2017 and two from ACM UbiComp in 2012 and 2015, respectively. (Based on document published on 5 September 2018).
Title: TBD

Biography: Diana Marculescu received a degree in computer science from Politehnica University of Bucharest, Romania, in 1991, and a Ph.D. in computer engineering from the University of Southern California in 1998. From 2014 to 2018, she served as Associate Department Head for Academic Affairs in Electrical and Computer Engineering, and, from 2015 to 2019, was the founding director of the College of Engineering Center for Faculty Success. In 2019, the Cockrell School of Engineering at the University of Texas at Austin, had named her New Chair of Electrical and Computer Engineering.
Title: TBD

Biography: Song Han is an assistant professor at MIT EECS. Dr. Han received the Ph.D. degree in Electrical Engineering from Stanford advised by Prof. Bill Dally. Dr. Han’s research focuses on efficient deep learning computing. He proposed “Deep Compression” and “Efficient Inference Engine” that impacted the industry. His work received the best paper award in ICLR’16 and FPGA’17. He is the co-founder and chief scientist of DeePhi Tech (a leading efficient deep learning solution provider), which was acquired by Xilinx. The pruning, compression and acceleration techniques have been integrated into products.
Title: TBD

Biography: Bill Dally develops efficient hardware for demanding information processing problems and sustainable energy systems. His current projects include domain-specific accelerators for deep learning, bioinformatics, and SAT solving; redesigning memory systems for the data center; developing efficient methods for video perception; and developing efficient sustainable energy systems. His research involves demonstrating novel concepts with working systems. Previous systems include the MARS Hardware Accelerator, the Torus Routing Chip, the J-Machine, M-Machine, the Reliable Router, the Imagine signal and image processor, the Merrimac supercomputer, and the ELM embedded processor. His work on stream processing led to GPU computing. His group has pioneered techniques including fast capability-based addressing, processor coupling, virtual channel flow control, wormhole routing, link-level retry, message-driven processing, deadlock-free routing, pruning neural networks, and quantizing neural networks.
Title: TBD

Biography: Chelsea Finn completed her Ph.D. in computer science at UC Berkeley and her B.S. in electrical engineering and computer science at MIT. Now she is a research scientist at Google Brain, a post-doc at Berkeley AI Research Lab (BAIR), and an acting assistant professor at Stanford. She will join the Stanford Computer Science faculty full time, starting in Fall 2019. She is interested in how algorithms can enable machines to acquire more general notions of intelligence through learning and interaction, allowing them to autonomously learn a variety of complex sensorimotor skills in real-world settings. This includes learning deep representations for representing complex skills from raw sensory inputs, enabling machines to learn through interaction without human supervision, and allowing systems to build upon what they’ve learned previously to acquire new capabilities with small amounts of experience.

Keynote Speakers

Title: TBD


Biography: Philip Torr did his PhD (DPhil) at the Robotics Research Group of the University of Oxford under Professor David Murray of the Active Vision Group. He worked for another three years at Oxford as a research fellow, and is still maintains close contact as visiting fellow there. He left Oxford to work for six years as a research scientist for Microsoft Research, first in Redmond USA in the Vision Technology Group, then in Cambridge UK founding the vision side of the Machine learning and perception group. He then became a Professor in in Computer Vision and Machine Learning at Oxford Brookes University, where he has brought in over one million pounds in grants for which he is PI. Recently in 2013 he returned to Oxford as full professor where he has established the Torr Vision group and has brought in over five million pounds of funding. Philip Torr won several awards including the Marr prize (the highest honour in vision) in 1998. He is a Royal Society Wolfson Research Merit Award Holder. More recently he has been awarded best science paper at BMVC 2010 and ECCV 2010. He was involved in the algorithm design for Boujou released by 2D3. Boujou has won a clutch of industry awards, including Computer Graphics World Innovation Award, IABM Peter Wayne Award, and CATS Award for Innovation, and a technical EMMY. He then worked closely with this Oxford based company as well as other companies such as Sony on the Wonderbook project. He is a director of new Oxford based spin out OxSight.
Title: TBD

Biography: Nic Lane received his Ph.D. degree in 2011 from Dartmouth College, Hanover, New Hampshire. He is an associate professor in the Computer Science Department at the University of Oxford, United Kingdom. He is an experimentalist and likes to build prototype next-generation wearable and embedded-sensing devices based on well-founded computational models. His work has received multiple best paper awards, including one from the ACM/IEEE Conference on Information Processing in Sensor Networks 2017 and two from ACM UbiComp in 2012 and 2015, respectively. (Based on document published on 5 September 2018).
Title: TBD

Biography: Diana Marculescu received a degree in computer science from Politehnica University of Bucharest, Romania, in 1991, and a Ph.D. in computer engineering from the University of Southern California in 1998. From 2014 to 2018, she served as Associate Department Head for Academic Affairs in Electrical and Computer Engineering, and, from 2015 to 2019, was the founding director of the College of Engineering Center for Faculty Success. In 2019, the Cockrell School of Engineering at the University of Texas at Austin, had named her New Chair of Electrical and Computer Engineering.
Title: TBD

Biography: Song Han is an assistant professor at MIT EECS. Dr. Han received the Ph.D. degree in Electrical Engineering from Stanford advised by Prof. Bill Dally. Dr. Han’s research focuses on efficient deep learning computing. He proposed “Deep Compression” and “Efficient Inference Engine” that impacted the industry. His work received the best paper award in ICLR’16 and FPGA’17. He is the co-founder and chief scientist of DeePhi Tech (a leading efficient deep learning solution provider), which was acquired by Xilinx. The pruning, compression and acceleration techniques have been integrated into products.
Title: TBD

Biography: Bill Dally develops efficient hardware for demanding information processing problems and sustainable energy systems. His current projects include domain-specific accelerators for deep learning, bioinformatics, and SAT solving; redesigning memory systems for the data center; developing efficient methods for video perception; and developing efficient sustainable energy systems. His research involves demonstrating novel concepts with working systems. Previous systems include the MARS Hardware Accelerator, the Torus Routing Chip, the J-Machine, M-Machine, the Reliable Router, the Imagine signal and image processor, the Merrimac supercomputer, and the ELM embedded processor. His work on stream processing led to GPU computing. His group has pioneered techniques including fast capability-based addressing, processor coupling, virtual channel flow control, wormhole routing, link-level retry, message-driven processing, deadlock-free routing, pruning neural networks, and quantizing neural networks.
Title: TBD

Biography: Chelsea Finn completed her Ph.D. in computer science at UC Berkeley and her B.S. in electrical engineering and computer science at MIT. Now she is a research scientist at Google Brain, a post-doc at Berkeley AI Research Lab (BAIR), and an acting assistant professor at Stanford. She will join the Stanford Computer Science faculty full time, starting in Fall 2019. She is interested in how algorithms can enable machines to acquire more general notions of intelligence through learning and interaction, allowing them to autonomously learn a variety of complex sensorimotor skills in real-world settings. This includes learning deep representations for representing complex skills from raw sensory inputs, enabling machines to learn through interaction without human supervision, and allowing systems to build upon what they’ve learned previously to acquire new capabilities with small amounts of experience.

Program (Tentative)

(Location: TBD )
Time Event
8:50 - 9:00 Welcome by organizers
9:00 - 9:30 Invited talk: Prof. Philip Torr (Oxford University)
9:30 - 10:00 Invited talk: Prof. Nic Lane (Oxford University)
10:00 - 10:30 Coffee break
10:30 - 11:00 Oral Session 1 (3 presentations: 10min each)
11:00 - 11:30 Keynote talk: Prof. Song Han (MIT)
11:30 - 12:00 Invited talk: Prof. Diana Marculescu (CMU)
12:00 - 12:30 Oral Session 2 (3 presentations: 10min each)
12:30 - 13:30 Lunch break
13:30 - 14:00 Keynote talk: Prof. Bill Dally (Stanford)
14:00 - 14:30 Invited talk: Prof. Chelsea Finn (Stanford)
14:30 - 15:00 Oral Session 3 (3 presentations: 10min each)
15:00 - 16:00 Poster session by paper submission
16:00 - 17:30 Panel presentations and discussion on Efficient deep learning algorithms.
Moderator: Luc Van Gool
17:30 - 17:45 Closing awards for best paper and best poster

Program (Tentative)

(Location: TBD )
Time Event
8:50 - 9:00 Welcome by organizers
9:00 - 9:30 Invited talk: Prof. Philip Torr (Oxford University)
9:30 - 10:00 Invited talk: Prof. Nic Lane (Oxford University)
10:00 - 10:30 Coffee break
10:30 - 11:00 Oral Session 1 (3 presentations: 10min each)
11:00 - 11:30 Keynote talk: Prof. Song Han (MIT)
11:30 - 12:00 Invited talk: Prof. Diana Marculescu (CMU)
12:00 - 12:30 Oral Session 2 (3 presentations: 10min each)
12:30 - 13:30 Lunch break
13:30 - 14:00 Keynote talk: Prof. Bill Dally (Stanford)
14:00 - 14:30 Invited talk: Prof. Chelsea Finn (Stanford)
14:30 - 15:00 Oral Session 3 (3 presentations: 10min each)
15:00 - 16:00 Poster session by paper submission
16:00 - 17:30 Panel presentations and discussion on Efficient deep learning algorithms. Moderator: Luc Van Gool
17:30 - 17:45 Closing awards for best paper and best poster

Accepted Papers(TBD)

Awards

EDLCV 2020 will announce one Best Paper Award and one Best Paper Honorable Mention Award, fully sponsored by the Inception Institute of Artificial Intelligence:

※ EDLCV 2020 Best Paper Award ($1,500)


EDLCV 2020 Best Paper Honorable Mention Award ($500)

Awards

EDLCV 2020 will announce one Best Paper Award and one Best Paper Honorable Mention Award, fully sponsored by the Inception Institute of Artificial Intelligence:

※ EDLCV 2020 Best Paper Award ($1,500)


EDLCV 2020 Best Paper Honorable Mention Award ($500)

Submission

All submissions will be handled electronically via the workshop’s CMT Website. Click the following link to go to the submission site: https://cmt3.research.microsoft.com/EDLCV2020/

Papers should describe original and unpublished work about the related topics. Each paper will receive double blind reviews, moderated by the workshop chairs. Authors should take into account the following:

  • All papers must be written and presented in English.
  • All papers must be submitted in PDF format. The workshop paper format guidelines are the same as the Main Conference papers
  • The maximum paper length is 8 pages (excluding references). Note that shorter submissions are also welcome.
  • The accepted papers will be published in CVF open access as well as in IEEE Xplore.

Submission

All submissions will be handled electronically via the workshop’s CMT Website. Click the following link to go to the submission site: https://cmt3.research.microsoft.com/EDLCV2020/

Papers should describe original and unpublished work about the related topics. Each paper will receive double blind reviews, moderated by the workshop chairs. Authors should take into account the following:

  • All papers must be written and presented in English.
  • All papers must be submitted in PDF format. The workshop paper format guidelines are the same as the Main Conference papers
  • The maximum paper length is 8 pages (excluding references). Note that shorter submissions are also welcome.
  • The accepted papers will be published in CVF open access as well as in IEEE Xplore.

Organizers

Main Contacts

If you have question, please contact :

  • Dr. Li Liu : li.liu@oulu.fi

  • Dr. Peter Vajda : vajdap@fb.com

  • Dr. Werner Bailer : werner.bailer@joanneum.at

Main Contacts

If you have question, please contact :

  • Dr. Li Liu : li.liu@oulu.fi

  • Dr. Peter Vajda : vajdap@fb.com

  • Dr. Werner Bailer : werner.bailer@joanneum.at