Sandwiched Image Compression: Wrapping Neural Networks Around a Standard Codec

We propose sandwiching standard image and video codecs between pre- and post-processing neural networks. The networks are jointly trained through a differentiable codec proxy to minimize a given rate-distortion loss. This sandwich architecture not only improves the standard codec's performance on its intended content, it can effectively adapt the codec to other types of image/video content and to other distortion measures. Essentially, the sandwich learns to transmit ``neural code images'' that optimize overall rate-distortion performance even when the overall problem is well outside the scope of the codec's design. Through a variety of examples, we apply the sandwich architecture to sources with different numbers of channels, higher resolution, higher dynamic range, and perceptual distortion measures. The results demonstrate substantial improvements (up to 9 dB gains or up to 30\% bitrate reductions) compared to alternative adaptations. We derive VQ equivalents for the sandwich, establish optimality properties, and design differentiable codec proxies approximating current standard codecs. We further analyze model complexity, visual quality under perceptual metrics, as well as sandwich configurations that offer interesting potentials in image/video compression and streaming.

Publications

teaser image of Sandwiched Image Compression: Increasing the Resolution and Dynamic Range of Standard Codecs

Sandwiched Image Compression: Increasing the Resolution and Dynamic Range of Standard CodecsBest Paper Finalist

2022 Picture Coding Symposium (PCS), 2022.
Keywords: deep learning, image compression, nonlineartransform coding, high dynamic range, super-resolution, interactive perception

teaser image of Sandwiched Image Compression: Wrapping Neural Networks Around a Standard Codec

Sandwiched Image Compression: Wrapping Neural Networks Around a Standard Codec

2021 IEEE International Conference on Image Processing (ICIP), 2021.
Keywords: deep learning, image compression, interactive perception

Videos

Talks

Cited By

  • Machine Learning for Multimedia Communications. Sensors.Nikolaos Thomos, Thomas Maugey, and Laura Toni. source | cite | search
  • Differentiable Bit-rate Estimation for Neural-based Video Codec Enhancement. 2022 Picture Coding Symposium (PCS).Amir Said, Manish Kumar Singh, and Reza Pourreza. source | cite | search
  • Differentiable Bit-rate Estimation for Neural-based Video Codec Enhancement. 2022 Picture Coding Symposium (PCS).Amir Said, Manish Kumar Singh, and Reza Pourreza. source | cite | search
  • Learnt Deep Hyperparameter Selection in Adversarial Training for Compressed Video Enhancement With a Perceptual Critic. 2023 IEEE International Conference on Image Processing (ICIP).Darren Ramsook and Anil Kokaram. source | cite | search
  • Sandwiched Video Compression: Efficiently Extending the Reach of Standard Codecs With Neural Wrappers. arXiv.2303.11473.Berivan Isik, Onur Guleryuz, Danhang Tang, Jonathan Taylor, and Philip Chou. source | cite | search
  • VVC+M: Plug and Play Scalable Image Coding for Humans and Machines. arXiv.2305.10453.Alon Harell, Yalda Foroutan, and Ivan V. Bajic. source | cite | search
  • Visual Coding for Machines. Simon Fraser University.Bardia Azizian. source | cite | search
  • Stay In Touch