A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation
Volume 9 Issue 5
May  2022

IEEE/CAA Journal of Automatica Sinica

  • JCR Impact Factor: 11.8, Top 4% (SCI Q1)
    CiteScore: 17.6, Top 3% (Q1)
    Google Scholar h5-index: 77, TOP 5
Turn off MathJax
Article Contents
P. W. Liang, J. J. Jiang, X. M. Liu, and J. Y. Ma, “BaMBNet: A blur-aware multi-branch network for dual-pixel defocus deblurring,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 5, pp. 878–892, May 2022. doi: 10.1109/JAS.2022.105563
Citation: P. W. Liang, J. J. Jiang, X. M. Liu, and J. Y. Ma, “BaMBNet: A blur-aware multi-branch network for dual-pixel defocus deblurring,” IEEE/CAA J. Autom. Sinica, vol. 9, no. 5, pp. 878–892, May 2022. doi: 10.1109/JAS.2022.105563

BaMBNet: A Blur-Aware Multi-Branch Network for Dual-Pixel Defocus Deblurring

doi: 10.1109/JAS.2022.105563
Funds:  The research was supported by the National Natural Science Foundation of China (61971165, 61922027, 61773295), in part by the Fundamental Research Funds for the Central Universities (FRFCU5710050119), the Natural Science Foundation of Heilongjiang Province (YQ2020F004), and the Chinese Association for Artificial Intelligence (CAAI)-Huawei MindSpore Open Fund
More Information
  • Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography. It is very challenging because the blur kernel is spatially varying and difficult to estimate by traditional methods. Due to its great breakthrough in low-level tasks, convolutional neural networks (CNNs) have been introduced to the defocus deblurring problem and achieved significant progress. However, previous methods apply the same learned kernel for different regions of the defocus blurred images, thus it is difficult to handle nonuniform blurred images. To this end, this study designs a novel blur-aware multi-branch network (BaMBNet), in which different regions are treated differentially. In particular, we estimate the blur amounts of different regions by the internal geometric constraint of the dual-pixel (DP) data, which measures the defocus disparity between the left and right views. Based on the assumption that different image regions with different blur amounts have different deblurring difficulties, we leverage different networks with different capacities to treat different image regions. Moreover, we introduce a meta-learning defocus mask generation algorithm to assign each pixel to a proper branch. In this way, we can expect to maintain the information of the clear regions well while recovering the missing details of the blurred regions. Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art (SOTA) methods. For the dual-pixel defocus deblurring (DPD)-blur dataset, the proposed BaMBNet achieves 1.20 dB gain over the previous SOTA method in term of peak signal-to-noise ratio (PSNR) and reduces learnable parameters by 85%. The details of the code and dataset are available at https://github.com/junjun-jiang/BaMBNet.

     

  • loading
  • [1]
    A. Abuolaim and M. S. Brown, “Defocus deblurring using dual-pixel data,” in Proc. European Conf. Computer Vision. Springer, 2020, pp. 111–126.
    [2]
    A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, and J. Tumblin, “Dappled photography: Mask enhanced cameras for heterodyned light fields and coded aperture refocusing,” ACM Trans. Graph., vol. 26, no. 3, p. 69, 2007.
    [3]
    S. Chaudhuri and A. N. Rajagopalan, Depth From Defocus: A Real Aperture Imaging Approach. New York, USA: Springer Science & Business Media, 2012.
    [4]
    S. Xu, L. Ji, Z. Wang, L i, K. Sun, C. Zhang, and J. Zhang, “Towards reducing severe defocus spread effects for multi-focus image fusion via an optimization based strategy,” IEEE Trans. Computational Imaging, vol. 6, pp. 1561–1570, 2020. doi: 10.1109/TCI.2020.3039564
    [5]
    A. Abuolaim, M. Delbracio, D. Kelly, M. S. Brown, and P. Milanfar, “Learning to reduce defocus blur by realistically modeling dual-pixel data,” in Proc. IEEE Int. Conf. Computer Vision, 2021, pp. 2289–2298.
    [6]
    P. Śliwiński and P. Wachel, “A simple model for on-sensor phase-detection autofocusing algorithm,” J. Computer and Communications, vol. 1, no. 06, p. 11, 2013.
    [7]
    J. Jang, Y. Yoo, J. Kim, and J. Paik, “Sensor-based auto-focusing system using multi-scale feature extraction and phase correlation matching,” Sensors, vol. 15, no. 3, pp. 5747–5762, 2015. doi: 10.3390/s150305747
    [8]
    C. Herrmann, R. S. Bowen, N. Wadhwa, R. Garg, Q. He, J. T. Barron, and R. Zabih, “Learning to autofocus,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2020, pp. 2230–2239.
    [9]
    R. Garg, N. Wadhwa, S. Ansari, and J. T. Barron, “Learning single camera depth estimation using dual-pixels,” in Proc. IEEE Int. Conf. Computer Vision, 2019, pp. 7628–7637.
    [10]
    L. Pan, S. Chowdhury, R. Hartley, M. Liu, H. Zhang, and H. Li, “Dual pixel exploration: Simultaneous depth estimation and image restoration,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2021, pp. 4340–4349.
    [11]
    A. Punnappurath and M. S. Brown, “Reflection removal using a dualpixel sensor,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2019, pp. 1556–1565.
    [12]
    Y. Zhang, N. Wadhwa, S. Orts-Escolano, C. Häne, S. Fanello, and R. Garg, “Du2Net: Learning depth estimation from dual-cameras and dual-pixels,” in Proc. European Conf. Computer Vision, Springer, 2020, pp. 582–598.
    [13]
    N. Wadhwa, R. Garg, D. E. Jacobs, B. E. Feldman, N. Kanazawa, R. Carroll, Y. Movshovitz-Attias, J. T. Barron, Y. Pritch, and M. Levoy, “Synthetic depth-of-field with a single-camera mobile phone,” ACM Trans. Graphics, vol. 37, no. 4, pp. 1–13, 2018.
    [14]
    A. K. Vadathya, S. Girish, and K. Mitra, “A unified learning-based framework for light field reconstruction from coded projections,” IEEE Trans. Computational Imaging, vol. 6, pp. 304–316, 2019.
    [15]
    A. Punnappurath, A. Abuolaim, M. Afifi, and M. S. Brown, “Modeling defocus-disparity in dual-pixel sensors,” in Proc. IEEE Int. Conf. Computational Photography, 2020, pp. 1–12.
    [16]
    E. Hecht, Optics. Edinburgh Gate, UK: Pearson, 2001.
    [17]
    L. D’Andrès, J. Salvador, A. Kochale, and S. Süsstrunk, “Non-parametric blur map regression for depth of field extension,” IEEE Trans. Image Processing, vol. 25, no. 4, pp. 1660–1673, 2016. doi: 10.1109/TIP.2016.2526907
    [18]
    A. Karaali and C. R. Jung, “Edge-based defocus blur estimation with adaptive scale selection,” IEEE Trans. Image Processing, vol. 27, no. 3, pp. 1126–1137, 2017.
    [19]
    J. Lee, S. Lee, S. Cho, and S. Lee, “Deep defocus map estimation using domain adaptation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2019, pp. 12222–12230.
    [20]
    J. Park, Y.-W. Tai, D. Cho, and I. So Kweon, “A unified approach of multi-scale deep and hand-crafted features for defocus estimation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2017, pp. 1736–1745.
    [21]
    J. Shi, L. Xu, and J. Jia, “Just noticeable defocus blur detection and estimation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 657–665.
    [22]
    C. Tang, X. Zhu, X. Liu, L. Wang, and A. Zomaya, “Defusionnet: Defocus blur detection via recurrently fusing and refining multi-scale deep features,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2019, pp. 2700–2709.
    [23]
    J. Shi, L. Xu, and J. Jia, “Discriminative blur detection features,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2014, pp. 2965–2972.
    [24]
    X. Yi and M. Eramian, “LBP-based segmentation of defocus blur,” IEEE Trans. Image Processing, vol. 25, no. 4, pp. 1626–1638, 2016. doi: 10.1109/TIP.2016.2528042
    [25]
    W. Zhao, F. Zhao, D. Wang, and H. Lu, “Defocus blur detection via multi-stream bottom-top-bottom fully convolutional network,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2018, pp. 3080–3088.
    [26]
    W. Zhao, F. Zhao, D. Wang, and H. Lu, “Defocus blur detection via multi-stream bottom-top-bottom network,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 1884–1897, 2019.
    [27]
    L. Zhang, Z. Lang, P. Wang, W. Wei, S. Liao, L. Shao, and Y. Zhang, “Pixel-aware deep function-mixture network for spectral super-resolution,” in Proc. AAAI Conf. Artificial Intelligence, 2020, vol. 34, no. 7, pp. 12 821–12 828.
    [28]
    W. Xie, D. Song, C. Xu, C. Xu, H. Zhang, and Y. Wang, “Learning frequency-aware dynamic network for efficient super-resolution,” in Proc. IEEE Int. Conf. Computer Vision, 2021, pp. 4308–4317.
    [29]
    L. Sun, Z. Liu, X. Sun, L. Liu, R. Lan, and X. Luo, “Lightweight image super-resolution via weighted multi-scale residual network,” IEEE J. Autom. Sinica, vol. 8, no. 7, pp. 1271–1280, 2021.
    [30]
    K. Yu, X. Wang, C. Dong, X. Tang, and C. C. Loy, “Path-restore: Learning network path selection for image restoration,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2021.
    [31]
    L. Xu, J. Zhang, X. Cheng, F. Zhang, X. Wei, and J. Ren, “Efficient deep image denoising via class specific convolution,” in Proc. AAAI Conf. Artificial Intelligence, 2021, pp. 3039–3046.
    [32]
    Z. Shen, W. Wang, X. Lu, J. Shen, H. Ling, T. Xu, and L. Shao, “Humanaware motion deblurring,” in Proc. IEEE Int. Conf. Computer Vision, 2019, pp. 5572–5581.
    [33]
    J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.
    [34]
    O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Medical Image Computing and Computer-assisted Intervention. Springer, 2015, pp. 234–241.
    [35]
    K. Zhang, Y. Su, X. Guo, L. Qi, and Z. Zhao, “Mu-GAN: Facial attribute editing based on multi-attention mechanism,” IEEE/CAA J. Autom. Sinica, vol. 8, no. 9, pp. 1614–1626, 2020.
    [36]
    Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image superresolution using very deep residual channel attention networks,” in Proc. European Conf. Computer Vision, 2018, pp. 286–301.
    [37]
    G. Harikumar and Y. Bresler, “Perfect blind restoration of images blurred by multiple filters: Theory and efficient algorithms,” IEEE Trans. Image Processing, vol. 8, no. 2, pp. 202–219, 1999. doi: 10.1109/83.743855
    [38]
    A. Buades, B. Coll, and J. M. Morel, “On image denoising methods,” CMLA Preprint, vol. 5, pp. 19–26, 2004.
    [39]
    J. Lorraine and D. Duvenaud, “Stochastic hyperparameter optimization through hypernetworks,” arXiv preprint arXiv: 1802.09419, 2018.
    [40]
    S. Xin, N. Wadhwa, T. Xue, J. T. Barron, P. P. Srinivasan, J. Chen, I. Gkioulekas, and R. Garg, “Defocus map estimation and deblurring from a single dual-pixel image,” in Proc. IEEE Int. Conf. Computer Vision, 2021, pp. 2228–2238.
    [41]
    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learning Representations, 2015.
    [42]
    D. Krishnan and R. Fergus, “Fast image deconvolution using hyperlaplacian priors,” Proc. Advances Neural Information Processing Syst., vol. 22, pp. 1033–1041, 2009.
    [43]
    Z. Wang, A. C. Bovik, H. R. Sheikh, and E. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Processing, vol. 13, no. 4, pp. 600–612, 2004. doi: 10.1109/TIP.2003.819861
    [44]
    R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2018, pp. 586–595.
    [45]
    M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in Proc. Conf. Neural Information Processing Systems, 2017, pp. 6629–6640.
    [46]
    J. V. Manjón, Coupé, L. Martí-Bonmatí, D. L. Collins, and M. Robles, “Adaptive non-local means denoising of MR images with spatially varying noise levels,” Journal of Magnetic Resonance Imaging, vol. 31, no. 1, pp. 192–203, 2010. doi: 10.1002/jmri.22003
    [47]
    M. Garon, K. Sunkavalli, S. Hadap, N. Carr, and J.-F. Lalonde, “Fast spatially-varying indoor lighting estimation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2019, pp. 6908–6917.
    [48]
    S. K. Nayar and T. Mitsunaga, “High dynamic range imaging: Spatially varying pixel exposures,” in Proc, IEEE Conf. Computer Vision and Pattern Recognition, 2000, vol. 1. pp. 472–479.

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(3)

    Article Metrics

    Article views (809) PDF downloads(95) Cited by()

    Highlights

    • Instead of handling the entire image in a single network, BaMBNet assigns different regions with different blur amounts into multiple branches with different capacities, which can maintain the information of the clear regions while recovering the missing details of the blurred regions
    • we devise an unsupervised learn-based method to estimate the blur amounts of DP image, i.e. COC map
    • We use a novel assignment strategy to the estimated COC map to generate the defocus masks, which can effectively and efficiently guide the optimization of multi-branch network

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return