IEEE/CAA Journal of Automatica Sinica
Citation: | L. F. Tang, Y. X. Deng, Y. Ma, J. Huang, and J. Y. Ma, "SuperFusion: A versatile image registration and fusion network with semantic awareness, " IEEE/CAA J. Autom. Sinica, vol. 9, no. 12, pp.2021-2137, Dec. 2022. doi: 10.1109/JAS.2022.106082 |
[1] |
X. Zhang, "Deep learning-based multi-focus image fusion: A survey and a comparative study, " IEEE Trans. Pattern Anal. Mach. Intell. , vol. 44, no. 9, pp. 4819–4838, 2022.
|
[2] |
J. Ma, Y. Ma, and C. Li, "Infrared and visible image fusion methods and applications: A survey, " Inf. Fusion, vol. 45, pp. 153–178, 2019. doi: 10.1016/j.inffus.2018.02.004
|
[3] |
X. Zhang, P. Ye, and G. Xiao, "Vifb: A visible and infrared image fusion benchmark, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2020, pp. 104–105.
|
[4] |
H. Zhang, H. Xu, X. Tian, J. Jiang, and J. Ma, "Image fusion meets deep learning: A survey and perspective, " Inf. Fusion, vol. 76, pp. 323–336, 2021. doi: 10.1016/j.inffus.2021.06.008
|
[5] |
L. Tang, H. Zhang, H. Xu, and J. Ma, "Deep learning-based image fusion: A survey, " J. Image Graph. , 2022.
|
[6] |
Q. Ha, K. Watanabe, T. Karasawa, Y. Ushiku, and T. Harada, "Mfnet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes, " in Proc. IEEE Int. Conf. Intell. Rob. Syst., 2017, pp. 5108–5115.
|
[7] |
X. Zhang, P. Ye, H. Leung, K. Gong, and G. Xiao, "Object fusion tracking based on visible and infrared images: A comprehensive review, " Inf. Fusion, vol. 63, pp. 166–187, 2020. doi: 10.1016/j.inffus.2020.05.002
|
[8] |
M. Yin, J. Pang, Y. Wei, and P. Duan, "Image fusion algorithm based on nonsubsampled dual-tree complex contourlet transform and compressive sensing pulse coupled neural network, " J. Comput. Aided Des. Comput. Graph. , no. 3, pp. 411–419, 2016.
|
[9] |
J. Chen, X. Li, L. Luo, X. Mei, and J. Ma, "Infrared and visible image fusion based on target-enhanced multiscale transform decomposition, " Inf. Sci. , vol. 508, pp. 64–78, 2020. doi: 10.1016/j.ins.2019.08.066
|
[10] |
H. Li, X. -J. Wu, and J. Kitler, "Mdlatlrr: A novel decomposition method for infrared and visible image fusion, " IEEE Trans. Image Process. , vol. 29, pp. 4733–4746, 2020. doi: 10.1109/TIP.2020.2975984
|
[11] |
Y. Liu, X. Chen, R. K. Ward, and Z. J. Wang, "Image fusion with convolutional sparse representation, " IEEE Signal Process. Letters, vol. 23, no. 12, pp. 1882–1886, 2016. doi: 10.1109/LSP.2016.2618776
|
[12] |
Z. Fu, X. Wang, J. Xu, N. Zhou, and Y. Zhao, "Infrared and visible images fusion based on rpca and nsct, " Infrared Phys. Technol. , vol. 77, pp. 114–123, 2016. doi: 10.1016/j.infrared.2016.05.012
|
[13] |
J. Ma, Z. Zhou, B. Wang, and H. Zong, "Infrared and visible image fusion based on visual saliency map and weighted least square optimization, " Infrared Phys. Technol. , vol. 82, pp. 8–17, 2017. doi: 10.1016/j.infrared.2017.02.005
|
[14] |
J. Ma, C. Chen, C. Li, and J. Huang, "Infrared and visible image fusion via gradient transfer and total variation minimization, " Inf. Fusion, vol. 31, pp. 100–109, 2016. doi: 10.1016/j.inffus.2016.02.001
|
[15] |
W. Zhao, H. Lu, and D. Wang, "Multisensor image fusion and enhancement in spectral total variation domain, " IEEE Trans. Multimed. , vol. 20, no. 4, pp. 866–879, 2017.
|
[16] |
H. Li and X. -J. Wu, "Densefuse: A fusion approach to infrared and visible images, " IEEE Trans. Image Process. , vol. 28, no. 5, pp. 2614–2623, 2019. doi: 10.1109/TIP.2018.2887342
|
[17] |
H. Xu, H. Zhang, and J. Ma, "Classification saliency-based rule for visible and infrared image fusion, " IEEE Trans. Comput. Imaging, vol. 7, pp. 824–836, 2021. doi: 10.1109/TCI.2021.3100986
|
[18] |
J. Liu, X. Fan, J. Jiang, R. Liu, and Z. Luo, "Learning a deep multi-scale feature ensemble and an edge-attention guidance for image fusion, " IEEE Trans. Circuits Syst. Video Technol. , vol. 32, no. 1, pp. 105–119, 2022. doi: 10.1109/TCSVT.2021.3056725
|
[19] |
F. Zhao, W. Zhao, L. Yao, and Y. Liu, "Self-supervised feature adaption for infrared and visible image fusion, " Inf. Fusion, vol. 76, pp. 189–203, 2021. doi: 10.1016/j.inffus.2021.06.002
|
[20] |
J. Ma, L. Tang, M. Xu, H. Zhang, and G. Xiao, "Stdfusionnet: An infrared and visible image fusion network based on salient target detection, " IEEE Trans. Instrum. Meas. , vol. 70, pp. 1–13, 2021.
|
[21] |
Y. Liu, Y. Shi, F. Mu, J. Cheng, C. Li, and X. Chen, "Multimodal mri volumetric data fusion with convolutional neural networks, " IEEE Trans. Instrum. Meas. , vol. 71, pp. 1–15, 2022.
|
[22] |
L. Tang, J. Yuan, and J. Ma, "Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network, " Inf. Fusion, vol. 82, pp. 28–42, 2022. doi: 10.1016/j.inffus.2021.12.004
|
[23] |
J. Ma, W. Yu, P. Liang, C. Li, and J. Jiang, "Fusiongan: A generative adversarial network for infrared and visible image fusion, " Inf. Fusion, vol. 48, pp. 11–26, 2019. doi: 10.1016/j.inffus.2018.09.004
|
[24] |
J. Ma, H. Zhang, Z. Shao, P. Liang, and H. Xu, "Ganmcc: A generative adversarial network with multiclassification constraints for infrared and visible image fusion, " IEEE Trans. Instrum. Meas. , vol. 70, pp. 1–14, 2021.
|
[25] |
J. Liu, X. Fan, Z. Huang, G. Wu, R. Liu, W. Zhong, and Z. Luo, "Targetaware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 5802–5811.
|
[26] |
Y. Yang, J. Liu, S. Huang, W. Wan, W. Wen, and J. Guan, "Infrared and visible image fusion via texture conditional generative adversarial network, " IEEE Trans. Circuits Syst. Video Technol. , vol. 31, no. 12, pp. 4771–4783, 2021. doi: 10.1109/TCSVT.2021.3054584
|
[27] |
J. Ma, L. Tang, F. Fan, J. Huang, X. Mei, and Y. Ma, "Swinfusion: Cross-domain long-range learning for general image fusion via swin transformer, " IEEE/CAA J. Autom. Sinica, vol. 9, no. 7, pp. 1200–1217, 2022. doi: 10.1109/JAS.2022.105686
|
[28] |
W. Tang, F. He, and Y. Liu, "Ydtr: infrared and visible image fusion via y-shape dynamic transformer, " IEEE Trans. Multimed. , 2022.
|
[29] |
J. Li, J. Zhu, C. Li, X. Chen, and B. Yang, "Cgtf: Convolution-guided transformer for infrared and visible image fusion, " IEEE Trans. Instrum. Meas. , vol. 71, p. 5012314, 2022.
|
[30] |
W. Di, L. Jinyuan, F. Xin, and R. Liu, "Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration, " in Int. Joint Conf. Artif. Intell., 2022.
|
[31] |
H. Li, X. -J. Wu, and T. Durrani, "Nestfuse: An infrared and visible image fusion architecture based on nest connection and spatial/channel attention models, " IEEE Trans. Instrum. Meas. , vol. 69, no. 12, pp. 9645–9656, 2020. doi: 10.1109/TIM.2020.3005230
|
[32] |
H. Li, X. -J. Wu, and J. Kittler, "Rfn-nest: An end-to-end residual fusion network for infrared and visible images, " Inf. Fusion, vol. 73, pp. 72–86, 2021. doi: 10.1016/j.inffus.2021.02.023
|
[33] |
H. Xu, X. Wang, and J. Ma, "Drf: Disentangled representation for visible and infrared image fusion, " IEEE Trans. Instrum. Meas. , vol. 70, p. 5006713, 2021.
|
[34] |
M. Xu, L. Tang, H. Zhang, and J. Ma, "Infrared and visible image fusion via parallel scene and texture learning, " Pattern Recognit. , vol. 132, p. 108929, 2022. doi: 10.1016/j.patcog.2022.108929
|
[35] |
Z. Zhao, S. Xu, C. Zhang, J. Liu, J. Zhang, and P. Li, "Didfuse: deep image decomposition for infrared and visible image fusion, " in Int. Joint Conf. Artif. Intell., 2020, pp. 970–976.
|
[36] |
F. Zhao and W. Zhao, "Learning specific and general realm feature representations for image fusion, " IEEE Trans. Multimed. , vol. 23, pp. 2745–2756, 2020.
|
[37] |
L. Tang, J. Yuan, H. Zhang, X. Jiang, and J. Ma, "Piafusion: A progressive infrared and visible image fusion network based on illumination aware, " Inf. Fusion, vol. 83, pp. 79–92, 2022.
|
[38] |
Y. Long, H. Jia, Y. Zhong, Y. Jiang, and Y. Jia, "Rxdnfuse: A aggregated residual dense network for infrared and visible image fusion, " Inf. Fusion, vol. 69, pp. 128–141, 2021. doi: 10.1016/j.inffus.2020.11.009
|
[39] |
R. Liu, Z. Liu, J. Liu, and X. Fan, "Searching a hierarchically aggregated fusion architecture for fast multi-modality image fusion, " in Proc. ACM Int. Conf. Multimed., 2021, pp. 1600–1608.
|
[40] |
H. Zhang, H. Xu, Y. Xiao, X. Guo, and J. Ma, "Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity. " in Proc. AAAI Conf. Artif. Intell., 2020, pp. 12 797–12 804.
|
[41] |
H. Zhang and J. Ma, "Sdnet: A versatile squeeze-and-decomposition network for real-time image fusion, " Int. J. Comput. Vis. , vol. 129, no. 10, pp. 2761–2785, 2021. doi: 10.1007/s11263-021-01501-8
|
[42] |
H. Xu, J. Ma, Z. Le, J. Jiang, and X. Guo, "Fusiondn: A unified densely connected network for image fusion, " in Proc. AAAI Conf. Artif. Intell., 2020, pp. 12 484–12 491.
|
[43] |
H. Xu, J. Ma, J. Jiang, X. Guo, and H. Ling, "U2fusion: A unified unsupervised image fusion network, " IEEE Trans. Pattern Anal. Mach. Intell. , vol. 44, no. 1, pp. 502–518, 2022. doi: 10.1109/TPAMI.2020.3012548
|
[44] |
Y. Zhang, Y. Liu, P. Sun, H. Yan, X. Zhao, and L. Zhang, "Ifcnn: A general image fusion framework based on convolutional neural network, " Inf. Fusion, vol. 54, pp. 99–118, 2020. doi: 10.1016/j.inffus.2019.07.011
|
[45] |
J. Liu, R. Dian, S. Li, and H. Liu, "Sgfusion: A saliency guided deeplearning framework for pixel-level image fusion, " Inf. Fusion, 2022.
|
[46] |
Y. Fu, T. Xu, X. Wu, and J. Kittler, "Ppt fusion: Pyramid patch transformerfor a case study in image fusion, " arXiv, 2021.
|
[47] |
V. VS, J. M. J. Valanarasu, P. Oza, and V. M. Patel, "Image fusion transformer, " arXiv, 2021.
|
[48] |
H. Zhao and R. Nie, "Dndt: Infrared and visible image fusion via densenet and dual-transformer, " in Proc. Int. Conf. Inf. Technol. Biomed. Eng., 2021, pp. 71–75.
|
[49] |
J. Ma, P. Liang, W. Yu, C. Chen, X. Guo, J. Wu, and J. Jiang, "Infrared and visible image fusion via detail preserving adversarial learning, " Inf. Fusion, vol. 54, pp. 85–98, 2020. doi: 10.1016/j.inffus.2019.07.005
|
[50] |
J. Ma, H. Xu, J. Jiang, X. Mei, and X. -P. Zhang, "Ddcgan: A dualdiscriminator conditional generative adversarial network for multiresolution image fusion, " IEEE Trans. Image Process. , vol. 29, pp. 4980–4995, 2020. doi: 10.1109/TIP.2020.2977573
|
[51] |
J. Li, H. Huo, C. Li, R. Wang, and Q. Feng, "Attentionfgan: Infrared and visible image fusion using attention-based generative adversarial networks, " IEEE Trans. Multimed. , vol. 23, pp. 1383–1396, 2020.
|
[52] |
H. Zhou, W. Wu, Y. Zhang, J. Ma, and H. Ling, "Semantic-supervised infrared and visible image fusion via a dual-discriminator generative adversarial network, " IEEE Trans. Multimed. , 2021.
|
[53] |
H. Zhang, J. Yuan, X. Tian, and J. Ma, "Gan-fm: Infrared and visible image fusion using gan with full-scale skip connection and dual markovian discriminators, " IEEE Trans. Comput. Imaging, vol. 7, pp. 1134–1147, 2021. doi: 10.1109/TCI.2021.3119954
|
[54] |
Y. Liu, Y. Shi, F. Mu, J. Cheng, and X. Chen, "Glioma segmentationoriented multi-modal mr image fusion with adversarial learning, " IEEE/CAA J. Autom. Sinica, vol. 9, no. 8, pp. 1528–1531, 2022. doi: 10.1109/JAS.2022.105770
|
[55] |
Y. Liu, F. Mu, Y. Shi, and X. Chen, "Sf-net: A multi-task model for brain tumor segmentation in multimodal mri via image fusion, " IEEE Signal Process. Letters, vol. 29, pp. 1799–1803, 2022. doi: 10.1109/LSP.2022.3198594
|
[56] |
H. Xu, J. Ma, J. Yuan, Z. Le, and W. Liu, "Rfnet: Unsupervised network for mutually reinforcing multi-modal image registration and fusion, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., June 2022, pp. 19 679–19 688.
|
[57] |
J. Zhang and A. Rangarajan, "Affine image registration using a new information metric, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2004, pp. 1–8.
|
[58] |
D. Rueckert, L. Sonoda, C. Hayes, D. Hill, M. Leach, and D. Hawkes, "Nonrigid registration using free-form deformations: application to breast mr images, " IEEE Trans. Med. Imaging, vol. 18, no. 8, pp. 712–721, 1999. doi: 10.1109/42.796284
|
[59] |
J. -C. Yoo and T. H. Han, "Fast normalized cross-correlation, " Int. J Circuits, Syst. signal Process. , vol. 28, no. 6, pp. 819–843, 2009. doi: 10.1007/s00034-009-9130-7
|
[60] |
F. Maes, D. Vandermeulen, and P. Suetens, "Medical image registration using mutual information, " Proc. IEEE, vol. 91, no. 10, pp. 1699–1722, 2003. doi: 10.1109/JPROC.2003.817864
|
[61] |
C. Aguilera, F. Barrera, F. Lumbreras, A. D. Sappa, and R. Toledo, "Multispectral image feature points, " Sensors, vol. 12, no. 9, pp. 12 661–12 672, 2012. doi: 10.3390/s120912661
|
[62] |
S. Kim, D. Min, B. Ham, M. N. Do, and K. Sohn, "Dasc: Robust dense descriptor for multi-modal and multi-spectral correspondence estimation, " IEEE Trans. Pattern Anal. Mach. Intell. , vol. 39, no. 9, pp. 1712–1729, 2016.
|
[63] |
J. Li, Q. Hu, and M. Ai, "Rift: Multi-modal image matching based on radiation-variation insensitive feature transform, " IEEE Trans. Image Process. , vol. 29, pp. 3296–3310, 2019.
|
[64] |
G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, and A. V. Dalca, "Voxelmorph: a learning framework for deformable medical image registration, " Proc. IEEE Conf. Comput. Vis. Pattern Recognit. , vol. 38, no. 8, pp. 1788–1800, 2019.
|
[65] |
M. Arar, Y. Ginger, D. Danon, A. H. Bermano, and D. Cohen-Or, "Unsupervised multi-modal image registration via geometry preserving image-to-image translation, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 13 410–13 419.
|
[66] |
S. Zhou, W. Tan, and B. Yan, "Promoting single-modal optical flow network for diverse cross-modal flow estimation, " in Proc. AAAI Conf. Artif. Intell., 2022, pp. 3562–3570.
|
[67] |
S. Wang, D. Quan, X. Liang, M. Ning, Y. Guo, and L. Jiao, "A deep learning framework for remote sensing image registration, " ISPRS J. Photogramm. Remote Sens. , vol. 145, pp. 148–164, 2018. doi: 10.1016/j.isprsjprs.2017.12.012
|
[68] |
K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning phrase representations using rnn encoder-decoder for statistical machine translation, " arXiv, 2014.
|
[69] |
S. Bell, C. L. Zitnick, K. Bala, and R. Girshick, "Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2874–2883.
|
[70] |
X. Hu, C. -W. Fu, L. Zhu, J. Qin, and P. -A. Heng, "Direction-aware spatial context features for shadow detection and removal, " IEEE Trans. Pattern Anal. Mach. Intell. , vol. 42, no. 11, pp. 2795–2808, 2019.
|
[71] |
T. Wang, X. Yang, K. Xu, S. Chen, Q. Zhang, and R. W. Lau, "Spatial attentive single-image deraining with a high quality real rain dataset, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 12 270–12 279.
|
[72] |
S. Meister, J. Hur, and S. Roth, "Unflow: Unsupervised learning of optical flow with a bidirectional census loss, " in Proc. AAAI Conf. Artif. Intell., 2018, pp. 7251–7259.
|
[73] |
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity, " IEEE Trans. Image Process. , vol. 13, no. 4, pp. 600–612, 2004. doi: 10.1109/TIP.2003.819861
|
[74] |
M. Berman, A. R. Triki, and M. B. Blaschko, "The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 4413–4421.
|
[75] |
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., "Pytorch: An imperative style, high-performance deep learning library, " in Proc. Adv. Neural Inf. Process. Syst., 2019, pp. 8026–8037.
|
[76] |
P. Truong, M. Danelljan, and R. Timofte, "Glu-net: Global-local universal network for dense flow and correspondences, " in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 6258–6268.
|