Infant Cry Language Analysis and Recognition: An Experimental Approach
-
Recently, lots of research has been directed towards natural language processing. However, the baby's cry, which serves as the primary means of communication for infants, has not yet been extensively explored, because it is not a language that can be easily understood. Since cry signals carry information about a babies' wellbeing and can be understood by experienced parents and experts to an extent, recognition and analysis of an infant's cry is not only possible, but also has profound medical and societal applications. In this paper, we obtain and analyze audio features of infant cry signals in time and frequency domains. Based on the related features, we can classify given cry signals to specific cry meanings for cry language recognition. Features extracted from audio feature space include linear predictive coding (LPC), linear predictive cepstral coefficients (LPCC), Bark frequency cepstral coefficients (BFCC), and Mel frequency cepstral coefficients (MFCC). Compressed sensing technique was used for classification and practical data were used to design and verify the proposed approaches. Experiments show that the proposed infant cry recognition approaches offer accurate and promising results.
-
Corresponding author:
Lichuan Liu, e-mail:liu@niu.edu
-
Corresponding author:
-
-
[1] H. Karp, The Happiest Baby on the Block; Fully Revised and Updated Second Edition: The New Way to Calm Crying, New York City, NY, USA, 2015. [2] J. A. Green, P. G. Whitney, and M. Potegalb, "Screaming, yelling, whining and crying: categorical and intensity differences in vocal expressions of anger and sadness in children's tantrums, " Emotion, vol. 5, no. 11, pp. 1124-1133, Oct. 2011. [3] Y. Kheddache and C. Tadj, "Acoustic measures of the cry characteristics of healthy newborns and newborns with pathologies, " Journal of Biomedical Science and Engineering, vol. 6, no. 8, 9 pages, 2013. [4] L. Liu, K. Kuo, and Sen M. Kuo, "Infant cry classification integrated ANC system for infant incubators, " in Proc. IEEE International Conf. on Networking, Sensing and Control, Paris, France, 2013, pp. 383-387. [5] L. Liu and K. Kuo, "Active noise control systems integrated with infant cry detection and classification for infant incubators, " in Proc. Acoustic, pp. 1-6. 2012. [6] L. LaGasse, A. Neal, and M. Lester, "Assessment of infant cry: acoustic cry analysis and parental perception, " Ment Retard Dev Disabil Res Rev., vol. 11, no. 1, pp. 83-93, 2005. doi: 10.1002/(ISSN)1098-2779 [7] Várallyay Jr. György, "Future prospects of the application of the infant cry in the medicine, " Periodica Polytechnica Ser. El. Eng, vol. 50, no. 1-2, pp. 47-62, 2006. [8] G. Buonocore and C.V. Bellieni, Neonatal Pain, Suffering, Pain and Risk of Brain Damage in the Fetus and Newborn, Berlin, Germany, Springer, 2008. [9] L. L. LaGasse, R. Neal, and B. M. Lester. "Assessment of infant cry: acoustic cry analysis and parental perception, " Mental Retardation and Developmental Disabilities Research Reviews, vol. 11, no. 1. pp. 83-93, 2005. doi: 10.1002/(ISSN)1098-2779 [10] L. Tan and J. Jiang, Digital Signal Processing: Fundamentals and Applications (3rd edition). Cambridge, MA, USA, Academic Press, 2017. [11] Z. Ren, K. Qian, Z. X. Zhang, V. Pandit, A. Baird, and B. Schuller, "Deep scalogram representations for acoustic scene classification, " IEEE/CAA J. Autom. Sinica, vol. 5, no. 3, pp. 662-669, May 2018. [12] Dong Yu and Jinyu Li. "Recent progresses in deep learning based acoustic models, " IEEE/CAA J. Autom. Sinica, vol. 4, no. 3, pp. 396-409, April 2017 [13] B. Goldand N. Morgan, Speech and Audio Signal Processing. New York, NY, USA, John Wiley & Sons, 2011. [14] V. R. Fisichelli, S. Karelitz, C. F. Z. Boukydis, and B. M. Lester, "The cry attencies of normal infants and those with brain damage, " Infant Crying, Plenum Press, 1985. [15] C. F. Z. Boukydis and B. M. Lester, Infant Crying: Theoretical and Research Perspectives, Berlin, Germany, Springer Science and Bussiness Media, 2012. [16] S. Ludington-Hoe, X. Cong, and F. Hashemi, "Infant crying: nature, physiologic consequences, and select interventions, " Neonatal Netw. vol. 21, no. 2, pp. 29-36. Mar. 2002. [17] P. Dunstan, Calm the Crying: The Secret Baby Language That Reveals the Hidden Meaning Behind an Infant's Cry, New York City, NY, USA, Avery, 2012. [18] M. Sahidullah, and G. K. Saha, "Design analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition, " Speech Communication, vol. 54, no. 4, pp. 543-565, May 2012. [19] F. Katzberg, R. Mazur, M. Maass, P. Koch, and A. Mertins, "A compressed sensing framework for dynamic sound-field measurements, " IEEE/ACM Trans. Audio, Speech, and Language Processing, vol. 26, no. 11, pp. 1962-1975, Jun. 2018. [20] D. Needell and R. Ward, "Two-subspace projection method for coherent overdetermined systems, " Journal of Fourier Analysis and Applications, vol. 19, no. 2, pp. 256-269, April, 2013. [21] C. Lau, "Development of suck and swallow mechanisms in infants, " Ann. Nutr. Metab., vol. 7, no. 5, pp. 7-14, July 2015. [22] P. Runefors and E. Arnbjönsson, "A sound spectrogram analysis of children's crying after painful stimuli during the first year of life, " Folia honiatr. Logop., vol. 2, no. 57, pp. 90-95, Mar-Apr. 2005. -
-
[1]
Sagar Arun More and Pramod Jagan Deore, "Gait Recognition by Cross Wavelet Transform and Graph Model," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 3, pp. 718-726, Mar. 2018. doi: 10.1109/JAS.2018.7511081
-
[2]
Kai Zhong, Min Han and Bing Han, "Data-Driven Based Fault Prognosis for Industrial Systems: A Concise Overview," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 2, pp. 330-345, Mar. 2020. doi: 10.1109/JAS.2019.1911804
-
[3]
Wenjing Luan, Guanjun Liu, Changjun Jiang and Liang Qi, "Partition-based Collaborative Tensor Factorization for POI Recommendation," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 3, pp. 437-446, July 2017. doi: 10.1109/JAS.2017.7510538
-
[4]
Jianhong Wang, Liyan Qiao, Yongqiang Ye and YangQuan Chen, "Fractional Envelope Analysis for Rolling Element Bearing Weak Fault Feature Extraction," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 2, pp. 353-360, Apr. 2017. doi: 10.1109/JAS.2016.7510166
-
[5]
Mohammadhossein Ghahramani, MengChu Zhou and Gang Wang, "Urban Sensing Based on Mobile Phone Data: Approaches, Applications, and Challenges," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 3, pp. 627-637, May 2020. doi: 10.1109/JAS.2020.1003120
-
[6]
Chinthaka Premachandra, Dang Ngoc Hoang Thanh, Tomotaka Kimura and Hiroharu Kawanaka, "A Study on Hovering Control of Small Aerial Robot by Sensing Existing Floor Features," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 4, pp. 1016-1025, July 2020. doi: 10.1109/JAS.2020.1003240
-
[7]
Pei Liu, Yingjie Zhou, Dezhong Peng and Dapeng Wu, "Global-Attention-Based Neural Networks for Vision Language Intelligence," IEEE/CAA J. of Autom. Sinica, vol. 8, no. 7, pp. 1243-1252, July 2021. doi: 10.1109/JAS.2020.1003402
-
[8]
Xiaowei Feng, Xiangyu Kong and Hongguang Ma, "Coupled Cross-correlation Neural Network Algorithm for Principal Singular Triplet Extraction of a Cross-covariance Matrix," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 2, pp. 147-156, 2016.
-
[9]
Haoyue Liu, MengChu Zhou and Qing Liu, "An Embedded Feature Selection Method for Imbalanced Data Classification," IEEE/CAA J. of Autom. Sinica, vol. 6, no. 3, pp. 703-715, May 2019. doi: 10.1109/JAS.2019.1911447
-
[10]
Zhiling Cai and William Zhu, "Feature Selection for Multi-label Classification Using Neighborhood Preservation," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 1, pp. 320-330, Jan. 2018. doi: 10.1109/JAS.2017.7510781
-
[11]
Yadi Wang, Zefeng Zhang and Yinghao Lin, "Multi-Cluster Feature Selection Based on Isometric Mapping," IEEE/CAA J. of Autom. Sinica, vol. 9, no. 3, pp. 570-572, Mar. 2022. doi: 10.1109/JAS.2021.1004398
-
[12]
Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016.
-
[13]
Jianquan Gu, Haifeng Hu and Haoxi Li, "Local Robust Sparse Representation for Face Recognition With Single Sample per Person," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 2, pp. 547-554, Mar. 2018. doi: 10.1109/JAS.2017.7510658
-
[14]
Xin Kang, Fuji Ren and Yunong Wu, "Exploring Latent Semantic Information for Textual Emotion Recognition in Blog Articles," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 1, pp. 204-216, Jan. 2018. doi: 10.1109/JAS.2017.7510421
-
[15]
Zuojun Liu, Wei Lin, Yanli Geng and Peng Yang, "Intent Pattern Recognition of Lower-limb Motion Based on Mechanical Sensors," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 651-660, Oct. 2017. doi: 10.1109/JAS.2017.7510619
-
[16]
Zhentao Liu, Min Wu, Weihua Cao, Luefeng Chen, Jianping Xu, Ri Zhang, Mengtian Zhou and Junwei Mao, "A Facial Expression Emotion Recognition Based Human-robot Interaction System," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 668-676, Oct. 2017. doi: 10.1109/JAS.2017.7510622
-
[17]
Xiaohui Yuan, Longbo Kong, Dengchao Feng and Zhenchun Wei, "Automatic Feature Point Detection and Tracking of Human Actions in Time-of-flight Videos," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 677-685, Oct. 2017. doi: 10.1109/JAS.2017.7510625
-
[18]
Zhanjun Huang, Zhanshan Wang and Huaguang Zhang, "Multilevel Feature Moving Average Ratio Method for Fault Diagnosis of the Microgrid Inverter Switch," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 2, pp. 177-185, Apr. 2017. doi: 10.1109/JAS.2017.7510496
-
[19]
Jia Sun, Peng Wang, Zhengke Qin and Hong Qiao, "Effective Self-calibration for Camera Parameters and Hand-eye Geometry Based on Two Feature Points Motions," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 2, pp. 370-380, Apr. 2017. doi: 10.1109/JAS.2017.7510556
-
[20]
Guangyuan Pan, Liping Fu, Qili Chen, Ming Yu and Matthew Muresan, "Road Safety Performance Function Analysis With Visual Feature Importance of Deep Neural Nets," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 3, pp. 735-744, May 2020. doi: 10.1109/JAS.2020.1003108
-
[1]
-
Table Ⅰ. CRY SIGNAL INFORMATION
Cause Sex Age Race File 1 Diaper F 2w Asian T07 2 Attention F Asian T10A 3 Attention T34 4 Attention T105 5 Hungry M 1w Asian T11 6 Attention T33 7 Hungry T35 8 Hungry M 1w Asian T19 9 Sleepy M 3m Asian T20 10 Disturbed T32 11 Sleepy F Asian T21 12 Sleepy T23 13 Diaper F 3d Asian T22 14 Inject M 1w Asian T24 15 Sputum induction M 2w T110 16 Sleepy M 1w Asian T25 17 Hungry M 2w T113 18 Hungry F 3d Asian T26 19 Attention F 1w T104 20 Hungry F 1w T122 21 Attention F 8d Asian T27 22 Uncomfortable M 2w Asian T28 23 Blood test M 3w Asian T109 24 Diaper F 2w Asian T29 25 Attention M 9d Asian T30 26 Attention T31 27 Diaper M 2d Asian T36 28 Diaper M 6d T116 29 Diaper F 9d Asian T37 30 Hungry F 2w T117 31 Other M 1w Asian T106 32 Hungry M 2w T121 33 Hungry M 2w Asian T107 34 Blood test F 2w Asian T108 35 Uncomfortable F 1m Asian T111 36 Hungry F T124 37 Hungry M 5d Asian T112 38 Hungry M 11d Asian T114 39 Hungry M 2w T115 40 Hungry M 2w T120 41 Sleepy F 8d Asian T118 42 Sleepy F T119 43 Hungry M 1w Caucasian T123 44 Uncomfortable M 14w Asian T125 45 Sleepy M T126 46 Hungry M T127 47 Sleepy M T128 48 Uncomfortable M T129 Table Ⅱ. INFANT CRY RECOGNITION CORRECT RATE WITH COMPRESSED SENSING TECHNIQUE AND DIFFERENT FEATURES FOR CRY SIGNALS (SLEEPY AND HUNGRY)
Features The data ratio of constructing the matrix 0.4 0.5 0.6 0.7 0.8 0.9 BFCC 0.6991 0.6915 0.7067 0.6842 0.7105 0.6842 LPC 0.5133 0.4681 0.4933 0.4737 0.4211 0.5789 LPCC 0.6018 0.6064 0.6267 0.5965 0.5789 0.4737 MFCC 0.6814 0.6596 0.6767 0.7018 0.7105 0.6842 Table Ⅲ. INFANT CRY RECOGNITION CORRECT RATE WITH COMPRESSED SENSING TECHNIQUE AND DIFFERENT FEATURES
Features The data ratio of constructing the matrix 0.4 0.5 0.6 0.7 0.8 0.9 BFCC 0.5701 0.5393 0.5742 0.5754 0.5111 0.6842 LPC 0.6131 0.6225 0.5854 0.5688 0.5419 0.4667 LPCC 0.5009 0.4989 0.4986 0.4944 0.5028 0.4889 MFCC 0.5907 0.5910 0.5938 0.5502 0.5140 0.5333 Table Ⅳ. INFANT CRY RECOGNITION CORRECT RATE BY USING DIFFERENT FEATURES AND RECOGNITION TECHNIQUES
Features LPC LPCC MFCC BFCC Nearest neighborhood (NN) 0.6384 0.4795 0.6389 0.6522 Artificial neural network (ANN) 0.5455 0.5188 0.6045 0.7647 Compressed sensing (CS) 0.5789 0.6267 0.7105 0.7064



DownLoad:
