A journal of IEEE and CAA , publishes high-quality papers in English on original theoretical/experimental research and development in all areas of automation

Citation: Lichuan Liu, Wei Li, Xianwen Wu and Benjamin X. Zhou, "Infant Cry Language Analysis and Recognition: An Experimental Approach," IEEE/CAA J. Autom. Sinica, vol. 6, no. 3, pp. 778-788, May 2019. doi: 10.1109/JAS.2019.1911435 shu

Infant Cry Language Analysis and Recognition: An Experimental Approach

  • Recently, lots of research has been directed towards natural language processing. However, the baby's cry, which serves as the primary means of communication for infants, has not yet been extensively explored, because it is not a language that can be easily understood. Since cry signals carry information about a babies' wellbeing and can be understood by experienced parents and experts to an extent, recognition and analysis of an infant's cry is not only possible, but also has profound medical and societal applications. In this paper, we obtain and analyze audio features of infant cry signals in time and frequency domains. Based on the related features, we can classify given cry signals to specific cry meanings for cry language recognition. Features extracted from audio feature space include linear predictive coding (LPC), linear predictive cepstral coefficients (LPCC), Bark frequency cepstral coefficients (BFCC), and Mel frequency cepstral coefficients (MFCC). Compressed sensing technique was used for classification and practical data were used to design and verify the proposed approaches. Experiments show that the proposed infant cry recognition approaches offer accurate and promising results.
    • Author Bio:

      Lichuan Liu (M'06-SM'11) received the B.S. and M.S. degree in electrical engineering in 1995 and 1998 respectively from University of Electronic Science and Technology of China, and Ph.D. degree in electrical engineering from New Jersey Institute of Technology, Newark, NJ in 2006. She joined Northern Illinois University in 2007 and is currently an Associate Professor of Electrical Engineering and the Director of Digital Signal Processing Laboratory. Her current research includes digital signal processing, real-time signal processing, wireless communication and networking. She has over 70 publications including 30 journal papers and one book chapter. She has three patents awarded. She has led and participated in many research grants, such as: NSF, NASA and NIH. (email: liu@niu.edu)

      Wei Li (M'99) received the Ph.D. degree in electrical and computer engineering from the University of Victoria, Canada in 2004. He is currently an Assistant Professor at the Northern Illinois University, USA. His research interests include computer networks, smart grid, internet of things, applications of machine learning and artificial intelligence in e-health, computer vision and natural language processing. (email: weili@niu.edu)

      Xianwen Wu (M'06) received the B.S. degree in electrical engineering from North University of China in July 2005, the M.S. degree in biomedical engineering from Southeast University in June 2010, and the M.S. degree in electrical engineering from Northern Illinois University in August 2013. He received the Ph.D. degree in electrical engineering from the University of Arkansas in December 2016 and then joined Qualcomm Inc. as a System Engineer. His research focuses on communication theory, wireless sensor networks, and signal processing. (email: z1648342@niu.edu)

      Benjamin X. Zhou is currently pursuing a B.S. in biology at the College of New Jersey as part of the 7-year B.S/M.D. program with NJMS. He currently researches at Perelman School of Medicine at the University of Pennsylvania. His current research interests include sepsis and its effects on the immune system using animal models. (email: zhoub1@tcnj.edu)

    • Corresponding author:

      Lichuan Liu, e-mail:liu@niu.edu

    • Funds:  This work was supported by the Gerber Foundation and the Northern Illinois University Research Foundation
  • 加载中
  • [1] H. Karp, The Happiest Baby on the Block; Fully Revised and Updated Second Edition: The New Way to Calm Crying, New York City, NY, USA, 2015.
    [2] J. A. Green, P. G. Whitney, and M. Potegalb, "Screaming, yelling, whining and crying: categorical and intensity differences in vocal expressions of anger and sadness in children's tantrums, " Emotion, vol. 5, no. 11, pp. 1124-1133, Oct. 2011.
    [3] Y. Kheddache and C. Tadj, "Acoustic measures of the cry characteristics of healthy newborns and newborns with pathologies, " Journal of Biomedical Science and Engineering, vol. 6, no. 8, 9 pages, 2013.
    [4] L. Liu, K. Kuo, and Sen M. Kuo, "Infant cry classification integrated ANC system for infant incubators, " in Proc. IEEE International Conf. on Networking, Sensing and Control, Paris, France, 2013, pp. 383-387.
    [5] L. Liu and K. Kuo, "Active noise control systems integrated with infant cry detection and classification for infant incubators, " in Proc. Acoustic, pp. 1-6. 2012.
    [6] L. LaGasse, A. Neal, and M. Lester, "Assessment of infant cry: acoustic cry analysis and parental perception, " Ment Retard Dev Disabil Res Rev., vol. 11, no. 1, pp. 83-93, 2005. doi: 10.1002/(ISSN)1098-2779
    [7] Várallyay Jr. György, "Future prospects of the application of the infant cry in the medicine, " Periodica Polytechnica Ser. El. Eng, vol. 50, no. 1-2, pp. 47-62, 2006.
    [8] G. Buonocore and C.V. Bellieni, Neonatal Pain, Suffering, Pain and Risk of Brain Damage in the Fetus and Newborn, Berlin, Germany, Springer, 2008.
    [9] L. L. LaGasse, R. Neal, and B. M. Lester. "Assessment of infant cry: acoustic cry analysis and parental perception, " Mental Retardation and Developmental Disabilities Research Reviews, vol. 11, no. 1. pp. 83-93, 2005. doi: 10.1002/(ISSN)1098-2779
    [10] L. Tan and J. Jiang, Digital Signal Processing: Fundamentals and Applications (3rd edition). Cambridge, MA, USA, Academic Press, 2017.
    [11] Z. Ren, K. Qian, Z. X. Zhang, V. Pandit, A. Baird, and B. Schuller, "Deep scalogram representations for acoustic scene classification, " IEEE/CAA J. Autom. Sinica, vol. 5, no. 3, pp. 662-669, May 2018.
    [12] Dong Yu and Jinyu Li. "Recent progresses in deep learning based acoustic models, " IEEE/CAA J. Autom. Sinica, vol. 4, no. 3, pp. 396-409, April 2017
    [13] B. Goldand N. Morgan, Speech and Audio Signal Processing. New York, NY, USA, John Wiley & Sons, 2011.
    [14] V. R. Fisichelli, S. Karelitz, C. F. Z. Boukydis, and B. M. Lester, "The cry attencies of normal infants and those with brain damage, " Infant Crying, Plenum Press, 1985.
    [15] C. F. Z. Boukydis and B. M. Lester, Infant Crying: Theoretical and Research Perspectives, Berlin, Germany, Springer Science and Bussiness Media, 2012.
    [16] S. Ludington-Hoe, X. Cong, and F. Hashemi, "Infant crying: nature, physiologic consequences, and select interventions, " Neonatal Netw. vol. 21, no. 2, pp. 29-36. Mar. 2002.
    [17] P. Dunstan, Calm the Crying: The Secret Baby Language That Reveals the Hidden Meaning Behind an Infant's Cry, New York City, NY, USA, Avery, 2012.
    [18] M. Sahidullah, and G. K. Saha, "Design analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition, " Speech Communication, vol. 54, no. 4, pp. 543-565, May 2012.
    [19] F. Katzberg, R. Mazur, M. Maass, P. Koch, and A. Mertins, "A compressed sensing framework for dynamic sound-field measurements, " IEEE/ACM Trans. Audio, Speech, and Language Processing, vol. 26, no. 11, pp. 1962-1975, Jun. 2018.
    [20] D. Needell and R. Ward, "Two-subspace projection method for coherent overdetermined systems, " Journal of Fourier Analysis and Applications, vol. 19, no. 2, pp. 256-269, April, 2013.
    [21] C. Lau, "Development of suck and swallow mechanisms in infants, " Ann. Nutr. Metab., vol. 7, no. 5, pp. 7-14, July 2015.
    [22] P. Runefors and E. Arnbjönsson, "A sound spectrogram analysis of children's crying after painful stimuli during the first year of life, " Folia honiatr. Logop., vol. 2, no. 57, pp. 90-95, Mar-Apr. 2005.
    1. [1]

      Sagar Arun More and Pramod Jagan Deore, "Gait Recognition by Cross Wavelet Transform and Graph Model," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 3, pp. 718-726, Mar. 2018.  doi: 10.1109/JAS.2018.7511081

    2. [2]

      Kai Zhong, Min Han and Bing Han, "Data-Driven Based Fault Prognosis for Industrial Systems: A Concise Overview," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 2, pp. 330-345, Mar. 2020.  doi: 10.1109/JAS.2019.1911804

    3. [3]

      Wenjing Luan, Guanjun Liu, Changjun Jiang and Liang Qi, "Partition-based Collaborative Tensor Factorization for POI Recommendation," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 3, pp. 437-446, July 2017.  doi: 10.1109/JAS.2017.7510538

    4. [4]

      Jianhong Wang, Liyan Qiao, Yongqiang Ye and YangQuan Chen, "Fractional Envelope Analysis for Rolling Element Bearing Weak Fault Feature Extraction," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 2, pp. 353-360, Apr. 2017.  doi: 10.1109/JAS.2016.7510166

    5. [5]

      Mohammadhossein Ghahramani, MengChu Zhou and Gang Wang, "Urban Sensing Based on Mobile Phone Data: Approaches, Applications, and Challenges," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 3, pp. 627-637, May 2020.  doi: 10.1109/JAS.2020.1003120

    6. [6]

      Chinthaka Premachandra, Dang Ngoc Hoang Thanh, Tomotaka Kimura and Hiroharu Kawanaka, "A Study on Hovering Control of Small Aerial Robot by Sensing Existing Floor Features," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 4, pp. 1016-1025, July 2020.  doi: 10.1109/JAS.2020.1003240

    7. [7]

      Pei Liu, Yingjie Zhou, Dezhong Peng and Dapeng Wu, "Global-Attention-Based Neural Networks for Vision Language Intelligence," IEEE/CAA J. of Autom. Sinica, vol. 8, no. 7, pp. 1243-1252, July 2021.  doi: 10.1109/JAS.2020.1003402

    8. [8]

      Xiaowei Feng, Xiangyu Kong and Hongguang Ma, "Coupled Cross-correlation Neural Network Algorithm for Principal Singular Triplet Extraction of a Cross-covariance Matrix," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 2, pp. 147-156, 2016. 

    9. [9]

      Haoyue Liu, MengChu Zhou and Qing Liu, "An Embedded Feature Selection Method for Imbalanced Data Classification," IEEE/CAA J. of Autom. Sinica, vol. 6, no. 3, pp. 703-715, May 2019.  doi: 10.1109/JAS.2019.1911447

    10. [10]

      Zhiling Cai and William Zhu, "Feature Selection for Multi-label Classification Using Neighborhood Preservation," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 1, pp. 320-330, Jan. 2018.  doi: 10.1109/JAS.2017.7510781

    11. [11]

      Yadi Wang, Zefeng Zhang and Yinghao Lin, "Multi-Cluster Feature Selection Based on Isometric Mapping," IEEE/CAA J. of Autom. Sinica, vol. 9, no. 3, pp. 570-572, Mar. 2022.  doi: 10.1109/JAS.2021.1004398

    12. [12]

      Li Li, Yisheng Lv and Fei-Yue Wang, "Traffic Signal Timing via Deep Reinforcement Learning," IEEE/CAA J. of Autom. Sinica, vol. 3, no. 3, pp. 247-254, 2016. 

    13. [13]

      Jianquan Gu, Haifeng Hu and Haoxi Li, "Local Robust Sparse Representation for Face Recognition With Single Sample per Person," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 2, pp. 547-554, Mar. 2018.  doi: 10.1109/JAS.2017.7510658

    14. [14]

      Xin Kang, Fuji Ren and Yunong Wu, "Exploring Latent Semantic Information for Textual Emotion Recognition in Blog Articles," IEEE/CAA J. of Autom. Sinica, vol. 5, no. 1, pp. 204-216, Jan. 2018.  doi: 10.1109/JAS.2017.7510421

    15. [15]

      Zuojun Liu, Wei Lin, Yanli Geng and Peng Yang, "Intent Pattern Recognition of Lower-limb Motion Based on Mechanical Sensors," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 651-660, Oct. 2017.  doi: 10.1109/JAS.2017.7510619

    16. [16]

      Zhentao Liu, Min Wu, Weihua Cao, Luefeng Chen, Jianping Xu, Ri Zhang, Mengtian Zhou and Junwei Mao, "A Facial Expression Emotion Recognition Based Human-robot Interaction System," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 668-676, Oct. 2017.  doi: 10.1109/JAS.2017.7510622

    17. [17]

      Xiaohui Yuan, Longbo Kong, Dengchao Feng and Zhenchun Wei, "Automatic Feature Point Detection and Tracking of Human Actions in Time-of-flight Videos," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 4, pp. 677-685, Oct. 2017.  doi: 10.1109/JAS.2017.7510625

    18. [18]

      Zhanjun Huang, Zhanshan Wang and Huaguang Zhang, "Multilevel Feature Moving Average Ratio Method for Fault Diagnosis of the Microgrid Inverter Switch," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 2, pp. 177-185, Apr. 2017.  doi: 10.1109/JAS.2017.7510496

    19. [19]

      Jia Sun, Peng Wang, Zhengke Qin and Hong Qiao, "Effective Self-calibration for Camera Parameters and Hand-eye Geometry Based on Two Feature Points Motions," IEEE/CAA J. of Autom. Sinica, vol. 4, no. 2, pp. 370-380, Apr. 2017.  doi: 10.1109/JAS.2017.7510556

    20. [20]

      Guangyuan Pan, Liping Fu, Qili Chen, Ming Yu and Matthew Muresan, "Road Safety Performance Function Analysis With Visual Feature Importance of Deep Neural Nets," IEEE/CAA J. of Autom. Sinica, vol. 7, no. 3, pp. 735-744, May 2020.  doi: 10.1109/JAS.2020.1003108

  • Figure 1.  Block diagram for infant cry recognition.

    Figure 2.  Diaper signal wave form (upper) and spectrogram (lower).

    Figure 3.  Attention: signal waveform (upper) and spectrogram (lower).

    Figure 4.  Hungry: waveform (upper) and spectrogram (lower).

    Figure 5.  Sleepy: waveform (upper) and spectrogram (lower).

    Figure 6.  Uncomfortable: waveform (upper) and spectrogram (lower).

    Figure 7.  Baby cry signal, short time energy and detected cry unit for cry file T19.wav.

    Figure 8.  Baby cry signal, short time zero-crossing and detected cry for T19. wav.

    Figure 9.  BFCC features for attention, diaper, hungry and discomfort cry signals.

    Figure 10.  Different features for hungry and sleepy cry.

    Figure 11.  Features for attention, diaper, hungry and discomfort cry.

    Table Ⅰ.  CRY SIGNAL INFORMATION

    Cause Sex Age Race File
    1 Diaper F 2w Asian T07
    2 Attention F Asian T10A
    3 Attention T34
    4 Attention T105
    5 Hungry M 1w Asian T11
    6 Attention T33
    7 Hungry T35
    8 Hungry M 1w Asian T19
    9 Sleepy M 3m Asian T20
    10 Disturbed T32
    11 Sleepy F Asian T21
    12 Sleepy T23
    13 Diaper F 3d Asian T22
    14 Inject M 1w Asian T24
    15 Sputum induction M 2w T110
    16 Sleepy M 1w Asian T25
    17 Hungry M 2w T113
    18 Hungry F 3d Asian T26
    19 Attention F 1w T104
    20 Hungry F 1w T122
    21 Attention F 8d Asian T27
    22 Uncomfortable M 2w Asian T28
    23 Blood test M 3w Asian T109
    24 Diaper F 2w Asian T29
    25 Attention M 9d Asian T30
    26 Attention T31
    27 Diaper M 2d Asian T36
    28 Diaper M 6d T116
    29 Diaper F 9d Asian T37
    30 Hungry F 2w T117
    31 Other M 1w Asian T106
    32 Hungry M 2w T121
    33 Hungry M 2w Asian T107
    34 Blood test F 2w Asian T108
    35 Uncomfortable F 1m Asian T111
    36 Hungry F T124
    37 Hungry M 5d Asian T112
    38 Hungry M 11d Asian T114
    39 Hungry M 2w T115
    40 Hungry M 2w T120
    41 Sleepy F 8d Asian T118
    42 Sleepy F T119
    43 Hungry M 1w Caucasian T123
    44 Uncomfortable M 14w Asian T125
    45 Sleepy M T126
    46 Hungry M T127
    47 Sleepy M T128
    48 Uncomfortable M T129
    DownLoad: CSV

    Table Ⅱ.  INFANT CRY RECOGNITION CORRECT RATE WITH COMPRESSED SENSING TECHNIQUE AND DIFFERENT FEATURES FOR CRY SIGNALS (SLEEPY AND HUNGRY)

    Features The data ratio of constructing the matrix
    0.4 0.5 0.6 0.7 0.8 0.9
    BFCC 0.6991 0.6915 0.7067 0.6842 0.7105 0.6842
    LPC 0.5133 0.4681 0.4933 0.4737 0.4211 0.5789
    LPCC 0.6018 0.6064 0.6267 0.5965 0.5789 0.4737
    MFCC 0.6814 0.6596 0.6767 0.7018 0.7105 0.6842
    DownLoad: CSV

    Table Ⅲ.  INFANT CRY RECOGNITION CORRECT RATE WITH COMPRESSED SENSING TECHNIQUE AND DIFFERENT FEATURES

    Features The data ratio of constructing the matrix
    0.4 0.5 0.6 0.7 0.8 0.9
    BFCC 0.5701 0.5393 0.5742 0.5754 0.5111 0.6842
    LPC 0.6131 0.6225 0.5854 0.5688 0.5419 0.4667
    LPCC 0.5009 0.4989 0.4986 0.4944 0.5028 0.4889
    MFCC 0.5907 0.5910 0.5938 0.5502 0.5140 0.5333
    DownLoad: CSV

    Table Ⅳ.  INFANT CRY RECOGNITION CORRECT RATE BY USING DIFFERENT FEATURES AND RECOGNITION TECHNIQUES

    Features LPC LPCC MFCC BFCC
    Nearest neighborhood (NN) 0.6384 0.4795 0.6389 0.6522
    Artificial neural network (ANN) 0.5455 0.5188 0.6045 0.7647
    Compressed sensing (CS) 0.5789 0.6267 0.7105 0.7064
    DownLoad: CSV

Figures(11) / Tables(4)

  • PDF Downloads  (238)
  • Abstract Views  (5543)
  • HTML Views  (7788)

Article Metrics

通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

/

DownLoad:  Full-Size Img  PowerPoint
Return