Effect of “Spatially Separated Speech in Noise Training” on speech perception in noise in children with bimodal fitting

The objective of this study was to evaluate the effect of “Spatially separated speech in noise” auditory training on the ability of speech perception in noise among bimodal fitting users. The assumption was that the rehabilitation can enhance spatial hearing and hence speech in noise perception. This study was an interventional study, with a pre/post-design. Speech recognition ability was assessed with the specific tests. After performing the rehabilitation stages in the intervention group, the speech tests were again implemented, and by comparing the pre- and post-intervention data, the effect of auditory training on the speech abilities was assessed. Twenty-four children of 8–12 years who had undergone cochlear implantation and continuously used bimodal fitting were investigated in two groups of control and intervention. The results showed a significant difference between the groups in different speech tests after the intervention, which indicated that the intervention group have improved more than the control group. It can be concluded that “Spatially separated speech in noise” auditory training can improve the speech perception in noise in bimodal fitting users. In general, this rehabilitation method is useful for enhancing the speech in noise perception ability.


Background
Children with unilateral cochlear implant suffer from lower abilities of localization and spatial hearing than their normal hearing counterparts; however, bimodal fitting can provide them with the advantage of binaural hearing [1]. In this configuration, one ear is stimulated with the electrical signals of the cochlear implant, and the contralateral ear receives the acoustic signals amplified by the hearing aid. Bimodal fitting (BF) refers to the combination of these two inputs [2]. The combination of hearing aid and cochlear implant in BF amplifies lowfrequency speech signals by consolidating them with cochlear implant signals. Consequently, it improves the segregation of competing sounds. This process allows signal separation through binaural hearing processing mechanisms which is one of the several advantages of bimodal hearing [1].
However, studies have shown that these abilities in BF users will not reach a normal level. Studies into BF users undergoing conventional auditory training have shown that despite the addition of spectral cues, improvement of signal reception conditions, and improvement of speech perception in quiet and noise, they are still far from their normal peers, especially in speech perception in noise performance [3]; however, this gap can be narrowed with appropriate rehabilitation techniques. Speech in Noise Auditory Training (SPIN AT) is a rehabilitation technique to improve speech perception in noise. There is a probably quite complex interplay of bottom-up and top-down processes involved in speech comprehension under adverse conditions. Auditory training is generally based on the theories of bottom-up and top-down auditory phenomena: bottomup methods emphasizing the training of the discrimination of small acoustic differences in signals, and the aim is to improve the peripheral auditory processing efficiency, and in the top-down method, the goal of training is improving central processing efficiency and enhancing the ability to selectively use lower-level auditory processing as per the needs of the listener [4].
The majority of studies into the effect of rehabilitation on speech perception have concentrated on the listener's ability to recognize acoustic differences of stimuli. This is because these auditory training methods aim at bottom-up processing. Despite the importance of bottom-up processing, the top-down central processing is more important in speech perception in noise; thereby, it is recommended to consider top-down processing to enhance speech perception in noise in cochlear implant users. The training programs with a focus on spatial hearing and binaural hearing lead to a greater enhancement of speech in noise perception and lateralization than the conventional auditory training methods. Therefore, in the current study, considering this issue, both bottom-up and top-down aspects were taken into consideration. Tyler et al. applied auditory training in noise to people with bilateral cochlear implantation and focused on binaural hearing and spatial hearing in their training program. In order to enhance speech perception in noise, the signal and noise were spatially separated in this method, aiming to develop an auditory training technique with a focus on binaural hearing [5]. The present study also used spatial auditory training to provide optimal conditions for the enhancement of speech perception in noise. It was tried to enhance speech perception in noise of the participants through training and learning based on binaural hearing as well as through speech in noise rehabilitation, in which the speech source is spatially separated from the noise source. It was assumed that this rehabilitation technique: SPIN AT which considers bottom-up and top-down processes, can improve spatial hearing and ability to distinguish signal from noise, thereby enhancing speech perception in noise.
As a result, this study investigated the influence of SPIN AT on the improvement of speech perception in noise among BF children aged between 8 and 12 years (The reason for choosing this age range was that this age range is the age of education, and in this regard, they are considered important ages.). In other words, the purpose of the study was to evaluate if any improvement was achieved after rehabilitation.

Methods
This interventional study, with a pre/post-design, was performed in 2018, in a cochlear implant center in Tehran. Twenty-four pre-lingually deaf children (8-12 years old) who received a cochlear implant and used bimodal fitting continuously participated in the research based on the following criteria: healthy middle ear, performing cochlear implant surgery before the age of 2.5 years, normal intelligence, ability to perform tests, good general health (which was assessed according to the file they had in the implant center), mean amplified hearing threshold of better than 55 dBHL in binaural condition (hearing aid + cochlear implant) at speech spectrum frequencies (0.5, 1, 2 and 4 kHz), speech discrimination score more than70%, without auditory neuropathy disorder, at least 6-month use of hearing aid in the ear contralateral to one with the cochlear implant, and using a hearing aid for at least 6 h during the day.
All children participating in the project received a routine aural rehabilitation program after cochlear implantation. This rehabilitation program includes one hundred 45-min training sessions performed three times a week for cochlear implant children.
Accordingly, 24 most eligible children were selected and with completely random allocation divided into two equal-sized groups of intervention and control.
The current investigation was conducted in three separate sections: pre-rehabilitation assessments, rehabilitation sessions, and post-rehabilitation assessments.

Pre-rehabilitation assessments
The investigations before the rehabilitation stage were conducted in 2-3 sessions in the following order: 1. Otoscopy and tympanometry to rule out middle ear infection. 2. Adjusting hearing aid and ear molds: a prescribed fitting formula for BF cases which can prepare optimal frequency response is NAL-NL1 [6], so the hearing aids were aligned according to this formula. Most of the participants were using this formula before being enrolled in the study and some had their fitting changed during the initial assessment. 3. Mapping cochlear implant speech processor: Since the objective of this research was to investigate the speech perception ability of participants in daily life situations, all of them were evaluated in situations where their cochlear implant speech processor was set at the "normal everyday program" which were typically used for everyday listening situations.
4. The loudness balance: in order to balance the loudness level between the cochlear implant and hearing aid, a loudspeaker was placed in front of the patient (at the azimuth angle of zero) and the patient was asked to indicate the direction of the delivered speech stimuli (65 dB SPL). The volume of the hearing aid was then adjusted so that the sound could be heard from the midline of the head. The cochlear implant mappings were unchanged because all of them were mapped at the optimal auditory levels [7]. 5. Pure tone audiometry (PTA): Free field pure tone audiometry performed in binaural condition (hearing aid + cochlear implant) with a speaker located at 0°azimuth (at a distance of 1.5 m from the child) and the mean threshold was determined in 500, 1000, 2000, and 4000 Hz. 6. The Speech Recognition Threshold (SRT), Speech Discrimination Score (SDS) in quiet, Consonant Vowel (CV) in noise [8], Word in Noise (WIN) [9], and Bamford-Kowal-Bench Speech-in-Noise Test (BKB-SIN) [10] were used to investigate speech perception performance. These tests were conducted using standardized lists (Persian versions) with a consistent female speaker, played via a speaker placed at the azimuth angle of zero, 1.5 m in front of the participant. The subjects were tested test under optimal auditory conditions (turning the cochlear implant and the hearing aid on) in an acoustic chamber.

Speech tests
Speech Recognition Threshold (SRT) This is a test for estimating the intensity level at which a person can repeat 50% of the spondaic words and is usually reported in dBHL or dBSPL. In other words, SRT is described as follows: the level of intensity at which a person can correctly express at least 2 words out of the 4 two-syllable words presented. The lower the level of intensity, the better the auditory ability [11]. In this study, this test was performed by using a standard Persian list of twosyllable words [12].
Speech Discrimination Score (SDS) The most common supra threshold measure in quiet is the SDS or word recognition score (WRS) and is generally measured with the correct percentage at a certain intensity level relative to either the SRT or an average of PTA thresholds [11]. SDS or WRS is the number of correct words that a person recognizes from a list of 25 single-syllable words, expressed as a percentage. The level of intensity of speech is at the level of one's most comfortable listening level (MCL), which is usually 30 dB above the threshold of speech reception. The higher the score, the better their performance.

Consonant Vowel (CV) in noise
In this test, nonsense syllables are presented at different signal-to-noise ratios (SNRs) [13]. Simultaneously with the CV signals, white noise is delivered through the loudspeaker at different signal-to-noise ratios: −12, −6, 0, + 6, and + 12, and the person repeats the audible syllable and the scores are evaluated as percentages separately in each SNR; the higher score indicates better performance [8,13]. In the current study, the standard Persian list of the test was used [8]. Three SNRs were selected for the test (−6, 0, + 6). A list of 25 words was presented in each SNR. The results were averaged.

Word in Noise (WIN)
This test is used as a tool to quantify listeners' ability to understand monosyllabic words in background noise using four-talker babble. In this test, monosyllable words recorded in different signal-to-noise ratios (SNRs) are presented to create a psychometric function from which a 50% point can be calculated using the Spearman-Kärber equation [14]. In this study, this was done by the Persian standard version of the test which is standardized for children [9].
Bamford-Kowal-Bench Speech-in-Noise Test (BKB-SIN) The BKB-SIN is a speech-in-noise test in which BKB (Bamford-Kowal-Bench) sentences are presented in four-talker babble noise. The BKB-SIN can be used to estimate SNR loss in children and adults for whom the Quick SIN test is so hard and the following formula is used for scoring: To get SNR 50, subtract the correct total value for each list from 23.5 (signal-to-noise ratio for 50% correct) [15]. A lower score indicates better performance. In this study, the standard Persian version of the test was used [10].

Rehabilitation sessions (Speech in Noise Auditory Training (SPIN AT))
The training sessions were held twice per week for 5 weeks. The trainings were given by the researcher (always the same person) in a quiet room while 1.5 m away from the child. The contents of training sessions varied across sessions to maintain sustainable diversity and motivation necessary to carry out the training.
The intervention conducted in this study consisted of an official training program, including fully meaningful and structured exercises. The difficulty level of the exercises was gradually increasing in this program. The intervals between the exercises, the frequency of repetition of the exercise, and the overall severity of the stimulus presentation were under the therapist's control. To investigate the effect of this rehabilitation method on speech comprehension abilities, bimodal fitting users were divided into two completely random groups: intervention group-a group for which rehabilitation sessions were performed, and control group-the group that did not participate in these sessions and, if desired, after the end of the study, participated in the rehabilitation sessions.

Stimulation mechanism and loudspeaker layout
Stimuli were presented via three speakers placed at different angles with a semicircular arrangement on a horizontal plate. The target stimulus was provided by a loudspeaker at 0°azimuth, while the noise was generated by two side loudspeakers located at + 30°/−30°or + 60°/ −60°, and + 90°/−90°azimuth [16]. Loudspeakers were 1.5 m away from the child. The SNR was changed at different locations of the loudspeakers; in each location, three SNRs were presented. The target signal was presented at 65 dBHL and the SNR was 0, + 5, and + 10.

Stimuli
The employed stimuli were speech signals presented via the front loudspeaker in an easy-to-difficult order including two-syllabic and monosyllabic words, consonant-vowels (nonsense monosyllable), and sentences. The stimuli were selected from standardized Persian lists. The generated noise was initially speech noise (300-3000 Hz), which then changed into a four-talker babel noise with different SNRs [17]. Noises were presented only from the side loudspeakers. The narrowband shift to the babel noise occurred when the child successfully passed through all the stages of auditory training in the presence of narrow-band noise. The criterion for passing was recognizing and repeating different words at different SNRs and angles.

Training difficulty levels
The level of auditory training difficulty was changed by: Changing speech signals (two-syllabic, monosyllabic, nonsense monosyllabic, and sentences) Changing SNR (0, + 5, and + 10) And relocating the speakers (+ 90°and −90°, + 60°a nd −60°, and + 30°and −30°) Shifting white band noise to babble noise All of the levels of training are described in Table 1.
In each position, the word, or sentence, would be presented on the loudspeaker in front of the patient, in the intensity level of most comfortable hearing level, and in the constant SNR so that the person could accurately identify and repeat it, and then the next word was presented. In each condition, at least 30 stimuli were presented. Each session was about 1.5 h.

Post-rehabilitation assessments
Immediately after the end of rehabilitation sessions, the speech perception in quiet and noise was assessed in both intervention and control groups using the SRT, SDS, WIN, CV in noise, and BKB tests, and the obtained results were compared. In order to minimize the chance of pretest preparation, the list of two-syllabic words, monosyllabic words, nonsense monosyllables (CV), and sentences used in this stage differed from that employed in rehabilitation and pre-rehabilitation assessments. The effect of "Spatially Separated Speech in Noise training" on speech perception of children with BF was measured through comparing pre-intervention and postintervention results. To determine the reliability of the results in the intervention group, all tests were performed 1 month after the end of the rehabilitation program and the results were compared. To compare the Table 1 The level of auditory training difficulty: all these levels were presented at first by narrow-band noise and then all these steps repeated for babble noise

Level
Signal description Noise speaker angle mean scores in three measurements (before training, immediately after training, and 1 month after training), repeated measures statistical test was used.

Statistical analysis
Statistical analysis of this study was performed using SPSS (version 19, SPSS Inc., Chicago, IL, USA). Mean, standard deviation, and percentages were used to describe the data; in the present study, due to the small sample size and the fact that the distribution of samples was not normal, non-parametric tests were used for statistical analysis. The Mann-Whitney test was used to compare two groups. Covariance analysis (ANCOVA) was used to compare the two groups after the training stage. The Wilcoxon test was used to compare the results of the tests before and after the training in the intervention group.

Results
Based on the inclusion criteria, 24   years in the control group. The Mann-Whitney test did not show a significant difference between the groups in this regard. The implants used in all participants in both groups were implanted using the Cochlear Nucleus, System 5 Sound Processor. The mean PTA thresholds at frequencies of 0.5, 1, 2, and 4 KHz in the sound field under different conditions (with hearing aid alone, with cochlear implant alone, and both of them together) are presented in Table 2. There was no significant difference under above conditions between the two groups.
These assessments were done to see if the two groups differed before the one group received auditory training, and there was no significant difference between the two groups. Participant's demographic and audiometric characteristics are summarized in Table 2.

Effect of rehabilitation on speech perception test results
The Wilcoxon test was used to compare the results of the tests before and after the intervention in the intervention group. Results are summarized in Table 3. All tests except for SRT showed a significant statistical difference between pre-and post-intervention. Table 4 presents the mean and standard deviation of SRT in quiet, SDS in quiet, CV in noise, WIN, and BKB before and after the "spatially separated speech and noise" training sessions.
The ANCOVA was used to compare the second time results between two groups and investigate the effects of the intervention on the results of the speech perception test. In the control group, the second test was performed about 5 weeks after the first test.
According to the findings, a significant difference existed between the groups in the mean threshold score of SRT after the intervention, whereas there was no significant difference between the groups in the mean score of SDS in quiet after the intervention. Moreover, investigation into the effect of the intervention on WIN, CV in noise, and BKB showed a significant difference between the groups in the mean score of these tests after the intervention.

The reliability of the results in the intervention group
The results of the repeated measures test showed that the average threshold for SRT in quiet in three measurements did not have a statistically significant difference (P.value > 0.05). This statistic measurement showed that the mean score of SDS in quiet had a statistically significant difference in three measurements (P.value < 0.05). The Tukey test was used to compare each two means. The results showed that there was a statistically significant difference between the first score (before the intervention) and the second score (immediately after the intervention) (P.value < 0.05), while there is no statistically significant difference between the second score (immediately after the intervention) and the third score (1 month after the intervention) (P.value < 0.05). The results were exactly the same for CV in noise, WIN, and BKB tests. That is, there was a statistically significant difference in the three measurements and there was a statistically significant difference between the first score and the second score, while there was no statistically significant difference between the second score and the third score. The results of the reliability assessment are shown in Figs. 1, 2, 3, 4, and 5.   [18][19][20][21]. So in the present study, the participants were selected from the age group of 8-12 years.
To evaluate the effect of the rehabilitation program on speech perception, different tests were investigated. Comparing the post-intervention/second time results between two groups (Table 4), the SRT results showed a significant difference. This may reflect the positive effect of auditory training on speech perception: that the speech recognition threshold of the participants in this program improved as compared to the controls. The current study determined a significant difference between the groups in the mean score of CV in noise and WIN after the intervention (Table 4). In fact, improvement of results after the auditory training sessions can reflect the positive effect of the intervention on the perception of syllables and words in noise and in quiet (according to SRT outcome).
This study showed no significant difference between the groups in the mean score of SDS in quiet after the intervention, which can be attributed to inadequate training sessions for this ability (Table 4).
Several studies have been done to find the more effective auditory training method. Fu and Galvin [22] developed an auditory training method in noise using a Fig. 2 The results of the repeated measures test showed that the mean score of SDS in quiet had a statistically significant difference in three measurements (P.value < 0.05). The Tukey test showed that there was a statistically significant difference between the first score and the second score (P.value < 0.05), while there is no statistically significant difference between the second score and the third score (P.value < 0.05) phonetic contrast package or a keyword in a sentence package. They reported the effectiveness of bottom-up and top-down auditory training methods in enhancing speech perception in quiet. It seems that sentence training is far more effective [22]. Stacy and Summerfield found that such stimuli as words and sentences in the auditory training process are more effective than stimuli such as phoneme and syllable. They also revealed that stimuli such as meaningful words are more effective than nonsense stimuli [23]. In the current study, the mean score of the sentence perception in noise test (BKB) showed a significant difference between the groups after the intervention (Table 4). These results are consistent with the findings of Fu and Galvin [22] and Stacey and Summerfield [23]. All of these studies which are mentioned above were conducted under optimal hearing in quiet conditions and thus could not be generalized to hearing in noise conditions, but the current study rehabilitation was investigated in noise conditions, which are close to real Oba et al. [25] found that the auditory training and rehabilitation programs can enhance speech perception even in experienced cochlear implant users [25]. In the current study, the positive effects of the auditory training could still be observed in some experienced cochlear implant patients.
To eliminate the learning effect, the stimuli lists used to measure speech perception ability differed from stimuli used in the auditory training program. Nevertheless, the improvement of speech perception after the rehabilitation sessions indicated its effectiveness in improving not only the familiar words and stimuli used in the auditory training session, but also unfamiliar words and sentences. These results are consistent with the findings of Burk et al. [27]. They investigated the effect of auditory Fig. 4 The results of the repeated measures test showed that the mean score of "CV in noise" had a statistically significant difference in three measurements (P.value < 0.05). The Tukey test showed that there was a statistically significant difference between the first score and the second score (P.value < 0.05), while there is no statistically significant difference between the second score and the third score (P.value < 0.05) Fig. 3 The results of the repeated measures test showed that the mean score of "Word in noise" had a statistically significant difference in three measurements (P.value < 0.05). The Tukey test showed that there was a statistically significant difference between the first score and the second score (P.value < 0.05), while there is no statistically significant difference between the second score and the third score (P.value < 0.05) training on word recognition in background noise in young people with normal hearing and the elderly with hearing impairment and showed that the word recognition in noise was improved in both groups after the auditory training. It is worth noting that this improvement was observed not only in recognition of familiar words or words used in the auditory training session, but also in recognition of unfamiliar words, indicating the effectiveness of auditory training in recognition of unfamiliar words [27]. It is worth noting that following the training sessions, it was thought that the participant's attention improves but there was no evidence of support. According to Amitay et al., some results and improvements associated with auditory training may be related to general focus and attention enhancement rather than auditory processing enhancement [28]. It is believed that auditory training improves the working memory capacity of the listeners [25]. Working memory is specifically important for linguistic and developmental skills. Meaningful integration of auditory information requires general processing, such as attention, working memory, executive function, and processing speed [29]. As a result, some improvement in outcomes after rehabilitation can also be due to elevated attention level and memory enhancement. It can be said that in the present study, rehabilitation sessions also enhanced working memory and broadened the children's concentration, and part of the improvement in speech in noise perception tests may also be due to their increased attention and memory. This is certainly possible and might contribute to the improvement in SRT, but this can only be speculative.
In the present study, performing speech comprehension tests 1 month after the end of rehabilitation yielded results that were not significantly different from the results obtained soon after the end of rehabilitation (Figs. 1, 2, 3, 4, and 5). This finding could indicate the stability of the rehabilitation effect.
It was better to add the 6-month follow-up evaluation of both groups after the end of the program but it was not available and was a limitation of this study. This can be a suggestion for further studies.

Conclusion
It can be concluded that "Spatially separated speech in noise" auditory training can improve the speech perception in noise in bimodal fitting users. This result is also generalized to other speech tests different with auditory training materials; in other words, the results of rehabilitation sessions can be extended to other complex stimuli. In general, this rehabilitation method is useful for enhancing the speech in noise perception ability.