Development of a test for assessment of the lipreading ability for children in the Arabic-speaking countries

Lipreading is considered an important skill that varies considerably among normal-hearing (NH) and hearing-impaired (HI) children. It is well known that normal-hearing children use audition as the primary sensory modality for speech perception, whereas HI children use lipreading cues as the primary sensory modality for speech perception. Moreover, speech perception is a multisensory process that involves attention to auditory signals as well as visual articulatory movements, and the integration of auditory and visual signals occurs naturally and automatically in normal individuals of all ages. Most researches proved that lipreading is a natural and important skill needed for language acquisition in HI children. Lipreading also helps HI children to perceive speech, acquire spoken language, and acquire phonology. In the Arabic language, tools are deficient for assessing the lipreading ability for HI children, so this study was conducted to develop a test suitable for assessing the lipreading ability of hearing-impaired children among Arabic-speaking countries. The constructed lipreading test was administered to 160 Arabic-speaking Egyptian children including 100 typically developing NH children and 60 HI children. Participants’ responses were statistically analyzed to assess the validity and reliability and to compare the lipreading ability between the NH and HI children. Ranks of percentiles were established to provide an estimate of the lipreading ability in children. Statistically significant differences were found between the normal-hearing and HI children as regards all subtotal and total scores of the Arabic lipreading test, with good validity and reliability of the test. The Arabic lipreading test is a valid and reliable test that can be applied to assess the lipreading ability among Arabic-speaking children with HI.


Background
Over the last years, studies provided evidence that speech perception is multimodal. It does not involve only auditory modality. It also involves the processing of phonetic components across different channels even when the auditory information is intact, or what is known by the McGurk effect. Studies by McGurk and MacDonald [1] emphasized that the perception of multimodal speech is mandatory rather than optional and provides strong evidence that hearing people use visual speech perception cues (lipreading) if they are available. Lipreading is considered an important part of speech processing that needs the extraction of visual speech information from the seen action of the lower face especially the jaws, lips, tongue, and teeth [2], well as from the movement of the extra-oral facial areas (e.g., cheeks, nose, eyes) [3].
Lipreading is a natural skill in hearing people that starts to develop during the early infancy of normalhearing individuals. Several studies found that once infants reach the babbling stage around the age of 4 to 6 months old, they become more interested in speech production and begin directing their attention to the audiovisual speech cues located in a talker's mouth [4]. Moreover, current data from neuroimaging supports the premise that speech perception is multimodal and that information from different modalities is integrated early in speech processing [5]. Numerous studies by functional brain imaging have demonstrated that visual information about speech enhances and facilitates auditory recognition of speech in both normal-hearing and hearing-impaired populations [6]. One study by Bernstein and Liebenthal [7] reported that the neural circuity of lipreading was shown to include supra-modal processing regions, especially superior temporal sulcus as well as the posterior inferior occipital temporal regions including regions specialized for the processing of faces and biological motion. Better lipreading skill is associated with greater activation of the left superior temporal sulcus in hearing people [8].
Lipreading ability, like most of the human skills, has great individual variability in both hearing and hearing-impaired populations [9,10]. These individual variations include the intelligence, age, degree of hearing loss, gender and visual memory, and other factors that relate to some aspect of visual or cognitive processing. The contribution of these factors in the development of lipreading skills made difficulty in developing a reliable and appropriate measurement or test to assess lipreading for children. Moreover, the assessment of lipreading in children, especially those with hearing impairment, has a problem concerning the linguistic content. Studies by Green and Holmes [11] found that both hearing-impaired and hearing groups performed significantly better receptively at a word level, as opposed to sentence or phrase level, and that they could identify found nouns were easier than adjectives and verbs. This finding was consistent across word, phrase, and sentence levels. Also, there is little to know about the relation between the degree of hearing loss and lipreading ability [9]. However, evidence indicates that the lipreading skills of hearing-impaired children may provide fundamental information for phonological processing in speech perception [12][13][14][15].
Several lipreading tests were developed using different materials, manner of presentation, manner of response, and scoring. Most of these tests targeted the Englishspeaking population and most of them were developed in the United States of America (USA) such as the Craig lipreading inventory [16] and Test of Child Speech Reading "ToCS" [17]. To our knowledge, there is no standardized valid lipreading test for the Arab population, specifically, the middle east countries. Arabic is the liturgical language of 1.8 billion Muslims across the middle east, north Africa, and the Horn of Africa. Arabic and its different dialects (the Egyptian, Maghrebi, Sudanese, Arabian Peninsula, Mesopotamian, and Levantine dialects) are spoken by around 422 million speakers (native and non-native) in the Arab world as well as in the Arab diaspora [18]. Modern Standard Arabic is the literal form of Arabic, and it is one of the six most spoken languages in the world [19]. It is the official language in 22 countries in the world and is widely taught in schools and universities and used in media and governments across the Arabic countries. The Modern Standard Arabic included approximately 28-consonant and 6vowel phonemes ( [20,21].
The need to measure accurately mandatory speechreading skills in Arabic-speaking children as part of the assessment process of speech perception was the motive of this study. Thus, this study aimed to develop an Arabic test that assesses the lipreading ability using the modern standard Arabic language. This test was standardized on normal-hearing Egyptian Arabic-speaking children in the age range between 3 and less than 8 years old. This test will provide normative data on the lipreading development of normal-hearing children. This study aimed to develop an objective tool to measure the lipreading performance of hearing-impaired children. This is because lipreading is an important facilitator of the progress in speech reading, auditory, and linguistic abilities which are the targets of the therapy program for the HI children.

Participants
This study was applied to 160 Arabic-speaking Egyptian children between 3 and 8 < years, who were divided into 2 groups. Group I included 100 normal-hearing and typically developing Arabic-speaking children who were recruited from different nurseries and schools of Cairo government in Egypt and representing different social classes. Children selected for group I were divided into 2 age groups: NH between 3 and 5 < years and NH between 5 and 8 < years, with 25 children in each subgroup.
Group II included 60 HI children who had pre-lingual bilateral sensorineural hearing loss (SNHL) of moderate (40-70 dB), to severe (70-95 dB), and/or profound (95 dB or more) degrees. Children in group II were conveniently selected from the outpatient clinics of the units of phoniatrics of Ain-Shams University hospitals (Al-Demerdash hospital and Ain-Shams University Specialized Hospital). Children in group II were fitted with hearing aids and/or cochlear implants and receiving regular speech therapy for a minimum of 1 year and had an appropriate language standard sufficient to go through the tested items.
All children in NH and HI groups were checked to have a normal or normal-corrected vision. Children had average IQ and normal mental age as assessed by the screening test of Stanford-Binet intelligence scale, Arabic version (5th edition) [22], and the Modified Preschool Language Scale-4th edition (The Arabic edition) [23] to determine their language age.

Test design
The Arabic lipreading test "ALRT" is a live-voice lipreading test designed to be suitable for use with hearing and hearing-impaired Arabic-speaking children aged between 3 and < 8 years old. This test requires a picturepointing response from the children to choose either the target word/sentence from 4 pictures (the target and 3 distractors). The target words/sentences were articulated in the Modern Standard Arabic language to suit most of the Arabic-speaking children. It was presented in brightcolored pictures familiar for the selected children. This test consists of 2 subtests that measure lipreading skills at 2 psycholinguistic levels: words and sentences.
The selected words and sentences of the ALRT were ensured to be in the vocabulary knowledge of both hearing and hearing-impaired children. The words were selected from the first words acquired by children from the age of 1 year and the highfrequency words commonly used early by the Arabic-speaking hearing children. Thus, a pilot study was conducted with 15 normal-hearing and 12 hearingimpaired children selected randomly in the same age range of the test (3-< 8 years) to ensure that the selected items were familiar to both groups and that there is no difficulty in understanding the test instructions. It was found the most children had difficulty in recognizing one word and relating it to its picture, so this target word was replaced by another one.

Word subtest
There were 68 items in the word subtest (34 items in each form). For each item, the child was allowed to look at the examiner while s/he is silently articulating the target word and the child is to respond by pointing to the corresponding depicted pictorial among 4 pictures (the target and 3 distractors) presented to him/her. The words in this subtest item were selected to represent the 34 Arabic phonemes (28 consonants and 6 vowels). The words that represented consonants were selected to have these consonants at the initial position of the word, while words that presented vowels were selected to be CVC or CVSS words with the vowel placed in the medial position of the words. According to the viseme classification of the Modern Standard Arabic by Damien et al. [24] ( Fig. 1), the word subtest was designed to have 2 levels of difficulty: form A and form B, form B is considered to be the more difficult one according to distractor words used. Distractors in form A were chosen to begin with a consonant different in place of articulation (different viseme) to the target word. For example, word (/ʔaesaed/, lion) in the form A that begin with the viseme /ʔ/ where put along 3 distractors namely: (/korsi:/, chair), (/bɑtˤtˤɑh/, duck) and (/ri:ʃaeh/, feather), that start with different viseme /k/, /b/ and /r/ respectively. While in form B, 2 of the distractors were chosen to begin with the same consonant of the target word (same viseme), but with a different following vowel and the other distractor begin with a consonant that is different in place of articulation to the target word (different viseme). For example, in form B, the target word (/ʔaesaed/, lion) were put among 3 distractor words; 2 of them are the word (/ʔodˤɑh/, room) and the word (/ʔebraeh/, needle) that start with the same viseme /ʔ/ of the target word and the other distractor word was (/fɑrɑ:ʃɑ:h/, butterfly), that start with /f/ sound, a viseme that is different from that of the target word.
Both forms A and B were applied to all the children and their scores were included in the total test score.
A practice item was used at the beginning to ensure that the child had understood the procedure (Additional file 1: Appendix 1). The list of words of the items in the word subtest (form A and B) can be found in Additional file 1: Appendix 2.

Sentence subtest
The sentence subtest of the ALRT includes 5 items. The response task in the sentence subtest is the same as in the word subtest, but the stimulus is a subjectverb sentence describing one picture from among 3 pictures (the target sentence and 2 distractors) presented to the child. The subject of these sentences was chosen to be either (/waeled/, a boy) or (/bent/, a girl). In each item, one of the distractor sentences had the same subject as the target sentence, but with a different verb, and the other distractor sentence had the same verb but the subject is different (Additional file 1: Appendix 3).

Test application
In this study, the test was applied to all children by one examiner. The examiner was a 26-year-old female Egyptian phoniatricains who was an Arabic native speaker. The examiner had a clear normal articulation with quite a speech rate. Therefore, to ensure the proper performance of this test by other examiners, test instructions were set; thus, the examiner should not have any speech/articulatory problems and should have normal clear articulation without any obstacles that hinder the ability to visualize his/her lips (e.g., big mustaches or beard). The examiner should speak at a normal speech rate (not too quickly or too slow).
In a preparatory session and before the application of the ALRT, the examiner exposed the child to prints of all the target words and sentences (placed among others) and required the child to point at them when the examiner asked. This was to check that the child had these words in his/her passive vocabulary to avoid the bias of linguistic unawareness of the word or sentence meaning. Also, the child should identify all the screened pictures at least receptively to be allowed to pass the ALRT. A list of these screening words can be found in Additional file 1: Appendix 4.
Following this, the examiner and the child were seated facing each other, at about 1 m, at nearly the same level in a quite well-illuminated room. Instructions were given to the child as follows: "Can you see these pictures? (while pointing at one row of pictures), I am going to say one word/sentence silently, and you are going to point to the word/sentence I silently said".
Thus, the child was supposed to use his/her lipreading ability to recognize the target word/sentence the examiner had uttered, without sound. The examiner's mouth was free from obstacles and the utterance was done at a normal speed. The target words or sentences could be repeated up to three times if the child so required during the procedure, after that the child scored 0 in the item s/he did not recognize.

Test timing and scoring
The average time for application of the ALPR was estimated to be 15-20 min depending on the child's age and cooperation. The total test score was 73 grades: 68 grades for the word subtest and 5 grades for the sentence subtest. The scores were calculated by giving the child 1 grade for each correct answer in the word (forms A and B) and the sentence subtests. Incorrect answers received a score of "0" in both word and sentence subtests. Ranks of percentiles (25th, 50th, 75th) of the test subtests and the total scores were calculated for the two age groups of the NH children. Accordingly, the lipreading ability of the child is given an estimate as follows: poor (scores < 25th percentile), fair (scores ≥ 25th-< 50th percentiles), good (scores ≥ 50th-< 75th percentiles), and excellent (scores ≥ 75th percentile).

Test reliability
The reliability of the total score and the score of each subtest of the ALRT were estimated by retesting 30 children randomly selected from the already participated children. Each child was tested twice with a 2-week interval between the 2 assessments.

Statistical measures and analysis
The data were statistically analyzed with the Statistical Package for Social Science under windows version 24. Mean and standard deviation values were used to describe the data. The paired t test was used to compare the mean of the total score and the scores of each subtest of the ALRT among the 2 age groups of the NH and to compare the test scores among the NH and the HI children (significant p value at < 0.05). It was also used to compare the scores of forms A and B to determine the difference between the 2 forms, with a significant p value at < 0.05. Pearson correlation coefficient was used to correlate the chronological age of the participated NH and HI children to the total test score, scores of the word subtest in forms A and B, and the score of the sentence subtest, with a significant p value at < 0.01. Cronbach α test was used to analyze the test-retest reliability of the test-retest and the internal consistency between all the ALRT subtests. High reliability was considered if Cronbach α > 0.75. The validity of the ALRT was analyzed using construct validity. The principal component analysis was used to identify the principal components that led to the validity of the test. These components were constructed from possibly correlated variables. Table 1 shows the demographic data of the participated children. Data of the ALRT including means and standard deviations for the total score, the score of the word subtest (forms A and B), and the sentence subtest of the NH and HI children are reported in Table 2. Ranks of percentiles and the estimation of the Arabic lipreading test's scores among the NH children are shown in Table 3. While comparing the results between forms A and B, there was a statistically significant difference (paired t test; p value significant at > 0.05) between the scores of forms A and B of the word subtest (Table 4). This indicates that both forms are different in their difficulty levels, form A being the easier.

Results of test application
By comparing the results of the test subtotal and total score among the two groups, a statistically significant difference (paired t test; p value significant at > 0.05) was found between the NH and HI children. This revealed that the NH children were better lipreaders than the HI children (Table 5).  There was a significant positive correlation (p value significant at > 0.01) between the chronological age of the participated NH and HI children and their total test scores, scores of forms A and B, and scores of the sentence subtest by Pearson correlation coefficient ( Table 6).

Test reliability and validity
The high value of the Cronbach α test (0.878; significance at < 0.70) for testing test-retest indicates the high reliability of the ALRT to assess the lipreading ability. The high internal consistency between all the ALRT subtests is revealed from the high value of the Cronbach α reliability analysis (0.878; significance at < 0.70) (Tables 7, 8, and 9). The 2 tests used for factor analysis of the ALRT data were sample sufficiency index ΚΜΟ and supposition test of sphericity by the Bartlett test. It was found that sample sufficiency index Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO), which compares the sizes of the observed correlation coefficients to the sizes of the partial correlation coefficients for the sum of analysis variables, was 79.2%. This was considered reliable as it overcomes 70% by far. Also, the supposition test of sphericity by the Bartlett test was rejected on a level of statistical significance p < 0.0005 for approx. chisquare = 464.409. Consequently, the coefficients are not all zero, so that the second acceptance of factor analysis is satisfied. As a result, both acceptances for the conduct of factor analysis are satisfied indicating the high validity of the ALRT.

Discussion
The main aim of this study was to develop an Arabic lipreading test (ALRT) to be used for the assessment of lipreading ability in hearing-impaired Arabicspeaking children. The normative data provided by this test indicates that lipreading skill is directly related to the chronological age of the participated NH and HI children, at least within the frame of the test's age range. This indicates that the lipreading ability improves with age like almost all cognitive abilities which increase with age as reported by Evans [25]. This was proved also by Tye-Murray et al. [26] who showed that children's lipreading ability is not fixed, but rather improves between 7 and 14 years of age and that lipreading is a skill that continues to develop as long as language skills are developing [27]. However, another study by Woodhouse [28] found that although older hearing and hearing-impaired children tended to have better speech-reading skills, being young did not preclude good speech-reading skills, suggesting that strength in the skill may not necessarily be acquired through maturation.
Moreover, and although the lipreading cues are known to be the primary sensory modality for speech perception for HI children, this study indicated that the NH children are better lipreaders than the NH children in the same age range. This may be attributed to the delay in language development in HI children in comparison to their normal-hearing peers. Although those children are all fitted with proper hearing devices, yet the shorter duration of hearing experience, after being fitted with their devices, inevitably leads to lower language acquisition levels than normal children. It is clear that HI children's lipreading ability is tied to their language ability. In fact, the tie may be circular in fashion. This means that while language ability affects lipreading, the latter is a contributing factor in language acquisition.
The ATLR was designed to obtain normative data about the development of the lipreading ability among normal-hearing Arabic-speaking children aged 3 years  to < 8 years. While developing an instrument that aims to measure lipreading skills, many challenges related to the factors affecting lipreading skills were considered. Among these challenges were children's linguistic and language abilities. It was important to ensure that children's responses were exclusively due to lipreading and not deduced from any context. For this purpose, the ALRT is built up in a manner that suits the Arabic culture and uses simple images for the target words/sentences familiar to any Arabicspeaking child. Moreover, children with typical language development who had a reasonable size and type of language vocabulary knowledge were included. This is to ensure that the participated children have linguistic awareness of the meaning of the words/sentences used in this test, thus avoiding any bias of the test results. Moreover, before applying the test, all children were screened by picture list that includes prints of all the target words/sentences of the test. This was to check that the child had these words in his/her passive vocabulary to avoid the bias of linguistic unawareness of the word or sentence meaning. This screening list was added to the test material and the child should identify all the screened pictures (at least receptively) to be allowed to pass the whole test. Another challenge was faced while developing this test is the method of introduction of the test. Literature mentioned two methods of introduction of lipreading tests: by face-to-face live means or videotape previously recorded evidence. Each of these methods has its advantages and disadvantages. The face-to-face "live" method has the advantage of great fidelity of three-dimensional life situations. It allows the children to focus their attention more on the speaker's lips movement rather than the animated moving pictures like that used in video-to-picture tests, although it is vulnerable to variations related to the examiner's features and the environment around. Nevertheless, the moving pictures cannot approach the real-life situation and even at an optimal level of approximation, the situation is never as good as a well-conducted natural situation [29]. In this study, the face-to-face "live" method was selected to test the lipreading ability by only one examiner. Thus, to avoid the potential affection of the generalizability and replicability of the test results if done by other examiners, instructions for doing the test were set. That is to say, the test's examiners should have correct pronunciation with clear articulation without any obstacles that hinder the ability to visualize his/her lips (e.g. big mustaches or beards). The test examiner should speak at a normal speech rate (not too quickly or too slow).
Lipreading ability can be measured at many different psycholinguistic levels such as the word, phrase, sentence, or connected speech. These different levels lead to variability within and between the assessed individuals [30]. The ALRT is a comprehensive test for lipreading ability in children that was designed to estimate and comprehend word/sentences through lipreading. Thus, the word subtest was selected to  represent all the Arabic phonemes and vowels in the target words and sentences. The target of this test is not identifying the phonemes, rather than recognize the target words/sentences using the visual cues of the lipread phonemes. The identification of words requires that the perceiver has a sufficiently detailed lexicon to distinguish a word whether by phonetic or semantic features. While the elements of an utterance need to be perceived efficiently, this may not be sufficient to ensure an understanding of the utterance as a whole. On the other hand, the identification of a phrase or sentence requires good working memory [31]. The ALRT requires a non-verbal picture-pointing response. This makes the application of the test simple and easier specifically when it is applied for the HI children who are non-verbal and/or with lower linguistic abilities. The ALRT requires 73 non-verbal picturepointing responses from the examined child, including both the word and sentence subtests. These higher numbers of responses overcome the possibility of chance responses and indicate better lipreading ability. The word subtest consisting of two different levels to avoid the ceiling effects for the test. Accordingly, a higher number of responses in form B (the difficult form) could also indicate better lipreading ability. It also may indicate the effectiveness of a lipreading program if the test results before and after the training were compared.
For sure, the ALRT is the first standardized Arabic lipreading test known in the middle east that suits Arabicspeaking children. However, Bassiouny et al. [32] mentioned a trial of Arabic lipreading used to assess the lipreading abilities among children with cochlear implants. Nevertheless, this test was just a trial that is not standardized nor published until now. Besides that, it was just an Arabic translation of the Craig lipreading test without any modification in the words or the sentences to suit the Arabic language and culture. It involved rather difficult, and somewhere unfamiliar words to the children's ages and expected vocabulary size of the tested children, as well as the sentences, were also rather long and somewhat complicated. This was a motive to present a short easily applicable lipreading test with pictures familiar to most of the Arabic-speaking young children.

Conclusions
The ALRT is a reliable and valid tool that assesses the lipreading ability among children. Lipreading ability appears to be directly related to age and the NH children are better lipreaders than the HI children.