Description of the Normative Study

Participants.   The ENNI sample consisted of two subgroups within every age group: a wide range of typically developing children and children previously identified as having a specific language impairment. Since we want the instrument to be useful for language assessment, we considered it essential to include children with language impairments in the normative sample. If groups with special needs are excluded from the normative sample, then the interpretation of data from children in excluded groups is difficult, because if they receive a score that was received by any children in the normative sample, even if significantly below the mean, then they have scored similarly to a normally-developing child (Ukrainetz McFadden, 1996). Because the norms will be particularly useful for professionals interested in language impairment, special care was taken to include a representative sample of children previously identified as having a specific language impairment. The term “specific language impairment” (SLI) refers to problems in language that are not due to cognitive disorders, general developmental delay, or other identified condition. This definition does rule out children who may be receiving services for language impairments who have other conditions, and thus the participants are not representative of the full range of children receiving language services in Edmonton. However, as a first step, we decided to focus on the SLI population to make the best use of our resources. Based on a diverse sample of children from Iowa, prevalence of specific language impairment has been estimated at 7.4% of the child population (Tomblin, Records, Buckwalter, Zhang, Smith, & O'Brien, 1998). To assure as representative a subsample as possible without overrepresenting children with specific language impairments, the subsample was oversampled, with subsequent weighting of subsample data when calculating norms.

Sample size was 50 children with typically developing language per age group (one-year intervals), with equal numbers of boys and girls. The goal for children with language impairments was 15 per age group; due to difficulty in obtaining participants with language impairments, the obtained sample varies by age group from 10 to 17 children per age group. Gender was left to vary in this group; as expected, there were more boys than girls (48 of 77 – 62%) in the group with language impairments. Stories were collected from children ages 4 through 9, for a total of 377 children. Sample information is summarized in Table 1.

Schools were randomly selected from areas all across Edmonton to assure a sample that was representative of the Edmonton population. Children in the school-age range were chosen from children attending Kindergarten through Grade 4 in Edmonton public and separate schools. The younger children were chosen from those attending preschools, daycare centres, and Kindergarten programs in Edmonton. The subsample of children with language impairments was obtained with the cooperation of 3 sites: a public school serving children with language/learning disabilities; a rehabilitation hospital, which has several programs for children with language impairments; and Capital Health Authority, which serves preschool and school-aged children throughout the city. In all, 34 elementary schools and 13 daycares, preschools and independent Kindergarten programs were visited to collect the data. Data collection was conducted throughout the school year, with care taken to collect data from the full age range throughout the year so that no one age group was sampled at a different point in the school year than another age group.

Table 1.  Number, Age, and Socioeconomic Status Information on the Normative Sample

Age Group

Language Group

Total N

N Boys

Mean Age

Age SD

Age Range

Mean SES

SES SD

SES Range

4

TD

50

25

4.60

.24

4.04-4.97

47.38

13.58

23.70-82.91

SLI

12

9

4.66

.23

4.18-4.97

47.17

10.80

34.45-70.27

5

TD

50

25

5.51

.26

5.01-5.98

46.49

12.03

24.11-73.38

SLI

14

8

5.41

.26

5.07-5.85

46.52

12.00

25.53-63.64

6

TD

50

25

6.56

.29

6.04-6.95

48.31

14.75

25.53-101.53

SLI

11

6

6.64

.26

6.13-6.95

40.26

13.97

26.36-60.73

7

TD

50

25

7.54

.28

7.01-7.98

45.13

13.65

24.11-101.32

SLI

13

10

7.56

.23

7.15-7.92

42.42

13.30

23.70-65.43

8

TD

50

25

8.58

.28

8.01-8.99

45.04

11.55

23.70-75.87

SLI

17

10

8.70

.26

8.11-8.96

42.42

7.40

32.78-60.73

9

TD

50

25

9.49

.28

9.02-9.99

48.79

12.04

25.56-80.32

SLI

10

5

9.50

.21

9.10-9.82

48.71

9.66

27.60-60.73

 

Demographic information was collected on the families of participating children to permit description of socioeconomic status (using the Blishen scales; Blishen, Caroll, & Moore, 1987) and ethnic composition of the sample. The purpose of collecting demographic information was to ensure that the sample was representative of the Edmonton population. Demographic information is reported for each age group in Table 1 above. Information on ethnic backgrounds of the families was also collected and is reported in Table 2 along with comparison data for Edmonton and Canada.


Table 2.  Ethnic composition of the sample

Statistics Canada Category

ENNI Sample

Edmonton2

Canada2

Aboriginal

7.36%

4.15%

2.80%

Latin American

2.15%

1.04%

0.62%

Filipino

3.07%

1.64%

0.82%

Chinese

4.29%

6.24%

3.02%

Arab and West Asian

1.23%

1.24%

0.86%

Southeast Asian

1.53%

1.38%

0.61%

Black

2.76%

1.70%

2.01%

Korean

0.31%

0.29%

0.23%

Japanese

0.61%

0.22%

0.24%

Other

76.69%

81.93%

88.79%

Total

100.00%

99.84%

99.99%

1The categories are those used on the Canadian census for 2001 for visible minorities.

2Data for Edmonton and Canada are from the 2001 Canadian census.

Materials

Six original picture sets with animal characters were used to elicit stories, two each at three levels of complexity. The stories were controlled in pairs and systematically varied across levels for length, amount of story information, and number and gender of characters. These picture stories were designed to provide a range of narrative complexity. Table 3 provides a summary of the characteristics of the story sets.

To develop the sets, scripts for stories to be portrayed by the pictures were written by Dubé (2000; Dubé & Schneider, 2001) for her doctoral research investigating the language skills of Deaf children. A panel of narrative experts was asked to comment on the scripts with regard to their narrative structure as well as their appropriateness for children; the stories were revised based on comments from the panel. The black and white line drawing pictures were then drawn from her scripts by a professional cartoonist. The pictures were then given to the panel of narrative experts as well as to a panel of teachers of Deaf children. Both panels approved the pictures as appropriate for research with children.

Table 3.  Characteristics of the Story Sets

Story

Number of Episodes

Setting

Number of Characters

Character Description

No. of Pages

A1

1

Swimming pool

2

young female elephant

young male giraffe

5

A2

2

same

3

same as A1 plus

adult male elephant lifeguard

8

A3

3

same

4

same as A2 plus

adult female elephant

13

B1

1

Park

2

young male rabbit

young female dog

5

B2

2

same

3

same as B1 plus

adult female rabbit doctor

8

B3

3

same

4

same as B2 plus

adult male rabbit balloon-seller

13

The pictures for each story were placed in page protectors in a binder. Each story was in its own binder.

Comprehension Questions

In addition to assessing Story Grammar knowledge through children’s story productions, three sets of questions were developed to investigate children’s understanding of the Set A stories. The Guided Questions Set consists of Literal and Inferential questions which assess children’s knowledge of the story from the beginning to the end. The questions were derived from the category components of the Story Grammar model. Literal questions could be answered by observing details shown in the pictures; Inferential questions asked about elements not in the pictures.

The second set of questions, Problem-Resolution Questions, asked children to select two of the central components of the story, the Problem and the Resolution. To answer these questions correctly, children must integrate information from the whole story. Asking these questions allows an examiner to determine if a child can demonstrate knowledge of the central story elements.

Importance Judgements, the third set of questions, require children to judge which two parts of the story they considered to be the most important. These questions require children to integrate the story as a whole and reflect on it to make appropriate judgements.

Compared to the storytelling task which requires children to formulate entire stories while keeping the listener’s needs in mind, the question-answering task provides information regarding children’s knowledge of the stories under reduced task demands. Table 4 provides a summary of the questioning tasks which specifies story elements and relationships evaluated along with the specific ‘wh’ question forms used to evaluate each of the story elements. Specific questions, administration instructions and scoring criteria are on the Analyses pages.

Table 4.  Description of the Three Questioning Tasks

Question set

Question Type

Story Elements Evaluated

‘Wh’ question form

Guided

Literal

Events in the pictures

 

 

 

Inferential

Events not in the pictures

1) Setting

2) Initiating Event

3) Attempt

4) Consequence

5) Reaction

 

1) Internal Response

2) Explanations of story characters’ reactions

Who? / Where?

What – happen?

What – do?

What – happen?

How?

 

What – thinking?

 

Why?

Problem

Resolution

Integrative

Inferential

1) Main problem to be solved

2) Outcome of story

What – problem?

How?

 

Importance Judgements

Integrative

Inferential

1) Information considered most important in the story

2) Information considered the second most important in the story

What – important?

 

What – important?

 

Procedure

Three research assistants were employed to collect the storytelling data. In addition, the third author (a registered speech-language pathologist) administered the questioning task and the standardized testing.

Each child was seen individually in the child's school, preschool, or daycare, in two sessions. The child was first given a training story, which was similar to the simple stories in the two story sets in terms of length (5 pictures, 1 episode) and number of characters (2). The purpose of the training story was to familiarize the child with the procedure and to allow the examiner to give more explicit prompts if the child was having difficulty with the task, such as providing the story beginning (e.g., “Once upon a time … there was a …”). For the sets A and B stories, the examiner was restricted to less explicit assistance such as general encouragement, repetition of the child’s previous utterance, or if the child did not say anything, a request to tell what was happening in the story.

After the training story, the child then viewed the pictures for each story in turn and was asked to tell the story to the examiner. When presenting the stories, the examiner held the binder in such a way that she could not see the pictures as the child told the story, which meant that the child needed to use language rather than pointing or gesturing if the examiner was to understand the story. The instructions emphasized that the examiner would not be able to see the pictures, so the child would have to tell a really good story so the examiner could understand it.

The examiner first went through all the pages so that the child could preview the story, after which the examiner turned the pages again as the child told the story. The examiner turned the page when the child appeared to be finished telling the story for a particular picture. Administration of the story sets was counterbalanced, with half of the children telling stories from Set A first and the other half telling stories from Set B first. Stories were audiorecorded using JVC minidisk recorders.

In the second session, children participated in the comprehension task involving the pictures in the first set of stories (Hayward, 2003; Hayward & Schneider, 2001, 2003). After that, standardized tests were administered.

The Clinical Evaluation of Language Fundamentals (CELF) was used to collect language information on all participants – the CELF-Preschool (Wiig, Secord, & Semel, 1992) for children under 6 years of age, and the CELF-III (Semel, Wiig, & Secord, 1995) for children age aged 6 and over. The full CELF was administered to 29.3% of children with typical language development (TLD); the other TLD children were given 2 subtests of the CELF that are considered “screening” subtests, as well as the Listening to Paragraphs subtest. All children with language impairments were given the full CELF.

Children's story retellings were audiotaped and later transcribed in full using the CHAT transcription system from the CHILDES database (MacWhinney, 2000; MacWhinney & Snow, 1990). The CHILDES database is a collection of transcripts from many researchers of primarily children’s language samples in a number of languages. When the investigators have completed all analyses, the transcripts will be donated to the database so that other researchers can use them. CHILDES also provides a system for analysing transcripts using the CLAN program, which was used for the analyses of storytelling. The transcripts were divided into communication units (C-units, each of which consisted either of one independent clause plus any dependent clauses associated with it or of a sentence fragment.

Transcripts were also converted to the SALT format and provided to the Systematic Analysis of Language Transcripts (SALT) program (Miller, 2006) for use with the Profile feature. Profile allows a user to compare an individual transcript to a number of reference databases in the system. An individual transcript of ENNI stories can be compared to the ENNI reference database. The database can be downloaded at no charge from http://www.saltsoftware.com/salt/downloads/referencedatabases.cfm#. However, the SALT program is needed to compare the file to the database.

Reliability measures

Transcription reliability.  Transcripts (CHAT format) were checked against the recordings by the primary investigator before being analysed. A research assistant transcribed 5% of the stories for reliability purposes; word-by-word reliability was calculated to be 97%.

Story grammar scoring reliability.  To determine interscorer agreement, two scorers blind to participant’s group membership scored 20% of the A1 and A3 stories. Cohen's kappa was computed for agreement on each story using the procedure described by Bakeman & Gottman (1986). This measure takes into account differences between scorers on individual scoring categories; it adjusts for frequencies of different categories and thus it corrects for agreements expected by chance (Bakeman & Gottman, 1986). The kappa for A1 was .92, and for A3 kappa was also .92. These kappas are significant at p < .001 and indicate excellent interscorer agreement (Landis & Koch, 1977).

Subsequently, another study by Beswick (2008) investigated reliability of the Story Grammar measure using clinicians. Four community speech-language pathologists were recruited to score a set of transcripts from the ENNI normative sample. The transcripts were randomly selected from the full range of ages in the normative sample; it included 4 children with language impairment and 14 children with typical language development. The S-LPs received no training in how to score the ENNI; they were given the scoring information that is available on the ENNI website. Their scorings were compared to the original scorings and intra-class correlations (ICCs) were computed for A1 and A3. Interscorer reliability was determined to be .92 for A1 and .96 for A3. Thus Story Grammar scoring appears to be reliable without training.

First Mentions scoring reliability.  To check First Mentions reliability, two scorers scored 20% of the transcripts (entire story sets from 20% of the children, randomly chosen). A Cohen's kappa of .85 was obtained, indicating excellent reliability (Landis & Koch, 1977).

Validity

Story grammar.  To estimate concurrent validity of the Story Grammar measure, scores obtained from the ENNI data were correlated with CELF scores. Correlations of CELF scores with ENNI Story Grammar scores are reported in Table 5.

Table 5.  Correlations between ENNI Story Grammar scores and CELF scores.

All correlations (Pearson's r) are significant at <.01 (two-tailed).

CELF-P

 

Receptive Language*

Expressive Language*

Total Score*

Linguistic Concepts**

Recalling Sentences in Context**

Simple story A1

.48

.46

.49

.38

.26

Complex story A3

.63

.71

.70

.54

.49

CELF-3

 

Receptive Language*

Expressive Language*

Total Score*

Concepts and Directions**

Recalling Sentences**

Simple story

.34

.43

.38

.31

.33

Complex story

.46

.45

.45

.39

.37

All children

 

Receptive Language*

Expressive Language*

Total Score*

Subtest 1***

Subtest 2***

Simple story

.37

.44

.41

.32

.30

Complex story

.51

.53

.52

.43

.40

 

*Composite scores – available for a portion of the Typical Development group and the entire Specific Language Impairment group.

**These subtests were given to all children.

***Subtest 1 was Linguistic Concepts (4 and 5 year olds) or Concepts and Directions; Subtest 2 was Recalling Sentences in Context (4 and 5 year olds) or Recalling Sentences

First Mentions validity.  Scores from the First Mentions scoring were correlated with CELF scores to estimate concurrent validity. Correlations between CELF scores and First Mention scores are reported in Table 6.

Table 6. Correlations of ENNI First Mentions Scores to Clinical Evaluation of Language Functions (CELF) Scores

Test

N

Receptive Language

Expressive Language

Total Language

 

 

 

 

 

CELF-Preschool Composite Scores

60

.71

.66

.74

 

 

 

 

 

CELF-III
Composite Scores

105

.51

.50

.53

 

 

 

 

 

All CELF Scores
(Preschool + III)

165

.58

.55

.58

 

Subtests

N

Subtest 1*

Subtest 2**

 

 

CELF-P subtests

 

126

.59

.47

 

CELF-III subtests

 

251

.48

.52

 

All CELF subtests (Preschool + III)

377

.51

.50

 

 

Note.  All correlations were significant at p < .000001.

* Subtest 1:  Ages 4-5, CELF-P Linguistic Concepts; Ages 6-9, CELF-III Concepts and Directions

**Subtest 2:  Ages 4-5, CELF-P Recalling Sentences in Context; Ages 6-9, CELF-III Recalling Sentences