Transcribing the Stories


Transcription greatly facilitates the analysis of the stories and is necessary for using the stories as a language sample. (If you are only interested in Story Grammar and/or First Mentions, it is possible to score without transcription, by listening to the audiotape.)

The easiest way to transcribe is directly into a computer as you listen to the audiotape. This makes transcription much easier, and facilitates making changes to the transcript. If you have videotaped the session as well, you can then check it against the videotape and easily add nonverbal context that might be useful, such as pointing at pictures. Make sure that your transcription is as accurate as possible. Be sure to check it while listening to the tape once you have transcribed it in full.

There are programs available that facilitate transcription and analysis of language transcripts. Instructions for transcribing with two of these programs is included in this manual. The Systematic Analysis of Language Transcripts or SALT (Miller, 2002) is available for purchase from the Waisman Center at the University of Wisconsin (; a trial version is also available at that site. SALT has easy-to-use menu-based analyses of typical language sample measures such as MLU, numbers of words, different words, utterances, and mazes. The second program is actually a set of "tools" associated with the CHILDES database (MacWhinney, 2000; MacWhinney & Snow, 1990). The database contains transcripts of children’s language collected by a number of researchers and donated to the database; the files can be used for free by anyone with proper referencing of the original researchers. Along with the database are methods for transcribing and analysing one’s own files. CHAT is the set of conventions that must be used for transcribing, while CLAN is a set of commands for analysing the data. CHAT and CLAN are also free. CHILDES can be accessed at

The use of language analysis programs such as CHAT or SALT makes both transcription and analysis of your samples much easier. In particular, the analyses that involve counting (e.g., Number of Words, Number of Different Words, Mean Length of Communication Units) are tedious and even impractical without the use of a language analysis program. One analysis, "D", is possible only through the use of CLAN.

Determining utterance boundaries

Transcripts in the ENNI normative sample are divided into C-units. C-units are basically sentences (clausal units) as opposed to "utterances", the basis for transcription with very young children. We are interested in clausal units for syntactic analysis with older children.

When deciding where the boundaries of C-units lie, it is necessary to identify independent (or main) clauses versus dependent clauses. Certain conjunctions ("co-ordinating conjunctions") such as and, but, and then can begin independent clauses; sentences that begin with because, although, etc. ("subordinating conjunctions") are considered to be dependent clauses and thus are transcribed in the same sentence as the independent clause it goes with. A clause is dependent if it would not be considered a complete sentence with the conjunction. Example:

She saw the airplane.

And she wanted to play with it, although it wasn’t hers.

Note that intonation does not determine utterance boundaries. The above two sentences could have been said as one intonation unit, that is, without sentence-final intonation until "hers". However, it would still be two utterances on syntactic grounds. (Children sometimes tell entire stories without sentence-final intonation until the end.)

Measures of Sentence Length: MLCU versus MLTU

There are two measures of sentence length commonly used with school-aged children: Mean Length of Communication Unit (MLCU) and Mean Length of T-Unit (MLTU). T-units (terminable units) are sentences that can stand alone. That means that one T-unit is one independent clause plus any dependent clauses that are contained in or attached to it. The following do not count as T-units: utterances such as 'Yes,' 'Okay,' 'Please,' 'the more the merrier,' partial sentences in response to a question-- generally, verbless utterances.

Communication units (C-units) differ from T-Units in that they include partial sentences as well as full sentences. Using C-units allows us to include all of the child’s story-related utterances rather than excluding utterances that were not complete sentences. The focus is still on sentences that can stand alone, but partial sentences are not excluded. That is why C-units were chosen as the unit for the ENNI sample.

What is a clause?

Clauses can be independent or dependent.

  • Independent (also called main): A clause that can stand alone as a complete sentence.
  • Dependent (also called subordinate): Not a sentence that can stand on its own, but a phrase containing a verb. [See the next pages for a (nonexhaustive) list of dependent clause types.]

Sentences can be complex or not complex (simple or conjoined).

Complex sentence: An independent clause plus one or more dependent clauses (transcribed as one C-Unit).

Simple sentence: A single independent clause (also one C-Unit).

Conjoined (coordinated) sentences are 2 independent clauses -- thus they are counted as 2 C-units, even if they could normally be written as one sentence. They are transcribed on separate lines. Example:

She saw his plane.

And she wanted to play with it.

But he didn’t like that.

Note: Only certain conjunctions ("coordinating conjunctions") such as and, but, and then can join independent clauses; sentences that begin with because, although, when, etc. ("subordinating conjunctions") are considered to be dependent clauses. A clause is dependent if it would not by itself be considered a complete sentence with the conjunction. Examples:

He was mad because she dropped his plane.

When she saw the lifeguard, she went over to him.

"He was mad" and "she went over to him" could stand alone, but the clauses in italics could not, and thus are dependent clauses.

Consider direct quotations that contain clauses to be complex sentences, which means that they are single utterances. For example, a sentence such as She said, "Wait!" should be considered a complex sentence, with wait the dependent clause. Thus it is one utterance. However, if there are subsequent quoted sentences, as in, She said, "Wait! Don't do that! You'll get hurt!", the subsequent sentence(s) (Don’t do that and You'll get hurt in this example) is/are independent and thus should be transcribed on separate lines.

Identifying Clauses

Complex sentences: Main clause plus any of the following types of dependent clauses. Note that the normative tables for subordination index (SI) only require that you count the number of clauses; the types are described below to help you to identify C-units for transcription and to count the number of clauses if you want to calculate SI. (See the section on syntactic analyses for a description of SI.)

Types of Dependent Clauses

Subordinate clauses:

- adverbial clause

He was angry because she had dropped it.

Although it was an accident she ran away.

When she tried to fly it, the plane fell in the water.

- relative clause (wherever it occurs)

The elephant, who was very large, could move very fast.

The lifeguard used the net that was by the pool.

- sentences in which full sentences serve as noun phrases

I think it's right.

She said (that) I can do it.

- sentences with wh-clauses (wh- words as well as how, if, and like)

She doesn't know where she's going.

That's how it happened.

She wanted to see if she could fly it.

Do it like he does (it).

- Direct quotations

Sentences containing direct quotations followed by at least a clause.

She said, "Don’t do that."

NOT: She said, "Hello." or She said, "Not that."

DO count "Thank you" and "You’re welcome" as clauses.

- appositive clause

The answer, whether or not we like it, is complex.

Note: An appositive is only a clause if it contains a verb. Otherwise, it is an appositive phrase and not a clause, as in "The answer, short and sweet...".

Non-finite Clauses:

- infinitive clause

He really wanted to get the plane.

He got the ball for her to have.

- unmarked infinitive clause

She let the ball fall in the water.

She made the plane fall in.

Let’s go.

- Wh- infinitive clauses (count as one dependent clause, not two)

Tell me when to start.

I know how to do that.

- gerund clause

She made a big mistake by dropping the plane.

She made a mistake trying to fly the plane. [there are two dependent clauses in this sentence]

Trying to fly the plane turned out to be a mistake. [contains 3 dependent clauses]

[note: 'she was almost hit by the falling plane’ would not contain a gerund clause, since 'falling' functions as an adjective here, but 'she was almost hit by the plane falling' would contain a gerund clause]

Units that are NOT counted as dependent clauses:

Conjoined phrases, either nouns or verbs (also called coordinated phrases)

She had coffee, toast and cereal for breakfast. (1 independent clause only)

She sat down and ate her breakfast. (The second subject was deleted--thus 1 dependent clause. This can be confusing because there are two verbs, but no dependent clause. However, the key is that the conjunction is a coordinating (not subordinating) conjunction and both verbs are inflected, that is, they are marked for tense. A subject such as she could be inserted before ate and this would turn it into two independent sentences; this indicates that without a second subject it is a single independent clause. Compare she sat down to eat her breakfast – she cannot be inserted before the second verb, which is in infinitive form; this can be used as indication that to eat her breakfast is a dependent clause.)

Prepositional phrases, however long or numerous they may be

She was tired from the top of her head to the tips of her toes. (1 independent clause only)

C-units may contain more than one dependent clause. Count each dependent clause.

Main clause + 2 dependent (subordinate) clauses

She was sorry 1) when she realized 2) that the plane had fallen in.


Transcribing the Stories Using Language Analysis Programs

Transcribing with CHAT

The norms sample data were transcribed using the CHILDES system (MacWhinney, 2000). CHILDES includes tools for transcription and analysis of transcripts. Researchers have contributed their own transcripts to the system, which are available through the Internet at no charge. There is also no charge for the programs to transcribe and analyse data.

The transcription portion of the system is called CHAT. Selected aspects of CHAT are provided below. For information about the complete CHAT format or any aspect of the CHILDES system, consult the CHILDES website.


Transcriptions can be typed using any word processing program, as long as they are saved in an ASCII text format. If you use Word Perfect, you must save the file as ASCII (DOS) Generic Word Processor (one of the many options at the bottom of the SAVE screen). In Word, it would be saved as a DOS text file. You can also use the CLAN program to enter transcripts directly; they will be saved in ASCII format when you save the file. Each individual story goes into a separate file for Story Grammar and Referring Expressions analyses. For language sample analyses, they are combined into one file for each child.

Note that some lines need to have a tab stop. You do not need to worry about the tab settings -- there just has to be a tab code in the file at the designated spot.

Each file will have certain information in it. Each file must begin with the following two lines:


@PARTICIPANTS: CHI target_child, EXA examiner

The PARTICIPANTS line tells the program who the transcript lines belong to; the variants below are also acceptable:

@PARTICIPANTS: CHI Susan target_child, EXA Sarah examiner

@PARTICIPANTS: SUS Susan target_child, SAR Sarah examiner (in this case, transcript lines would need to begin with SUS and SAR rather than CHI and EXA).

In addition, other lines can be entered if desired, such as the following examples:

@Birth of CHI: 1-JAN-1994 [child’s birthdate – must be in exact format shown]

@Date: 3-FEB-2000 [date on which child was seen]

@Age of CHI: 4;8.3

@Sex of CHI: M

@Transcriber: Mary

You can create a file called HEADER with this information in it and use that to start every file.

Every transcript line of the child’s speech must start with *CHI: (or whatever identifier was specified in the @PARTICIPANTS line) followed by a tab (not spaces) and the utterance, as follows.

*CHI: once an elephant and a giraffe were at the pool.

*CHI: and the elephant had a toy plane.

If the examiner says something that might affect your interpretation of the child’s story, such as asking a question, transcribe it beginning with *EXA: followed by a tab and then the utterance.

The end of file must contain the following line:


When transcribing what the participants said, use upper case letters only for proper nouns and the word "I". Do not use upper case for the first words of sentences.

CHAT transcription conventions:

Unintelligible speech: xx or xxx

I want xx. [one unintelligible word]

I want xxx. [more extensive unintelligibility; sentence will not be counted in MLU]

Pause: # for short pause, ## for medium, ### for long pause

The rhino was ## mad.


False start: <words> [/]

<what> [/] what are you doing? <am I> [/] am I wrong?

Use for exact repetitions; if something is changed, see ‘revision’. Also use for interjections such as ‘um’, ‘uh’, and so on.

Revision: <words> [//]

<the giraffe> [//] the elephant

Use when only limited change is made, such as change of noun.

Completely new tangent: <words> [/-]

<he ran> [/-] the elephant felt bad.

Untranscribed material: www

This can be used to indicate that there was an interruption, rather than transcribing the interruption; e.g., if an interruption occurs, you could transcribe it as follows:

*CHI: www.

%exp: makes comment about announcement on PA.

Guessed transcription: [?]

She was heavy [?].

Indicating guesses can be helpful when checking a transcription; the person checking is alerted to an uncertain transcription and may be able to correct it.

Alternative transcription (when you are not sure whether the child said one thing or another): You can put angled brackets around the phrase and put the alternate phrase in straight brackets as in the following example.

He wanted <one or two> [=? one too].

Contractions: Show the contraction as two words with the vowel in parentheses, as in the following examples:

does n(o)t I (a)m

did n(o)t I (ha)ve

do n(o)t he (i)s

can n(o)t she (ha)s

has n(o)t they (a)re

had n(o)t would (ha)ve

could n(o)t

would n(o)t

[The only exception: Do not modify ‘won’t’ – it’s not really wo + not!]

If an utterance continues past one line, the second line must be indented using a tab (NOT spaces). (Only put one tab in -- adjust the tab set for the file if necessary.) If you’re using a word processing program to enter the transcript, it’s probably a good idea to add these tabs only after you’ve finished the transcript completely (meaning that you have checked it completely at least once). If you use CLAN to type the transcript, it will automatically put a tab in when it "wraps" the line.


*CHI: Ellie the elephant and Gerry the giraffe were by the pool one day with a toy airplane.

Utterances can only end with one of the following: a period, a question mark, or an exclamation mark. These can ONLY be used at the end of the line. Thus do not use them in abbreviations. Spell out "Mister," ‘Missus," "Doctor", or other titles.

The following conventions are ways to indicate the manner of speaking. Their use is optional.

  • Lengthening of a word: : A little: tired.
  • Lengthening of a syllable: : bana:na
  • Pause between syllables: :: bana::na

Interruptions of speech flow:

Trailing off of an utterance: +... *CHI: And then +...

*EXA: Is that the end?

Self-interruption: +//. *CHI: And then he +//.

*CHI: What’s that noise?

  • Interruption: +/. *CHI: Once there was a +/.
  • *EXA: Wait til I put the microphone on you.

NOTE: Whenever possible, put one ‘sentence’ on a single line. We want to capture false starts and hesitations, which means we want them in angled brackets on the line with the final version. For example, if a child says:

Once there was a - oh wait, what is that animal? A giraffe.

Transcribe as:

*CHI: Once there was <a> [/] [= oh wait, what is that animal] a giraffe.

Transcribing it this way rather than on 3 lines (as is also possible with the above conventions) allows us to capture that there was a 5-word story-related utterance with a false start and a comment (i.e., non-story utterance, ‘what is that animal’). If it were transcribed as 3 utterances, the child would get credit for one 3 word and one two-word utterance – which would shorten the child’s mean length of communication unit.

Extra information (optional):

          laughing, yelling, crying, etc.: [=! ]

          e.g. He said stop that [=! yelling]!

          e.g. It was so funny [=! laughing].

Comments: [= ] There is the giraffe [= points].

Stressed word: [!] He was real [!] mad.

Do not use quotation marks. Quotations are simply transcribed, as in the following:

he said could you get our plane out of the water?

If you plan to use the stories to obtain a score for Number of Different Words, it is important to transcribe words as they were transcribed in the normative data. Enter the following words as single words:





Enter the following as separate words:

any time

all right

          every day

Spell OK as ‘okay’ rather than ‘ok’. If part of a word is omitted, such as be in because, either transcribe the entire word, or if you wish to indicate the omitted portion, put it in parentheses: (be)cause, (a)gain, goin(g). If the child uses semiauxilliary forms such as gonna, wanna, and so forth, transcribe the full form (e.g., going to, want to), and if you wish, you can include the actual form after it as in the following example.

He is going to [= gonna] jump in because they want to [= wanna] get the ball.

For "fillers" such as ‘uh’ , ‘um’, etc.: Spell only as ‘uh’ or ‘um’, not ‘uhm’, ‘ah’, or other variants. Put in angled brackets as for false starts.

If you are planning to use the norms for Number of Words and Number of Different Words or to calculate D for comparison to the norms, you will need to mark word boundaries for grammatical morphemes. Because we were interested only in counting word types rather than analysing bound morphemes, we simply placed hyphens between the stem and grammatical morpheme. We did not mark derivational morphemes, because it is not clear that children recognize the relationships among words with and without such morphemes; for instance, do children consider beauty and beautiful to have the same root word? Instead, we allowed the CLAN program to count them as two different words.

The grammatical morphemes plural -s, possessive -‘s, present progressive –ing, and past tense –ed were preceded by hyphens in the transcripts as in the examples below. Note that spelling was altered as well in the case of doubled or dropped letters, so that the program would recognize the stem (it was also necessary to spell the bound morpheme correctly in order to avoid an error message when using the CHECK command in CLAN).

Word Marked for Word Boundary as

"bounces " bounce-s

"bouncing " bounce-ing

"bounced" bounce-ed

"drops " drop-s

"dropping" drop-ing

"dropped " drop-ed

"falled" fall-ed*

"felled" fell-ed*

"pockets" pocket-s

"berries" berry-s

"centses" cents-s*

"giraffe’s" giraffe-‘s

*Note: if you were counting only correct forms, as in 14 morpheme analysis, you would not want to mark these forms as you would correct past tense forms. However, 14 morpheme counts were not made in the ENNI analyses because of the age of the children; the only purpose for marking morphemes was to obtain an accurate count of word types.

Only grammatical morphemes were marked. Derivational morphemes were not marked, as it is not possible to determine in most cases whether children recognize that multimorphemic words are actually made up of stems plus affixes or whether they are simply different words to the child.

Completing the Transcription

Be sure to remember to save the file. All filenames must end in .CHA (final version of the file). When you are done transcribing, check the transcript. The Check command is Esc-L, or select Check File on the Mode menu.

Note that some of the above conventions may be dispensed with depending on your purposes for transcribing. If you plan to use the norms for word counts, you will need to transcribe words and mazes and break up utterances as was done in the normative sample. If you are interested only in Story Grammar Units and/or First Mentions, it is not important to include mazes or mark word boundaries.

Transcribing with SALT

The computer program Systematic Analysis of Language Transcripts (SALT, Miller & Chapman, 1998) can be used to assist in the transcription and analysis of narrative transcripts. SALT will do basic calculations such as number of utterances, number of words, number of different words, and mean length of utterance. It is also possible to add codes to SALT transcript files for such analyses as story grammar units, referring expressions, and so forth; SALT will then count the occurrences of each code and provide totals. SALT is generally easier to use than CHAT, but it is not free. It is available from, where you can also download a free trial version.

Transcription of tapes using SALT

You can transcribe using the SALT program. It is also possible to transcribe using a word-processing program such as WordPerfect or Microsoft Word, as long as the files are saved in ASCII (DOS or text-only) format. If you use SALT, the program will automatically insert two spaces when an utterance is too long for a single line and goes to a new line. If you use another program, you must be sure that two spaces are inserted whenever an utterance continues to a new line.

Each task must be in a separate file, so that you can use SALT to analyze each one separately. Be sure to copy each file onto a diskette for a backup. If you collect a written story, transcribe it as well, so that you can analyze it with SALT.

Transcription conventions:

For SALT, the following information typically is found at the beginning of the file:

  • ‘dollar sign line’ identifying the speakers $child, examiner
  • child's name +Amanda
  • Context: [narr[ative or con[versation] +Context: narr
  • child's age in years;months +CA: 8;8

If the file is transcribed in SALT, a page appears on which information such as the above can be filled in; the information is then inserted at the beginning of the new file in the correct format.

Each transcript line must start with a letter to indicate who is speaking for each utterance (e.g., C for child, E for Examiner), followed by a single space. Each transcript line must end with a period, question mark, or exclamation point. The punctuation must be the last character in the utterance (no quotation marks or anything else after it).

Contextual information, such as pointing, is indicated by curly brackets.

e.g.: He sees this {points at airplane}.

Parentheses indicate "mazes": false starts, repetitions, 'fillers,' repair terms, and pauses.

False starts: e.g., (He wanted to) she saw his airplane.

Repetitions (if they are essentially false starts): (she) she (wan) wanted to play with it. (Do not put in parentheses if repetition is for emphasis, e.g., she really really wanted to play.)

Fillers: (uh) he saw (um like) a ball.

Repair terms: (He saw) (I mean) he wanted it.

Discourse markers: (Well,) he didn’t (like) mean it, (y’know).

Pauses: He was at (:) the swimming pool. [The parentheses are not necessary around pauses.]

Combinations: (um she) (she) he saw (like :) a cookie stand.

X is used for an unintelligible syllable.

e.g.: She wanted to XX airplane.

Emphasized word in caps.

e.g.: She REALLY wanted the airplane.

You must mark any contractions by inserting a /, e.g., I/’m, he/’s, do/n’t. (Do not divide won’t into wo/n’t.)

Word Counts Using SALT

If you plan to use SALT to obtain a count of words or number of different words for comparison to the norms, you will need to mark word boundaries. In SALT, morphemes are marked the same way that contractions are marked: with a /. As with CLAN transcription, the spelling needs to be altered to show the entire root and stem. For example:

Word Marked for Word Boundary as

"bounces " bounce/3s

"bouncing " bounce/ing

"bounced" bounce/ed

"drops " drop/3s

"dropping" drop/ing

"dropped " drop/ed

"falled" fall/ed*

"felled" fell/ed*

"pockets" pocket/s

"berries" berry/s

"centses" cents/s*

"giraffe’s" giraffe/‘s

*Note: if you were counting only correct forms, as in 14 morpheme analysis, you would not want to mark these forms as you would correct past tense forms. However, 14 morpheme counts were not made in the ENNI analyses because of the age of the children; the only purpose for marking morphemes was to obtain an accurate count of stem words.

Only grammatical morphemes were marked. Derivational morphemes were not marked, as it is not possible to determine in most cases whether children recognize that multimorphemic words are actually made up of stems plus affixes or whether they are simply different words to the child.

Transcription examples

Example of transcription using CHAT:


@PARTICIPANTS: CHI Bill target_child, EXA examiner

@Age of CHI: 4;7

*CHI: one time the elephant and the giraffe went to <the> [/] the pool.

*CHI: and <uh> [/] the elephant was bounce-ing <uh> [/] this ball. [this is a separate utterance even if previous utterance did not have sentence-final intonation]

*CHI: and <uh> [/] the ball went in the pool.

*CHI: so <the uh> [/] the giraffe jump-ed <on> [//] in the pool.

*CHI: and he swam to the ball.

*CHI: he got the ball.

*CHI: And he gave it back to the elephant.

*CHI: he got the ball and gave it back to her.

*CHI: and the elephant felt happy (be)cause now she had her ball back.


Same transcript using SALT:

$ Child, Examiner

+Name: Bill

+CA: 4;7

C one time the elephant and the giraffe went to (the) the pool.

C and (uh) the elephant was bounce/ing (uh) this ball.

C and (uh) the ball went in the pool.

C so (the uh) the giraffe jump/ed (on) in the pool.

C and he swam to the ball.

C he got the ball.

C and he gave it back to the elephant.

C he got the ball and gave it back to her.

C and the elephant felt happy because now she had her ball back.