CORPORA
Title: TalkBank
URL: http://www.talkbank.org/data/
Platform: Corpora require the (free) CLANtool (Mac or PC)
Cost: Free [TalkBank is an interdisciplinary research project funded by a 5-year
grant from the National Science Foundation to Carnegie Mellon University and
the University of Pennsylvania.]
Animal; Aphasia; Classroom discourse;Conversations; Gestur; IViE (Intonational
Variation in English); Linguistic Exploration (Tseltal, Tzotzil, Chol, Mambila);
Clinical Interviews; LIDES (code-switching; language interaction); Switchboard
(sampling of the larger LDC Switchboard corpus).
Title: The Dialogue Diversity Corpus
URL:
http://www-rcf.usc.edu/~billmann/diversity/DDivers-site.htm
Platform: mainly html
Cost: Free
Provides access to hundreds of dialogues, many thousand lines of interaction.
Title: Project Gutenberg
URL: http://gutenberg.net/
Platform: mainly txt files
Cost: Free
Thousands of (non-copyright) books available in electronic form. Most of these
books were published prior to the 1920's. Three main categories:
Light Literature; such as Alice in Wonderland, Through the Looking-Glass, Peter
Pan, Aesop's Fables, etc.
Heavy Literature; such as the Bible or other religious documents, Shakespeare,
Moby Dick, Paradise Lost, etc.
References; such as Roget's Thesaurus, almanacs, and a set of encyclopedia,
dictionaries, etc.
Title: Alex Catalogue of Electronic Texts
URL: http://www.infomotions.com/alex/
Platform: txt or pdf files
Cost: Free [The Alex Catalogue of Electronic Texts is a collection of public
domain documents from American and English literature as well as Western philosophy.]
You can download individual texts, or "all American text"s, "all
British texts", "all philosophy texts".