School of Library and Information Studies

LIS 598 Information Modelling in XML

COURSE OUTLINE:  

Instructor:

Peter Binkley
Peter.Binkley@ualberta.ca

Prerequisite: LIS 501

Course Goal: To provide an introduction to both theoretical and practical aspects of XML and its major applications in library and information services.

Course Objectives:

Upon completion of this course, students will:  

  1. understand the concept of XML and its associated technologies such as XSLT, and their importance in information modeling.  
  2. know how to structure, present and transform information in XML and associated technologies.  
  3. understand major applications of XML in LIS.  

Possible Assignments:  

  1. Create and validate XML document(s) such as research papers and metadata records.  
  2. Transform XML document(s) from one format to another.  
  3. Short Quiz (10-15 marks) to test students` understanding of concepts.  

LIS 598: SPECIAL TOPICS: Information Modeling in XML

Outline

The purpose of this course is to provide hands-on experience with XML, some common data and metadata formats that use it in the library world, and some common tools for manipulating and exploiting it. Class work will consist of alternating lectures and hands-on sessions, in which particular technologies will be applied to library-related problems.

Students will come away with:

  • An understanding of the origins and purposes of XML and related technologies
  • An introduction to the principal XML data formats in use in the library world: MARC-XML, MODS, Dublin Core
  • An introduction to some of the XML text formats: TEI, DocBook, ODF Practical experience with the principal XML manipulation tools: XPath, XSLT, XQuery
  • An understanding of how XML is used in digital library environments
  • An understanding of the Semantic Web and the current state of progress towards incorporating library resources into the emerging “Linked Data” environment
  • Programming experience is not required; the hands-on sessions will consist of modifying and extending examples.

Preparation

Students are highly encouraged to bring a laptop to class, and to pre-install the free open-source software that will be used in the hands-on sessions.

There are versions for Windows, Mac and Linux.

  • XML Copy Editor: http://sourceforge.net/projects/xml-copy-editor/files/
    • Windows users should install the latest version of the xmlcopyeditor-windows package (at the bottom of the downloads page)
    • Mac users will have to use the Windows version, running under the Darwine Windows emulator, or as an alternative they may use Smultron: http://sourceforge.net/projects/smultron/. If you have difficulty please contact the instructor.
    • Linux users can probably install it using their Linux distribution’s package installer: look for the package “xmlcopyeditor”
    • Apache Ant: http://ant.apache.org/
      • Download the binary version and install it according to the instructions here; simplified instructions for Windows here: http://www.sitepoint.com/article/apache-ant-demystified/
      • Mac users may prefer to install Ant using Darwin-Ports: http://apache-ant.darwinports.com/ (installation instructions start below the horizontal rule)
      • Linux users can use their distribution’s package installer to install the package “ant”
      • Saxon B Download the zip file from here:
        • http://sourceforge.net/projects/saxon/files/saxon/9.1.0.7/saxonb9-1-0-7j.zip/download . We will install it into Ant in class.

        Students who do not have a suitable laptop or who have difficulty installing the recommended software should contact the instructor during the week before the course. Some loaner laptops are available, and an office hour can be arranged to sort out the installation problems.

        Recommended Preparatory Readings

        Resources

        Online tutorials at w3cschool:

        Evaluation

        25% Class participation

        75% Final assignment: a choice of:

        • Hands-on project: develop an entry in the JISC competition (http://www.sero.co.uk/jisc-mosaic-competition.html) using any of the technologies covered in the course. Your entry should work, but does not have to be sufficiently innovative to be a real entry (though you should feel free to submit your entry if you wish to – your results will not affect the evaluation of your project in this course.) Note that the competition calls for a “browser-accessible” project, implying that it is accessible on a web server; for the purposes of this assignment you should use Apache Ant to fetch and process the XML data and generate the HTML output.
        • Research project: What will it take to move bibliographic metadata into the emerging linked-data/semantic web environment, and make it useful? Evaluate and respond to Martha Yee’s experiment and the uncertainties she expresses about the way forward:
          • Martha M. Yee, "Can Bibliographic Data Be Put Directly Onto the Semantic Web?" (2009). Information Technology and Libraries. 28 (2), pp. 55-80. Postprint available free at: http://repositories.cdlib.org/postprints/3369
          • Student-devised projects of comparable complexity that are relevant to the student’s interests or other courses will be considered on application to the instructor.

          Segments

          • XML intro
          • Areas of application
          • Kinds of tools
          • Our tools: XML Copy Editor
          • XML styles: fields vs text – MARC, MODS, DC, vs TEI, DocBook, ODF
          • Elements, Attributes, namespaces
          • Exploiting the structure: XPath
          • Our tools: Ant
          • Validation: DTD, Schema
          • Transformation: XSLT
          • Our tools: Saxon B
          • Querying: XQuery
          • Relationships: XLink
          • Linked Data – RDF
          • Future of Bibliographic Metadata: FRBR, RDA