ETD Guide/Technical Issues/Berlin DTD workshop

Scope of the workshop

The Idea behind this workshop was to bring together experts and developers working on SGML or XMLbased Document Type Definitions for electronic theses and dissertations. Within the NDLTD-initiative several document type definitions (DTDs) for dissertations have been developed. To come to a standardized or generalized searching and archiving structure, it is absolutely necessary to summarize all those developments and research and to work out correspondences between all of them. If SGML and XML are the preferred future document formats for theses and dissertations, the DTD will play a major role within the whole initiative. The following questions were discussed:

  • How to get to a generalized DTD?
  • Is there just one DTD necessary or is it possible to have more than one DTD, e.g., for different subjects and sciences?
  • Where exactly are the current differences between the existing document type definitions?
  • Which approach shall we follow to unify our DTDs?
  • How to come to an agreed Dublin-Core metadata element set?
  • How many markup should be done by the author, how much by machine translation?
  • Evaluating several tools and ways to come to an SGML/XML-based document

Outcome

This expert workshop covering the topic of using XML for publishing theses and dissertations electronically in universities was held in May 2000 at Humboldt-University Berlin, Germany. It focused on the ways in which XML can be used in university libraries in order to deliver, process and archive scholarly high quality electronic publications. The recognition that a worldwide range of various approaches for SGML/XML-based publishing concepts exists, led to the conclusion that interoperability on an international level is inevitable. Experts from the USA, Finland, Norway, Sweden, France, Portugal, the United Kingdom and Germany discussed commonalities and differences among their models, approaches and document type definitions (DTDs).

An Objective of the Workshop

The objective of the workshop was also to exchange experiences in document conversion and author's support therein, to agree on a Dublin Core metadata set worldwide and to share tools. The potential support of authors was recognized as a very essential part within the electronic publishing workflow. Attempts to convince authors to leave their native word processing systems for systems that support structured writing and/or export to SGML or XML usually tend to fail, for several reasons. First, authors are reluctant to switch from the system they are used to. This is exacerbated by the fact that publications usually have to be written within a very short time frame, leaving little or no time for authors to learn a new system, even if they were so inclined. Until XML authoring and editing tools become as simple to use as a word processor, this approach will still be viewed as an added burden to the student. Second, current SGML/XML support systems are usually far more expensive than common ones.

Prearchival Format

Many universities follow another strategy and accept documents in a pre-archival format. This necessitates conversion of the documents by the university libraries or media center. The advantage is that authors are able to prepare their documents within their native word processing systems using style sheets especially designed for that conversion, e.g., in Microsoft Word. Some institutions that have a high focus on format are reluctant to approve this method because the SGML or XML publication may be altered from the word processing document that was deposited by the student. Even though this method seems to exact a high resource cost, it is a step toward the ideal, where authors write in XML directly, using their native word processors. We are not far from this ideal.

ETD Projects with XML-based Approach

The following electronic theses and dissertation (ETD) projects have an XML- based approach already in place or are presently in a pilot phase:

  • Swedish University of Agricultural Sciences (SLU) Libraries, SWEDEN,
  • Virginia Polytechnic Institute and State University, University Libraries, USA
  • Sentiers, Université Lumiére, Lyon 2, (Cytbertheses.org project), FRANCE
  • University of Montreal, CANADA
  • Universidad de Chilé (Santiago de Chilé), CHILE
  • Humboldt-University Berlin, "Dissertation Online", GERMANY
  • University of Oslo, Center for Information Technology Services, NORWAY
  • University of Iowa, Graduate College, USA
  • University of Michigan at Ann Arbor, Library, USA
  • Helsinki University of Technology, Library, FINLAND

For further information, please visit the workshop-website at: http://dochost.rz.huberlin.de/epdiss/dtd-workshop/index.html

With the workshops there has been a collection of tool, that are used fpor SGML/XML publishing at different universities. This collection can be found at: http://dochost.rz.huberlin.de/epdiss/dtd-workshop/cdrom/index.htm


Next Section: Support for students to write directly in XML