Using the TEI to encode manuscripts of Australian languages
Access status:
Open Access
Type
PresentationAuthor/s
Thieberger, NickAbstract
This paper will discuss the value of using the Text Encoding Initiative’s schema (TEI) for a set of manuscript vocabularies of Australian Indigenous languages collected by Daisy Bates in the early 1900s. I will first outline the method used to type the text from manuscript images ...
See moreThis paper will discuss the value of using the Text Encoding Initiative’s schema (TEI) for a set of manuscript vocabularies of Australian Indigenous languages collected by Daisy Bates in the early 1900s. I will first outline the method used to type the text from manuscript images and then contrast the effort required to render the vocabularies as encoded text with the simpler method of placing page images online (as PARADISEC did, for example, with the Capell papers). I will assess the tools available for markup of the corpus and show that the encoded version affords more research outputs than does the simple rendering of an image, and has benefits for the broader community, including speakers of the languages recorded. In addition, the project provides the National Library of Australia with an enriched description of the Bates vocabulary collection. Once the data structures have been tested by users there is the potential to crowdsource annotation of as yet untranscribed handwritten sections of the work.
See less
See moreThis paper will discuss the value of using the Text Encoding Initiative’s schema (TEI) for a set of manuscript vocabularies of Australian Indigenous languages collected by Daisy Bates in the early 1900s. I will first outline the method used to type the text from manuscript images and then contrast the effort required to render the vocabularies as encoded text with the simpler method of placing page images online (as PARADISEC did, for example, with the Capell papers). I will assess the tools available for markup of the corpus and show that the encoded version affords more research outputs than does the simple rendering of an image, and has benefits for the broader community, including speakers of the languages recorded. In addition, the project provides the National Library of Australia with an enriched description of the Bates vocabulary collection. Once the data structures have been tested by users there is the potential to crowdsource annotation of as yet untranscribed handwritten sections of the work.
See less
Date
2013-01-01Licence
This material is copyright. Other than for the purposes of and subject to the conditions prescribed under the Copyright Act, no part of it may in any form or by any means (electronic, mechanical, microcopying, photocopying, recording or otherwise) be altered, reproduced, stored in a retrieval system or transmitted without prior written permission from the University of Sydney Library and/or the appropriate author.Department, Discipline or Centre
University of MelbourneShare