Show simple item record

FieldValueLanguage
dc.contributor.authorSchembri, Adam
dc.contributor.authorJohnston, Trevor
dc.contributor.authorFenlon, Jordan
dc.contributor.authorCormier, Kearsy
dc.contributor.authorRentelis, Ramas
dc.date.accessioned2012-02-07
dc.date.available2012-02-07
dc.date.issued2011-01-01
dc.identifier.urihttp://hdl.handle.net/2123/8109
dc.description.abstractDigital video archives of Auslan (Australian sign language) and BSL (British Sign Language) are slowly being transformed into machine-readable linguistic corpora. Each archive (Auslan 2004-2008, BSL 2008-2001) consists of data collected from deaf native and near-native signers. The datasets are being annotated using ELAN software. The majority of the video data will be made accessible online (with some limits to access for sensitive data). In this presentation, we report on the on-going studies of lexical frequency in these two signed languages—63,436 sign tokens produced in 360 clips by 109 participants in the currently annotated Auslan dataset, and 25,000 sign tokens from the corpus conversation data in the BSL dataset (500 signs each from 50 participants). Preliminary results signs indicate that between 65% and 60% of the Auslan and BSL data respectively consist of signs from the core lexicon (i.e. those signs which are highly conventionalised in form and meaning across contexts, (see Johnston, 2011, Johnston &Schembri, 1999, 2010). The next two largest categories are pointing signs (12% and 23% respectively) and signs from outside the core lexicon (i.e., gestures and sequences of enactment or 'constructed action') (6.5% and 9% respectively). The remaining number of tokens consists of fingerspelled signs (5% in both datasets), depicting constructions (i.e., depicting verbs of location, motion and/or handling, 11% and 3% respectively), and sign names (0.2 and 0.3% respectively). We discuss some of the challenges creating a lemmatised corpus of a sign language, including difficulties in differentiating core from non-core signs and sign from gesture, as well as how our work informs both sign language documentation and description specifically and linguistic theory more generally. Johnston, T. (2011). Lexical frequency in sign languages. Journal of Deaf Studies and Deaf Education. Johnston, T., &Schembri, A. (1999). On defining lexeme in a signed language. Sign Language and Linguistics, 2(2), 115-185. Johnston, T., &Schembri, A. (2010). Variation, lexicalization and grammaticalization in signed languages. Langage et Société, 131, 19-35.en_AU
dc.description.sponsorshipPARADISEC (Pacific And Regional Archive for Digital Sources in Endangered Cultures), Australian Partnership for Sustainable Repositories, Ethnographic E-Research Project and Sydney Object Repositories for Research and Teaching.en_AU
dc.language.isoenen_AU
dc.relation.ispartofSustainable data from digital research: Humanities perspectives on digital scholarship.en_AU
dc.titleChallenges in lemmatising signed language digital video corpora: the measure of lexical frequency in Australian and British signed languagesen_AU
dc.typeConference paperen_AU


Show simple item record

Associated file/s

Associated collections

Show simple item record

There are no previous versions of the item available.