Investigating Connected Speech from Tohono O'odham Digitized Legacy Data
Access status:
Open Access
Type
Conference paperAuthor/s
Fitzgerald, ColleenAbstract
Archival and legacy resources provide rich material for linguistic investigation, provided such materials are accessible. The growing number of digital tools offers prospects for investigating new research questions. Working with such resources and considering usability, particularly ...
See moreArchival and legacy resources provide rich material for linguistic investigation, provided such materials are accessible. The growing number of digital tools offers prospects for investigating new research questions. Working with such resources and considering usability, particularly with community members, leads to new insights with old data. In this paper, I talk about how exactly this has happened for the Tohono O'odham language by digitizing two legacy resources: an out-of-print dictionary (Mathiot 1973) and fieldnotes from the late Kenneth Hale. This paper focuses on new linguistic investigations possible because of having digitized these materials. Tohono O'odham is a Native American language primarily spoken in the southwestern United States. The number of speakers has been declining, with an estimated 8,000 to 10,000 speakers. While there has been considerable linguistic research done on the Tohono O'odham language, including preliminary descriptions (Dolores 1913, Mason 1950, Hale 1959, Saxton 1963, 1982, Mathiot 1973, Zepeda 1988), most research has focused on syntax (i.e., Hale 1975, Fitzgerald 2003) phonology (i.e. Hill and Zepeda 1992, Fitzgerald 1997, 2002), and morphology (i.e., Zepeda 1984, 1987, Hill and Zepeda 1991, 1998). Relatively little has been done on connected speech, particularly important for an endangered language, with benefits to revitalization and second language instruction. In fact, there are unanalyzed resources in terms of connected speech in the Mathiot dictionary and the Hale field notebooks. Mathiot noted deletions, assimilations and other attested surface forms, with annotations indicating the citation form. The Hale notebooks were recorded in 1961 from two Tohono O'odham speakers (different dialects) and a third speaker of the Pima variety, with accompanying recordings. While Hale's transcriptions are in citation form, he indicates pauses and high-level phrase groupings and thus gives indications of prosodic phrasing. His transcriptions are also contextualized, whereas Mathiot's connected speech transcriptions are given out of their narrative context. The O'odham digitized materials have considerable potential for data-mining, as well as practical uses for revitalization and maintenance. I address how each set of materials presents challenges, including how to best represent aspects of connected discourse in tools such as FLEx, to consider what elements are helpful for revitalization and language teaching, and to package information for second language learners. Exploring these implications is useful for other revitalization and research teams; it offers ideas about what type of documentation will have future use, as well as challenges in balancing connected speech and citation forms in standardized formats like dictionaries.
See less
See moreArchival and legacy resources provide rich material for linguistic investigation, provided such materials are accessible. The growing number of digital tools offers prospects for investigating new research questions. Working with such resources and considering usability, particularly with community members, leads to new insights with old data. In this paper, I talk about how exactly this has happened for the Tohono O'odham language by digitizing two legacy resources: an out-of-print dictionary (Mathiot 1973) and fieldnotes from the late Kenneth Hale. This paper focuses on new linguistic investigations possible because of having digitized these materials. Tohono O'odham is a Native American language primarily spoken in the southwestern United States. The number of speakers has been declining, with an estimated 8,000 to 10,000 speakers. While there has been considerable linguistic research done on the Tohono O'odham language, including preliminary descriptions (Dolores 1913, Mason 1950, Hale 1959, Saxton 1963, 1982, Mathiot 1973, Zepeda 1988), most research has focused on syntax (i.e., Hale 1975, Fitzgerald 2003) phonology (i.e. Hill and Zepeda 1992, Fitzgerald 1997, 2002), and morphology (i.e., Zepeda 1984, 1987, Hill and Zepeda 1991, 1998). Relatively little has been done on connected speech, particularly important for an endangered language, with benefits to revitalization and second language instruction. In fact, there are unanalyzed resources in terms of connected speech in the Mathiot dictionary and the Hale field notebooks. Mathiot noted deletions, assimilations and other attested surface forms, with annotations indicating the citation form. The Hale notebooks were recorded in 1961 from two Tohono O'odham speakers (different dialects) and a third speaker of the Pima variety, with accompanying recordings. While Hale's transcriptions are in citation form, he indicates pauses and high-level phrase groupings and thus gives indications of prosodic phrasing. His transcriptions are also contextualized, whereas Mathiot's connected speech transcriptions are given out of their narrative context. The O'odham digitized materials have considerable potential for data-mining, as well as practical uses for revitalization and maintenance. I address how each set of materials presents challenges, including how to best represent aspects of connected discourse in tools such as FLEx, to consider what elements are helpful for revitalization and language teaching, and to package information for second language learners. Exploring these implications is useful for other revitalization and research teams; it offers ideas about what type of documentation will have future use, as well as challenges in balancing connected speech and citation forms in standardized formats like dictionaries.
See less
Date
2011-01-01Source title
Sustainable data from digital research: Humanities perspectives on digital scholarship.Share