Title: The relationship between lexicon and syntax in texts written in Catalan by school children and adolescents
Author: Llauradó Singla, Anna
Keywords: Lexicon
Spelling errors
Issue Date: 18-Oct-2012
Publisher: Universitat de Barcelona
Abstract: [cat] Dins el procés de desenvolupament del llenguatge , els anys de l'escola juguen un paper important . El coneixement lingüístic dels escolars difícilment pot caracteritzar-se sense tenir en compte el seu acompliment en la modalitat escrita . Per tal de caracteritzar el desenvolupament del llenguatge dels escolars catalans d'entre 5 i 16 anys d'edat, vam compilar el corpus CESCA ( Català Escolar Escrit a Catalunya ) . El CESCA inclou vocabularis escrits i textos produïts per 2.436 escolars. Els participants tenen llengües d'origen diferents i representen el multilingüisme de la població escolar. Seguint un enfocament basat en el corpus , hem examinat diferents dominis del desenvolupament: el lèxic , la sintaxi ( i la relació entre aquests dos dominis ) i l'ortografia. També hem examinat la influència del plurilingüisme en el desenvolupament del lèxic . En primer lloc , hem trobat que el lèxic creix notablement durant l’escola primària tant en tamany com en qualitat , incloent progressivament paraules morfològicament complexes , una major proporció d’adjectius i termes o construccions multiparaula més avançats. A diferència d’estudis similars en altres llengües no hem trobat que la densitat lèxica dels textos creixi amb l'edat. La llengua materna és una variable rellevant per al creixement lèxic . No obstant, el multilingüisme no va ser necessàriament perjudicial per al desenvolupament del lèxic tardà . L’ús a l’escola d’una llengua diferent de la de la llar és més un desavantatge per als nens monolingües que per als nens que usen el català a l’escola i parlen més d'un idioma fora de l'escola . En segon lloc, l’analisi dels textos en relació amb els patrons de creixement de la complexitat sintàctica en dos llocs diferents: el sintagma nominal i el nivell de clàusula, mostra que l'adquisició de la complexitat sintàctica és un procés prolongat, i en el cas de complexitat clàusula, d’aparició tardana. Hem trobat correlacions significatives, pero disperses, entre els usos lèxics i sintàctics en els textos escrits. Finalment , hem mirat com els escolars usen el seu coneixement lingüístic, per representar ortogràficament els diferents segments lingüístics. Hem trobat que els nens cometen menys errors quan poden recórrer a l’anàlisi fonogràfica i morfològica de les paraules , que quan els cal coneixement ortogràfic o lèxic de la paraula. Aquesta tesi contribueix a l’àmbit de coneixement del desenvolupament tardà del llenguatge, usa una mostra de nens i adolescents àmplia representativa de población escolar, i examina una combinació de variables lingüístiques algunes ben establertes i altres no tan conegudes , que, fins ara, no havien estat usades per l’anàlisi del català.
[eng] In the life long process of language development, the school years play a major role. The linguistic knowledge of schoolers can hardly be characterized without taking into account their performance in the written modality. Writing becomes the necessary platform for the remarkable changes that occur at the lexical, morphosyntactic and discursive levels, all of which are key to the successful attainment of literacy. In order to characterize the pathways of language development of Catalan schoolers ranging from 5 to 16 years of age we compiled the CesCa (Català Escolar Escrit a Catalunya) corpus. CesCa includes written vocabularies of 5 different semantic fields and texts of 6 different types produced by 2,436 school children and adolescents attending 32 state and semi-state schools in Catalonia. The participants were grouped into 5 separate groups according to their home languages. Only two groups spoke Catalan at home as their only language or in a bilingual condition along with Spanish. The sample thus notably represents the multilingualism of the school population at present, and renders an updated picture of authentic (written) language productions by that school population. All the written productions have been digitalized and prepared for computational processing in the studies presented in the thesis. Using a corpus-based approach, we have examined different domains of development: the lexicon, the syntax (and the relation between these both domains) and spelling, as a problem solving space in which different levels of language are involved. We have also examined the influence of multilingualism on lexical development. First, the domain of lexical development accounts for the acquisition through (linguistic) experience and interaction with their environment, of new lexical items and constructions that become better interconnected and that better represent the child’s knowledge-base. We have found the lexicon to grow markedly throughout gradeschool in size as well as in quality, to include longer morphologically complex words, a higher proportion of adjectives (a later developing category) and more advanced terms or multiword constructions. Against similar research in other languages we have not found text lexical density to grow with age. Home language arose as a relevant variable for lexical outcome. However, multilingualism was not necessarily damaging for later lexical development. In fact, bilingual and multilingual children outperformed children who use only Spanish for out-of-school purposes. Thus, being instructed in a language different from one’s home language is more a handicap for monolingual children than for those other children who speak more than one language (in addition to using Catalan at school) out of school. Both the vocabularies and the texts yield evidence that different semantic fields and types of texts trigger different types of lexicon. The different semantic fields triggered different grammatical categories and some primed more frequent words and other less frequent, more sophisticated terms. However, it is by the analysis of the text-embedded lexicon that we can best assess how, with age, children learn to fine tune their lexical uses to the type of text they are producing. Next, the domain of syntactic development is related to the acquisition of more complex, low frequency, structures deployed for an increasingly varied range of purposes. We have analyzed the texts regarding the pattern(s) of growth of syntactic complexity in two different sites: the noun phrase and the clause level. Compared to lexical development, acquisition of syntactic complexity is a more protracted, and in the case of clause complexity, late process. Only 10th graders produced significantly more complex syntactic architectures and explanations, the most school-like type of text, arouse as the preferred site for this increased complexity. We have found significant but scarec correlations between the lexical and syntactic uses in the written texts. Finally, the domain of spelling regards the way linguistic information is mapped onto orthographic segments in a particular language. We have examined the developmental pattern of spelling from 1st through 5th grade, with a particular regard on he different types of knowledge necessary for rendering orthographic spelling in Catalan. We have found children to make fewer mistakes when they can turn to phonographic and morphological analysis of the words than when they need to use orthographic or lexical knowledge. Morphologically based spellings increase substantially between 1st and 2nd grade pointing to a possible effect of the salient rich morphology in Catalan. This thesis contributes to the field of later language development by covering a sample of children and adolescents wide-ranging in age and linguistic background and by applying a combination of well established language variables with other not so well known yet, on a so far not well researched romance language such as Catalan.
