TY - BOOK AU - Corpas Pastor,Gloria AU - Colson,Jean-Pierre TI - Computational Phraseology T2 - IVITRA Research in Linguistics and Literature Series SN - 9789027261397 AV - P326.5.P45 U1 - 415 PY - 2020/// CY - Amsterdam/Philadelphia PB - John Benjamins Publishing Company KW - Phraseology KW - Electronic books N1 - Intro -- Computational Phraseology -- Editorial page -- Title page -- Copyright page -- Table of contents -- Foreword -- The chapters -- Profiling phraseology in different languages -- Measures for phraseology discovery -- All we need is corpora -- References -- Introduction -- References -- Monocollocable words: A type of language combinatory periphery -- 0. Opening -- 1. By way of introduction -- 2. Substance and definition of monocollocable words -- 3. Are there monocollocable words on the language periphery only? -- 4. Distribution of monocollocable words -- 5. Language combinations and language periphery -- 6. Identification of MWS in corpus -- 7. Outlook and applications -- References -- Translation asymmetries of multiword expressions in machine translation: An analysis of the TED-MWE corpus -- 1. Introduction -- 2. Related work -- 3. The TED-MWE corpus -- 4. The annotation guidelines -- 5. The annotation methodology -- Individual annotation -- Inter-annotation validation -- Evaluation -- 6. The results of the annotation process -- 7. Translation asymmetries and mistranslations in the TED-MWE corpus -- 7. Conclusions and future work -- References -- Correspondence information -- German constructional phrasemes and their Russian counterparts: A corpus-based study -- 1. Introduction -- 2. German deictic elements hin and her: Semantics and combinatorial potential -- 3. Construction vor sich her: Underlying pattern, semantics and Russian counterparts -- 3.1 The constructional phraseme [vor sich her + v] and its underlying pattern -- 3.2 Semantic derivation, construction polysemy and lexicographic description -- 4. The construction vor sich hin: Semantics, co-occurrence types and Russian counterparts -- 4.1 Types of verbs co-occurring with vor sich hin -- 4.2 Vor sich hin: Semantic features -- 5. Conclusion -- Funding -- References -- Corpora; Computational phraseology and translation studies: From theoretical hypotheses to practical tools -- 1. Introduction -- 2. Phraseology and translation studies -- 3. Problems posed by phraseology to human translation -- 4. Problems posed by phraseology to machine translation -- 5. Theoretical hypotheses -- 6. Towards new practical tools -- 7. Conclusion -- References -- Computational extraction of formulaic sequences from corpora: Two case studies of a new extraction algorithm -- 1. Introduction -- 1.1 Counting co-occurrences -- 1.2 N-Gram sizes/configurations and the problem of redundancy -- 1.3 Recent approaches -- 2. The MERGE algorithm -- 3. Case study 1: MERGE vs. AFL -- 3.1 Materials -- 3.2 Results -- 3.3 Interim conclusions -- 4. Case study 2: Exploring MERGE in the context of L1 acquisition -- 4.1 Materials and methods -- 4.2 Results -- 4.3 Discussion -- 5. Conclusion -- References -- Appendix. Summary statistics for the linear model on the acquisition data -- Computational phraseology discovery in corpora with the MWETOOLKIT -- 1. Introduction -- 2. Computational phraseology discovery -- 2.1 General architecture -- 2.2 Freely available tools -- 3. The mwetoolkit -- 4. Phraseology discovery with the mwetoolkit -- 4.1 Candidate search patterns -- 4.2 Association scores -- 4.3 Other scores -- 5. Conclusions and open issues -- References -- Multiword expressions in comparable corpora -- 1. Comparable corpora: A brief survey -- 2. Aranea comparable corpora -- 2.1 Methodology -- 2.2 Available corpora -- 2.3 Access to CC -- 3. Multi-word expressions in comparable corpora -- 3.1 Competition between monolingual and comparable corpora -- 3.2 Data mining in comparable corpora -- 4. Conclusion -- References -- Collecting collocations from general and specialised corpora: A comparative analysis -- 1. Introduction; 2. Lexical combinations in terminology and lexicography -- 3. A comparative analysis -- 3.1 Corpora -- 3.2 Lexical items selected -- 3.3 Automated extraction of collocations -- 4. Observations on the lists of candidate collocations -- 4.1 Overlap of candidate collocates -- 4.2 Rank of candidates -- 4.3 How collocates reveal specific meanings of items -- 5. Concluding remarks: Summary and guidelines for terminologists -- Acknowledgements -- References -- Appendix -- Résumé -- Funding information -- What matters more: The size of the corpora or their quality?: The case of automatic translation of multiword expressions using comparable corpora -- 1. Rationale -- 2. Our methodology for translating multiword expressions -- 3. Data and experiments -- 3.1 Comparable corpora -- 3.2 Data -- 3.3 Vector representations -- 3.4 Gold standard -- 4. Comparable corpora and translation of mwes: Size vs. quality -- 5. Conclusion -- References -- Statistical significance for measures of collocation strength (WP3) -- 1. Introduction -- 2. The chi-squared test (X2) -- 3. The log-likelihood test (G2) -- 4. Fisher's exact test -- 5. The z-score -- 6. The t-test -- 7. Pointwise mutual information -- 8. Computer simulations to estimate statistical significance -- 9. The poisson distribution -- 10. Confidence limits of the mean and standard deviation -- 11. Experimental comparison of measures -- 12. Conclusion -- References -- Verbal collocations and pronominalisation -- 1. Introduction -- 2. Parsing and collocation detection -- 3. Anaphora resolution -- 4. Verbal collocations and pronominalisation -- 5. Experimental results -- 5.1 Evaluation methodology -- 5.2 Evaluation results -- 6. Conclusion -- References -- Empirical variability of Italian multiword expressions as a useful feature for their categorisation -- 1. Introduction; 2. Anomalous behaviours of Italian Multiword Expressions -- 3. A quantitative approach to MWEs -- 3.1 Reasons to go beyond statistics -- 3.2 Reasons for an empirical, quantitative approach to MWEs -- 4. Methodology -- 4.1 Syntactic variations -- 4.2 Lexical variations -- 4.3 Inflectional variations -- 5. Analysis and results -- 6. Conclusion -- References -- Too big to fail but big enough to pay for their mistakes: A collostructional analysis of the patterns [too ADJ to V] and [ADJ enough to V] -- 1. Introduction -- 2. Background -- 2.1 Descriptive background -- 2.2 Methodological background -- 3. Case studies -- 3.1 Data: Source, extraction, cleaning -- 3.2 Case study: Simple collexeme analysis (SCA) -- 3.3 Case study: Distinctive collexeme analysis (DCA) -- 3.4 Case study: Co-varying collexeme analysis (CCA) -- 3.5 Case study: Distinctive co-varying collexeme analysis (DCCA) -- 4. Summary -- References -- Multi-word patterns and networks: How corpus-driven approaches have changed our description of language use -- 1. Introduction -- 2. The rocky road of qualitative interpretation -- 3. Kinds of lexical fixedness -- 3.1 From multiword expressions to patterns -- 3.2 MWPS as autonomous units -- 3.3 Extended context patterns (ECPS) -- 4. Corpus-linguistic methodology and interpretation -- 4.1 Corpus searches -- 4.2 Collocation profiles -- 4.3 KWIC bundles and slot-filler analysis -- 5. A new type of corpus-driven, pattern-based MW dictionaries -- 6. Conclusion -- References -- Internet sources -- Abbreviations -- How context determines meaning -- 1. Patterns and valency -- 2. The verb is the pivot of the clause -- 3. Collocations and lexical sets -- 4. Core meaning -- 5. Phrasal verbs -- 6. Exploiting established phraseology -- 6.1 Phraseology that is both literal and figurative -- 7. Exploiting a proverb -- 8. Other IDIOMS with 'blow' -- 9. Conclusion; References -- Detecting semantic difference: A new model based on knowledge and collocational association -- 1. Introduction -- 2. Related work -- 3. Methodology -- 3.1 Association-based score -- 3.2 Google N-Grams -- 3.3 Word embedding-based score -- 3.4 ConceptNet score -- 4. Experiments -- 4.1 Data -- 4.2 Experimental setup -- 4.3 Evaluation metrics -- 5. Results and discussion -- 6. Conclusion -- References -- Index N2 - In recent years, an increasing number of studies dealt with the computational treatment of multiword expressions: identification, extraction, translation, and the role they play in Natural Language Processing applications. This book aims to address the need for better understanding in this comparatively new field of Computational Phraseology UR - https://ebookcentral.proquest.com/lib/orpp/detail.action?docID=6177126 ER -