ORPP logo
Image from Google Jackets

Computational Phraseology.

By: Contributor(s): Material type: TextTextSeries: IVITRA Research in Linguistics and Literature SeriesPublisher: Amsterdam/Philadelphia : John Benjamins Publishing Company, 2020Copyright date: ©2020Edition: 1st edDescription: 1 online resource (341 pages)Content type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9789027261397
Subject(s): Genre/Form: Additional physical formats: Print version:: Computational PhraseologyDDC classification:
  • 415
LOC classification:
  • P326.5.P45
Online resources:
Contents:
Intro -- Computational Phraseology -- Editorial page -- Title page -- Copyright page -- Table of contents -- Foreword -- The chapters -- Profiling phraseology in different languages -- Measures for phraseology discovery -- All we need is corpora -- References -- Introduction -- References -- Monocollocable words: A type of language combinatory periphery -- 0. Opening -- 1. By way of introduction -- 2. Substance and definition of monocollocable words -- 3. Are there monocollocable words on the language periphery only? -- 4. Distribution of monocollocable words -- 5. Language combinations and language periphery -- 6. Identification of MWS in corpus -- 7. Outlook and applications -- References -- Translation asymmetries of multiword expressions in machine translation: An analysis of the TED-MWE corpus -- 1. Introduction -- 2. Related work -- 3. The TED-MWE corpus -- 4. The annotation guidelines -- 5. The annotation methodology -- Individual annotation -- Inter-annotation validation -- Evaluation -- 6. The results of the annotation process -- 7. Translation asymmetries and mistranslations in the TED-MWE corpus -- 7. Conclusions and future work -- References -- Correspondence information -- German constructional phrasemes and their Russian counterparts: A corpus-based study -- 1. Introduction -- 2. German deictic elements hin and her: Semantics and combinatorial potential -- 3. Construction vor sich her: Underlying pattern, semantics and Russian counterparts -- 3.1 The constructional phraseme [vor sich her + v] and its underlying pattern -- 3.2 Semantic derivation, construction polysemy and lexicographic description -- 4. The construction vor sich hin: Semantics, co-occurrence types and Russian counterparts -- 4.1 Types of verbs co-occurring with vor sich hin -- 4.2 Vor sich hin: Semantic features -- 5. Conclusion -- Funding -- References -- Corpora.
Computational phraseology and translation studies: From theoretical hypotheses to practical tools -- 1. Introduction -- 2. Phraseology and translation studies -- 3. Problems posed by phraseology to human translation -- 4. Problems posed by phraseology to machine translation -- 5. Theoretical hypotheses -- 6. Towards new practical tools -- 7. Conclusion -- References -- Computational extraction of formulaic sequences from corpora: Two case studies of a new extraction algorithm -- 1. Introduction -- 1.1 Counting co-occurrences -- 1.2 N-Gram sizes/configurations and the problem of redundancy -- 1.3 Recent approaches -- 2. The MERGE algorithm -- 3. Case study 1: MERGE vs. AFL -- 3.1 Materials -- 3.2 Results -- 3.3 Interim conclusions -- 4. Case study 2: Exploring MERGE in the context of L1 acquisition -- 4.1 Materials and methods -- 4.2 Results -- 4.3 Discussion -- 5. Conclusion -- References -- Appendix. Summary statistics for the linear model on the acquisition data -- Computational phraseology discovery in corpora with the MWETOOLKIT -- 1. Introduction -- 2. Computational phraseology discovery -- 2.1 General architecture -- 2.2 Freely available tools -- 3. The mwetoolkit -- 4. Phraseology discovery with the mwetoolkit -- 4.1 Candidate search patterns -- 4.2 Association scores -- 4.3 Other scores -- 5. Conclusions and open issues -- References -- Multiword expressions in comparable corpora -- 1. Comparable corpora: A brief survey -- 2. Aranea comparable corpora -- 2.1 Methodology -- 2.2 Available corpora -- 2.3 Access to CC -- 3. Multi-word expressions in comparable corpora -- 3.1 Competition between monolingual and comparable corpora -- 3.2 Data mining in comparable corpora -- 4. Conclusion -- References -- Collecting collocations from general and specialised corpora: A comparative analysis -- 1. Introduction.
2. Lexical combinations in terminology and lexicography -- 3. A comparative analysis -- 3.1 Corpora -- 3.2 Lexical items selected -- 3.3 Automated extraction of collocations -- 4. Observations on the lists of candidate collocations -- 4.1 Overlap of candidate collocates -- 4.2 Rank of candidates -- 4.3 How collocates reveal specific meanings of items -- 5. Concluding remarks: Summary and guidelines for terminologists -- Acknowledgements -- References -- Appendix -- Résumé -- Funding information -- What matters more: The size of the corpora or their quality?: The case of automatic translation of multiword expressions using comparable corpora -- 1. Rationale -- 2. Our methodology for translating multiword expressions -- 3. Data and experiments -- 3.1 Comparable corpora -- 3.2 Data -- 3.3 Vector representations -- 3.4 Gold standard -- 4. Comparable corpora and translation of mwes: Size vs. quality -- 5. Conclusion -- References -- Statistical significance for measures of collocation strength (WP3) -- 1. Introduction -- 2. The chi-squared test (X2) -- 3. The log-likelihood test (G2) -- 4. Fisher's exact test -- 5. The z-score -- 6. The t-test -- 7. Pointwise mutual information -- 8. Computer simulations to estimate statistical significance -- 9. The poisson distribution -- 10. Confidence limits of the mean and standard deviation -- 11. Experimental comparison of measures -- 12. Conclusion -- References -- Verbal collocations and pronominalisation -- 1. Introduction -- 2. Parsing and collocation detection -- 3. Anaphora resolution -- 4. Verbal collocations and pronominalisation -- 5. Experimental results -- 5.1 Evaluation methodology -- 5.2 Evaluation results -- 6. Conclusion -- References -- Empirical variability of Italian multiword expressions as a useful feature for their categorisation -- 1. Introduction.
2. Anomalous behaviours of Italian Multiword Expressions -- 3. A quantitative approach to MWEs -- 3.1 Reasons to go beyond statistics -- 3.2 Reasons for an empirical, quantitative approach to MWEs -- 4. Methodology -- 4.1 Syntactic variations -- 4.2 Lexical variations -- 4.3 Inflectional variations -- 5. Analysis and results -- 6. Conclusion -- References -- Too big to fail but big enough to pay for their mistakes: A collostructional analysis of the patterns [too ADJ to V] and [ADJ enough to V] -- 1. Introduction -- 2. Background -- 2.1 Descriptive background -- 2.2 Methodological background -- 3. Case studies -- 3.1 Data: Source, extraction, cleaning -- 3.2 Case study: Simple collexeme analysis (SCA) -- 3.3 Case study: Distinctive collexeme analysis (DCA) -- 3.4 Case study: Co-varying collexeme analysis (CCA) -- 3.5 Case study: Distinctive co-varying collexeme analysis (DCCA) -- 4. Summary -- References -- Multi-word patterns and networks: How corpus-driven approaches have changed our description of language use -- 1. Introduction -- 2. The rocky road of qualitative interpretation -- 3. Kinds of lexical fixedness -- 3.1 From multiword expressions to patterns -- 3.2 MWPS as autonomous units -- 3.3 Extended context patterns (ECPS) -- 4. Corpus-linguistic methodology and interpretation -- 4.1 Corpus searches -- 4.2 Collocation profiles -- 4.3 KWIC bundles and slot-filler analysis -- 5. A new type of corpus-driven, pattern-based MW dictionaries -- 6. Conclusion -- References -- Internet sources -- Abbreviations -- How context determines meaning -- 1. Patterns and valency -- 2. The verb is the pivot of the clause -- 3. Collocations and lexical sets -- 4. Core meaning -- 5. Phrasal verbs -- 6. Exploiting established phraseology -- 6.1 Phraseology that is both literal and figurative -- 7. Exploiting a proverb -- 8. Other IDIOMS with 'blow' -- 9. Conclusion.
References -- Detecting semantic difference: A new model based on knowledge and collocational association -- 1. Introduction -- 2. Related work -- 3. Methodology -- 3.1 Association-based score -- 3.2 Google N-Grams -- 3.3 Word embedding-based score -- 3.4 ConceptNet score -- 4. Experiments -- 4.1 Data -- 4.2 Experimental setup -- 4.3 Evaluation metrics -- 5. Results and discussion -- 6. Conclusion -- References -- Index.
Summary: In recent years, an increasing number of studies dealt with the computational treatment of multiword expressions: identification, extraction, translation, and the role they play in Natural Language Processing applications. This book aims to address the need for better understanding in this comparatively new field of Computational Phraseology.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
No physical items for this record

Intro -- Computational Phraseology -- Editorial page -- Title page -- Copyright page -- Table of contents -- Foreword -- The chapters -- Profiling phraseology in different languages -- Measures for phraseology discovery -- All we need is corpora -- References -- Introduction -- References -- Monocollocable words: A type of language combinatory periphery -- 0. Opening -- 1. By way of introduction -- 2. Substance and definition of monocollocable words -- 3. Are there monocollocable words on the language periphery only? -- 4. Distribution of monocollocable words -- 5. Language combinations and language periphery -- 6. Identification of MWS in corpus -- 7. Outlook and applications -- References -- Translation asymmetries of multiword expressions in machine translation: An analysis of the TED-MWE corpus -- 1. Introduction -- 2. Related work -- 3. The TED-MWE corpus -- 4. The annotation guidelines -- 5. The annotation methodology -- Individual annotation -- Inter-annotation validation -- Evaluation -- 6. The results of the annotation process -- 7. Translation asymmetries and mistranslations in the TED-MWE corpus -- 7. Conclusions and future work -- References -- Correspondence information -- German constructional phrasemes and their Russian counterparts: A corpus-based study -- 1. Introduction -- 2. German deictic elements hin and her: Semantics and combinatorial potential -- 3. Construction vor sich her: Underlying pattern, semantics and Russian counterparts -- 3.1 The constructional phraseme [vor sich her + v] and its underlying pattern -- 3.2 Semantic derivation, construction polysemy and lexicographic description -- 4. The construction vor sich hin: Semantics, co-occurrence types and Russian counterparts -- 4.1 Types of verbs co-occurring with vor sich hin -- 4.2 Vor sich hin: Semantic features -- 5. Conclusion -- Funding -- References -- Corpora.

Computational phraseology and translation studies: From theoretical hypotheses to practical tools -- 1. Introduction -- 2. Phraseology and translation studies -- 3. Problems posed by phraseology to human translation -- 4. Problems posed by phraseology to machine translation -- 5. Theoretical hypotheses -- 6. Towards new practical tools -- 7. Conclusion -- References -- Computational extraction of formulaic sequences from corpora: Two case studies of a new extraction algorithm -- 1. Introduction -- 1.1 Counting co-occurrences -- 1.2 N-Gram sizes/configurations and the problem of redundancy -- 1.3 Recent approaches -- 2. The MERGE algorithm -- 3. Case study 1: MERGE vs. AFL -- 3.1 Materials -- 3.2 Results -- 3.3 Interim conclusions -- 4. Case study 2: Exploring MERGE in the context of L1 acquisition -- 4.1 Materials and methods -- 4.2 Results -- 4.3 Discussion -- 5. Conclusion -- References -- Appendix. Summary statistics for the linear model on the acquisition data -- Computational phraseology discovery in corpora with the MWETOOLKIT -- 1. Introduction -- 2. Computational phraseology discovery -- 2.1 General architecture -- 2.2 Freely available tools -- 3. The mwetoolkit -- 4. Phraseology discovery with the mwetoolkit -- 4.1 Candidate search patterns -- 4.2 Association scores -- 4.3 Other scores -- 5. Conclusions and open issues -- References -- Multiword expressions in comparable corpora -- 1. Comparable corpora: A brief survey -- 2. Aranea comparable corpora -- 2.1 Methodology -- 2.2 Available corpora -- 2.3 Access to CC -- 3. Multi-word expressions in comparable corpora -- 3.1 Competition between monolingual and comparable corpora -- 3.2 Data mining in comparable corpora -- 4. Conclusion -- References -- Collecting collocations from general and specialised corpora: A comparative analysis -- 1. Introduction.

2. Lexical combinations in terminology and lexicography -- 3. A comparative analysis -- 3.1 Corpora -- 3.2 Lexical items selected -- 3.3 Automated extraction of collocations -- 4. Observations on the lists of candidate collocations -- 4.1 Overlap of candidate collocates -- 4.2 Rank of candidates -- 4.3 How collocates reveal specific meanings of items -- 5. Concluding remarks: Summary and guidelines for terminologists -- Acknowledgements -- References -- Appendix -- Résumé -- Funding information -- What matters more: The size of the corpora or their quality?: The case of automatic translation of multiword expressions using comparable corpora -- 1. Rationale -- 2. Our methodology for translating multiword expressions -- 3. Data and experiments -- 3.1 Comparable corpora -- 3.2 Data -- 3.3 Vector representations -- 3.4 Gold standard -- 4. Comparable corpora and translation of mwes: Size vs. quality -- 5. Conclusion -- References -- Statistical significance for measures of collocation strength (WP3) -- 1. Introduction -- 2. The chi-squared test (X2) -- 3. The log-likelihood test (G2) -- 4. Fisher's exact test -- 5. The z-score -- 6. The t-test -- 7. Pointwise mutual information -- 8. Computer simulations to estimate statistical significance -- 9. The poisson distribution -- 10. Confidence limits of the mean and standard deviation -- 11. Experimental comparison of measures -- 12. Conclusion -- References -- Verbal collocations and pronominalisation -- 1. Introduction -- 2. Parsing and collocation detection -- 3. Anaphora resolution -- 4. Verbal collocations and pronominalisation -- 5. Experimental results -- 5.1 Evaluation methodology -- 5.2 Evaluation results -- 6. Conclusion -- References -- Empirical variability of Italian multiword expressions as a useful feature for their categorisation -- 1. Introduction.

2. Anomalous behaviours of Italian Multiword Expressions -- 3. A quantitative approach to MWEs -- 3.1 Reasons to go beyond statistics -- 3.2 Reasons for an empirical, quantitative approach to MWEs -- 4. Methodology -- 4.1 Syntactic variations -- 4.2 Lexical variations -- 4.3 Inflectional variations -- 5. Analysis and results -- 6. Conclusion -- References -- Too big to fail but big enough to pay for their mistakes: A collostructional analysis of the patterns [too ADJ to V] and [ADJ enough to V] -- 1. Introduction -- 2. Background -- 2.1 Descriptive background -- 2.2 Methodological background -- 3. Case studies -- 3.1 Data: Source, extraction, cleaning -- 3.2 Case study: Simple collexeme analysis (SCA) -- 3.3 Case study: Distinctive collexeme analysis (DCA) -- 3.4 Case study: Co-varying collexeme analysis (CCA) -- 3.5 Case study: Distinctive co-varying collexeme analysis (DCCA) -- 4. Summary -- References -- Multi-word patterns and networks: How corpus-driven approaches have changed our description of language use -- 1. Introduction -- 2. The rocky road of qualitative interpretation -- 3. Kinds of lexical fixedness -- 3.1 From multiword expressions to patterns -- 3.2 MWPS as autonomous units -- 3.3 Extended context patterns (ECPS) -- 4. Corpus-linguistic methodology and interpretation -- 4.1 Corpus searches -- 4.2 Collocation profiles -- 4.3 KWIC bundles and slot-filler analysis -- 5. A new type of corpus-driven, pattern-based MW dictionaries -- 6. Conclusion -- References -- Internet sources -- Abbreviations -- How context determines meaning -- 1. Patterns and valency -- 2. The verb is the pivot of the clause -- 3. Collocations and lexical sets -- 4. Core meaning -- 5. Phrasal verbs -- 6. Exploiting established phraseology -- 6.1 Phraseology that is both literal and figurative -- 7. Exploiting a proverb -- 8. Other IDIOMS with 'blow' -- 9. Conclusion.

References -- Detecting semantic difference: A new model based on knowledge and collocational association -- 1. Introduction -- 2. Related work -- 3. Methodology -- 3.1 Association-based score -- 3.2 Google N-Grams -- 3.3 Word embedding-based score -- 3.4 ConceptNet score -- 4. Experiments -- 4.1 Data -- 4.2 Experimental setup -- 4.3 Evaluation metrics -- 5. Results and discussion -- 6. Conclusion -- References -- Index.

In recent years, an increasing number of studies dealt with the computational treatment of multiword expressions: identification, extraction, translation, and the role they play in Natural Language Processing applications. This book aims to address the need for better understanding in this comparatively new field of Computational Phraseology.

Description based on publisher supplied metadata and other sources.

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

There are no comments on this title.

to post a comment.

© 2024 Resource Centre. All rights reserved.