当前位置:论文写作 > 论文大全 > 文章内容

英语论文网山东

主题:机器翻译模型 下载地址:论文doc下载 原创作者:原创作者未知 评分:9.0分 更新时间: 2024-03-08

简介:适合不知如何写翻译模型方面的相关专业大学硕士和本科毕业论文以及关于翻译模型论文开题报告范文和相关职称论文写作参考文献资料下载。

翻译模型论文范文

机器翻译模型论文

目录

  1. 机器翻译模型:【翻译】亚当kv1制作第一集中文字幕版-标识的涂装

作者简介及博士学位论文中英文摘 要

   论文题目:树到树统计机器翻译优化学习及解码方法研究

   作者简介:,男,19年月出生,20年9月师从于教授,于20年7月获博士学位.n元语法单元上对机器翻译进行建模.虽然这种方法有较强的容错能力,且模型简单易于实现,但是由于没有考虑翻译源语和目标语的句法信息,它对许多重要的翻译问题(如:长距离依赖问题)不能很好的进行处理.针对这些问题,基于句法的统计机器翻译应运而生.特别是,树到树翻译模型(同时利用源语言和目标语句法树的翻译模型)可以同时利用双语的句法信息进行调序,源语结构分析及目标语结构生成,因此它相比其它基于句法的翻译模型具有更大的翻译性能提升潜力.本文以基于句法的统计机器翻译为框架,对树到树翻译模型的优化学习及解码中的若干关键问题进行研究并提出解决方案.主要内容包括以下四个方面:

   本文提出了一种无指导的树到树结构对齐模型.首先,本文把结构对齐问题转化为翻译规则的推导,然后把结构对齐概率计算化简为多种因素的组合,最后通过EM等算法对结构对齐模型的参数进行无指导学习.在获得树结构对齐模型的基础上,本文进一步利用树结构对齐的后验概率,构建树到树对齐矩阵.并利用树到树对齐矩阵进行规则抽取.相比传统的基于单一对齐结果的规则抽取,基于树到树对齐矩阵的规则抽取可以显着增加规则覆盖度,进而提高系统的翻译质量.

   本文提出一种受限束宽度的模型训练方法.相比传统的不考虑搜索问题的模型参数训练方法,本文将受限束宽度的搜索引入到参数训练过程中来.通过定义不同的损失函数,本文分别从束搜索和翻译结果评价(如:BLEU)两个角度对训练过程进行建模.最后利用迭代式学习从双语数据中自动训练模型参数.由于本文提出的方法可以在训练过程中更多的考虑搜索和翻译结果评价等因素,利用这个方法训练所得到的模型更适用于(树到树)解码,进而提高系统在测试集上的翻译准确性.

   本文针对树到树的解码问题,提出了基于混合粒度的解码和基于集成学习的解码优化方法.前者通过定义不同粒度的翻译文法(或模型)来对翻译过程进行不同层次/粒度的描述,然后混合使用多种粒度的文法进行树到树解码.一方面,粗粒度文法可以确保解码能在足够大的搜索空间上进行,减少搜索错误;另一方面,细粒度文法可以对翻译结果进行更准确地评价,进而提高模型打分的准确性.基于集成学习的解码的基本思想是利用同一个*生成多个翻译结果候选集,之后对所有这些翻译候选重新解码,进而得到更优的翻译结果.实验结果证明本文提出的这两种优化解码方法可以显着提高树到树系统的翻译质量. 本文提出一种基于树替换文法的目标语树结构评价模型.首先,本文对树到树系统的翻译结果所对应的句法树结构进行建模,并利用树替换文法对目标语树结构的质量进行评价.通过在机器翻译训练数据(目标语部分)上的学习,本文提出的树结构评价模型能够准确地评价翻译结果的句法结构的质量,进而带来了翻译性能的提升.此外,本文还对目标语树结构评价模型在*中的集成问题进行了研究,并提出了三种树结构评价模型集成方法.

   基于本文的技术,我们成功开发了开源统计机器翻译系统NiuTrans(nlplab./NiuPlan/NiuTrans.html1),并在NTCIR2和CWMT3等多项国内外机器翻译评测中取得了第一,第二名的成绩. 关键词:统计机器翻译;树到树翻译;句法对齐;参数训练;解码

   On Training and Decoding Approaches to Tree-to-tree

   Statistical Machine Translation

   XIAO Tong

   ABSTRACT

   Machine translation is one of man's oldest dreams and has received growing interests over a long period of time. Recently statistical approaches h论文范文e been succes论文范文ully applied to machine translation. More and more studies h论文范文e focused on learning translation systems from the large collection of bilingual sentence pairs and automatically translating new sentences using the resulting system. In statistical machine translation, traditional approaches are modeled in either word or n-gram (phrase) level. While these approaches are robust and easy to implement, they ignore the underlying (syntactic) structure of sentence and thus h论文范文e limited capabilities in dealing with long distance dependencies and generating grammatically-correct outputs. To address these problems, the syntax-based approach has been recognized as one of the most desirable solutions. Among various syntax-based models, tree-to-tree models (i.e., translating from a given source-language parse tree into a target-language parse tree) are no-doubt the most promising directions due to their obvious advantages over other phrase-based and syntax-based counterparts, such as: better use of bilingual syntax in modeling the reordering problem, better analysis of source tree and syntactic generation of target-language syntactic structure. In this article, we investigate approaches to tree-to-tree translation. In particular, we focus on developing better model learning and decoding methods for tree-to-tree systems. Our contributions are summarized as follows:

   We present an unsupervised sub-tree alignment model. In this work, we first model the sub-tree alignment problem as derivations of tree-to-tree tran论文范文er rules, and depose the model into a product of several factors under reasonable assumptions. The model parameters

机器翻译模型:【翻译】亚当kv1制作第一集中文字幕版-标识的涂装

are then learned on the bilingual tree-pairs using the EM algorithm. Moreover, as a by-product, the proposed model can produce a sub-tree alignment matrix, rather than 1-best/k-best alignments. As sub-tree alignment matrix encodes an exponentially large number of possible alignments, we can extract additional translation rules from the alignment matrix. As a result, we can increase the coverage rate of the extracted rule set and thus improve the translation quality.

   We present a beam-width limited approach to training tree-to-tree models. Unlike traditional approaches, we do not ignore the search problem in the training stage, but instead directly parameterize the beam search problem by incorporating various lose functions into modeling. In particular, we consider both the beam-width limited search and the measure of translation quality (e.g., BLEU) in training, and design two loss functions to model these two factors. Furthermore, we propose a simple and effective method to learn our model from the bilingual corpus in an iterative manner. Our experimental studies show that our proposed approach is very helpful in improving a state-of-the-art tree-to-tree system due to the reduction of mi论文范文atch between training and decoding.

   We present two improved approaches to tree-to-tree decoding. The first of these is a course-to-fine approach. Unlike previous approaches, we do not resort to a single grammar, but instead decode with various grammars that h论文范文e different use of syntax (ranging from course-grained grammar to fine-grained grammar). As course-grained grammars can make a "large" search for decoding, the decoder suffers less from search errors. On the other hand, fine-grained grammars can assign a more accurate model score to each translation hypothesis and thus reduce model errors. The second decoding approach is based on ensemble learning techniques. In this approach, we first learn a number of different systems using a single translation model (or decoder), and then "select" a better translation from the pool of the translation outputs of these systems. Experimental results show that the proposed approach significantly outperforms the baseline approach that relies on a single MT output.

   We proposed a tree-substitution grammar-based evaluation model of target-tree structure (syntax-based language model) for tree-to-tree translation. First, we model the target tree structure using tree-substitution grammars (TSGs), and then measure the goodness of the tree structures generated during decoding using various parsing models. Our proposed model can be learned on the auto-parsed data. Experimental results show that it is able to benefit a state-of-the-art tree-to-tree translation system, even achieves promising BLEU improvements. In addition, we present three methods for the integration of the proposed evaluation model into decoding. All these methods lead to a further improvement in translation accuracy of the tree-to-tree system.

   The above techniques h论文范文e been employed to an open-source machine translation NiuTrans (nlplab./NiuPlan/NiuTrans.html4) which has been released to the munity for the research purpose. Also, the achievements herein help us to achieve 论文范文-performance in recent translation evaluations tasks, such as NTCIR5 and CWMT6.

   Key words: Statistical Machine Translation; Tree-to-tree Translation; Syntactic Alignment; Parameter Estimation; Decoding

   2

总结:本论文为您写翻译模型毕业论文范文和职称论文提供相关论文参考文献,可免费下载。

机器翻译模型引用文献:

[1] 模型论文范文 模型类自考开题报告范文2万字
[2] 三维模型论文范文 三维模型方面在职研究生论文范文2万字
[3] 模型论文范文 模型方面毕业论文题目范文2万字
《英语论文网山东》word下载【免费】
机器翻译模型相关论文范文资料