Improvement of Translation Accuracy for the Outlines of Japanese Statutes by Splitting Parenthesized Expressions

Authors

  • Kouhei Okada Nagoya University
  • Yasuhiro Ogawa
  • Makoto Nakamura
  • Tomohiro Ohno
  • Katsuhiko Toyama

Abstract

To globally share Japanese legal information, we translate the Outlines of Japanese statutes. These outlines are the official summaries of Japanese statutes and are useful to quickly understand their contents. In a previous statistical machine translation system for the outlines, we found that the training corpus consisted of both statutes and their outlines, including many long sentences that reduced the translation quality. To solve this problem, we shortened the length of sentences and focused on parenthesized expressions. In this paper, we propose a translation method that splits off parenthesized expressions from the sentences. Experimental result shows the effectiveness of our method.

References

Greenleaf, G. Legal Information Institutes and the Free Access to Law Movement, GlobaLex website, 2008.

GYOSEI Corporation Legislation Seminar, Zusetsu Houseishitsumu Nyumon, GYO-SEI Corporation, 2013 (in Japanese) .

Hoshino, S., Miyao, Y., Sudoh, K., and Nagata, M. Two-Stage Pre-ordering for Japanese-to-English Statistical Machine Translation, Proc. of IJCNLP 2013, pp. 1062-1066, 2013.

Inagi, D., Ogawa, Y., Nakamura, M., Ohno, T., and Toyama, K. Statistical Machine Translation for Outlines of Japanese Statutes, Proc. of JURISIN 2013, pp. 37-49, 2013.

Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H. Automatic Evaluation of Translation Quality for Distant Language Pairs, Proc. of EMNLP 2010, pp. 944-952, 2010.

Kawachi, G., Ogawa, Y., Nakamura, M., Ohno, T., and Toyama, K. Daily News on Japanese Legislation toward International Sharing of Japanese Legal Information, Journal of Open Access to Law, Vol. 3, No. 1, 19 pages, 2015.

Koehn, P. Statistical Machine Translation, Cambridge University Press, 2010.

Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constanin, A., and Herbst, E. Moses: Open Source Toolkit for Statistical Machine Translation, Proc. of ACL2007, pp. 177-180, 2007.

Kudo, T., Yamamoto, K., and Matsumoto, Y. Applying Conditional Random Fields to Japanese Morphological Analysis, Proc. of EMNLP 2004, pp. 230-237, 2004.

Och, F. J. Minimum Error Rate Training in Statistical Machine Translation, Proc. of ACL 2003, pp. 160{167, 2003.

Och, F. J. and Ney, H. A Systematic Comparison of Various Statistical Alignment Models, Computational Linguistics, Vol. 29, No. 1, pp. 19-51, 2005.

Ogawa, Y., Inagi, D., Nakamura, M., and Toyama, K. Translation for Outlines of Japanese Acts, Law via the Internet 2013, 12 pages, 2013.

Papineni, K., Roukos, S., Ward, T., and Zhu, W. BLEU: A Method for Automatic Evaluation of Machine Translation, Proc. of ACL 2002, pp. 138-145, 2002.

Stolcke, A. SRILM - An Extensible Language Modeling Toolkit, Proc. of ICSLP 2002, pp. 901-904, 2002.

Toyama, K., Saito, D., Sekine, Y., Ogawa, Y., Kakuta, T., Kimura, T., and Matsuura, Y. Design and Development of Japanese Law Translation Database System, Law via the Internet 2011, 12 pages, 2011.

Downloads

Published

2016-03-01

Issue

Section

Data organization and legal informatics