A Right to Access Implies A Right to Know: An Open Online Platform for Research on the Readability of Law
Keywords:
readability, legislation, legal informatics, corpus linguistics, machine learningAbstract
The widespread availability of legal materials online has opened the law to a new and greatly expanded readership. These new readers need the law to be readable by them when they encounter it. However, the available empirical research supports a conclusion that legislation is difficult to read if not incomprehensible to most citizens. We review approaches that have been used to measure the readability of text including readability metrics, cloze testing and application of machine learning. We report the creation and testing of an open online platform for readability research. This platform is made available to researchers interested in undertaking research on the readability of legal materials. To demonstrate the capabilities ofthe platform, we report its initial application to a corpus of legislation. Linguistic characteristics are extracted using the platform and then used as input features for machine learning using the Weka package. Wide divergences are found between sentences in a corpus of legislation and those in a corpus of graded reading material or in the Brown corpus (a balanced corpus of English written genres). Readability metrics are found to be of little value in classifying sentences by grade reading level (noting that such metrics were not designed to be used with isolated sentences).References
Eloise Abrahams (2003), Efficacy of plain language drafting in labour legislation. Master's thesis on Human Resource Management), Cape Peninsula University of Technology, South Africa.
Sandra Aluisio, Lucia Specia, Caroline Gasperin, and Carolina Scarton (2010), Readability assessment for text simplification. In Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications, Association for Computational Linguistics, pp. 1-9.
F.A.R. Bennion (1983), Statute law. Oyez
Robert W. Benson (1984), End of legalese: The game is over. NYU Review of Law & Social Change , Vol. 13, p. 519.
Steven Bird, Edward Loper, and Ewan Klein (2009), Natural Language Processing with Python. O'Reilly Media Inc.
J.R. Bormuth (1967), Cloze readability procedure. University of California Los Angeles.
Kevyn Collins-Thompson and James P Callan (2004), A language modeling approach to predicting reading difficulty. In HLT-NAACL, pp. 193-200.
O. De Clercq, V. Hoste, B. Desmet, P. Van Oosten, M. De Cock, and L. Macken (2013), Using the crowd for readability prediction. Natural Language Engineering, pp. 1-33.
Felice Dell'Orletta, Simonetta Montemagni, and Giulia Venturi (2011), Read-it: Assessing readability of Italian texts with a view to text simplification. In Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies, Association for Computational Linguistics, pp. 73-83.
W.H. DuBay (2004), The principles of readability. Impact Information, pp. 1-76.
Lijun Feng, Martin Jansche, Matt Huenerfauth, and Noémie Elhadad (2010), A comparison of features for automatic readability assessment. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Association for Computational Linguistics, pp. 276-284.
W. N. Francis and H. Kucera (1964), A Standard Corpus of Present-Day Edited American. Revised 1971, Revised and Amplified 1979. Department of Linguistics, Brown University Providence, Rhode Island, USA. Available at: www.hit.uib.no/icame/brown/bcm.htm
GLPi and V. Smolenka (2000), A Report on the Results of Usability Testing Research on Plain Language Draft Sections of the Employment Insurance Act. Available at:
http://www.davidberman.com/wp-content/uploads/glpi-english.pdf
Mark Hall, Eibe Frank, Geofrey Holmes, Bernhard Pfahringer, Peter
Reutemann, and Ian H. Witten (2009). The weka data mining software. ACM SIGKDD Explorations, Vol. 11, No. 1.
J. Harrison and M. McLaren (1999), A plain language study: Do New Zealand consumers get a "fair go" with regard to accessible consumer legislation. Issues in Writing, Vol. 9, pp. 139-184.
Michael Heilman, Kevyn Collins-Thompson, and Maxine Eskenazi (2008), An analysis of statistical models and features for reading difficulty prediction. In Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications, Association for Computational Linguistics, pp. 71-79.
P. Heydari and A.M. Riazi ( 2012). Readability of texts: Human evaluation versus computer index. Mediterranean Journal of Social Sciences, Vol. 3 No. 1, 2012, pp. 177-190.
Miller J. (2005), The development of the legal information institutes around the world. Canandian Law Library Review, Vol. 30, No. 1, p. 8
Simon James and Ian Wallschutzky (1997), Tax law improvement in Australia and the UK: the need for a strategy for simplification. Fiscal Studies, Vol. 18 No. 4, pp. 445-460
Rohit J Kate, Xiaoqiang Luo, Siddharth Patwardhan, Martin Franz, Radu Florian, Raymond J Mooney, Salim Roukos, and Chris Welty (2010).
Learning to predict readability using diverse linguistic features. In Proceedings of the 23rd International Conference on Computa- tional Linguistics, Association for Computational Linguistics, pp. 546-554
J. Kimble (1994), Answering the critics of plain language. The Scribes Journal of Legal Writing, Vol. 5, p. 51.
G.R. Klare (2000), Readable computer documentation. ACM Journal of Computer Documentation (JCD), Vol. 24, No. 3, pp. 148-168
Uta Kohl (2005), Ignorance is no defense, but is inaccessibility? On the accessibility of national laws to foreign online publishers. Information & Communications Technology Law, Vol. 14, No. 1, pp. 25-41
Hugo Liu (2004), Montylingua: An end-to-end natural language processor with common sense. Available at: http://web.media.mit.edu/~hugo/montylingua/
P.W. Martin (2000), The mushrooming virtual law library on the net. In Cornell Law Forum, Vol. 27.
D. Melham (1993), Clearer Commonwealth Law: Report of the Inquiry into Legislative Drafting by the Commonwealth. Technical report, House of Representatives Standing Committee on Legal and Constitutional Affairs.
Jay Milbrandt and Mark Reinhardt (2012), Access Denied: Does Withholding the Law Violate Human Rights? Regent Journal of International Law, Forthcoming. Available at SSRN: http://ssrn.com/abstract=2132672
Robert Munro, Steven Bethard, Victor Kuperman, Vicky Tzuyin Lai, Robin Melnick, Christopher Potts, Tyler Schnoebelen, and Harry Tily (2010), Crowdsourcing and language studies: the new generation of linguistic data. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, Association for Computational Linguistics, pp. 122-130.
PCO NZ (2007), Presentation of New Zealand Statute Law: Issues Paper 2. Technical Report 2, New Zealand Law Reform Commission and New Zealand Parliamentary Counsel's Office.
PCO NZ (2008), Presentation of New Zealand Statute Law. Technical Report 104, New Zealand Law Reform Commission and New Zealand Parliamentary Counsel's Office.
OLR (2003), Inland Revenue Evaluation of the Capital Allowances Act 2001 rewrite, Opinion Leader Research. Technical report, UK Inland Revenue.
OPC-Australia (2003), Plain English. Technical report, Australian Common wealth Office of Parliamentary Counsel.
OPC-UK (2013), When Laws Become Too Complex: A Review into the Causes of Complex Legislation. Technical report, United Kingdom Office of Parliamentary Counsel.
PCO-NZ (2011). A Review of Methods for Measuring the Quality of Legislation. Technical report, New Zealand Parliamentary Counsel's Office.
N. Pettigrew, S. Hall, and D. Craig (2006), The Income Tax (Earnings and Pensions) Act - Post-Implementation Review, Final Report MORI.
Emily Pitler and Ani Nenkova (2008), Revisiting readability: A unified framework for predicting text quality. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 186-195.
G. Richardson and D. Smith (2002). Readability of Australia's goods and services tax legislation: An empirical investigation, Federal Law Review, Vol. 30, p. 475.
Adrian Sawyer (2010), Enhancing compliance through improved readability: Evidence from New Zealand rewrite experiment. Recent Research on Tax Administration and Compliance.
Sarah E Schwarm and Mari Ostendorf (2005), Reading level assessment using support vector machines and statistical language models. In Proceedings of the 43rd Annual Meeting on Association for Compu- tational Linguistics, Association for Computational Linguistics, pp. 523-530
Luo Si and Jamie Callan (2001), A statistical model for scientific readability. In Proceedings of the tenth international conference on Information and knowledge management, ACM, pp. 574-576.
Johan Sjöholm (2012), Probability as readability: A new machine learning approach to readability assessment for written Swedish. PhD thesis, Linköpings University, Sweden. Available at: http://www.ida.liu.se/projects/webblattlast/Rapporter/lasbarhet.pdf
D. Smith and G. Richardson (1999), The readability of Australia's taxation laws and supplementary materials: an empirical investigation. Fiscal Studies, Vol. 20, No. 3, pp. 321-349.
Edwin Tanner (2002), Seventeen years on: Is Victorian legislation less grammatically complicated. Monash University Law Review, Vol. 28, p. 403.
C van Noortwijk, RV De Mulder, and RW van Kralingen (1995), Word use in legal texts: statistical facts and practical applicability. Legal Know- edge Based Systems: Telecommunication and AI & Law (JURIX95), Lelystad: Koninklijke Vermande, pp. 91-100.
G. Venturi (2008), Parsing legal texts. A contrastive study with a view to Knowledge Management Applications. In Language Resources and Evaluation LREC 2008 Workshop on the Semantic Processing of Legal Texts, p. 1.
G. Wagner (1986), Interpreting cloze scores in the assessment of text readability and reading comprehension.
B. Woods, G. Moscardo, T. Greenwood, et al. (1998), A critical review of readability and comprehensibility tests. Journal of Tourism Studies, Vol. 9, No. 2, pp. 49-61
Downloads
Published
Issue
Section
License
Copyright Agreement with AuthorsAuthors submitting a paper to JOAL automatically agree to confer a limited license to JOAL if and when the manuscript is accepted for publication. This license allows JOAL to publish a manuscript in a given issue, by any means, anywhere in the world. Authors whose submissions have been accepted then have a choice of:
- Dedicating the article to the public domain. This allows anyone to make any use of the article at any time, including commercial use. A good way to do this is to use the Creative Commons Public Domain Dedication Web form; see http://creativecommons.org/license/publicdomain-2?lang=en.
- Retaining some rights while allowing some use. For example, authors may decide to disallow commercial use without permission. Authors may also decide whether to allow users to make modifications (e.g.translations, adaptations) without permission. A good way to make these choices is to use a Creative Commons license.
- Go to http://creativecommons.org/license/.
- Choose and select license. Choose "generic" if you are in the U.S. and "text" for JOAL articles.
- What to do next — you can then e–mail the license html code to yourself. Do this, and then forward that e–mail to JOAL’s editors. Put your name in the subject line of the e–mail with your name and article title in the e–mail.
- Retaining full rights, including translation and reproduction rights. Authors may use the statement: © Author 2013 All Rights Reserved. Authors may choose to use their own wording to reserve copyright. If you choose to retain full copyright, please add your copyright statement to the end of the article.