Machine Learning
Machine learning refers to a subfield of AI and involves “computer algorithms that have the ability to “learn” or improve in performance over time on some task” (Harry Surden, Machine Learning and Law, Washington Law Review 2014, 88).
Machine learning algorithms predict outputs based on previous instances of relationships between input data and outputs. A machine learning algorithm will be gradually improved by testing and correcting its predictions.Researchers have successfully used machine learning to automate a variety of sophisticated tasks that were previously presumed to require human cognition. These applications range from self-driving cars, facial recognition, fraud detection, speech recognition and spam filter to automated language translation. These algorithms cannot learn in the human sense. The algorithms are only able to “learn” in a functional sense as they can change their behaviour to enhance their performance on a specific task through experience (Stuart Russel/Peter Norvig, AI: A modern approach, 2010, 693). Machine learning algorithms cannot, however, replicate the human cognitive system. Machine learning techniques have been able to produce “intelligent” results in complex, abstract tasks, often not by engaging directly with the underlying conceptual substance of the information, but by engaging indirectly with this information, through detecting proxies and patterns in data that lead to useful results.
The rise of machine learning was enabled by advances in processing capacity and better availability of teaching data (Big Data). Big Data refers to “a massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques” (What is Big Data? A Webopedia Definition, www.webopedia.com). The advent of Big Data is a gamechanger for many industries. Google, for instance, processes more than 100 petabytes (100 million gigabytes) of data every single day. In law, however, Big Data might be misnomer. The legal industry does not have enough data to label it as “big.” The legal datasets are still mostly measured in gigabytes or maybe terabytes, nowhere near the petabytes and exabytes of Google. Some authors, therefore, suggest using the term Medium Data in a legal context to clarify that Big Data techniques in a narrower sense are not applicable. This Medium Data is equally valuable to streamline and automate legal work, and is hence also referred to as “Smart Data”. That such Smart Data plays an increasingly important role in the legal industry can be seen when looking at companies like Lex Machina, Juristat, eBrevia, Justly and LEVERTON.
3.3