30 November 2018 (Paris, France) – Those of you who read my blog on a regular basis know I write a lot about Amazon. It takes up a good chunk of my book-in-progress. Working title? “A (relatively) short history of technology”. I know. Pathetic. But I have hired Charles Christian as my “pre-editor” so hope springs eternal.
The book is not about all technology, of course. Not the lathe, the powerloom, the motor car, the MRI scanner or even the F1 fighter jet. I mean to take a linear look specifically at the digital media platforms associated with Silicon Valley. Those social media, big data, mobile technology and artificial intelligence platforms that increasingly dominate our economic, political and social life.
The power of Amazon: what you need to understand before I talk about legal artificial intelligence
Why does Amazon dominate the retail market? Two of many reasons: content and artificial intelligence. Amazon has over 210,000,000 pages indexed on their website, which translates into consumers being able to find unique stuff that they want – not the commoditized stuff large retailers decide that you want and push out everywhere. That’s a key reason why so many national retailers are crashing. It’s because Amazon is commoditizing everything.
And its overall power? As I noted in a (very) long piece last year which became my pitch to publishers for my book … The power of GAFA (Google, Amazon, Facebook, Apple): a forced reinterpretation of narratives, data, software, and news events … Amazon has captured over one-half of all dollars spent online, and is now challenging Google and Facebook in digital advertising (Google and Facebook have 62% of total world-wide digital ad spend).
But more importantly, what sets these new digital superstars apart from other firms is not their market dominance; many traditional companies have reached similarly commanding market shares in the past. What’s new about these companies is that they themselves are markets. Amazon operates a platform on which over $220 billion worth of goods are bought and sold each year. Amazon’s retail and cloud opportunities are effectively unlimited, its market size unconstrained.
Side note and worthy of a separate post: my Chief Technology Officer has been in Las Vegas this week at the Amazon Web Services mega-event “AWS re:Invent 2018” (which is also held in London and which I attend). It’s why I am writing this post about the legal artificial intelligence sector (which I will get to in a moment; patience, gentle readers). At one of his dinners this week the chatter turned to China. Said a well-known technology maven (not associated with Amazon): “To compete with China, America needs companies that are innovative and customer-centric. That could be Google, and it certainly is not Facebook. But it’s almost perfectly Amazon. Bezos has created the perfect funnel into an array of consumer orientated services. Amazon in 2019 is so much more than e-commerce or the Cloud. It is quite possibly THE future of consumer facing … and business facing … artificial intelligence.”
This is exactly what I wrote about two years ago when I commented on the Amazon/Whole Foods deal. lt was not about groceries. It was about artificial intelligence. As I noted in a graphic:
It is about becoming a “full stack” AI company. As I have noted before, machine learning/artificial intelligence really requires massive volumes of data filtered through algorithms to create systems that continuously learn from the information fed back into them. Google was (well, still is) the King because it was the only full stack AI company … the “new new” thing … and Amazon had to challenge it:
All Amazon was missing was the chips. It now has them.
Consider what Amazon has been doing for the past five years or so: developing not one but two different custom chips, building a range of machine learning tools including free (for now) training programs, and rolling out features and function to keep the often creaky Amazon Web Services engine chugging along.
Net net: Amazon seems to be taking bits and pieces from the Google, Palantir, and IBM playbooks. Chef Bezos mixes the ingredients and rolls out a mind boggling array of new stuff.
Amazon and the legal artificial intelligence sector
There was an interesting article yesterday in Artificial Lawyer titled “Will Amazon + Big Tech Devour the Legal AI Sector?” and it noted Amazon had started selling a NLP/machine learning software service that can analyse and extract information from medical records. The program is called “Amazon Comprehend Medical”. It was announced this week at “AWS re:Invent 2018” in Las Vegas (see, I did come back around to that). The Artificial Lawyer article raises a question: will Amazon ever move into the legal AI space? And if they do, what would happen?
Oh, it will move in, all right. But first …
Amazon has had a health innovation stealth unit for quite some time (it was called “1492”; pretty cool, huh?) With the announcement of “Amazon Comprehend Medical” (ACM) we’re slowly starting to understand the sweeping changes Amazon is contemplating. In a nutshell ACM involves analyzing medical data that could revolutionize patient diagnosis. You have machine learning at the intersection of EMR (electronic medical records) that could help predict outcomes to save costs. And eventually … if you work your way through this … you can see where Amazon Prime members would opt-in to ACM so Amazon would have access to their health data and patient records, which could be a win-win for both parties. And as Tencent scales to the enterprise sector and eventually the Cloud itself, it has massive potential to intersect with Healthcare as well, where Google and Apple will also contribute. In my humble opinion corporations at the service of healthcare will be the major story in the future of technology in the 2020s.
I have read the briefing papers on ACM and what Amazon has done vis-a-vis machine learning is impressive:
- The program scans medical files to pick out relevant information such as the medical condition and patient’s procedures and prescriptions.
- Amazon claims to have trained its system to recognize the idiosyncrasies in how doctors take notes.
- For Amazon, this is another move into the health care market on the heels of the retailer buying the online pharmacy PillPack in June.
- Machine learning at the intersection of niche healthcare is one of the hottest VC sectors circa 2019, and it appears Amazon already has a head-start.
- So in sum: the Amazon cloud software combines text analysis and machine learning to read patient records that often consist of prescriptions, notes, audio interviews, and test reports. Once those records are digitized and uploaded to ACM, it picks out and organizes information about diagnoses, treatments, medication dosage, and symptoms.
Google’s DeepMind Health is also well documented, and very impressive. Apple’s Apple Watch shows consistent improvements related to healthcare analysis. But Amazon already owns the smart home sector for consumers, and we know the future of healthcare monitoring starts at home as EMRs become better at data harvesting for well-being. Early diagnosis really is the key in most health outcomes.
Oh, and something else I picked up on. Since Amazon has also developed an app that’s embedded in electronic medical records and provides doctors direct links to products to market to patients, this could be a significant B2B cash-cow as well, where Amazon facilitates and disrupts the supply chain of medical technology and supplies, and not just pharma and Biotech for the consumer.
Bottom line: Amazon as a consumer facing company is uniquely positioned to be at the intersection of pharma and biotech with the consumer. You cannot underestimate Amazon’s relationship with consumer convenience here.
Now, back to legal AI
The Artificial Lawyer article I noted above says that legal texts tend to have similarities to medical texts (complexity) and that legal language is very specific and full of professional jargon, but then, so is medical language. Can a company like Amazon handle that? The answer is: if they wanted to, yes. And you can bet they will.
Amazon has looked at text analytics for a very long time, and they have been looking specifically at the law. Their IT teams have quizzed Big Law firms and they have had some pretty deep discussions on the complexity of data in “law systems”. One Amazon source told me they were “coming to grips” with analyzing “incredibly nuanced documents written in highly technical language” and understanding the complexity “when it is particular to a specific business”.
Disclosure: my discovery/information/legal job posting service has a job posting agreement with Amazon so I chat a lot with Amazon folks.
So how is the medical side linked with the legal side? With “Amazon Comprehend Medical” essentially we are talking about improving medical language processing with machine learning. The exact same thing can be done with legal language. And just as “Amazon Comprehend Medical” allows developers to identify the key common types of medical information automatically, with high accuracy, and without the need for large numbers of custom rules, the same can be done for legal.
And Amazon has made the first steps.
Not discussed much this week in Las Vegas was Amazon’s other news: Amazon Textract. It is a service that automatically extracts text and data from scanned documents. It goes beyond simple optical character recognition (OCR) to also identify the contents of fields in forms and information stored in tables. At several private demos, they used legal documents to show how it worked.
Now, we all know the drill. Many companies today extract data from documents and forms through manual data entry that’s slow and expensive or through simple OCR software that is difficult to customize. As the Amazon folks noted “rules and workflows for each document and form often need to be hard-coded and updated with each change to the form or when dealing with multiple forms. If the form deviates from the rules, the output is often scrambled and unusable”.
So imagine: Amazon Textract overcomes these challenges by using machine learning to instantly “read” virtually any type of document to accurately extract text and data without the need for any manual effort or custom code. With Textract you can quickly automate document workflows, enabling you to process millions of document pages in hours. And … you can create smart search indexes, and even install compliance document archival rules by flagging data, for example, that may require redaction.
According to the briefing papers, Amazon Textract’s pre-trained machine learning models eliminate the need to write code for data extraction, because they have already been trained on tens of millions of documents from virtually every industry: straight business (invoices, receipts, tax documents, sales orders, etc.), insurance (enrollment forms, benefit applications, insurance claims, policy documents, etc.) and legal (contracts, email threads, etc.) According to Amazon “you no longer need to maintain code for every document or form you might receive or worry about how page layouts change over time”.
And what is really impressive is the ability to extract structured data from documents and create a smart index using what they call “Amazon Elasticsearch Service” which allows you to search through millions of financial statements quickly … as well as SEC filings. And the speed. My CTO noted it could extract “lightening fast” … processing millions of scanned documents and emails … where “John Doe” appeared or “where a document or one of our employees used the word xxxx”.
Then a lightbulb went off and I saw another aspect I want to dive into a some point, but not in this post. Those of you who have used “Spector Pro” for internal corporate investigations know this already. You can track and record chat conversations (as transcripts), emails (sent and received), websites visited and, keystrokes made, not only what has been typed within a document, but the mouse and keystroke usage across the whole computer system. I’d like to see if you can somewhat “merge” a software like Spector Spector Pro with Amazon Textract.
Much of this is still in beta and Amazon has several partners working with it. One is an “intelligent information processing” company (was that generic enough for you?) that is using it to “mass ingest” information and classify its data faster than it can with any other product on the market. The company also realized the powerful capabilities for image-oriented applications that are scalable via AWS services. My CTO said one e-discovery vendor was present and “he was stunned”.
Obviously the speed … and one assumes, the accuracy … you can build on AWS results in dramatic cost savings. And the potential (possible?) impact on the e-discovery world and the legal artificial intelligence sector remains to be seen.
But it sure seems to me that Amazon is moving closer to that legal artificial intelligence sector.