The Automation Roadmap: CMU Professors Use Machine Learning to Forecast Disruptive Tech

By Scottie Barsotti

By applying machine learning to patent filing data, Professor Lee Branstetter and the Center for the Future of Work are working with other CMU experts to map what jobs and industries are likely to see the most intense applications of artificial intelligence—and where policymakers need to focus their attention.

Some of the coolest research coming out of Heinz College is originating from an unlikely source: patent filings. A recent report from Carnegie Mellon University’s Center for the Future of Work explains how a team of researchers applied machine learning algorithms to patent filing documents in order to learn about the nascent artificial intelligence (AI) industry and its potential impacts on the economy.

While patent documents may not immediately strike you as captivating subject matter, they are an absolute treasure trove of data. In fact, the U.S. Patent and Trademark Office processes hundreds of thousands of documents each year in the form of patent applications and trademark registrations from companies and individual inventors.

Lee Branstetter, Heinz College Professor of Economics and Public Policy and director of the Center for the Future of Work, looked at this ever-growing mountain of paperwork and had an idea. Maybe, he thought, we can use the copious data in patent applications to get a clearer picture of how artificial intelligence is developing. In time, that data might even serve as a crystal ball of sorts, allowing us to see which industries and which parts of the U.S. will be most heavily disrupted by technology in the future.

To implement this idea, Branstetter teamed up with Ed Hovy, Research Professor in the Language Technologies Institute (LTI) of the School of Computer Science, Andrew Runge, a master's student in LTI working under Hovy, and Dean Alerucci, a doctoral candidate in the Department of Engineering and Public Policy who spent years as a patent lawyer before coming to CMU.

"Patents are clear indicators. They represent a specific technology...and there are explicit ways to tie them back the economy. They're very rich in terms of the meta-information attached to them," said Runge. "Once we know that the document is for an AI-related technology, we can use other techniques to determine how different types of patents map into different sectors of the economy."

The general consensus among economists is that while advancements in artificial intelligence and robotics have the potential to spur growth and create new job categories that don’t even exist yet, the resulting automation of job tasks will likely have dramatic impacts on the current labor market—the real questions are where and to what extent those impacts will be felt.

Branstetter’s study provides a powerful diagnostic and predictive tool driven by—appropriately—machine learning, that attempts to answer those questions.

By understanding the nature of patent filings, where they are coming from, who is applying for them, and what kind of technologies they foretell, Branstetter’s team has been able to chart which industries are investing most heavily in artificial intelligence, and what has come of those investments. That not only paints a picture of what's been done so far, it will also help project what might happen in the future.

"There's a lot we can do with this information," said Runge.

A Map of Disruption Begins to Emerge

According to the report, the algorithm found that over 71,000 AI-related patents have been filed in the U.S. since 1990, with activity rising in the past decade—now over 8,000 AI-related patents are awarded every year, and to “an extremely diverse group of firms.”

U.S. companies and inventors make up the vast majority of those patents (47,967 as of this report); the next closest country of origin is Japan (8,881). However, AI-related inventions in the U.S. are currently concentrated on the coasts, with Microsoft, IBM, and Google outpacing others in patents assigned to date.

But regardless of what company invents a new technology, that invention might be applied broadly across many industries. Branstetter suggests that future work will allow his team to map which industries can expect to sustain the most disruption once certain technologies are adopted. Perhaps more importantly, in time they may be able to map emerging technologies to areas of the U.S. that have high concentrations of jobs that could be impacted by those specific technologies, thereby illustrating where the pain of disruption might be most acute over time.

Policy leaders could use such insights to create interventions, funds, and programs targeted to those areas most at risk, strengthening the social safety net and minimizing the drawbacks of technological change.

A research win for the Center for the Future of Work

As a project of the Center for the Future of Work, Branstetter’s patent study draws on expertise from across the university. Both Branstetter and Runge indicated that without machine learning capability, analysis like this wouldn’t be possible.

"Patents are difficult to parse. They are very complex legal documents with their own formalisms and structures that are very different from common language. That precludes a lot of people from going in and understanding them," said Runge.

He adds that reading patents is extremely time-consuming. The algorithm developed by the CMU team can search and analyze patent filings much more quickly and accurately than any human could while also effectively capturing patents across a broad category (artificial intelligence) that has become a catch-all for numerous types of related technologies.

This project aims to tell the story of a massive societal challenge, serving as a warning and helping make the case for swift action on new policy: When automation sweeps the nation, here are the people who will be left behind.

Professor Lee Branstetter

With a story so potent and so clearly laid out, Branstetter says that inaction is not an option.

At a recent CMU event in Silicon Valley, Branstetter spoke about the importance of viewing the coming wave of disruptive technology—the “Fourth Industrial Revolution” as many call it—through the lens of economic history. To him, the biggest concern stemming from automation is not mass unemployment, but rather an explosion in inequality.

“Our less educated fellow citizens have seen their real incomes fall since the early 70s, and what many would regard a middle-class standard of living is increasingly out of reach,” said Branstetter. “The past half century of technological progress has had a pronounced skill bias, and the best labor economists predict that the next wave of disruptive innovation will bring more of the same, furthering the growth of inequality that has placed such a strain on our national politics.”

Moving forward, the team hopes to partner with the U.S. Census Bureau to access the agency’s data. For example, census data would allow the researchers to link AI invention to labor demand at the enterprise level. That analysis would enable estimates of the direct impact of AI on labor demand and wages.

A major aim of the Center of the Future of Work is to create policy innovations that will make the gains of automation more widely shared. Already, Branstetter and his colleagues are generating insights that could very well change the course of policy decisions and proactively help the populations most vulnerable to automation.

Read the Report

This research was made possible by funding from the Heinz Endowments. The Center for the Future of Work is an initiative of the Block Center for Technology and Society.

This article has been updated to include findings from the team's report. A link to the report has also been provided.

The Automation Roadmap: CMU Professors Use Machine Learning to Forecast Disruptive Tech

A Map of Disruption Begins to Emerge

A research win for the Center for the Future of Work

Energy Markets Under the Microscope: Research Reveals Hidden Patterns

Powering Pennsylvania

Students develop tool to help American Red Cross estimate shelter needs after earthquakes