Georgia Tech Researcher Uses Machine Learning in Search for Ways to Predict Political Violence

By Michael Pearson

While instances of genocide and political violence around the world seem to have been falling in recent years, such episodes are never far from the headlines, or the lives of millions of people around the world.

They also have important consequences for international political stability, global commerce, and human suffering. Consequently, finding ways to better understand where violence is likely to break out could be a boon to policymakers and non-governmental organizations seeking to foster peace and stability.

David Muchlinski, an assistant professor in Georgia Tech’s Sam Nunn School of International Affairs, is in the forefront of a handful of political scientists around the world studying ways to do just that.

In 2016, he authored an influential paper that helped start the conversation about using machine learning in political violence scholarship. He has been working on a project to help give researchers more accurate information about incidents of genocide. And last month, he published a paper detailing groundbreaking use of a convolutional neural network to analyze social media posts surrounding three disputed elections.

"I am looking at data that’s been previously ignored to uncover patterns of violence that were previously not understood” he said. “I think it is time to take seriously the messages people send on social media about why they engage in political violence.”

Why Predicting Political Violence, Genocide is Hard

When it comes to determining whether it is even possible to forecast genocide and political violence, the debate comes down to “clouds” and “clocks,” in the words of philosopher Karl Popper.

Are such events irregular and unpredictable, like cloud formation? Or are they more like clocks: well-ordered machines that are highly predictable?

The truth, Muchlinski says, lies somewhere in between, making efforts to accurately predict such events possible, but complicated and difficult — especially on shorter time frames.

Researchers well understand many of the root causes that lead to political violence: key indicators include poverty, overpopulation, a country’s level of democratization, and episodes of recent political violence.

“What we don’t know as well is at the individual level: what motivates a person to participate in a genocide, or join a rebel group, or commit an act of electoral violence,” Muchlinski said.

“That makes this a very interesting methodological and substantive problem. We have to develop the right tools to be able to predict these things. But we also have to have the proper definitions, and that’s where I think we are somewhat more behind.”

There’s also the issue of rarity. Fortunately, genocide, civil war, and electoral violence aren’t daily occurrences, making it harder to draw valid conclusions about the factors that feed such incidents.

“If you predicted every country in the world would not experience a civil war, all the time, your error rate would be very low, but you’d miss all the important observations,” Muchlinski said.

Making Better Predictions

One solution to better predictions is a technology increasingly deployed across a variety of disciplines but still a rarity in political science: machine learning.

In 2016, Muchlinski wrote a pioneering paper that helped lay the groundwork for the use of machine learning in violence prediction.

The paper made use of a machine learning algorithm called random forests to make better predictions about civil war.

The random forest algorithm is, in many ways, a more sophisticated version of the traditional carnival game Plinko. Players drop a ball into the top of a box. As the ball falls, it bounces off pegs placed randomly throughout the board and eventually falls into a bin. Depending on where the ball drops, the player wins, or loses, the game.

In Muchlinksi’s deployment of the random forest algorithm, the “balls” were observations of political violence and the “pegs” were variables that could lead to such violence. The bins? A prediction whether a given observation was predictive of civil war or not.

He found that such algorithms, when run hundreds or thousands of times, are better at predicting civil war than other methods available to researchers, such as traditional regression models. In fact, his method correctly “predicted” nine of 20 civil wars, where regression models predicted none.

“That article proved the most influential regression models in the field, models which were used to explain why civil wars happen, failed to predict any civil war at all,” he said. “If the explanation as to why these things occur as given by the model does not predict any occurrence of these events, can we be sure the causes these models identify are really true?”

Muchlinski also helped prepare a database of countries at risk of electoral violence.

More recently, he contributed to a new database of targeted mass killings designed to give researchers better insight into those events.

The Targeted Mass Killings Data Set provided more precise data and also a clearer understanding of what constitutes a targeted mass killing or genocide, with more emphasis on the intent behind the act than had been provided before.

“When we looked around at other data sets, we noticed that a lot of them were not trying to get at that legalistic sense of what this kind of violence is. They were, in a sense, just counting bodies,” Muchlinski said.

Electoral Violence Paper

In August, he brought the focus back to machine learning with a new paper analyzing the ability of the technology to predict electoral violence.

That paper, published in Political Science Research and Methods, is believed to be one of the first projects in political science to use a convolutional neural network, a kind of machine learning model, to estimate political violence directly from social media texts.

Muchlinski and his co-authors — Xiao Yang with the European Bioinformatics Institute, Sarah Birch of King’s College, and Craig Macdonald and Iadh Ounis of the University of Glasgow — used the Twitter API to collect more than 13,000 tweets related to a 2015 election in Venezuela and 2016 elections in Philippines and Ghana.

They used open-source software called word2vec transform the tweets into numerical representations that could be understood by the machine learning platform, then fed them into a convolutional neural network, a machine learning platform meant to mimic the functioning of the human brain.

How Convolutional Neural Networks Function

Such algorithms analyze text that has been converted into a series of numbers representing words based on their similarity to one another and passes them through several layers of analysis.

In the first, the convolutional layer, the model uses software filters to scan text segments of varying lengths, learning about the relationships of words in the text and looking, in the case of Muchlinksi’s study, for words that seem most likely identified with electoral violence. The end result of that analysis is called a “feature map” of words or phrases the model deduces to be most representative of the tweet’s meaning.

For instance, Muchlinksi’s model would look at the tweet, “Ten people are dead in election day violence” and pull out feature maps for “ten people dead,” “dead in election,” and “violence.”

A second layer, called the max pooling layer, takes those three phrases, and condenses them down into a single numerical representation summing up the most salient information: “people dead violence.”

The final layer would examine this truncated tweet, compare it to the hand-coded labels for tweets that were linked to election violence, and then classify the tweet as being related to election violence or not. The model retains that information to become better at making predictions as it analyzes more tweets.

Machine Learning Model Proved More Effective

Muchlinski and his co-authors found that, when compared to another machine learning technique called the support vector model, their convolutional neural network found 15 more violent events related to Venezuela’s election, 11 more in the Philippines, and three more related to Ghana’s election.

Muchlinksi’s method also found that many events coded as political violence in one popular database frequently used by researchers were not actually related to an election at all.

“This platform proved vastly superior to other sources of data in identifying violent events that were truly related to an election rather than violence that just happened to occur at the same time as the election,” he said. “We demonstrate that reporting by ordinary citizens on social media can be used to uncover patterns of violence world-wide, especially when traditional media sources, like newspapers, are inaccessible, or don’t pick up these stories.”

His next project, with Senior Research Scientist Courtney Crooks of the Georgia Tech Research Institute, will look at social media posts to see if specific changes in language can be used to predict imminent outbreaks of violence.

Why it Matters

If successful, the project could move researchers one step closer to the goal of being able to accurately deliver short-term violence forecasts to policymakers.

But, he warned, the ability to render hyper-accurate violence forecasts remains a distant dream.

“The best we will most likely be able to do in the short- or medium-term is to identify what kind of traits predict different kinds of behavior by large, similar groups on social media,” he said. “And while it is important to learn how to predict these events more accurately, it is also important to keep in mind that implementing policies to prevent these kinds of events is not always straightforward nor easy.”

Still, even marginally more accurate forecasts would be useful to policymakers debating whether to get involved in a brewing crisis, he said. It also could give people living in regions prone to violence more time to evacuate, or help election monitors gear up to watch a disputed vote, Muchlinski said.

“Fundamentally, I hope this work changes the world for the better,” he said. “I hope it allows us to understand on a deeper level what is going on with our societies, why these problems are occurring, and how we might design strategies to fix some of these problems. I hope that in being able to see a hazy version of the future, we can take steps in the present to make that future one we all want to live in.”

Muchlinksi’s latest paper, “We Need to go Deeper: Measuring Electoral Violence Using Convolutional Neural Networks and Social Media,” is available at https://doi.org/10.1017/psrm.2020.32.

The Sam Nunn School of International Affairs is a unit of the Ivan Allen College of Liberal Arts.

Media

David Muchlinski

Summary

David Muchlinski, an assistant professor in Georgia Tech’s Sam Nunn School of International Affairs, is one of a handful of political scientists around the world who work in the area of violence prediction.

Details

Contact: Michael Pearson michael.pearson@iac.gatech.edu

Sidebar Content

No sidebar content

Groups

Status

Workflow Status:Published
Created By:mpearson34
Created:09/09/2020
Modified By:mpearson34
Modified:09/09/2020

Mercury (Hg)