• Use Case: Voice of the Public
  • Segment: Government Agencies
  • Product: Rosette

Better Border Security: Going Beyond Watch Lists


The Syrian refugee and migrant crisis is overwhelming Europe. The reality of ISIS fighters entering in the guise of refugees, has made it clear that watch lists are not enough to secure borders. Furthermore, 6,000 Europeans—including 1,800 French—have left home and joined ISIS in their war.1 Among the French, 118 returned to France in 2014.2 The question is, where are those people now? It is unlikely that all of them will be on a watch list.

The reality of ISIS fighters entering in the guise of refugees, has made it clear that watch lists are not enough to secure borders.

Our borders can be significantly strengthened through the use of applications built on text analytics to find insights from diverse data sources. Advanced social mining through natural language processing is already in use in the commercial world; making a transition into the public sector is a logical and lifesaving next step. These applications can increase the ability of analysts and security personnel to see the most vital data quickly, and form the final conclusions that still require human experience and intellect.

Text analytics can support border security in three ways: (1) to identify present dangers near borders and at screening time, (2) to identify potential dangers that need follow-up, and (3) to forecast future troubles at borders.

Identifying Present Dangers

Heathrow-T5Monitoring all data from all channels means the ability to catch tweets like,

“Saw a woman abandon a bag at the toilet at gate 35 @Heathrow terminal 5.”

Such a system is programmed for key phrases such as “abandoned bag” and could verify the tweet was sent from Heathrow and extract the location “gate 35 @Heathrow terminal 5” to trigger an alert to airport authorities. Security could then check CCTV tapes to determine if the bag owner was a harried mother changing a baby’s diaper who forgot her bag, or something more ominous. The key is an ability to act on vital signals from any channel.

“We run people against watchlists and that’s how we decide if they get extra screening. In cases where those lists don’t hit, there’s nothing that distinguishes them from people we would love to welcome to this country.”

—C. Stewart Verdery Jr., a senior Homeland Security official during George W. Bush’s administration 3

Unlike the U.S. which can pre-screen visitors because of visa requirements, the EU border officer is under pressure to determine whether or not a person should be allowed in at the moment the visitor arrives at the border. If the person is not on a watch list, there is nothing but his travel papers and word. But, it does not have to be that way.

In The Commercial World

The sharing economy is already leveraging social media to strengthen security. While companies like Uber and Airbnb are revolutionizing their industries to benefit consumers, the nature of peer-to-peer transactions created novel means to commit crime. User reviews proved inadequate to manage the risk involved in trusting strangers for a ride or a place to stay.4

In response, Airbnb built their Verified ID5 feature, which includes verifying online IDs such as Facebook, LinkedIn, and Google+ against passports and other government-issued IDs, Airbnb provides both hosts and travelers with unprecedented trust in the transaction.6

At Passport Control

Similarly, border security could leverage online IDs to corroborate a person’s passport to increase confidence that the person standing there is indeed who he claims to be. Can the visitor answer questions of profession and education, which are easily verified on LinkedIn? Conversely, lack of corroboration with online data plus suspicious circumstances—sharing the same name as a person posting Jihadist messages on Facebook—may alert security to the need for further questioning.

In the aftermath of the December 2015 attacks in San Bernardino, CA, officials discovered that one shooter, Tashfeen Malik may have posted about her interest in violent jihads on social media. Despite this, she passed three background checks by American immigration officials as she moved to the United States from Pakistan. Malik may not have been cleared had U.S. immigration been able to review her online persona.

Identifying Potential Dangers


While pre-screening is not always possible, it is possible to look into the people a watch listed person has been in contact with through social media. Where have these other people been? Do any of them have a ticket to travel to Europe in the near future?

In the past, parents have contacted authorities or posted on Facebook about fears that their child has been radicalized or is hanging out at the “wrong” mosque. Bringing that information to the border allows a potential security risk to be flagged.Text analytics for English, Arabic, or other languages is able to find these connections and names in the native language of online postings.

Although text analytics can gather information on potential associates of a terrorist, it still takes a person to make that judgement call, especially when many people share similar names. New productivity tools built on text analytics can accelerate these analyst workflows. Given a diverse set of data on a suspicious person, the analyst chooses data that is relevant for the report. The system learns from the analyst’s choices and presents more recommendations that are likely about the same person, but also carry new information that expands the analyst’s picture of the situation. A shortened report cycle means quicker alerting of the correct task force or local commander and more timely information to the border.

Forecasting Future Troubles

Forecasting future unrest based on big data is not as futuristic as it sounds. The EMBERS project at Virginia Tech in the U.S. is funded by the U.S. Intelligence Advanced Research Projects Activity (IARPA), and has been successfully forecasting civil unrest in Latin America.7 Highly tuned algorithms look for signals in open data indicating disease outbreak, political protests, and more, sending alerts as much as seven days in advance of an event. Might the same not be possible for border areas?

Such a system looks for patterns from millions of data points, much the way weather forecasters know the key indicators for hurricanes. Border unrest signals might include social media hashtags for topics related to “ISIS” or known hotbeds of Islamic radicals. Threats and calls for action against authorities could be geocoded and analyzed for names of people, places, and organizations. Although it would take a significant investment of time to train and tune such a system, it might be worth considering as a long term goal.

Making It Real

There is a spectrum of options that each border authority can select from to meet short-term and long-term goals.

In the short term, name matching software that supports names in English, Arabic, or Pashto and Dari (Afghan languages) currently exists, and can be rapidly deployed. Installed on a laptop, it can screen one name against a 30 million name watch list in under a second. That software could immediately improve watch list screening in areas which have no more than a laptop, or borders with frequent power outages.

…name matching software…exists…[that can] screen one name against a 30 million name watch list in under a second…could immediately improve watch list screening in areas which have no more than a laptop.

Medium term are social media monitoring applications, many of which are already deployed for brand/reputation tracking or as an extension of customer support. These applications use entity extraction and resolution to find the names of people, places, and organizations, and link them to corresponding real-world entities in a knowledge base. Software that speeds up the work of human analysts are currently used in the area of cybersecurity.

Long term is investment in predictive analytics, which requires significant labor to train and adjust algorithms, to surface possible future incidents.


Consider what “all data” means to your organization, and if it is time to adopt technologies that are already in widespread use as “big data analytics” in diverse industries around the world, including healthcare, government, customer feedback analysis, customer relationship management, voice of the employee, financial services, insurance, and more.

With the ongoing civil war in Syria, the refugee crisis will only continue to grow. Getting legitimate refugees out of war-stricken regions and stopping those with violent intentions is key to support immigration policies and boost domestic and global security. By using technology that is available now to mine the social web, immigration and border control can be strengthened.

1 France has also estimated that some 1,800 of its citizens and residents have now joined ISIS and other jihadist networks. Danner, Chas “Report: ISIS Has Recruited as Many as 30,000 Foreigners in the Past Year” New York Magazine, Sept. 27, 2015
According to the French authorities, the number of native jihadis in Syria and Iraq has soared from 555 to 932 this year. Of those, 118 have returned to France. According to experts consulted by European officials involved in the effort, an estimated one in nine of those returning represents a terrorist threat. Traynor, Ian, “Major terrorist attack is ‘inevitable’ as Isis fighters return, say EU officials” The Guardian Sept. 24, 2014 
3 TechCrunch
4  “Case Study of Airbnb: Identity Resolution for the Sharing Economy” by Basis Technology
5 Apuzzo, Matt; Schmidt, Michael S.; Preston, Julia, “U.S. Visa Process Missed San Bernardino Wife’s Online Zealotry”, New York Times, Dec. 12, 2015