Top 5 Takeaways from AI for Human Language
AI for Human Language 2021 brought together hundreds of professionals in cybersecurity, financial security, and compliance to explore technology available today that is enabling us to verify identities and anticipate world events.
In this (almost in-person) virtual experience, speakers from IDF Unit 8200, Cybersixgill, Recorded Future, Metis Augmented Intelligence, and more came together to demonstrate how AI is solving real-world issues, and discuss predictive intelligence analytics, cyber threat intelligence, and event extraction.
For those who missed the event, don’t worry, you can still experience the action! We compiled a list of key takeaways to recap the event, plus, you can also dive into the virtual sessions.
1: The Rewards of the Early Adoption of NLP for Intelligence
Keynote speaker, Asaf Kochan of IDF Unit 8200, kicked off the event by sharing his experience as a former Commander. One of Unit 8200’s greatest challenges was processing ample amounts of data. With a massive workforce dealing with translation and language processing, they constantly witnessed events that were misinterpreted and often found there were missing bits of the big picture. This reality needed to change.
It was clear that NLP machine-based performance was an accelerator and a force multiplier to achieve far more accurate and comprehensive outcomes. After making the decision to introduce this cutting-edge technology, 8200 overcame language barriers, and analysts who had never had the ability to access data suddenly had it. That meant the data was more relevant because it was done in a much more accurate context for the consumer.
“The ability to transform data into actionable knowledge has never been so real and vivid as it is now.” – Asaf Kochan, Former Commander of IDF Unit 8200
As machine learning and NLP continued to detect events that were never before detected, the lives of innocent civilians were saved not only in Israel but around the world.
2: How Machine Learning Instills Confidence in Intelligence Organizations
Humans are incredibly intelligent. We have the ability to connect the dots in our minds and understand data, but we’re not always able to predict what happens next. Relying on human brain power alone is not the most efficient way to become a scalable intelligence organization.
AI empowers intelligence officers to feel confident in their ability to predict future events. It’s like having thousands of virtual analysts at your disposal. But the main challenge of intelligence is not just about the data, it’s about bridging the gap between collecting data and making the right decision at the right time, and the only way to do that is leveraging machine learning.
Machine learning and NLP solutions offer organizations the advantage of confidence in the ability to make decisions based on the data. As a result, analysts actually understand exactly what’s happening or going to happen, while also being exposed to information that they should know and don’t know about. The Holy Grail of detecting threats is simply being in the know, and NLP technology does that for us.
3: How Event Extraction can be Used for Cyber Threat Intelligence
Many vendors that have dealt with threat intelligence started out with a manual approach — meaning they had a team of analysts that tap into different areas of the dark web to extract insights. The outcomes were generally generic reports about different topics in various industries. So, over the years vendors started to develop some automation to ease the process of creating valuable insights. Today, most of these vendors are still semi-automated and still keep a team of professionals that generate most of the intelligence pieces manually. But AI is on our side and it’s here to take the tedious manual labor and produce insights quickly and more accurately.
So, how can AI benefit cyber threat intelligence? When we’re talking about threat intelligence, especially in the dark web, data is coming across from different regions worldwide and each region brings a different language, culture, and mysterious lingo which brings complexity when extracting insights.
AI-driven event extraction software helps to figure out what people are doing, what attacks will happen, and when they are going to happen. It analyzes statistics and gives them context in which they are provided. This allows vendors to detect different entities, the relationship between them without the hassle of the busy work.
4: How Name Matching Works
Name matching systems often seem like a black box. You’re working with a huge database with a sanctions list or a no fly list, for example. Then, a name comes in from a passport check or an application, and you feed it into a system that gives you a thumbs up or a thumbs down. But, how does this work?
Name matching solutions use two separate tools: name indexing, which makes sure the correct name is retrieved from the database, and name matching, which determines if two names are the same. The indexing aspect is important because if we were to take a query of names and manually go through the entire list, it becomes extremely expensive to do that comparison or for every name in the database. Instead, name matching technology uses machine learning, rather than generated lists of name variations, to perform fuzzy name matching. This approach matches never-before-seen names as well as avoids the problem of an exponentially growing list. Even a three-element name (first, middle, last) in another language with 12 variations for each element would add 1,728 variations to a list.
Machine learning has made it possible to run these algorithms on every name put into a database, and find the correct person. This technology reduces the need to manually review questionable results saving both time and money.
5: Rosette Now Matches Names in Hebrew
Basis’s own Gil Irizarry and Fiona Hasanaj who are responsible for Rosette Name Indexer, the part of Rosette text analytics that matches names across multiple languages, announced that it now supports Hebrew!
Rosette Name Translator users are looking for Hebrew transliterations that are more compatible with those that show up in a database, or would be used in searches, and are pronounceable. To fulfill these specific requirements, Basis Technology created the “folk” transliteration scheme.
The Basis Technology scheme is more user-friendly, as it prioritizes pronounceable transliterations and doesn’t contain diacritic marks. It should be noted, however, that this scheme has ambiguity, so it is not possible to roundtrip — i.e., convert from Hebrew to Latin and back to Hebrew, with fidelity. In Hebrew, some letters are pronounced the same, so the Basis Technology scheme will map to the same Latin letter in the transliteration.
In brief, the Basis Tech folk transliteration scheme attempts to balance fidelity with how words are pronounced in Hebrew, while producing name translations that will resemble what people type into search and database systems.
That’s not all!
We loved bringing together thought leaders and practitioners to discuss cybersecurity and financial crime and showcasing our Hebrew capabilities. But we also announced that Cybersixgill uses Rosette Entity Extractor to find people, places, and organizations for deep and dark web threat analysis. Check out the press release!
So if you’re interested in an adaptable text analytics and discovery platform, book your demo today.
We’re so pleased to have hosted AI for Human Language 2021. It was a powerful experience and we’re happy to bring it to you on demand.