Allure Security, a Boston-based "data loss detection" startup, recently closed a $5.3 million seed funding round led by GlassWing Ventures.
Allure deploys artificial intelligence in a new approach to securing data itself, rather than the network or endpoints. The firm's product was initially developed at Columbia University in New York with $10 million in funding from the Defense Advanced Research Projects Agency (DARPA).
Security Now contributing editor Simon Marshall spoke with Allure's CTO Dr. Salvatore J. Stolfo, who holds 47 patents, about the development and application of AI in cybersecurity.
Security Now: You've been professor of AI at Columbia University since 1979. What was the "light bulb moment" that drove you to develop this technology? What could you foresee that others could not?
Salvatore Stolfo: I had the idea for "behavior-based security" after my years of consulting for Citicorp in the 1980s on its credit card fraud system, and I proposed a project to DARPA in 1996 to study machine learning applied to security. The idea was to use machine learning as a security technique to detect malicious and abnormal behavior. I thought of the concept as quite general applied to any computational object or entity, and ultimately data and data flows. I focused my attention specifically on data behavior later, about 2006 or so. It occurred to me that that was the key to stopping data loss, essentially tracking data, particularly planted bogus decoy data.
SN: What did the commercialization of that look like?
SS: The key insight was to place beacons inside documents so they could be tracked anywhere on the Internet, an idea I had well over a decade ago. I observed commercial "solutions" were aimed at problems that were somewhat pedantic and at least five years behind the sophistication understood in government-oriented environments. I recognized it would be a slow process for the commercial space to catch up with more advanced thinking about their security problems. But I'm a patient person.
SN: Allure claims data loss detection and response (DDR) is a new category in cybersecurity, how so?
SS: DDR is an approach to data security that focuses on the data itself, rather than the surrounding users and IT infrastructure. With DDR, you enhance the data with self-activating beacons, allowing you to track and protect documents wherever they go, inside or outside the organization’s network. It’s a new approach to the age-old problems of data loss and data control.
SN: How does this measure up versus securing the network or endpoints?
SS: The advantage of securing the data itself is that you're not relying on network and endpoint measures that inevitably fail. The challenge is that you really need to understand the constructs of the data you’re protecting in order to effectively secure it.
SN: How do you pitch securing the data itself to a market ingrained in securing the infrastructure around the data? Does anyone "get it?"
SS: It is a new approach, but we've been pleasantly surprised by how quickly companies and government agencies really do "get it." We quickly see heads nodding, and often people we talk to start coming up with use cases we hadn’t originally thought of.
SN: Please describe the role machine learning plays here.
SS: One of the key capabilities is the automatic generation of highly believable decoy documents and deceptive data that are automatically strategically placed in an enterprise's environment. For the very sophisticated threats we detect, "believability" of the content hackers are targeting for exfiltration is very important.
AI and machine learning is crucial to generate unbounded amounts of deceptive material in real-time. We devised a novel way to do this by placing decoys in strategic locations that are enticing to attackers. With the addition of beacons in these materials, we detect data exfiltration and where documents were remotely opened. This is a very strong signal of real attacks, the kind of alert security personnel want to know.
SN: Where does the data to train the machine learning come from?
SS: We use AI and machine learning techniques in many ways. Perhaps the most interesting is the unique way we generate deceptive materials. The algorithms we use are applied to archives of old data from which we learn how to modify that content to make it appear recent. There are untold stores of old data available for us to use.
SN: How did the application of AI get this point?
SS: It was my idea to apply AI and machine learning to security problems that were not obvious twenty years ago. Conventional wisdom was focused on prevention, with the mindset that one could build a secure system from hardware architectures up the stack to user interfaces. I thought that was a fool's errand and would fail to solve the problem; no system would be absolutely secure.
Thus, detection technology was a required element of any security architecture and that's where I focused. The complexity of the communication behaviors, amount and kind of data flowing through today's systems creates a context where detecting attacks is very hard, and cannot entirely rely upon human experts to devise the technologies. Only machine learning can do an effective job to learn what is a malicious behavior and what is an expected normal behavior.
SN: What's the potential you believe that AI has to be a force for good in the cybersecurity world?
SS: I cannot imagine a better good for the world than to create security solutions that make it more difficult for attackers to win. My goal is to make the Internet safe. There just isn't enough human expertise and work effort to devise a very good detection system by hand. Thank goodness, the community now knows that machine learning is the only hope for securing our systems.
Editor's note: This article was condensed and edited for clarity.
— Simon Marshall, Technology Journalist, special to Security Now