A Cozy Community of Data Scientists in Information Security


Every scientist needs a home.  

Like most PhD research topics, mine was “special”. It was unique enough to straddle a few research communities, but fit snugly into none.  Because conferences often reflect these “communities”, I considered my “home academic community” for machine learning to be ICML and NIPS.  But, significant signal and image processing themes often didn’t fit there, and so I also found a “home” at ICASSP, SSP, and Electronic Imaging. The collegial exchange within each subcommunity enriched my research. Still, it frankly took a few years for me to feel comfortable with the simple fact that my presence at each venue would always present a slight mismatch in disciplines.

If you’re a data scientist in information security, you know this feeling. There are a host of excellent venues for machine learning researchers and practitioners, some of which are mentioned above, but these are generally application-agnostic or focus on predominantly computer vision or natural language understanding themes.  At the same time, security conferences like USENIX Security and VirusBulletin provide academics and practitioners a venue for deep dives in infosec, but historically aren’t appropriate for equal depth into data science methodologies. In fact, until relatively recently, your paper about (fabricated) “Siamese neural networks vs. fuzzy imphashing of PE’s import address table” would likely be viewed as at least slightly esoteric to either community.

To be fair, the burden of communicating and educating the security industry on machine learning rests squarely on you as a data scientist.  For many years, there’s been a seat at the table for high level or introductory level machine learning talks at conferences like BlackHat, DefCon, BSides, SchmooCon, DerbyCon, and a host of others.  All these are excellent venues and communities, whose machine learning sophistication is growing.  Your data science presence at these must continue. You must be clear and articulate as you describe “new” applications of machine learning in infosec, while also largely speaking above the math (but without hype!) for a broader audience.  Such a venue is not the place to geek out.

Fortunately, there *is* a burgeoning community within information security where geeking out about machine learning is welcome, even if relatively boutique.  For instance, if you’re academically inclined, then AISec is a great venue for rigorous peer review of machine learning applications in information security.  We lauded AISec last year for its singular ML-for-infosec focus.  It leans heavily academic, with great technical talks.

For a slightly broader audience that also includes infosec data science practitioners, we are excited to co-sponsor the inaugural Conference on Applied Machine Learning for Information Security (CAMLIS).  This is intended to be “in the weeds” for infosec data science practitioners, beginning “where the C-level BlackHat talk left off”.  This year, the lineup boasts data science leads and wonks from various (competing) security companies and government.  Indeed, the lineup of invited speakers may possibly be the richest collection of infosec data scientist friends and foes in any one place giving deep-dive technical talks.  The conference is in its first year, and in sharp contrast to large machine learning conferences that sell out before you realize registration is open, aims to keep the small-batch nature that invites participation and community inclusion of every attendee.

So, if you’ve felt altogether too “special” in your quest to find a conference home, then chin up! There’s a thriving community of infosec data scientists. If you’re in the DC metro area on October 28, register and meet them at CAMLIS!  You might just find a “home”!