Instagram is using AI to detect bullying in photos and captions

Last year, Instagram introduced an enhanced comment filter that uses machine learning to spot offensive words and phrases in challenging contexts. Now, the company is expanding similar coverage to photos and captions. Today, it announced that it will use AI to “proactively detect bullying” before sending content to human moderators for review.

The new feature will roll out to users in the coming weeks, launching in time for October’s National Bullying Prevention Month in the US and just before Anti-Bullying Week in the UK. The same technology is also being added to live videos to filter comments there as well.

This is the first product announcement under new Instagram chief Adam Mosseri who took over following the hasty departure of co-founders Kevin Systrom and Mike Krieger last month. The split was reportedly due to simmering tensions between the pair and parent company Facebook, which has frequently meddled with Instagram’s product.

With public trust in Facebook continuing to fall, Instagram remains the bright spot in the company’s product lineup. It’s popular, profitable, and it has yet to be tainted by the scandals that have undermined Facebook. In this context, using AI to help weed out offensive content and keep Instagram a home for good vibes is extremely important.

A story published in Wired last year explained some of the details of Instagram’s machine learning comment filters, but it’s well-established that this sort of technology is no silver bullet for content moderation. AI is cheap to deploy at scale, yes, but it still has trouble dealing with human context and nuance. That’s why it’s good that these new bullying filters also send content to human moderators to perform the final check. Automation without oversight is a recipe for disaster.

Interestingly, Instagram says it’s not just analyzing photos captions to identify bullying, but also the photo itself. Speaking to The Verge, a spokesperson gave the example of the AI looking for split-screen images as an example of potential bullying, as one person might be negatively compared to another. What other factors the AI will look for though isn’t clear. That might be a good idea considering that when Facebook announced it would scan memes using AI, people immediately started thinking of ways to get around such filters.

Along with the new filters, Instagram is also launching a “kindness camera effect,” which sounds like it’s a way to spread a positive message as a method to boost user engagement. While using the rear camera, the effects fill the screen with an overlay of “kind comments in many languages.” Switch to your front-facing camera, and you get a shimmer of hearts and a polite encouragement to “tag a friend you want to support.”

Instagram’s new “kindness camera effect” as launched by teen author, dancer, and actor Maddie Ziegler. Image: Instagram