Okay. So, give us a little preview of the talk and the sort of the topics you’re hitting.
Basically what it is, is over the course of the past twenty or thirty years, we’ve been having these problems where you have a computer, it’s connected to a network, suddenly people start uploading porn, and people want to filter that. And back in 1996, the way that they would do it is design features by hand. They would sort of say, here’s a stick figure, it’s mostly skin, it looks like this picture, therefore remove this picture. It works okay, but it’s definitely not optimal.
There’s been a lot of attempts made at this problem, in academia and also industry, which are varied levels of complexity. For obvious reasons, the data sets are never the same across, you know, different companies, different universities, ’cause you can’t just like release the data. So, what we do at my company, is we build a model that actually will filter adult content for our customers. And the way we built it, is we first had to collect a bunch of data that said, ‘safe for work,’ ‘not safe or work,’ and then instead of designing features by hand, we train neural nets to understand, this image is ‘safe,’ this image is ‘not safe.’ And the neural net is sort of like a black box. You don’t know what it’s doing or why it’s making its decisions unless you use this interesting trick called a deconvolutional network.
So, you can use a deconvolutional network to kind of take an image and a network can make some prediction about the image and sort of ask the network, “Why did you say this about this picture?” It will show you the parts of the image it activated on, which kind of things were important in that decision-making process, and then you can kind of debug your data sets or your network to understand if it’s actually learning what you want it to learn.
So, I talk about deconvolutional nets, and I use it specifically in the problem of nudity filtering because it is an interesting problem since the data sets you work with are always different.
Nobody ever releases the data set for it, so you always have this process of creating a data set and then building a model on data which you sort of guessed was the right data set.
So discussing this sort of sensitive topic, do you find it sort of awkward? How do you approach the sensitivity issue?
It’s totally not awkward, because SafeSearch is a thing like every time you go onto the internet, there’s an option to filter the stuff. So, it’s not awkward at all. The thing that gets a little bit funny … Some of the people who take the most interest in the problem tend to take interest in it for not always the very banal reasons that it exists for. Cause, people want to filter websites. It’s a pretty boring problem at the outset. Sometimes people get really really excited about, like all the possibilities, and sometimes that gets a little awkward. But in general, no.
How well do these models work? How would you say?
Yeah, so that’s the thing I want to talk about more in detail tomorrow because on one data set, your model can work extremely well. The thing is that whenever you’re building these models, you make the data set yourself. So, if one of your categories is extremely explicit, like definite, definite nudity, and the other is cute dogs, your model is going to be perfect every single time. If you don’t throw enough hard data into the middle, then you can never get a good understanding of how well the model performs, because all the evaluation metrics that you use, they’re sensitive not only to the performance of the model, but also to the difficulty of the test set. And making sure that your test set has the right level of difficulty is kind of … I guess the creative aspect of the work I do when you try to build these types of things.
I know you can’t talk about the source of your model, but can you, at least, sort of expand more about how you craft the data sets?
How do you avoid the subjectivity seeping into your data? So, do you include a lot of images of something that isn’t a penis, but looks like one, stuff like that? How do you include some of those in-between images?
Yeah. Definitely. There’s a lot of ways that you can sort of crowd source the borderline stuff, but the thing that’s really cool about the talk is, once you’ve decided on a data set, you can then understand for these borderline images that you collect in various ways, that are like sexy but not porn, you can collect those images and then just ask the network what it’s thinking. It will find things … You’ll discover stuff via these deconvultionational nets like … ‘red lips’ is something that just turns out to be very significant for filtered media. Or belly buttons. And of course, every time you have a porn photo, there’s going to be belly buttons in it. You learn all these weird side effects. And through this process of, you build a data set that’s your guess at something that’s biased, then you run a deconvultionational net, and look at the features you’ve actually learned, and you identify, “Oh look, here’s something it learned that I didn’t expect it to learn,” because you don’t design the features by hand. But once I’ve learned, okay, here’s something wrong, I should probably throw a bunch of pictures of belly buttons into the safe stuff so that it doesn’t fit to those. It’s kind of like having the data scientist in there, using this feature visualization technique.
What’s the bigger issue for your model: false negatives or false positives?
It totally depends on our customers. I clarify we have two types of customers, really. We have people who use us for image retrieval, like stock photo companies. 500px. Places like Trivago. All they care is that people can search images better. And if you want to have a good search image on your site, you want to make sure that false positives are low. You want to always … somebody searches for a dog, they’re going to get pictures of dogs. The other type of customer we have is image removal. Image removal people want to moderate images. They want to remove things that are undesirable. For those people, false positives are not as bad, because if you accidentally remove something, the user of your site doesn’t see that thing that was accidentally blocked. But if you accidentally show an offensive image, then that just really hurts the visual experience somebody has on your website.
So, for image removal companies, which is typically the users of this NSFW model, I suggest people to pick a threshold that will err them on the side of more false positives and less false negatives, but for a lot of our other models, it’s the opposite.
This problem you talk about sort of reminds of the chat roulette, which was about seven years ago, way before most people were encoding their own networks. I don’t know if they’re still around, but … back when it was really popular and they were sort of around, would CNNs be able to tackle that problem?
Oh, totally. That’s super easy.
So, even though it’s live …
Our SLA on our site is 200 milliseconds once the image hits our server, so that’s not a lot of time. So, you just send the images to us, and then we can get them right away. And the chat roulette problem, because that was just close-up penises? That’s super easy to filter. Like, no problem.
So, you’re saying that if they were to start back up, we should hire some neural network scientists?
Cool. So, how do you grade big social media on how well they do with this problem? Like Twitter, or someone else. Do you think they’re doing a good job of filtering out bad photos but keeping the good photos as well?
You have to be a little bit more careful, I think, at some of those companies because once you start to moderate your users, you end up with … Companies like Twitter or Facebook, they end up with a lot of criticism for their moderation policies, and I think they’re held to a higher standard based on their scale, than a lot of other people are. If Facebook gets something wrong, there’s lots and lots of news articles about it, which isn’t true for everyone, and I think that you’ll see a much heavier financial investment. Like just two days ago, Facebook said they’re going to hire 3,000 people to moderate their site. That’s 3,000 humans, full-time employees, moderating it, which is just a much huger budget than most people, than most of us have. So, I think for them, they can definitely filter nudity quite well. I’ve seen Twitter’s nudity filter work pretty decently. Whenever I go to the site and I’m not logged in and I go to an account that I know is going to be blocked, they’re going to make sure I’m logged in. The user-submitted feature on Twitter?
Would you say there’s an ethical component as well?
There’s an ethical component I think, but the ethics that I think you have to make is you’re on a mission to not offend your users, and this is always something that we always have to talk with our customers, which is when they decide on their false positive rates, sometimes if you are going to block some of these images, if you’re moderating content, and if you tell someone they’re doing something wrong and they’re not, that offends your users and that’s always a problem. In the absence of perfectly accurate models you do have to make some decisions as to what mistakes you’re going to make.