Sketch Simplification: Fully Convolutional Networks for Rough Sketch Cleanup

clickok · on April 28, 2016

The paper itself: http://hi.cs.waseda.ac.jp/~esimo/publications/SimoSerraSIGGR...

Part of what makes this neat is that their architecture includes exclusively convolutional layers. Networks that just use such layers have some incredibly useful emergent properties[1] and are about as good as networks with more varied layers[2].

Here they use that to allow for a wide variety of input sizes, because without a fully-connected layer, there's no need to resize images as part of the preprocessing, or indeed at any point inside the network.

I am a bit skeptical about it being able to handle "any" input size however, because ultimately the convolutions are going to be applied to pixel values, and I can imagine some sadist blowing up a vector sketch to ludicrous dimensions, so that the majority of the filters would see either the uniform color of a line or blank paper. Even supposing that worked, it would be expensive w/r/t memory and computation. But I haven't checked that on my own[3], so I may be wrong, and even if I'm right it could be fixed rather easily by down-sampling or cropping.

All in all, awesome stuff! It's always nice when you see a project that augments the human in the loop rather than just outright replacing them.

-----

1. For example, you can get segmentation "for free" when training a classifier, c.f. Fully Convolutional Networks for Semantic Segmentation --http://cs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

2. These guys show that you can implement something like pooling via adjusting the stride length, and that the networks in question are pretty much state-of-the-art (circa 2014). Striving for simplicity: The all convolutional net -- http://arxiv.org/pdf/1412.6806.pdf

3. My graphics cards are busy at the moment. I just built a computer, named it (Serene Doreen the Mean Green Deep Dreamin' Machine[4]), and now I'm already looking for more capacity.

4. I'm not completely sold on the name; another contender was "Kim Jong GPU". At some point I will probably have to train a net just to name my computers.

pippy · on April 28, 2016

Soon we'll have neural networks taking care of most of the animation pipeline. Most animators work long hours for little pay, and in most western countries it's simply outsourced this the third world. There was a bit of drama late last year when the average salaries of Japanese animators were released public.

It would lead to a loss of jobs, however it would also lead to an influx of more diverse content. Flash for example was hailed as saving the western animation industry at the expense of quality.

These techniques will hopefully lead to an increase of both quality and quantity.

ryptophan · on April 27, 2016

I mean, it's cool and all... but I'm not sure I'd consider the input images to be rough sketches.

To me it's more like a pencil -> ink converter.

wodenokoto · on April 27, 2016

I'm very impressed they did this with only 68 image pairs!

gwern · on April 28, 2016

I am too. Maybe it's just a really easy task since it's mostly just throwing away information and emphasizing lines? Denoising, almost. But it helps that they do so much data augmentation (>9x).

pluskid · on April 27, 2016

This is cool! This will save a lot of time to convert a hand-drawing into high-quality digital contours for adding colors and post-processing on a computer.

sxates · on April 28, 2016

Is this available for use anywhere?

Siemer · on April 28, 2016

It says "Model (pending)" on the project page and that's an empty link, so presumably they will release the code at some point in the future.

SapphireSun · on April 28, 2016

This is quite cool. I wonder if its core technology is related to waifu2x (http://waifu2x.udp.jp/) as its neural net needs to perform a kind of implicit vectorization.