iainhaslam.com/slides
(easier in this direction)
(generally more flexible in this direction)
Images taken shamelessly from Wikipedia article on ConvNets.
The method minimizes the loss function - which checks how close to the content and style the optimised image is getting.
$L = \alpha L_{content} + \beta L_{style}$
# The main loss function - mixes style and content
loss = beta * content_loss + alpha * style_loss
#Use the Adam optimizer to minimize the difference between the target
#image and
# 1) the content representation which is stored in one layer of the CNN,
# 2) the style representation which is stored in multiple layers of the
# CNN - each layer considers styles over different areas.
optimizer = tf.train.AdamOptimizer(2.0)
# Compute and apply gradients to optimize for minimal `loss`
operation = optimizer.minimize(loss)
for it in range(iterations):
# ... progress reporting code omitted ...
sess.run(operation)
mixed_image = sess.run(model['input'])
save_image(output_filename, mixed_image)
Style loss for a single layer is
$E_l = \frac{1}{4N^2_lM^2_l}\sum{(G - A)}^2 $
… and they are summed according to heuristically-determined (guessed at and tested) weights:
$L_{style} = \sum_{l=0}^L w_l E_l $
def _style_loss(a, x):
"""Calculate style_loss for a single layer."""
# Number of filters
N = a.shape[3]
# Area of feature map
M = a.shape[1] * a.shape[2]
# Style representation of the original image
A = _gram_matrix(a, N, M)
# Style representation of the generated image
G = _gram_matrix(x, N, M)
style_loss = (1 / (4 * N**2 * M**2)) * tf.reduce_sum(tf.pow(G - A, 2))
return style_loss
There are lots of variables. To keep things comparable, I’ll be using this familiar view as the content for style transfers. Compare the shapes of the moon.
There are lots of variables. To keep things comparable, I’ll be using this familiar view as the content for style transfers. Compare the shapes of the moon.
Wait, you mean some Instagram-a-like “make your photo look like art” thing? Not really - these are fotoram.io - mosaic, picasso, the scream all have the feel of a filter overlay with edge-detection.
What about big bold colours?
Loss function (aka objective function) vs epoch
Some of the increasing losses (a feature of SGD) are just discernable in the animation
Loss function (aka objective function) vs epoch
Some of the increasing losses (a feature of SGD) are just discernable in the animation
Needs less white?
Still needs less white?
With apologies to the bear.
What about this as a style?
How well will this transfer?
What about this as a style?
How well will this transfer?
Hulk Smash.
Can I transfer the style of a photograph onto a painting?
Answer: No.
This dappling is art, not compression artifact.
This dappling is art, not compression artifact.
The style in this example is an abstract canvas by our colleague Richard Hall.
The style in this example is an abstract canvas by our colleague Richard Hall.
Generation time scales (essentially) linearly with number of pixels.
Return of the loss function
$L = \alpha L_{content} + \beta L_{style}$
Influence of initial image
Flickering due to differing noise in starting image.
Flickering due to non-deterministic optimizer.
This is not a cup of coffee (after 100, 300, 3300, 6500 and 30000 epochs)