DeepTextures - Jan K. Schlüsener

Background

Gatys, Ecker and Bethge (NIPS 2015) showed that the Gram matrices of intermediate VGG layers encode textural content well enough to synthesize novel images with matching mid-level statistics. This is related to but distinct from Neural Style Transfer: there is no content image, no content loss, just a random noise matrix iteratively optimized to match the Gram matrix statistics of a single texture donor across five VGG19 layers (first conv + four pooling layers).

Motivation

The original Caffe implementation is effectively unusable today without significant environment archaeology. PyTorch is the current standard for deep learning research, so this re-implementation exists to make the algorithm accessible without the dependency burden of a deprecated framework.

What It Does

Given an input texture, the algorithm synthesizes a new image that shares its statistical fingerprint at multiple scales of the feature hierarchy, without copying any specific spatial structure. The output is called a texform.

Implementation

Delivered as a Jupyter notebook rather than a packaged library, which fits the scope: this is primarily a clear walkthrough of the algorithm rather than production tooling. Loss is Gram-matrix MSE across the five selected layers, optimized directly on the pixel values of the output image.