These are the assorted notes I gathered from my CS 798 lectures in winter 2008. I'm happy to answer questions or fill holes in these notes. Feel free to contact me whether or not you took the course.
I begin by laying out some of the history of western art, particularly the goal of creating a realistic visual record of people, places and events. What does the world look like? In the limit, progress in this direction leads to the "trompe l'oeil" style of painting, in which we literally attempt to fool the eye. I refer to the early example of Zeuxis and Parrhasius, as related by Pliny the Elder. I also showed Pozzo's masterful fresco on the ceiling of the St. Ignatius Church, and some wonderful contemporary work by John Pugh. Other wonderful examples of illusion-based art can be found in Al Seckel's book Masters of Deception.
The birth of photography freed painters from their duty as documenters, and allowed them to explore a more subjective approach to painting, one in which they could present their response to the world rather than its raw appearance. The Impressionism movement began at least partly as a response to the new-found realism of photography. Interestingly, a twentieth-century backlash operated in the opposite direction: photorealist painters created reproductions of photographs in minute detail with technical virtuosity.
Computer graphics can be seen as developing along an analogous path. the start of the art in photorealistic rendering is remarkable. In some sense, the research in non-photorealistic rendering is like impressionism: a deliberate attempt to explore the freedom available in computer-generated images in the face of photorealism. In some sense, we seek to build outward from principles of human perception and aesthetics rather than inward from the physical properties of light.
A venerable and popular image abstraction technique is painterly rendering using coloured strokes. The idea is presumably inspired by an idealization of a real painting process, in which an artist would continuously refine an approximation of a source image by adding strokes. Of course, we don't want the approximation to be too close -- part of the interest in these algorithms comes from the limitations of this means of reproduction. For a general introduction to stroke-based rendering, see the excellent survey by Aaron Hertzmann.
I begin by presenting a trivial algorithm: sample the colour of a source image at some location (x,y) and draw a coloured disc centered on that location. From there it's a small step into Haeberli's classic Paint by Numbers paper (SIGGRAPH 1990). Haeberli offers several strategies for controlling the shapes, sizes, orientations and colours of strokes, and a few other nice extensions. You can play with a Java implementation of his Impressionist program here.
This early work spawned a sequence of papers:
Several attempts have also been made to extend these methods to video. This idea leads naturally to a discussion of the nature of temporal coherence in NPR. Temporal coherence is something we all think we want, but as far as I know there is no real definition of it that explains why it's good and why standard artifacts (like the shower door effect) are bad. I think there's fundamental research to be done into the aesthetic question of whether temporal coherence even matters in the first place. Plenty of traditional animation (such as that of Bill Plympton has nothing resembling coherence; Plympton deliberately adds randomness to keep up the energy in his animations! Still, it's a standard research question in NPR.
Litwinowicz used optical flow to advect strokes from frame to frame, with extra work to keep stroke density roughly constant. Hertzmann and Perlin (NPAR 2000) suggest painting over the previous frame's rendering in order to minimize the number of new strokes generated. Hays and Essa also use optical flow, but more carefully and with fairly convincing results.
See also the video for The dumbing down of love by Frou Frou.
A major component of NPR research is concerned with recreating the appearance of traditional artistic media. Of course, we don't want to reproduce the dynamics of these media exactly -- computer simulations gain their power from the extent to which they depart from reality (editability, undoability, combinations of styles, etc.). I identify three main approaches to this problem:
I begin with Strassman's Hairy Brushes (SIGGRAPH 1986) as an early example of faking the appearance of ink brush drawing. I then present two more recent examples of faking natural media: Batik (Wyvill, van Overveld and Carpendale, NPAR 2004) and Felt (!) (O'Donovan and Mould, NPAR 2006).
A good example of simulation is the now classic paper Computer-generated watercolor (Curtis et al, SIGGRAPH 1997), which tries to manifest the typical visual effects of watercolour painting from the phyics describing the interaction of water, pigment and paper. Of course, two recent papers by Bousseau et al. (Interactive watercolor rendering with temporal coherence and abstraction, NPAR 2006, and Video watercolorization using bidirectional texture advection, SIGGRAPH 2007) yield convincing results, even animations, by faking it. Be sure also to check out Bousseau's simple tutorial on watercolourization in Photoshop.
Another medium that has been well explored is oil painting, due mostly to the excellent doctoral work of Bill Baxter. I present his papers dAb: interactive haptic painting with 3D virtual brushes (SIGGRAPH 2001) and IMPaSTo - a realistic, interactive model for paint (NPAR 2004). The former is especially interesting for its natural interface based on a 6DOF haptic brush device. The latter is highly realistic, and runs in real-time using GPU acceleration.
Finally, I consider pencil rendering, based on the paper Observational models of graphite pencil materials by Costa Sousa and Buchanan (Computer Graphics Forum 19(1), 2000). Although they develop what is essentially a fake-it approach, their work is notable in that they derive their model from close observation of pencil marks on paper. Really close: electron microsope images! They also handle the interaction of blenders and erasers with paper, and the deformation of paper fibres by the pencil tip.
There are a couple of commercial products that support painting with realistic natural media. Corel Painter has been around for a long time. I also recommend ArtRage, a recent tool by Ambient Design in New Zealand. It's a fun way to explore the potential of computer graphics in this context.
Of course, there's an interesting longer-term question: what artistic media will the computer enable that are completely divorced from tradition? Surely this is the true potential of NPR.
Halftoning refers to a collection of algorithms that span traditional and non-photorealistic rendering. The original goal of halftoning was to approximate continuous tones on devices that can only support discrete tones (i.e., white marks on a white screen or black marks on a white page). While these algorithms matter less now for display devices (mind you, new display technologies tend to start out in black and white), they're still very important for printing. The ultimate reference for halftoning is Ulichney's book Digital Halftoning (MIT Press, 1987); see also his 2000 survey. The course textbook also offers a pretty good survey of halftoning, with an eye towards adapting the algorithms to non-photorealistic rendering.
The simplest halftoning algorithm is thresholding: set a colour to black or white depending on whether its luminance is below or above some threshold. This produces pretty bad results. We can do better by converting pixels in blocks (say, 3x3). We can then define ten matrices of black and white pixels, and convert blocks of 3x3 pixels from the source image by subdividing the luminance range into ten corresponding intervals. This is called ordered dithering. Dithering produces a lot of structured artifacts, which the textbook claims can be harnessed for NPR. You can tradeoff between image detail and tone reproduction by varying the size of the blocks.
Another approach is to convert pixels one at a time, keeping track of the approximation error that accumulates by choosing black or white for any given pixel. You can then dole out fractions of that error to neighbouring pixels not yet processed. This technique is known as error diffusion dithering, and it produces far fewer artifacts than ordered dithering. The Floyd-Steinberg algorithm is by far the most well known error diffusion algorithm, but the question of how best to distribute error is still an active area of research.
Conceptually, screening is a bit like using an arbitrary greyscale image in the place of the dither matrices in ordered dithering. When placed atop an image, the greyscale screen is treated as a spatially varying threshold used to convert individual source pixels. You can also visualize this process using height fields: draw the source image as a black height field and the screen as a white height field. When drawn on top of each other, the screen height field will only allow black pixels to be shown that are locally darker than the screen at that point. In class I showed a demo that worked in exactly this way, using the OpenGL depth buffer to perform screening. Two great papers that extend the basic notions of screening are Digital Facial Engraving (Ostromoukhov, SIGGRAPH 1999) and Artistic Screening (Ostromoukhov and Hersch, SIGGRAPH 1995).
Stippling also involves the placement of black marks on a white background, but the marks are no longer constrained to a pixel grid. It's an important illustration technique in the sciences, useful for its ability to convey shape, tone and texture simultaneously in black and white. I talk about stippling in the context of a couple of papers: Floating Points: A Method for Computing Stipple Drawings (Deussen et al., Eurographics 2000), and Weighted Voronoi Stippling (Secord, NPAR 2002). I really like Secord's paper. It's a simple, elegant idea: use a weighted version of Lloyd's method to construct an importance sampling of an image. The results are quite nice.
Just for fun, I closed the lecture by showing some results from TSP Art, a line-art technique derived from Weighted Voronoi Stippling.
Pen-and-ink rendering might theoretically be lumped in with the earlier material on simulation of natural media, but it makes sense to separate it out. The simulation aspect of ink illustration isn't as important in this body of work -- we don't worry about the physics of ink or paper, or the dynamics of the drawing hand. Most papers on this topic simply assume ink drawings are geometrically perfect black strokes on a white background.
As with halftoning, our ink strokes must play three roles simultaneously: they must depict tone, texture and shape. In a way, it's amazing that a few simple ink strokes can do that at all, and that our visual systems will reassemble the original scene.
I discuss a sequence of four papers that laid out the field of pen-and-ink rendering (and which helped to define NPR as a research area):
As an addendum, I show some of the recent work Jie Xu and I did on mazes, which solves some related line-drawing problems.
NPR packing is a somewhat nebulously defined area, but one that I think is coalescing enough that it deserves to be treated as its own topic. The goal in NPR packing is usually to pack a collection of geometric primitives tightly within a container region in such a way that they achieve a set of aesthetic goals. As a motivating example, you may think of Roman mosaics or the portraits by Giuseppe Arcimboldo. Potential goals include:
Photomosaics and stippling may both be considered as a kind of NPR packing, though with photomosaics the packing problem is straightforward and the challenge lies in colour approximation.
I discuss several papers that have extended the original stippling algorithms to allow a wider range of primitives:
I then move on to algorithms for drawing Roman mosaics. A good first attempt is given in Simulating decorative mosaics (Hausner, SIGGRAPH 2001), who uses Lloyd's method in conjunction with the Manhattan distance metric to pack rectangular primitives oriented according to a direction field. It's not clear that this is what you want -- the algorithm produces many gaps and overlaps, whereas Roman mosaic artists seem to prefer reshaping tiles in order to keep grout even. A more principled approach is given in Rendering Traditional Mosaics (Elber and Wolberg, Visual Computer 2003). They line mosaic tiles up inside channels formed by successive parallel offet curves around an object. This approach can produce a better layout with more uniform grout. Some improvements, and a closer approximation of one particular mosaic style can be found in A novel technique for Opus Vermiculatum rendering (Battiato et al., WSCG 2006).
I end with a few other papers on NPR packing that I couldn't, er, pack into the topics above:
Before moving on to real-time, 3D NPR, I stop and consider the question of optimization in isolation. Optimization is a powerful technique in the context of NPR, where we often can describe the results we'd like to see but can't provide an algorithm that produces them. An optimization algorithm can search over a large space of possible drawings for one that fulfills our measurement of quality.
A good reference for optimization techniques is the chapter on optimization in the Numerical Recipes series (note in particular the online copy of the 1992 edition. I explain the perturbation-based simulated annealing method, and then discuss Rendering Effective Route Maps: Improving Usability Through Generalization (Agrawala and Stolte, SIGGRAPH 2001) as an example. You can try out their technique for yourself on Mapblast (click on "Get Directions" and select "LineDrive" as the map style). I also show some of my recent work with Jie Xu that uses simulated annealing.
At this point in the course, I make a rough transition from 2D graphics to 3D graphics. My aim is primarily to build up to some of the recent research on extracting fancy view-dependent curves from 3D models. In order to do that, I need to review a lot of basic concepts and simpler algorithms.
It's safe to assume that we'll be operating on polygonal meshes (or the pixels that represent drawings of them). A quick review of meshes can be found in the CS 488 course notes. Note that a mesh might have one of two interpretations:
In the latter case, we can safely assume that the vertices of the mesh lie on the ideal smooth object, and that the vertices have normals taken from the smooth object. If we don't have vertex normals, we can always invent them by averaging the normals of the faces adjacent to each vertex. In any case, we can then smoothly (and approximately) interpolate the normals across triangle faces using barycentric coordinates.
After a review of the Phong illumination model, I discuss how to achieve a two-tone toon shading look with 3D objects. The standard approach evaluates the diffuse lighting at every vertex in the mesh and uses it to look up colour in a 1D texture map. We can add a stylized specular highlight by using multitexturing to add in a lookup to the specular lighting. This approach is explained in Stylized rendering techniques for scalable real-time 3D animation (Lake et al., NPAR 2000), though these days the correct approach is simply to use GPU programming.
Toon shading is effective for simple animations and games (see, for example, the Zelda game "The Wind Waker"). But there are some annoying artifacts, and the results are hard to control. There's some good recent research that tries to add more directorial control to toon shading -- see Locally controllable stylized shading (Todo et al., SIGGRAPH 2007) and other papers by the same authors.
Another interesting bit of recent work is X-Toon: an extended toon shader (Barla et al., NPAR 2006). They propose that the standard 1D texture map approach to toon shading be extended to a 2D texture map. They suggest a couple of different ways to map aspects of the scene to that second access, and different contents for the texture map that support a number of common rendering effects.
Finally, I look back over some "classic" NPR work: A non-photorealistic lighting model for automatic technical illustration (Gooch et al., SIGGRAPH 1998). They suggest a combination of several illumination models motivated by traditional illustration techniques and the desire not to obscure detail as would normally happen with Phong illumination. First, they remap the diffuse illumination term N⋅L from [-1,1] to [0,1], so that no detail is lost in areas that point away from the light source. Then they define two lighting models. One runs from the object's ambient colour to its fully lit colour based on the remapped diffuse illumination. The other runs from a cool (blue) colour to a warm (yellow) colour. And they introduce a parameterized lighting model that blends between the dark-to-light shading and the cool-to-warm shading. The result gives a illustrative look that is held up as a classic example of NPR.
I introduce the problem of creating a line drawing of a 3D model. There's lots of evidence, both in the history of art and more recently in models of human perception, to suggest that simple line drawings can communicate a great deal about shape.
What sorts of lines (well, curves) might we be interested in? I introduce four types for starters (we'll add to this list later):
We must immediately consider how to identify such lines on a surface. Explicit curves require no work, though representing a curve on a surface requires some kind of parameterization of the surface. Boundaries are purely topological features and easy to find with the right kind of mesh representation. Creases can be found approximately by thresholding against the dihedral angle between two mesh faces (or sometimes by storing multiple normals at vertices).
Contours are the interesting case. Aaron Hertzmann's tutorial introduces some of the basic contour identification algorithms. On a faceted mesh, we can identify edges that are shared by a front face and a back face (in which case we can use Buchanan and Sousa's edge buffer to locate these edges). He also presents a smoother algorithm that interpolates contours through faces of a mesh with vertex normals. A truly wonderful approach that still gives me warm fuzzies is the dual-space method presented in Illustrating smooth surfaces (Hertzmann and Zorin, SIGGRAPH 2000 -- see also the source code). They show that the contours can be found by intersecting a plane representing the dual of the camera with a "dual surface". The difficulty comes from the fact that the dual surface may include points at infinity. Asymptotically, this is probably the fastest possible deterministic method for identifying contours. Note that Pop et al. developed a very similar technique at around the same time. See their paper Efficient perspective-accurate silhouette computation and applications (SoCG 2001). Question: this algorithm must be the dual of an algorithm that runs directly on the original mesh. What is that algorithm?
Once contours (and other lines) have been identified, the difficult problem of visibility processing arises. How do we draw only those lines that are not occluded by the surface? We know this will be a hard problem, since occlusions can be non-local, and we may need to consider partial visibility.
Once we've processed a bunch of lines and computed their visibility as above, how do we render them? And what other lines might we care about? In this lecture I present some of the techniques for applying stylizations to contours and other lines, and some more recent reserarch about extracting other meaningful lines from surfaces.
Given a bunch of line segments in screen space (say, taken from contour edges in a mesh), we typically link them together into long chains, approximate the polygonal chains with smooth Bézier curves, and draw stylized strokes along the chains. We can texture map a stroke image along the curve, use transparency and anti-aliasing, maybe even texture synthesis. Some of these effects are demonstrated in Artistic silhouettes: a hybrid approach (Northrup and Markosian, NPAR 2000).
A very nice, complete system for stylized strokes and shading is presented in WYSIWYG NPR: drawing strokes directly on 3D models (Kalnins et al., SIGGRAPH 2002). They provide the stylizations above, allow the user to sketch the stylized strokes interactively, and support several forms of hatching strokes on surfaces.
One problem with rendering stylized strokes in the style of WYSIWYG NPR is temporal coherence. If the 3D model moves, rotates and scales, how shall we move the strokes to make the resulting animation look "right"? That turns out to be a hard question, which depends in part on the way we interpret those stylized strokes (do they reflect a simulated artistic medium or detail on the underlying 3D surface)? These issues are tackled in Coherent stylized silhouettes (Kalnins et al., SIGGRAPH 2003). They present a technique for assigning a parameterization to each contour chain in a line drawing, and for propagating that parameterization from frame to frame without too much distortion.
I also review some of the more recent work on identifying new lines that help to communicate 3D shape and shading. I briefly review fundamental principles from differential geometry, namely curvature of a curve, curvature of a surface, principal curvature, and radial curvature. I then discuss three classes of lines:
I show how some NPR algorithms can be implemented simply by manipulation of pixel data, assuming that a rich set of buffers (beyond just R, G, and B) have been provided. I cover the seminal SIGGRAPH 1990 Comprehensible Rendering paper by Saito and Takahashi, the Piranesi system used commonly in architecture, and Curtis's Loose and Sketchy Animation technique. Curtis gives more details about his technique in the course notes of the SIGGRAPH 1999 NPR course. I also tie in image-based silhouette algorithms, as presented in Hertzmann's silhouette tutorial.
The techniques of Lectures 9 and 10 can extract attractive line drawings from 3D models. We know a little bit about how to fill in faces too, having discussed non-photorealistic lighting models. How can we further stylize the interiors of surfaces? This topic has cropped up occasionally in the past, for example in some of the watercolour work and the Comprehensible Rendering paper from Lecture 11. In this lecture, I explore a few additional approaches. As usual, one of the big difficulties to overcome is temporal coherence.
In Illustrating smooth surfaces, Hertzmann and Zorin present a technique for creating an illustration of a 3D shape using hatching strokes. The strokes are found by smoothing streamlines that flow in directions of principal curvature on the surface. They use a few heuristics derived from mathematical illustration to decide on the density of hatching strokes, and whether to show strokes in zero, one, or two directions.
Painterly rendering for animation (Meier, SIGGRAPH 1996) presents a technique that adapts the stroke-based painterly rendering algorithms from early in the course to 3D models. She generates a fixed set of point samples on the 3D surfaces in a scene. Then, for a given frame in an animation, strokes are rendered in screen space at the projected locations of the samples. Appropriately enough, she uses the Painter's Algorithm: strokes are painted in order from furthest from the camera to closest.
Real-time hatching (Praun et al., SIGGRAPH 2001, and the followup in NPAR 2002) introduces the tonal art map, a 2D array of textures that represents increasing tone in one direction and increasing level of detail in the other (via MIPmaps). The individual textures are nested, in the sense that every texture is a subset of its darker and more detailed neighbours. A tonal art map might in all represent a single texture, such as pencil hatching. The paper uses a lot of fancy texture mapping hardware combined with lapped textures (SIGGRAPH 2000) to render attractive, seamless hatching in real time on circa 2001 hardware. These days, a lot of the careful hackery is obviated by more general GPU programming capabilities.
Most recently, Dynamic 2D patterns for shading 3D scenes (Breslav et al., SIGGRAPH 2007) shows how surface texture and halftoning techniques such as screening can be adapted to 3D models with temporal coherence. They observe the motion of a set of point samples on the surface to infer a best-fit similarity motion (a combination of translation, rotation and uniform scaling) that maps the texture between successive frames. They need a few additional tricks to make scaling work correctly and possibly break a surface into independent surface textures.
We've talked throughout the course about distorting shapes, colours, pixels, and screen space primitives. One facet of rendering that has remained fixed is the camera. Until now, we have relied almost exclusively on straightforward linear perspective. Yet the camera is undoubtedly a participant in the process of storytelling and artistry. We might discover new NPR techniques by deliberately departing from linear pespective.
In full generality, we know that we're producing pixels, so we can allow every pixel in the output image to have its own camera. Of course, that would produce total chaos. We need high-level techniques to control all these cameras in a sensible way. I discuss five papers on this subject.
I close out the course by talking about some of the "grand challenges" in NPR. NPR is a bit of a strange research area. What's the common thread that holds together NPR research? What have we accomplished so far? What is NPR for? Where are we going with it? This is something we're very conscious of in this field -- almost every NPR meeting includes a meeting, invited talk or panel in which the field tries to understand itself and the challenges that await.
I take advantage of these previous discussions to recapitulate the talks others have given on this subject in the past. I borrowed slides from David Salesin's NPAR 2002 keynote talk, and the talks by Fredo Durand, Doug DeCarlo and Aaron Hertzmann from NPAR 2007. These talks have not yet been organized; if that ever happens, I'll update these notes. In the meantime, I'll repeat the shameless ad in Durand's talk -- his paper An invitation to discuss computer depiction (NPAR 2002) is a great place to start thinking about the big questions.
|Craig S. Kaplan||Last updated:|