Controlling visual appearance is an ongoing and challenging research problem in computer graphics. Artists have limited options for realistically portraying real objects, places, and people. That is, for recreating and responsibly altering their visual appearance. Existing tools that provide precise enough control, for example 3D modeling/rendering applications, are also labor-intensive. Recent generative AI models can produce plausible images quickly, but limit artists’ creative input. In this dissertation we present novel methods that bridge the gap between ease-of-use and controllability by repurposing image diffusion models to both capture and edit real-world appearance.
This dissertation begins with techniques for appearance estimation. Given a photograph, we are able to construct a digital representation of the material properties therein, in a format suitable for 3D graphics artists. A key observation is that the estimation problem is ambiguous, permitting many equally accurate but not all equally useful solutions. Leveraging recent advances in generative diffusion to sample multiple plausible appearances allows the user to choose whichever one best meets their artistic vision. Further, by reusing pre-existing image diffusion models, our techniques benefit from builtin visual knowledge and reduce training cost.
This dissertation also demonstrates three complementary methods for editing the visual appearance of real objects post-capture. First, our neural rendering model can synthesize realistic images from easy-to-edit color maps. Secondly, we contribute a texture synthesis pipeline that can extend real photographed details across an infinite surface. Finally, our novel “overpainting” task allows artists to make intuitive, tightly-controlled modifications directly in image-space, side-stepping the appearance representation problem entirely.