Google Labs, Google’s experimental arm, is testing a new image generator called Whisk. This tool allows people to prompt with images instead of text, allowing them to remix a photo by altering the subject, scene, and style.
Whisk uses Google’s image-generation model, Imagen 3, to combine three images: one for the subject, another for the scene, and one for the style. For instance, you can select a photo of yourself as the subject, a futuristic landscape as the scene, and an anime style for the final look.
The model automatically generates a detailed caption of your images, which is then used to guide Imagen 3 in creating a remix of the photo. You can also input text prompts to further define the desired outcome, including detailed descriptions like “Subject is riding a flying bike.”
Because Whisk only focuses on a few key characteristics from each image, the company explains that the results may not always meet your expectations. For example, the generated subject could differ in height, weight, hairstyle, or skin tone. Google says you can view and edit the underlying prompts at any time.
The experiment is currently only available to users based in the U.S. at labs.google/whisk.