They say a picture is worth a thousand words. But what words, damnit? If I see a stunning creation from Midjourney, Stable Diffusion or DALL-E 2, how can I go about creating a similar prompt if I don’t know the arcane magic words the creator used?
We launched PromptPerfect a while ago to help creators perfect their prompts (as you may have guessed from the name). It’s a tool to take your prompt (for text or image generation) and make it a lot more detailed.
Now it can reverse engineer prompts too, by extracting a prompt from any image you upload:
I can just feed in an image and it’ll reverse engineer the prompt so I can r̵i̵p̵ ̵o̵f̵f̵ be inspired by my favorite artists:
And here’s the prompt that PromptPerfect returns:
Japanese garden with a wooden bridge, mountain range in the background, traditional pagoda, lush greenery, serene atmosphere, family taking photo on the bridge, woman and child admiring pagoda, flowing river sound. Hyper-realistic digital art inspired by Hokusai's "Thirty-six Views of Mount Fuji," highly detailed with intricate patterns and textures, cinematic lighting and depth of field. Art style influenced by Yoshida Hiroshi and Utagawa Hiroshige.
PromptPerfect vs BLIP-2: Fight!
Let’s put it to the test by comparing it to BLIP-2:
a girl with long hair and a feathered headdress is standing in the woods
Portrait of a young Native American warrior princess, standing tall and proud on the edge of a cliff overlooking a vast canyon. She wears traditional clothing and accessories, including a feathered headdress, intricate beadwork, and tattoos that tell stories of her tribe's history. In one hand she holds a sharp spear, ready to defend her people against any threat. The scene is bathed in warm golden light as the sun sets behind her, evoking feelings of strength, courage, and fierce determination. Art by Amalia Chitulescu and Brian Sum.
As you can see, BLIP-2 spits out a very short, vague description, while PromptPerfect gives you a long, detailed prompt more suitable for image generation. This longer prompt contains a lot more variables for you to tinker with, though it does tend to hallucinate a bit (like specifying that she holds a spear and there’s a sunset behind her).
Now let’s feed both of those into Midjourney to see what we get. We’ll get four images from each prompt to ensure fairness:
BLIP-2 (left), PromptPerfect (right)
We can see from the PromptPerfect prompt that the lighting is more in line with the original image and we get more consistent patterns of dress. The BLIP-2 prompt seems to have led to more “creative” interpretations of a feathered headdress, with what seem to be leaves, and less of a traditional Native American vibe.
Let’s try a few more, like one of my favorite comic books, Marvel’s Nextwave:
a comic book cover featuring a group of characters
comic book cover featuring a superhero in the center, surrounded by various action scenes and villains. The top left piece shows the hero flying through the city, while the top right piece features a battle with a giant robot. The bottom left piece depicts the hero fighting off a group of henchmen, and the bottom right piece shows an intense showdown between the hero and his arch-nemesis on a rooftop. Despite the chaotic nature of the scene, there is a clear sense of heroism and triumph emanating from the central figure.
Neither of them are perfect, but BLIP-2’s prompt is so minimalist and vague as to be not very useful. Again, PromptPerfect hallucinates a rooftop and a city, but this gives a prompt engineer more knobs and dials to fiddle, by deleting or changing some words.
Here’s the result in Midjourney again:
BLIP-2 (left), PromptPerfect (right)
Here we can see that the image generated from BLIP-2’s prompt lacks the dynamic and frenetic feel of the original image. It’s more a bunch of folks standing around. While PromptPerfect doesn’t quite capture the delicious bonkers-ness of Nextwave, it gets a lot closer, showing crowded action-packed scenes of superheroes superheroing.
PromptPerfect: What is it good for?
Absolutely…lots of stuff.
Let’s say you’re running a project and want to re-design the logo. You want to maintain the same general vibe, but put a new spin on it. In this case, PromptPerfect’s hallucination becomes a feature rather than a bug, since it adds new features to what you already have.
Take our (at the time of writing) logo for Jina:
If we upload that to PromptPerfect we get this prompt:
magnifying glass, central focus, vibrant colors, inspecting, highlighting, circle logo, multiple shapes and hues| colorful elements| adjacent multicolored circles in yellow, green, blue| striking contrast between black background and vivid circle| recurring magnifying glasses throughout the scene| exploration of color and detail through lenses| dynamic composition that encourages viewers to examine intricate details within design
And then plugging that into Midjourney gives us wild new options:
If we manually tweak the prompt a bit (by fiddling with the knobs and dials PromptPerfect added), we get:
magnifying glass logo, central focus, vibrant colors, inspecting, highlighting, circle logo, multiple shapes and hues| colorful elements| adjacent multicolored circles in yellow, green, blue| striking contrast between white background and vivid circle| recurring magnifying glasses throughout the scene| exploration of color and detail through lenses| dynamic composition that encourages viewers to examine details within design
Which gives us some more promising candidates:
Let’s say I want to make a riff on the Mona Lisa. The problem is, it’s been done so often that it’s difficult for me to think of any creative thing that hasn’t been done before. Mona Lisa with a cigar? With Groucho Marx specs? As a punk? Done, done, and done.
Maybe PromptPerfect can throw in some new ideas? Let’s see the prompt it creates:
Reimagined Mona Lisa, Leonardo da Vinci, enigmatic smile, gentle gaze, elegant attire, serene landscape, intricate artwork | surrealism | cubism | double exposure effect | overlapping elements | inconsistencies | depth and realism | Salvador Dalí and Pablo Picasso inspired fusion | high-resolution digital painting | thought-provoking interpretations and discussions about Mona Lisa's identity.
I like to take pictures of my puppy, Mister Whifflekins:
Absolutely certainly my non-fictional puppy who is totally real and not an image I googled
For his birthday I want to make a chocolate cake with his picture on it (dogs eat cake, right? I’m sure it’ll be fine). But I don’t want a boring picture of him. I want something that reflects his creativity, daringness, and intense personality. So I can feed his image to PromptPerfect and see what we get:
Playful white poodle in a park, surrounded by lush greenery and vibrant flowers, wagging its tail, exploring surroundings, fur glistening in sunlight, eyes sparkling with excitement, puzzle piece shapes overlapping with poodle as centerpiece, joyful demeanor and lively energy captivating attention, intricate details and vivid colors enhancing composition, inspired by Thomas Kinkade's style
And in Midjourney that becomes:
I’m sure Mister Whifflekins will be overjoyed to see himself in Thomas Kincade’s jigsaw form!
Perfect your prompts now
Article by Alex C-G of Jina AI. Originally posted here and reposted with permission.