What is the new style parameter?
OpenAI released a preview of a DALL-E 3 API this week, and I’m excited to play with it today.
We can now control a new parameter named
style. The documentation explains:
style (defaults to
The style of the generated images. Must be one of
vividcauses the model to lean towards generating hyper-real and dramatic images.
naturalcauses the model to produce more natural, less hyper-real looking images.
This param is only supported forSource: DALL-E 3 API reference
Some examples of this parameter’s impact are in the cookbook, but I couldn’t get a sense of how much they differ based on a few images of a coffee set.
My current interest is in generating photorealistic images (as opposed to symbolic images or art), so I asked DALL-E to generate a few photography-like images.
The following table contains example images generated using DALL-E 3 in different ways:
- The image was generated using ChatGPT’s new “DALL-E” GPT. It is available as a part of ChatGPT Plus subscription. It’s only available via Web UI, and we have no direct control over low-level API parameters like
size. We can only influence the result with our prompt.
- Image generated using the
- Image generated using the
|A portrait of a school bus driver|
|Generate a photo of a surgeon performing brain operation|
|Generate a photorealistic image of a rock star performing on a stage of a large festival in the evening|
|Dancing people in the evening, sunset, city, flash photography, ƒ/3.5|
| A portrait of a dog in a library, Sigma 85mm f/1.4|
* acknowledgement: this and the following prompts come from an article by Merzmensch, which helped me direct DALL-E to generating more photo-realistic results. Thanks!
|A bitten-into apple hanging on branch of an apple tree, Sigma 85mm f/1.4|
|An image of a couple sharing an umbrella on a quaint park bench amidst falling rain.|
Each image is unique, and comparing them is subjective by nature. My view is that:
- ChatGPT seems to produce images close to ‘vivid’ in style. In my opinion, at this moment, it tends to make more interesting images than those returned by the API with the
There are some threads in the forums, like this one, showing examples of API generating arguably lower-quality images than ChatGPT. This might be temporary, as we’re dealing with a preview product, and can probably be explained by how the prompt is differently pre-processed and rewritten in those cases.
naturalstyle produces images that I’d describe as more “photorealistic” – looking like something possible to capture with a camera rather than created by a computer game. They might look a bit more bland, but I like them, and I’ll probably use this setting a lot.
- The default “vivid” style leads to cartoony, dramatic images. Sometimes they look fantastic, and sometimes they look artificial with too much saturation and dramatism. I like the last one with the umbrella, which could perfectly serve as an illustration for a book. But some of them are overdone for my taste. They scream, “I’m generated by AI,” much louder than the
Hope you enjoyed this short comparison!