I tested Google’s fancy image generator and quickly found its limits

2 weeks ago 10
Google IO 2024 Imagen 3

Imagen 3 is Google’s AI image generator, which was announced back in May at the company’s I/O developer conference. It launched in a limited capacity in the US in August but became available to free Gemini users last month. I’ve been using it ever since to create all sorts of images, and while it’s an impressive tool overall, it does have several limitations that hinder the overall experience.

Here’s where Imagen 3 struggles

The first limit to be aware of is that you can’t generate images of people, at least with a free Gemini account. This doesn’t just apply to creating images of famous people, which not many image-generating tools allow for anyway, but people in general. So a prompt like, “create an image of two random people dancing” will not return any results. For reference, ChatGPT also has this limit in place for its free tier.

You can create images of people if you upgrade to Gemini Advanced.

However, you can create images of people — excluding famous ones — if you opt for a Gemini Advanced subscription. I tried it out, and it’s a hit-and-miss. While it can generate images that are so realistic it’s hard to tell whether they are AI-generated or not, sometimes the results it produces are subpar. Check out the two examples below. The one on the left comes across as very realistic and looks like it was taken by a professional photographer, while the other one just looks cartoonish. Even when prompting the tool to make the photo more realistic multiple times, the changes it made were minimal.

Speaking of professional photographers, let’s move on to the second limit or issue I see with Imagen 3. Even when producing a realistic image, whether of a person, an animal, or an object, the result looks professional instead of casual. Every image is picture-perfect, with the bokeh effect frequently added to make it look more appealing. Every picture Imagine 3 creates looks like it was heavily edited, which is fine if that’s the look you’re going for, but having the ability to make images look more casual would be great.

I think the best photos are sometimes the ones that are raw. The unedited ones you took without much thought when the lighting wasn’t perfect and the people you captured didn’t even know you snapped a photo. That’s where Imagine 3 struggles, although it’s worth mentioning that this is true for almost every AI image generator out there.

This brings me to the third major issue with Imagen, which is editing the images created. If I create a funny image of a cat wearing a hat and eating a popsicle and then want to edit it with an additional prompt, Imagen 3 will create a brand new image in Gemini. So, for example, if I like the image created but just want to change the color of the hat from black to blue, the tool will generate a new image altogether and change the color of the hat instead of just changing the hat’s color and leaving everything else as is. Granted, the new image does look relatively similar to the old one when using the right prompt, but it’s still not the same, which is not ideal. This makes it impossible to edit a picture to perfection, especially with multiple prompts that will generate a new image every time. Check out the example below and see for yourself.

Another issue is that I can’t change the aspect ratio. Images are created in a 1:1 aspect ratio by default and can’t be modified. If I prompt the tool to change it to 16:9, Gemini just says it will but then generates a new image with the same aspect ratio. However, it looks like this will change soon, as the ability to change the aspect ratio is already in the works.

Limits aside, Imagen 3 is great

Let me just make it clear that I’m not trying to bash Google’s fancy AI image generator. I just want to highlight the limits I ran into while testing it so that you know what to expect. Limits aside, Imagen 3 is actually a very impressive tool. I’ve tried out a few of its rivals as well, and while each AI image generator has its pros and cons, I’d say Imagen 3 is among the best ones out there. My colleague Calvin agrees. He compared the tool against rivals and found that it’s the best one out there in terms of quality.

We’re still in the early stages of AI-generated content.

When Imagen 3 just gets it right, the results are outstanding. Images of animals, cities, people, and anything else for that matter come out great — if you can live with a photoshopped look. Don’t take my word for it. Take a look at the gallery below to see for yourself. And keep in mind that we’re still in the early stages of AI-generated content, so just imagine what the software will be able to do a few years down the line.

Other limits to be aware of

These are the limits I came across while testing the tool and didn’t expect — aside from the inability to generate images of people as a free user — although there are other limits in place Google clearly states on its website. It’s worth listing them out so that you know what to expect.

Imagen 3 will not create an image it deems inappropriate, even with a paid plan. That includes pictures related to violence, harassment, sex, discrimination, and the likes. This also goes for images that encourage dangerous activity and those with harmful factual inaccuracies that would pose a risk to someone’s safety.

These are all appropriate limits, and most of the big AI image-generating tools have them in place, not counting FLUX.1 used by Grok.

Read Entire Article