Every cent you spend on this, remember: The people who made this possible are not even getting a millionth of a cent for every billion USD made with it (they are getting nothing). Same with code; that code you spent years pouring over, fixing, etc. is now how these companies make so much money and get so much investment. It's like open source, except you get shafted.
minimaxiryesterday at 11:19 PM
So during my Nano Banana Pro experiments I wrote a very fun prompt that tests the ability for these image generation models to follow heuristics, but still requires domain knowledge and/or use of the search tool:
The NBP result is here, which got the numbers, corresponding Pokemon, and styles correct, with the main point of contention being that the style application is lazy and that the images may be plagiarized: https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:oxaerni...
It did more inventive styles for the images that appear to be original, but:
- The style logic is by row, not raw numbers and are therefore wrong
- Several of the Pokemon are flat-out wrong
- Number font is wrong
- Bottom isn't square for some reason
Odd results.
parastitoday at 6:04 AM
A great technical achievement, for sure, but this is kind of the moment where it enters uncanny valley to me. The promo reel on the website makes it feel like humans doing incredible things (background music intentionally evokes that emotion), but it's a slideshow of computer generatated images attempting to replicate the amazing things that humans do. It's just crazy to look at those images and have to consciously remind myself - nobody made this, this photographed place and people do not exist, no human participated in this photo, no human traced the lines of this comic, no human designer laid out the text in this image. This is a really clever amalgamation machine of human-based inputs. Uncanny valley.
simonwyesterday at 7:26 PM
I've been trying out the new model like this:
OPENAI_API_KEY="$(llm keys get openai)" \
uv run https://tools.simonwillison.net/python/openai_image.py \
-m gpt-image-2 \
"Do a where's Waldo style image but it's where is the raccoon holding a ham radio"
Here's what I got from that prompt. I do not think it included a raccoon holding a ham radio (though the problem with Where's Waldo tests is that I don't have the patience to solve them for sure): https://gist.github.com/simonw/88eecc65698a725d8a9c1c918478a...
vunderbayesterday at 8:32 PM
OpenAIās gpt-image-1.5 and Googleās NB2 have been pretty much neck and neck on my comparison site which focuses heavily on prompt adherence, with both hovering around a 70% success rate on the prompts for generative and editing capabilities. With the caveat being that Gemini has always had the edge in terms of visual fidelity.
That being said, gpt-image-1.5 was a big leap in visual quality for OpenAI and eliminated most of the classic issues of its predecessor, including things like the āpiss filter.ā
Iāll update this comment once Iāve finished running gpt-image-2 through both the generative and editing comparison charts on GenAI Showdown.
Since the advent of NB, Iāve had to ratchet up the difficulty of the prompts especially in the text-to-image section. The best models now score around 70%, successfully completing 11 out of 15 prompts.
For reference, hereās a comparison of ByteDance, Google, and OpenAI on editing performance:
gpt-image-2 has already managed to overcome one of the soācalled āmodel killersā on the test suite: the nine-pointed star.
Results are in for the generative (text to image) capabilities: Gpt-image-2 scored 12 out of 15 on the text-to-image benchmark, edging out the previous best models by a single point. It still fails on the following prompts:
- A photo of a brightly colored coral snake but with the bands of color red, blue, green, purple, and yellow repeated in that exact order.
- A twenty-sided die (D20) with the first twenty prime numbers (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71) on the faces.
- A flat earth-like planet which resembles a flat disc is overpopulated with people. The people are densely packed together such that they are spilling over the edges of the planet. Cheap "coastal" real estate property available.
Here is my regular "hard prompt" I use for testing image gen models:
"A macro close-up photograph of an old watchmaker's hands carefully replacing a tiny gear inside a vintage pocket watch. The watch mechanism is partially submerged in a shallow dish of clear water, causing visible refraction and light caustics across the brass gears. A single drop of water is falling from a pair of steel tweezers, captured mid-splash on the water's surface. Reflect the watchmaker's face, slightly distorted, in the curved glass of the watch face. Sharp focus throughout, natural window lighting from the left, shot on 100mm macro lens."
Ran a bunch both on the .com and via the api, none of them are nearly as good as Nano Banana.
(My file share host used to be so good and now it's SO BAD, I've re-hosted with them for now I'll update to google drive link shortly)
swalshyesterday at 10:25 PM
Been using the model for a few hours now. I'm actually reall impressed with it. This is the first time i've found value in an image model for stuff I actually do. I've been using it to build powerpoint slides, and mockups. It's CRAZY good at that.
madroxyesterday at 11:06 PM
This seems like a great time to mention C2PA, a specification for positively affirming image sources. OpenAI participates in this, and if I load an image I had AI generate in a C2PA Viewer it shows ChatGPT as the source.
Bad actors can strip sources out so it's a normal image (that's why it's positive affirmation), but eventually we should start flagging images with no source attribution as dangerous the way we flag non-https.
The improvement in Chinese text rendering is remarkable and impressive! I still found some typos in the Chinese sample pic about Wuxi though. For example the 笼 in å°ē¬¼å was written incorrectly. And the "ęå°äøęä¹ęø ę°åÆčÆ»" section contains even more typos although it's still legible. Still, truly amazing progress. Vastly better than any previous image generation model by a large margin.
schneehertztoday at 1:05 AM
Generating a 4096x4096 image with gemini-3.1-flash-image-preview consumes 2,520 tokens, which is equivalent to $0.151 per image.
Generating a 3840x2160 image with gpt-image-2 consumes 13,342 tokens, which is equivalent to $0.4 per image.
This model is more than twice as expensive as Gemini.
bsenftnertoday at 11:41 AM
My problem with all of this is the terrible educations everyone has, and they cannot discriminate images from art, nor art from communications, and if they had they would realize these points this entire debate hinges is a manipulation to create people that will not help themselves with the latest technologies. But to explain it causes people to get angry, because they either think I'm trying to manipulate them, or they fall in despair when they realize the magnitude of this crime.
Oarchtoday at 10:12 AM
Every groundbreaking new AI release feels like a volley of cannonfire towards the soul. Oof.
TrackerFFtoday at 9:40 AM
This is the first model I've used for mockups where I feed reference images, and they truly look real and good enough for pro use. I'm impressed.
dktpyesterday at 8:13 PM
One interesting thing I found comparing OpenAI and Gemini image editing is - Gemini rejects anything involving a well known person. Anything. OpenAI is happy to edit and change every time I tried
I have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away
This is not as exciting as previous models were, but it is incredibly good. I am starting to think that expressing thoughts in words clearly is probably the most important and general skill of the future.
louiereedersonyesterday at 7:43 PM
The image of the messy desktop with the ASCII art is so impressive - the text renders, the date is consistent, it actually generated ASCII art in "ChatGPT", etc. I was skeptical that it was cherry-picked but was able to generate something very similar and then edit particular parts on the desktop (i.e. fixing content in the browser window and making the ASCII dog "more dog like"). It's honestly astounding, to me at least.
____tom____yesterday at 7:58 PM
No mention of modifying existing images, which is more important than anything they mentioned.
I think we all know the feeling of getting an image that is ok, but needs a few modifications, and being absolutely unable to get the changes made.
It either keeps coming up with the same image, or gives you a completely new take on the image with fresh problems.
Anyone know if modification of existing images is any better?
This is insanely good. But wow, prompting to get any one of these images is way more complicated than prompting Claude Code. There is a ton of vocabulary that comes with it relating to the camera, the lighting, the mood etc.
throwaway2027yesterday at 7:11 PM
I know people like to dunk on ChatGPT and Gemini and say Claude is or used to be better, but you can still use worse models when you're out of usage AND make use of Nano Banana and and ChatGPT Image generation with separate limits for your subscription. I think it could make it a more package as a whole for some people (non-programmers). I do like having the option and am excited for which improvements they've done to ChatGPT Image generation because in the past it had this yellow piss filter and 1.5 it sort of fixed it but made things really generic with Nano Banana beating it (altough Gemini also had a too aggressively tuned racial bias which they fixed), it seems the images ChatGPT generates have gotten better.
sanextoday at 2:09 AM
Having the launch website just scrollable generated images is so slick. I love this.
mercaconatoday at 10:40 AM
Every improvement in image generation seems to reduce the value of the images themselves. When anything can be faked or created in seconds, what is an image really worth? With text or code, you can dig into a meaningful dialogue because their reality is digital too. But images become like the plain people to show up photo frames.
I guess it's just a completely personal feeling.
overgardyesterday at 11:11 PM
Pretty mixed feelings on this. From the page at least, the images are very good. I'd find it hard to know that they're AI. Which I think is a problem. If we had a functioning congress, I wonder if we might end up with legislation that these things need to be watermarked or otherwise made identifiable as AI generated..
I also don't like that these things are trained on specific artist's styles without really crediting those artists (or even getting their consent). I think there's a big difference between an individual artist learning from a style or paying it homage, vs a machine just consuming it so it can create endless art in that style.
super256today at 10:26 AM
I tried using it for creating 2D logos, which many tools suck at (except mid journey).
Looks like ChatGPT Images 2 is now good at this too!
squidsoupyesterday at 11:20 PM
Are camera manufacturers working on signed images? That seems like the only way our trust in any digital media doesn't collapse entirely.
lossyalgoyesterday at 11:30 PM
Someone remind me again why this is a good idea to be able to create perfect fake images?
joegibbsyesterday at 9:16 PM
The quality of the text is really impressive and I canāt seem to see any artefacts at all. The fake desktop is particularly good: Nano Banana would definitely slip up with at least a few bits of the background.
bensyversonyesterday at 7:22 PM
I caught the last minute of thisāwas it just ChatGPT Images 2.0?
thelucentyesterday at 8:46 PM
It seems to still have this gpt image color that you can just feel. The slight sepia and softness.
rambojohnsontoday at 8:40 AM
Just tried it and got six fingers and half a thumb on a simple portrait. Mickey Mouse stuff.
Was this an oversight? Or did their new image generation model generate an image that was essentially a copy of an existing image?
nickandbroyesterday at 10:54 PM
200+ points in Arena.ai , that's incredible. They are cleaning house with this model
kibibuyesterday at 9:28 PM
Genuine question: what positive use cases are sufficient to accept the harm from image generators?
One that i can think of:
- replacing photography of people who may be unable to consent or for whom it may be traumatic to revisit photographs and suitable models may not be available, e.g. dementia patients, babies, examples of medical conditions.
Most other vaguely positive use cases boil down to "look what image generators can do", with very little "here's how image generators are necessary for society.
On the flip side, there are hundreds of ways that these tools cause genuine harm, not just to individuals but to entire systems.
JimsonYangtoday at 12:00 AM
> you can make your own mangas
No you canāt.
You still have the studio ghibili look from the video. The issue of generating manga was the quality of characters, thereās multiple software to place your frame.
But I am hopeful. If I put in a single frame, can it carry over that style for the next images? It would be game changing if a chat could have its own art style
Orasyesterday at 10:40 PM
My test for image models is asking it to create an image showing chess openings. Both this model and Banana pro are so bad at it.
While the image looks nice, the actual details are always wrong, such as showing pawns in wrong locations, missing pawns, .. etc.
Try it yourself with this prompt: Create a poster to show opening game for Queen's Gambit to teach kids to play chess.
samiwamiyesterday at 7:22 PM
do they have anything similar to SynthID, or are they just pretending that problem doesn't exist?
I know this is probably mega cherry-picked to look more impressive, but some of the images are terrifyingly realistic. They seem to have put a lot of effort into the lighting.
baalimagotoday at 6:41 AM
"Benchmarks" aside, do anyone actually use these image models for anything?
rambojohnsontoday at 8:41 AM
Just tried it and got the usual six fingers, and half a thumb. What are they actually iterating on with these models by nowā¦
c16today at 9:55 AM
That video seems like it was made for the tiktok generation. Slow down.
RigelKentaurusyesterday at 8:48 PM
If every single image on their blog was generated by Images 2.0 (I've no reason to believe that's not the case), then wow, I'm seriously impressed. The fidelity to text, the photorealism, the ability to show the same character in a variety of situations (e.g. the manga art) -- it's all great!
codebolttoday at 7:18 AM
Anyone test it out for generating 2D art for games? Getting nano banana to generate consistent sprite sheets was seemingly impossible last time i tried a few months ago.
deletedtoday at 7:56 AM
platinumradyesterday at 11:41 PM
Why do all of the cartoons still look like that? Genuinely asking.
mrzhangbotoday at 10:55 AM
I'm exhausted. I've developed many products, but most of them were abandoned halfway through.
modelessyesterday at 10:25 PM
Can it generate transparent PNGs yet?
elAhmotoday at 9:19 AM
I am super out of the loop here, what happened with Dall-E?
PDF_Geektoday at 8:34 AM
The free tier for ChatGPT feels pretty much nerfed at this point. Iām barely getting 10 prompts in before it drops me down to the basic model. The restrictions are getting ridiculous. Is anyone else seeing this?
tezzatoday at 7:22 AM
I've rushed out my standardised quality check images for gpt-image-2:
gpt-image-2 has a lot more action, especially in the Apple Cart images.
VA1337today at 9:25 AM
So is it better than nano-banana after all?
jumploopstoday at 4:28 AM
Looks like analog clocks work well enough now, however it still struggles with left-handed people.
Overall, quite impressed with its continuity and agentic (i.e. research) features.
mvkeltoday at 2:43 AM
I wonder if this confirms version 1 of some kind of "world model."
It has an unprecedented ability to generate the real thing (for example, a working barcode for a real book)
naseemali925today at 5:47 AM
Its amazingly good at creating UI mockups. Been trying this to create UI mockups for ideas.
thevinteryesterday at 7:34 PM
Every time a new image gen comes out I keep saying that it won't get better just to be surprised again and again. Some of the examples are incredible (and incredibly scary. I feel like this is truly the point where understanding if something is AI becomes impossible)
souravroy78today at 11:08 AM
Cool!
franzetoday at 12:15 AM
the tragedy of image generating ai is that it is used to massively create what already exists instead of creating something truly unique - we need ai artists - and yeah, they will not be appreciated
aledevvtoday at 9:36 AM
Only vintage-style images?
etothetyesterday at 10:34 PM
I would love to see prompt examples that created the images on the announcement page.
vunderbatoday at 1:42 AM
I decided to run gpt-image-2 on some of the custom comics Iāve come up with over the years to see how well it would do, since some of them are pretty unusual. Overall, I was quite impressed with how faithful it adhered to the prompts given that multi-panel stuff has to maintain a sense of continuity.
Was surprised to see it be able to render a decent comic illustrating an unemployed Pac-Man forced to find work as a glorified pie chart in a boardroom of ghosts.
I wonder if this will be decent at creating sprite frame animations. So far I've had very poor results and I've had to do the unthinkable and toil it out manually.
james2doyleyesterday at 11:58 PM
In the next round of ChatGPT advertisements, if they donāt use AI generated images, then that means they donāt believe in their own product right?
...buuuuuuuuut the price per image has changed. For a high quality image generation the 1024x1024 price has increased? That doesn't make sense that a 1024x1024 is cheaper than a 1024x1536, so assuming a typo: https://developers.openai.com/api/docs/guides/image-generati...
The submitted page is annoyingly uninformative, but from the livestream it proports the same exact features as Gemini's Nano Banana Pro. I'll run it through my tests once I figure out how to access it.
lifeisstillgoodyesterday at 11:12 PM
Pretty much all of the kerfuffle over AI would go away of it was accurately priced.
After 2008 and 2020 vast (10s of trillions) amounts of money has been printed (reasonably) by western gov and not eliminated from the money supply. So there are vast sums swilling about - and funding things like using massively
Computationally intensive work to help me pick a recipie for tonight.
Google and Facebook had online advertising sewn up - but AI is waaay better at answering my queries. So OpenAI wants some of that - but the cost per query must be orders of magnitude larger
So charge me, or my advertisers the correct amount. Charge me the right amount to design my logo or print an amusing cat photo.
Charge me the right cost for the AI slop on YouTube
Charge the right amount - and watch as people just realise it aināt worth it 95% of the time.
Great technology - but price matters in an economy.
kanodiaayushyesterday at 10:38 PM
It stands out to me that this page itself is wonderful to go through (the telling of the product through model generated images).
fizlebittoday at 5:35 AM
Scrolling through those images it just feels like intellectual theft on a massive scale. The only place I think you're going to get genuinely new ideas is from humans. Whether those humans use AI or not I don't care, but the repetitive slop of AI copying the creative output of humans I don't find that interesting. Call me a curmudgeon. I guess humans also create a lot of derivative slop even without AI assistance. If this leads somehow to nicer looking user interfaces and architecture maybe that is good thing. There are a lot of ugly websites, buildings and products.
dakiolyesterday at 10:30 PM
> On the flip side, there are hundreds of ways that these tools cause genuine harm, not just to individuals but to entire systems.
Yeah, agree. I think it's the first time I'm asking myself: Ok, so this new cool tech, what is it good for? Like, in terms of art, it's discarded (art is about humans), in terms of assets: sure, but people is getting tired of AI-generated images (and even if we cannot tell if an image is AI-generated, we can know if companies are using AI to generate images in general, so the appealing is decreasing). Ads? C'mon that's depressing.
What else? In general, I think people are starting to realize that things generated without effort are not worth spending time with (e.g., no one is going to read your 30-pages draft generated by AI; no one is going to review your 500 files changes PR generated by AI; no one is going to be impressed by the images you generate by AI; same goes for music and everything). I think we are gonna see a Renaissance of "human-generated" sooner rather than later. I see it already at work (colleagues writing in slack "I swear the next message is not AI generated" and the like)
jcattletoday at 7:09 AM
Can we talk about how jarring the announcement video is?
AI generated voice over, likely AI generated script (You see, this model isn't just generating images, it's thinking!). From what it looks like only the editing has some human touch to it?
It does this Apple style announcement which everyone is doing, but through the use of AI, at least for me, it falls right into the uncanny valley.
agnishomtoday at 12:15 AM
I don't know how this benefits humanity. In what way was ChatGPT Images 1.0 not already good enough? Perhaps some new knowledge was created in the process?
cyberjunkietoday at 4:30 AM
Looks like AI and I look away from any image generated by a LLM. It's my easy internal filter to weed out everything that isn't art.
Melatonicyesterday at 8:21 PM
Can it generate anything high resolution at increased cost and time? Or is it always restricted?
jwpapiyesterday at 11:48 PM
Why is it all so asian?
XCSmeyesterday at 11:48 PM
Oh wow, scrolling through the page on mobile makes me dizzy
Melatonicyesterday at 8:13 PM
We were afraid it would be Skynet and instead we got the ultimate meme generator !
ChrisArchitectyesterday at 8:26 PM
Fake layouts, fake handwritten kid story, fake drunk photos? All from training on real things people did.
As with anything AI, we are not ready for the scale of impact. And for what? Like, why are you proud of this?
dazhbogyesterday at 11:05 PM
Yay, let's burn the planet computing more slopium..
StefanBatorytoday at 8:26 AM
Do you think those working at ChatGPT have ever wondered how they are contributing to dismantling democracy and ensuring nothing is true by now? The ultimate technological postmodernism.
deletedyesterday at 8:52 PM
RyanJohntoday at 2:58 AM
Oh my god, it's very nice!
tomchui157today at 7:24 AM
Img2+ seed dance 2 = image AGI
BohdanPetryshyntoday at 9:16 AM
Am I the only one for whom videos in OpenAI releases never load? Tried both Chrome and Safari
bitnovusyesterday at 7:21 PM
great obfuscation idea - hidden message on a grain of rice
apparenttoday at 12:13 AM
I find the video to be very annoying. Am I supposed to freeze frame 4x per second to be able to see whether the images are actually good? I've never before felt stressed watching a launch video.
retrac98yesterday at 8:14 PM
The page keeps crashing on my iPhone 17 Pro.
ibudialloyesterday at 10:42 PM
And here I was proud of myself, having taught my mom and her friends how to discern real from fakes they get on WhatsApp groups. Another even more powerful tool for scammers. I'm taking a break.
gfodyyesterday at 10:50 PM
there's something funny going on with the live stream audio
szmarczakyesterday at 7:45 PM
Wow, the difference between AI and non-AI images collapses. I hate the future where I won't be able to tell the difference.
dahuangftoday at 8:01 AM
good job
Bennettheynyesterday at 8:52 PM
fal has the endpoint under openai/gpt-image-2
ieie3366yesterday at 8:11 PM
It's great. Also doesn't seem to have any "slop" standard look, the images it produces are quite diverse.
I would imagine this will hit illustrators / graphics designers / similar people very hard, now that anyone can just generate professional looking graphical content for pennies on the dollar.
throw310822yesterday at 8:21 PM
Ok, I can hear the sound of entire industries crumbling right now.
How hard is it to have a video player with a fucking volume toggle?
dzongayesterday at 10:59 PM
for video game assets this is massive.
but in general though - will people believe in anything photographic ?
imagine dating apps, photographic evidence.
I'm guessing we're gonna reach a point where - you fuck up things purposely to leave a human mark.
andaiyesterday at 11:30 PM
lol at the fake handwritten homework assignment. Know your customer!
davikryesterday at 11:53 PM
It definitely lost the characteristic slop look.
OutOfHeretoday at 12:38 AM
ChatGPT image generation is and has been horrific for the simple reason that it rejects too many requests. This hasn't changed with the new model. There are too many legal non-adult requests that are rejected, not only for edits, but also for original image generation. I'd rather pay to use something that actually works.
irishcoffeeyesterday at 11:28 PM
This is so stupid. As a free OSS tool itās amazing. Paying money for this is fucking stupid. How blind are we all to now before this tech?
rqa129yesterday at 7:38 PM
Can it generate Chibi figures to mask the oligarchy's true intentions on Twitter and make them more relatable?
volkkyesterday at 7:53 PM
the guys presenting are probably all like 25x smarter than I am but good god, literally 0 on screen presence or personality.
Suggest renaming this to "OpenAI Livestream: ChatGPT Images 2.0"
dumbaccount123today at 11:10 AM
[dead]
nopinsighttoday at 12:50 AM
[dead]
cindyllmtoday at 12:50 AM
[dead]
mmh0000today at 12:39 AM
[dead]
dawdwatoday at 6:32 AM
[dead]
sho_hnyesterday at 7:49 PM
In 5 years and 3 months between DALL-E and Images 2.0 we've managed to progress from exuberant excitement to jaded indifference.
weldertoday at 12:27 AM
Introducing DeepFakes 2.0 /s
brianbest101yesterday at 7:35 PM
[dead]
otobrgleztoday at 8:26 AM
[flagged]
zb3yesterday at 7:24 PM
Image generation? Hmm, would be cool if OpenAI also made a video-generation model someday..
biosubterraneanyesterday at 10:58 PM
Oh no.
ai4thepeopleyesterday at 10:40 PM
Each day when my AI girlfriend wakes me up and shows me the latest news, I feel: This is it! We are living in a revolution!
Never before in history did humanity have the possibility of seeing a picture of a pack of wolves! The dearth of photographs has finally been addressed!
I told my AI girlfriend that I will save money to have access to this new technology. She suggested a circular scheme where OpenAI will pay me $10,000 per year to have access to this rare resource of 21th century daguerreotype.
green_wheeltoday at 4:11 AM
Well artists, you guys had a good run thank you for your service.
manishfptoday at 4:40 AM
Goated release tbh. The text work inside the images are nice
aliljetyesterday at 7:21 PM
I am hopeful that OpenAI will potentially offer clarity on their loss-leading subscription model. I'd prefer to know the real cost of a token from OpenAI as opposed to praying the venture-funded tokens will always be this cheap.
prvctoday at 3:46 AM
I hope they will consider releasing DALL-E 2 publicly, now that there has been so much progress since it was unveiled. It had a really nice vibe to it, so worth preserving.
tkgallytoday at 1:07 AM
I had it produce a two-page manga with Japanese dialogue. Nearly perfect:
Sam Altman in his meeting with Tim Cook two and a half years ago give me money. I think itāll take $150 billion dollars, Tim Cook well hereās what weāre going to do, this is what I think itās worthā¦
Later Google tried the same thing, Apple we will give you a $1 billion dollar a year refund, whatās changed in two and a half years?