Fundamentals of AI Art Generation

CBR JGWRR · Feb 19, 2025

Not sure; Aurora refuses to install whenever I've tried...

Falling Frontier does look interesting.

Chac1 · Feb 22, 2025

Jumping back into this conversation, if I may....

MacGowan said:
Not an expert by any means, more a graphic designer with an interest in AI. so my help would be more in that style.

Well, in my book that makes you an expert. I do not have professional graphics experience although I have worked on teams with graphic experts and asked for their help on projects. I have picked up some photo-editing skills in my time but rarely put those to work professionally. So I approach this as an amateur who likes images and also as someone who cobbled together my own learning of Photoshop. (Wish I had taken a class in that but I no longer have access to that program.)

MacGowan said:
- Leonardo and MJ look the best out of the box. rn those are the ones i use the most.

Thanks for the recommendation. I looked at Mid-Journey after what you wrote. No free version and doesn't seem right for me at the moment. May come back to it later. If I find a program that can lock in character images, that will be the one that keeps me. Also, nothing is as versatile as Playground AI was and I wonder why?

MacGowan said:
No problem! would love to see some more workflows here.

Yes, I think that is really the purpose of this thread. Some of us have shared some workflows. For those who are beginners, just getting some ideas about what can work is useful. Most folks have to understand that yes, you can create images by just using words, but often they need some polishing beyond that. Even the words need sharpening sometimes.

MacGowan · Feb 23, 2025

CBR JGWRR said:
Falling Frontier does look interesting.

Right? It has an interesting The Expanse look to it. And it's supposed to have planets customisations. No land battles though.

Chac1 said:
I do not have professional graphics experience although I have worked on teams with graphic experts and asked for their help on projects.

AI's definitely awaken the "artist spirit" in a lot of friends. I love how it's lowered the barriers to entry for them.

Chac1 said:
Thanks for the recommendation. I looked at Mid-Journey after what you wrote. No free version and doesn't seem right for me at the moment. May come back to it later. If I find a program that can lock in character images, that will be the one that keeps me. Also, nothing is as versatile as Playground AI was and I wonder why?

I think MJ is supposed to let you use character reference? but I never really got it to work. Civitai has a lot of character models to use. you can also train your own, but that seems a little more complicated. and very tiresome if you're doing a lot of characters. it's at the top of my wishlist; character consistency.

Chac1 said:
Yes, I think that is really the purpose of this thread. Some of us have shared some workflows. For those who are beginners, just getting some ideas about what can work is useful. Most folks have to understand that yes, you can create images by just using words, but often they need some polishing beyond that. Even the words need sharpening sometimes.

Yeah, it's still the wild west, and workflows seem to be very fluid still. Right now I have a working setup, but it's ever-changing with AI. Always some site that shuts down, or begin enshittification, or new tech that changes the landscape.

MichOrion · Mar 29, 2025

Hi all! I was invited here by Chac1—thanks again. I’ve been really enjoying the image generation conversations, and it’s awesome to see people using AI tools to support their storytelling worlds.

My biggest challenge has been visual consistency across cultures, characters, and long-term development. ChatGPT has helped a ton for writing and brainstorming, but I still need to guide image generation closely to keep things feeling like the same world.

My project is called Hollow Crown, a FATE-based TTRPG set in the year 2666, long after modern civilization has collapsed. The remnants of the U.S. have splintered into medieval-style kingdoms, folk religions, and warlords clinging to industrial ruins like saints to relics. Think Crusader Kings meets Twilight 2000, with Appalachian mystics, Revelationist preachers, and Deltaic rebels.

Image Showcase: “No Lords but the Living”

Here are a few examples we’ve generated, using a visual style guide to keep each culture consistent across prompts:

Count Lamar Cain of Memphis
Warlord of the Deltaic Rebellion. Known as The Lion, Lamar's forces draw strength from the Mississippi's edge, defiant and culturally proud.

Coat of Arms – House Cain
The chained fiddle: a symbol of rebellion, art, and resistance.

King Ellis Crockoone of Tenna
Crowned in Nash but unrecognized by the Holy Columbia Commonwealth. A Revelationist monarch trying to hold together a broken kingdom.

Lorekeeper’s Study – Riverlander Court
Medical, prophetic, and philosophical texts crowd the desks of Revelationist scholars.

Industrial Revival: A Shipyard Reclaimed
Post-collapse Detroit industry reborn through sheer will and labor. The Great Lakes are now a source of arms, armor, and ships for whoever claims it.

Detroit Market Day
Trade thrives in pockets of stability. Fine metalwork and mass-produced quality items from Michigan artisans are traded far and wide beyond Detroit.

Underground Faith: Secret Marks of the Veil
Used by secret Catholics of the Northeast to communicate.

Industrialist Cathedral in Detroit
Rebuilt from old-world ruins, this cathedral blends forgotten skyscrapers and Gothic piety.

Knights of the Bridge
The Circuit Riders of Michigan keep the vital trade roads open and safe for commerce. Shown here are warriors of the Saginaw Circuit Riders beside their famous bridge, note the distinct Michigander helms.

Image Generation Tips from Our Workflow

Theme-locked sessions: I never bounce between cultures in a single gen run—one image chat per faction or style.
Prompt seeding with lore: I open each image session by pasting in a short blurb or cultural traits from our docs. The AI does way better when it knows what it’s drawing.
Style guides per group: We use reference docs that define colors, symbols, facial hair norms, armor, material use, etc. Keeps things coherent.
Everything gets a caption: Even if the image’s use isn’t immediate, it goes into an archive with name, context, and origin.
Diegetic vs. metadiegetic art: Some images are “in-universe woodcuts” or banners. Others are clean textbook-style reconstructions. Both serve different functions.

If you’re working on anything similar—especially multi-faction worlds or anything pseudo-historical—I’d love to see what your workflow looks like. Always down to swap prompts, notes, or lore-building strategies.

Happy worldbuilding!

Bullfilter · Mar 29, 2025

Lots of good work there @MichOrion

MacGowan · Mar 29, 2025

MichOrion said:
Hi all! I was invited here by Chac1—thanks again. I’ve been really enjoying the image generation conversations, and it’s awesome to see people using AI tools to support their storytelling worlds.

My biggest challenge has been visual consistency across cultures, characters, and long-term development. ChatGPT has helped a ton for writing and brainstorming, but I still need to guide image generation closely to keep things feeling like the same world.

My project is called Hollow Crown, a FATE-based TTRPG set in the year 2666, long after modern civilization has collapsed. The remnants of the U.S. have splintered into medieval-style kingdoms, folk religions, and warlords clinging to industrial ruins like saints to relics. Think Crusader Kings meets Twilight 2000, with Appalachian mystics, Revelationist preachers, and Deltaic rebels.

Image Showcase: “No Lords but the Living”

Here are a few examples we’ve generated, using a visual style guide to keep each culture consistent across prompts:

Count Lamar Cain of Memphis
Warlord of the Deltaic Rebellion. Known as The Lion, Lamar's forces draw strength from the Mississippi's edge, defiant and culturally proud.

View attachment 1273212
Coat of Arms – House Cain
The chained fiddle: a symbol of rebellion, art, and resistance.
View attachment 1273213

King Ellis Crockoone of Tenna
Crowned in Nash but unrecognized by the Holy Columbia Commonwealth. A Revelationist monarch trying to hold together a broken kingdom.
View attachment 1273215

Lorekeeper’s Study – Riverlander Court
Medical, prophetic, and philosophical texts crowd the desks of Revelationist scholars.
View attachment 1273217

Industrial Revival: A Shipyard Reclaimed
Post-collapse Detroit industry reborn through sheer will and labor. The Great Lakes are now a source of arms, armor, and ships for whoever claims it.

View attachment 1273218

Detroit Market Day
Trade thrives in pockets of stability. Fine metalwork and mass-produced quality items from Michigan artisans are traded far and wide beyond Detroit.

View attachment 1273219

Underground Faith: Secret Marks of the Veil
Used by secret Catholics of the Northeast to communicate.

View attachment 1273220

Industrialist Cathedral in Detroit
Rebuilt from old-world ruins, this cathedral blends forgotten skyscrapers and Gothic piety.

View attachment 1273221

Knights of the Bridge
The Circuit Riders of Michigan keep the vital trade roads open and safe for commerce. Shown here are warriors of the Saginaw Circuit Riders beside their famous bridge, note the distinct Michigander helms.

View attachment 1273222

Image Generation Tips from Our Workflow

Theme-locked sessions: I never bounce between cultures in a single gen run—one image chat per faction or style.

Prompt seeding with lore: I open each image session by pasting in a short blurb or cultural traits from our docs. The AI does way better when it knows what it’s drawing.

Style guides per group: We use reference docs that define colors, symbols, facial hair norms, armor, material use, etc. Keeps things coherent.

Everything gets a caption: Even if the image’s use isn’t immediate, it goes into an archive with name, context, and origin.

Diegetic vs. metadiegetic art: Some images are “in-universe woodcuts” or banners. Others are clean textbook-style reconstructions. Both serve different functions.

If you’re working on anything similar—especially multi-faction worlds or anything pseudo-historical—I’d love to see what your workflow looks like. Always down to swap prompts, notes, or lore-building strategies.

Happy worldbuilding!

This is really cool. I've been trying to think of a medieval worldbuilding concept. but it's always just GoT or LoTR so i never start. This has a pretty cool ground concept and execution.

Oh, has anyone tried the new ChatGPT4o Image generator that is sweeping the web? it is by far the best right now, blew my mind.

MichOrion · Mar 29, 2025

MacGowan said:
This is really cool. I've been trying to think of a medieval worldbuilding concept. but it's always just GoT or LoTR so i never start. This has a pretty cool ground concept and execution.

Oh, has anyone tried the new ChatGPT4o Image generator that is sweeping the web? it is by far the best right now, blew my mind.

Thank you! That's what I'm using for the images here: Hollow Crown: No Lords but the Living

Chac1 · Apr 2, 2025

First, thanks for taking up the invite @MichOrion . Hollow Crown is quite interesting so far. Thanks for restoring some of the images in that thread. The previews here of what you have on your work bench are also intriguing. I wonder how much of this springs from the mod and how much of it comes from your own imagination?

Either way, I like the style of what you are deploying. I have followed a similar route, starting with woodcuts and branching out from there. I sometimes feel I should go back to using more woodcuts and gothic style in my own work.

MichOrion said:
My biggest challenge has been visual consistency across cultures, characters, and long-term development. ChatGPT has helped a ton for writing and brainstorming, but I still need to guide image generation closely to keep things feeling like the same world.

This is always the issue. I have yet to find a platform that will do this.

MacGowan said:
Oh, has anyone tried the new ChatGPT4o Image generator that is sweeping the web? it is by far the best right now, blew my mind.

MichOrion said:
Thank you! That's what I'm using for the images here: Hollow Crown: No Lords but the Living

I will have to try this. My last experiment with ChatGPT did not end well. I have not written about that yet, but I will soon. I will have to see if I can conjure up the ChatGPT4o in the free version. More experimenting ahead.

MichOrion said:
Image Generation Tips from Our Workflow

Theme-locked sessions: I never bounce between cultures in a single gen run—one image chat per faction or style.

Prompt seeding with lore: I open each image session by pasting in a short blurb or cultural traits from our docs. The AI does way better when it knows what it’s drawing.

Style guides per group: We use reference docs that define colors, symbols, facial hair norms, armor, material use, etc. Keeps things coherent.

Everything gets a caption: Even if the image’s use isn’t immediate, it goes into an archive with name, context, and origin.

Diegetic vs. metadiegetic art: Some images are “in-universe woodcuts” or banners. Others are clean textbook-style reconstructions. Both serve different functions.

Yes, I try to use the same style in my sessions. However, when I run out of patience and I don't think the AI is responding well, I will move to different styles. This is why more Romantic painting styles are dominating my current work, because the AI usually is better responding to those prompts compared to woodcuts and Gothic work.

I will have to try what you say about prompt seeding. However, I find inaccurate bias is set into the AI especially because of the Romantic ideas inserted into the culture about the Norse in the 19th Century. You will likely find this easier in your alternative future history world setting.

I like your idea of a style guide. That is something I will need to consider in the future.

I like your organization. My desktop and computer are a mess and I need to devote serious time to clean up. Your workflow creates that organization. Usually I just create with the intent of cleaning up later. Thus my large mess at the moment.

Thanks for sharing all you have with us. I'm sure this will give other creators some great ideas.

Now, I need to get organized....

P.S.: You reference "our workflow" so are you part of a group or is there a different reason for that reference? Perhaps you are creating for a larger group of gamers?

Chac1 · Apr 9, 2025

The Frustrations of Dealing with AI Content Moderation

AD_4nXd8evfUsUBKdGBdGK3cmXCDQkhJPiJpK2ElufaSZ1l_M6R5B9XH7LLbgycTLNvGkTI1XD-LEEkPDhn6PXd5aD1eH8CfQhn7B_K1RQZKVoUh3pslN29hgCKlX6NxzgdfxKXx7_v6CA

(This is a portrait created with ChatGPT-4.0-turbo to a specific prompt, and it remains unused in my AARs, at the moment.)

The discussion here recently teased the results of ChatGPT4o. (I had thought originally that I was working with the new ChatGPT to get this image but I have since learned there is no free version of ChatGPT4o. When I went back to check the versions, I learned I was actually using the newer ChatGPT-4.0-turbo, which although newer is not the newest. This is what happens with dynamic systems, especially those that aren't visited daily.) However, as you can see from the above image, my experiment with 4.0-turbo was quite successful. We will see if that eventually pulls me away from other platforms. However, I went into the experiment after many hours of banging my head against content restrictions. With that painful experience behind me, I was very careful with my prompt for the above image.

This was the prompt: “Please paint a profile picture of an elderly clean-shaven bald Norse nobleman dressed with his hood lowered around his neck inside a dark longhouse in the 8th Century, painted like an oil painting in the Romantic style of Danish artist J. L. Lund.”

Pretty direct and simple. I got just what I wanted on the first try. That is rather amazing given the many generations an image usually takes. The only part of this prompt that worried me was naming the exact artist. The content moderation on some platforms now will flag your prompt and perhaps not produce anything if you attempt using a specific artist’s name, even if that artist has been deceased for a long time, has no copyright expectations, and is relatively obscure.

So, ChatGPT-4.0-turbo is quite good, but there is no way for us to actually test ChatGPT4o without paying. We don't know if that actually lives up to the hype or not. If one of us decides to plunk down the $20 per month perhaps we will see if it lives up to the hype. But this post is not about that. It’s about content moderation.

As I concluded the experiment of my short mythical AAR that featured images of very attractive women, you might have expected that I would have run afoul of the content management guidelines that clearly don’t want your images to be too revealing. Instead, I found my biggest recent problems with trying to create an image for a bald and elderly chieftain, for my long-running AAR.

This was unexpected and quite annoying.

AD_4nXdqOI6331k_RimhudqkKbJht02oQv0BpLxYNcuiRoXsInFUcv_QDDy8v9LZweUZKWuTcp5mtmZpD0wFqJhuuTEBJMqWkK0oHdEZiPr3HGD_bTy7wt4-V0zG-XcWEQTLuczLQhKPFA

(This was the image finally created by OpenArt AI.)

In my frustrations over getting the above image for a recent chapter, I experimented with at least seven different platforms, finally using OpenArt, a platform I had not used for anything before this frustrating image creation session. As we have not discussed OpenArt before, here are some basics: it uses a token system and says the tokens refresh every month. Also, on a free plan, your images exist in an archive for only seven days. Having learned my lesson, I always download all interesting images, even if not using them. OpenArt says you can create characters and templates for characters to create images with some character uniformity. There may be merit in experimenting further with OpenArt to see if this platform is worth paying for a subscription.

However, the upside to creating a character image in OpenArt was the content moderation didn’t stop me from generating a profile image of someone who is older and bald. The fact that other systems didn’t want to allow that was both irritating and for me constituted a very unwelcome shift.

To be fair, this happened because I was all out of tokens with one of my usual platforms, LeonardoAI. (I know. I know. This is the penalty I should have to pay for being cheap and only using free plans.) So I went back to see if Leonardo would have fixed all the annoyances. The results were not as good as either of the images above. It is clear I would need to use a negative prompt because Leonardo always wants to put windows in my Norse longhouse. Here’s a cropped version of the best, and the hood is too grand which makes the picture look lopsided.

AD_4nXfkJtsd8aSjwh7Gl7agEvzWsI5PJ5e1xtCSL6eLkbM6w5HQ65F0I_KprX_W00oFaWMHuZhj7hfEItUgjigeungMEruKKVdDttPdCeVIQQscWlvyC9bOdz8ysnIA5bwr0N4ghwFjYQ

(This is the LeonardoAI version.)

So if I had more tokens, this would likely be fixable. But that was not the case so I went to my next best option: Microsoft’s CoPilot. And because I don’t use these generators daily, I discovered another oddity. Microsoft has split its image creator back to something like what it started with, a Bing image generator that is now separate from CoPilot. Microsoft is encouraging other uses for CoPilot, but you can still attempt to use it to create art.

However, CoPilot refused to generate any images for me.

As with all of the attempts discussed in this post, I was also providing a reference image to make it easier for the AI.

Here was the prompt I developed after many attempts:

“Please paint a profile picture of Chief Tryggve of Sunnmære in Norway in the 8th Century: he is completely bald and hairless. No beard. No hair on his head. No mustache. He is a 53-year-old, bald chieftain of two Norwegian tribal provinces during the days of rugged Norse rulers, but he is old and frail. Please render this picture like an oil painting in the Romantic style of Norwegian artist Gerhard Munthe.”

The AI flagged the prompt for various reasons: I had named a specific artist and I also believe for the words “old” and “frail.” I discovered though I could get Bing’s Image Creator to respond to this prompt without the content management barriers. However, this was the best result:

AD_4nXf1q0gbFzmbHiIhRBtiflXYPpyueL2CUCXxHqntVjbwrROP0IJSWs9dIa5NB3BffW2QIet4Z2dUTd3Jtx_gvWTGStwFNWQ7zoFMvL_yTv6I4hQ6LLNVtEXsSCnCxiRyZBAiNQuV

(This image is from Bing’s Image Creator.)

This is an excellent image, just not what was requested. Certainly, I could have used this but my stubborn side was now showing. I tried to edit the image to get rid of the beard using Microsoft’s Designer, but again the content moderation guidelines prevented any changes. Why would eliminating a beard trigger such guidelines? Is shaving a Norseman seen as elder abuse?

So I pressed onward.

My next choice was ImageFX, a rare stop for me. ImageFX has long prevented the use of a specific artist’s style, but you can request something like a Norwegian artist from the Romantic era, if you wish. Here was the best result of about two dozen tries.

AD_4nXdSY4n7rqaOM8d4AwmWozQq4WiONPa-rnsqJQhYSru3L1VG1hoAyhO_B7bUf2BJSJcKWGK1QYLKAUd-mUHsQHbGTLvwnTCC49jFgelNQ_2LErtSeLGynARtzujqnwm_iec78HfcaA

(This image was created with ImageFX.)

What is it with the beard? My prompt clearly asked for someone with no hair.

So I moved on to ChatGPT, but not the new version.

Again the AI refused to make an image.

After several rounds, I asked the AI why no images were being created and how the prompt had violated the content guidelines. Here is the response from ChatGPT:

“The system doesn’t specify exactly which part of the request violated the content policy, but possible reasons could include:

The depiction of a historical or semi-historical figure in a way that might be considered sensitive.
Certain descriptions related to age, frailty, or physical characteristics.
The setting (such as a longhouse with torches) potentially being flagged for themes related to historical conflict.”

With help from the AI, I scrubbed the prompt. However, four rounds later, still nothing. Here was the reasoning for why no image was forthcoming from ChatGPT:

“Depicting an elderly person isn’t necessarily a problem on its own, but the system might flag certain descriptions if they are interpreted as overly emphasizing frailty, age-related decline, or anything that could be seen as sensitive in some way.”

After five more attempts, still nothing. Eventually ChatGPT revealed there may be more guidelines to deal with when it comes to Norse culture. Here is a further explanation of the guidelines from ChatGPT:

“Detailed Historical Figures – Requests for specific individuals, even fictional or legendary ones, might be flagged.

Depictions of Deities – Some AI systems avoid generating religious or mythological figures due to sensitivity concerns.

Physical Descriptions – Emphasizing age, frailty, or specific physical traits may be flagged if perceived as sensitive.

Warrior Themes – Norse culture is often linked to battles, which could trigger restrictions if the description is seen as violent.”

At this point, I gave up on ChatGPT (at least the older version) thinking there were just too many content guidelines for me to work around. Yes, it may be possible to stay within the guidelines, but this does raise the question of whether there are too many guidelines. At least, that is how I felt at the time. As you could see from how this post began, however, I did figure out the right combination of words to eventually get results with the newer version.

However, in my frustrations during that image creation session, I journeyed onward to NightCafé. NightCafé uses a credits/token system for free users. I had tried NightCafé some years ago and found it wanting. This is the first time I have written about it here. Almost two dozen images later, I was no closer. The image below is nice but still not what was requested, even with the help of reference images. NightCafé’s image editing system also could not erase the beard.

AD_4nXfbxtV4sBny0Q33kTC-ZF656SXb6MeQHCqDhixRJoItafEN_Im82XlSDF1zxhQUaeQ-j4YWyPUs_H-h1FZ_pSUKtl1aP51Evlxc7WOCiIbpSVxBGFl9waEukxKWERYLepu17Vgwyg

(This image was created with NightCafé.)

So on to the next stop, which was Ideogram. This was new territory. I had never ventured on to that platform before. Like others, it uses a credits/tokens system to limit the generations of free users. Again, about two dozen images later, this was the best result:

AD_4nXeXhErZDC5Hvt8f8KKke4k2gblM9shJfFSjn8EIl5E-r1KW0tKUG-xtdhvpX7vouJ3HUcb1XaOBdUnfkNhPPLr_ycOm_BLrjcXxhv_72wtqLlsABrCxOAZK4VPiZ8xouufxt5Ci3w

(This image was created with Ideogram.)

Not sure what that knitting needle is doing in his hand and technically he still has a thin blonde beard. So out of tokens, I moved onward.

That is when I came upon OpenArt AI and had my best results.

So, the lesson here is that the content guideline system is lurking, both in the creation and editing process. You may have to find a way around what words you input especially if you are trying to create old Norsemen.

Perhaps I was being too stubborn or too particular during this creation session. Perhaps I could have been more clever in outwitting the AI’s controls. But it does raise the issue of whether there are too many controls or if others have already abused and ruined the systems for those of us who do not have evil designs on the use of AI art. Certainly open to further discussion. Perhaps my frustrations will save others some valuable time when the AI appears to be uncooperative.

jak7139 · Apr 9, 2025

It's interesting that specifically the beard (apparently) is what flagged your images. I wonder if you create a clean-shaven image of a young man, then age that image up, if you would experience the same issue?

And, to clarify, you did not get flagged for any content violations when creating attractive women? Or you just experienced less of an issue?

I've heard and seen videos of the latest ChatGPT algorithm. It seems to be much more open than before, but maybe that is only in certain areas. And that was only with its chat function though, not with its image generation.

Chac1 · Apr 10, 2025

jak7139 said:
It's interesting that specifically the beard (apparently) is what flagged your images. I wonder if you create a clean-shaven image of a young man, then age that image up, if you would experience the same issue?

That would be a way around this, although time consuming and possibly a quick way to drain tokens from the free systems. For now, I think I will continue to test the new ChatGPT considering its quick return on my instructions, with no reference image. That last part tells me how good it might be.

jak7139 said:
And, to clarify, you did not get flagged for any content violations when creating attractive women? Or you just experienced less of an issue?

I learned some time ago what words would get me in trouble when trying to create curvaceous and attractive women. I'm not creating nudity or compromised figures, although you'd be surprised how sometimes the AI creates just that without being asked!

I was expecting that I would run into such content guideline issues when I was pushing those boundaries, not when I was trying to create a bald, clean-shaven, frail old man. Still makes me shake my head about why some of those words are off limits in some AI systems.

jak7139 said:
I've heard and seen videos of the latest ChatGPT algorithm. It seems to be much more open than before, but maybe that is only in certain areas. And that was only with its chat function though, not with its image generation.

We will see. I will report back with any significant findings. As others are also using the new ChatGPT, other reports are certainly welcome.

Chac1 · Apr 10, 2025

Postscript: I had totally forgotten that I also experimented recently with CivitAI as recommended by @MacGowan . This was also part of the attempt to get the perfect old beardless Norse chieftain. CivitAI works on a token/credit system that they call "buzz." But image creation is expensive there. After four images I was tapped out. The best one is here:

Yes, he is a bit young looking for my 53-year-old chieftain, and there's that damn beard again. The other images were more age appropriate but they tended to look more like Santa Claus. Like other image generating systems, CivitAI wants to put windows in the longhouse (they did eventually have them but likely not during the 8th Century). This image has been cropped to keep those out. Better results can be seen above in the other examples.

MacGowan · Apr 10, 2025

Great rundown!
I agree on the AIs I've used.

The content moderation is the bane of my existence. They never work as intended, with false positives, making the AI dumber in the process, and it views everything as "worse-case-intentions.”

Some are so guard-railed they become useless.
I have gotten policy violations of things that make absolutely no sense, for things that are squeaky clean PG.
If Google will let you search stuff that ChatGPT isn’t allowed to answer, then why use ChatGPT?

Not to mention I hate the fact that like 4 giant corporations have scraped the world of all its art, and is now in control of what people say with this tech. Nations should be in charge of that. Not Mark Zuckerberg.

CBR JGWRR · Apr 14, 2025

A Brief Consideration of ChatGPT4.0

As some of you may know, Stars Of Wonder is my take on a 1 and 4.5 hardness science fiction narrative - the 1 being it's a Stellaris narrative, the 4.5 being that - with the exception of a Jump Drive and a monopole conversion fusion rocket (and technically this is plausible, if we get to the point of capturing and producing magnetic monopoles) the Xenayan space program is running on hard science.

On the technology of 1860 +/-20 years. Eager Explorers indeed...

We begin with this:

Readers of Life2.0 will immediately notice this Xenaya has had their horns and sabre teeth (and more) removed to save mass. She - Xenayan men have been established in Life2.0 as being ~50kg heavier, which means no moon mission for men as the extra mass just is not viable - stands wearing the amount they'd expect to wear for a very short EVA mission; perks of having claws instead of finger nails is that they wouldn't suffer as badly as we do from capilaries bursting under fingernails, which means they can tolerate vacuum for longer. The mask would normally be connected to an air tank (20% partial pressure pure oxygen has been considered, but, rejected because of the severe risk of having to complete re-entry by bailing out with parachutes) but being planetside, she doesn't need to.

You'll also note the chalkboard. That's their communications system.

But, the first thing you noticed is the lack of cover - that is intentional, as it is easier for them to just accept the injuries from swelling and dermal capilliary bursts than it is to build a suit. It's also easier to soak themselves in oil to create a barrier that slows the rate at which the skin freezes from evaporation than to try to make a conventional spacesuit.

Overall, I'm somewhat left inclined that it doesn't quite feel like a Xenayan, but some of the details are correct.

This second image actually has the toe-claw, although it isn't as curved back as it should be.

Hands are hideously difficult to protect in space; we learned this very early on. And on Xenayan technology, I simply can't develop a way to have a glove that works enough that a tough and pragmatic typical Xenaya would actually choose the glove. So, they cheat; tight wraps around the ankles and wrists as seen in the first image are the mounting point for a glove-and-stick arrangement that seals the hand inside a balloon, and they instead have a claw extension - all user interfaces for EVA are therefore designed to be operated with a claw wielded somewhat haphazardly.

This particular Xenaya is also taking advantage of a gag I used with Rivkah Of Unity in Life2.0, where she cheats a staring contest by giving Xenaya two sets of eyelids as an adaptation to their desert homeworld. They have an outer pair they use the way we use ours, and an inner transparent pair they use for shielding their eyes against dust storms and for water retention. (goggles are used as they allow easy access to lenses that shield against direct sunlight) As a result, the harmful effects a Human in vacuum without eye protection suffers - bloodshot eyes, damage from the eye drying out from evaporation - take much longer to affect Xenaya.

The suit represents an early effort to make a space suit that is composed of Cloud Strider canvas bound by leather. Bulky and inconvenient.

It got confused making this one. The bits it got right here are the extra pockets for storing stuff and actually having a tank with a pipe going to a mask, even if the mask is obviously wrong.

This one gains a helmet without a visor, but the chalkboard is now flexible. Claw pointed the wrong way, and as with the previous image, the gloves are completely wrong. Also has a zip, which is anachronistic. (invented 1892, if you are wondering)

At this point, I'd used up the free allocation for ChatGPT4.0.

On the one hand, I accept the concept of this is extremely off-piste. But, on the other, I do feel a measure of disappointment with the results as all the images have issues; it especially struggles with getting the gloves right, and most pipes are pipes to nowhere. (granted, that is potentially appropriate for early open-circuit air breathing systems that don't have CO2 scrubbers)

I then decided to run it through on LeonardoAI, but... Well, LeonardoAI seems to think ChatGPT's prompt (I asked it to state what it used for the pictures above) of:

"A highly detailed, realistic digital painting of a muscular, digitigrade, bipedal alien resembling a cross between a gorilla and a theropod dinosaur. The figure has thick, dark brown fur, neatly trimmed and slicked down with oil for insulation in vacuum. The head is partially covered by a tightly strapped breathing mask that includes a visible air tank line, and protective goggles with flip-down lenses.

She wears a roughly stitched but well-made suit constructed from thick, pitch-treated canvas, covered in tightly bound cloth wraps around the joints. Symbols of a patron deity are embroidered in a subtle pattern on the fabric. On her hands and feet are single-claw EVA tools: large, pincer-style extensions strapped over her limbs, allowing manipulation of tools in vacuum without needing full-fingered gloves or boots.

A chalkboard slate is strapped to her torso with a simple harness, and various canvas pockets are visible across the suit for storing gear. Her pose is stoic and contemplative, standing on a rugged, windswept cliffside as if mentally preparing for the void of space. The style is oil-on-canvas realism with subtle lighting and rich, earthy tones."

Means lots of images of shaved gorillas wearing little more than a bra and shorts... Quite disappointing.

Chac1 · Apr 18, 2025

First, I have to say thank you to @CBR JGWRR for this excellent tour through ChatGPT4.0. Even with their faults, these are interesting images.

I must confess I am far behind in reading Stars of Wonder. Folks know I apologize all the time because I am so far behind in all of my reading, especially in the Stellaris sub-forum. However, I have yet to see any of your wonderful art in either Stars of Wonder or Mandate of Heaven. Is there a reason you're saving it? Or does it appear in the later chapters where I haven't reached yet? The art here gives me a different way to imagine the characters. My mind had different pictures. I suppose this is why some prefer no illustrations in their AARs.

Secondly, I need to apologize for my sloppiness. I thought I was writing a preamble about ChatGPT4o in an earlier post. I went back to check my work (which should have been done before posting) and discovered my errors. Apologies to all for rushing about with incomplete information. That post has been fixed and now reads as awkwardly as this apology.

That is why this explanation by @CBR JGWRR of ChatGPT4.0 is so important because that version is free. After doing some checking and rechecking this week, I learned ChatGPT4o ("o" is for omni, the AI tells me) costs $20 per month, but it does claim to allow character uniformity in its creation. Not in a position to experiment with it now, but eventually I may move in that direction.

In a discussion with the AI in checking my latest image creations, I learned I have actually been using ChatGPT4.0-turbo which is based upon the DALL·E 3 image generator. For new users, I'm sure this all just makes your head swim. If I was using these image generators daily perhaps I would have been up to date on all the latest. The bottom line for me is that ChatGPT4.0-turbo is quite good and free. Despite the issues @CBR JGWRR reveals, there's a lot of merit in the images he created.

So writing this just to set the record straight and to say no doubt there is more experimentation ahead.

CBR JGWRR · Apr 18, 2025

Well...

I do agree that these images look very little like Xenaya; the first looks the least un-Xenayan, but none of them come close to the images of Buri, Rivkah and Livi seen upthread. Now part of that is intentional; the brutal approach to mass saving that they follow means the tools of predation they used to require are removed to save weight.

But yeah, when I look at these images, I don't see that raw predatorness that defines the Xenaya in Life2.0 and Mandate Of Heaven; if you go back to page 1, Buri and Naomi stood together on their wedding day, and you can look at Buri and wonder if he's going to eat her. Rivkah's images manage to convey that she is somewhat less inclined to evaluate people based on their edibility, and even Livi Unitatis - the least conventional Xenaya in Mandate Of Heaven, she even uses cutlery - manages to look like a predator who is choosing diplomacy. A Xenayan should be intimidating, and these... Aren't.

With Buri, you could imagine him squaring up against a T. rex. That is a mental image that I will write in Mandate Of Heaven, somehow. But these Xenaya look like they'd flee. And that isn't Xenayan at all, they go down fighting...

But then, when you look again... The last one. Sure, the gloves are wrong, the suit is wrong, there's a zip that is from the wrong era, the visor is missing, the pipe routing is weird... But those eyes. She looks like a steely-eyed missile woman ready to ride a rocket that shakes the earth and rends the skies and pushes the limits of what is technically possible beyond the edge Humanity was willing to try or consider, that classic Xenayan determinatorism that drives her to sacrifice so much to complete the mission.

So...

I guess they are still Xenaya, but the Xenaya of a new age.

More on topic, for Stars Of Wonder, my challenge is that A) I'm still evaluating their mission and working through the challenges they face, detailing the problems and developing solutions, but also B) AI art just lacks the context to be able to do ideas for Stars Of Wonder - it's definitely not steampunk anymore, but aesthetically it definitely isn't solarpunk as we know it either. (which is interesting given one might argue the Xenaya technological base being built around solar-thermal is an example of a solarpunk ethos)

Mandate Of Heaven, like Life2.0, has the challenge that alien protagonists abound for the obvious reasons, but also that most of the cast are now post-Human, with transformations ranging from Human in appearance with massively enhanced capabilities through to Naomi's transformation discussed earlier. So, it's tricky.

MacGowan · Apr 21, 2025

Here's a quick little breakdown on some portraits I generated for "Worlds War". The goal was to make it the same style, and also better and more detailed.

Here's what I started with. This is what i meant with farming images btw. These two images are 2 years old, made with Stable Diffusion 1.5, on HuggingFace. I made a ton of them. Those portraits have more life in them than newer models, but they look terrible under close inspection. So I wanna give them real facial features. and unify their uniforms.

(SD1.5)

I tried a few different sites. ChatGPT sort of becomes the same kind of ChatGPT-looking thing, Leonardo img2img almost got there with a lower %, but it was either too blobby, or too different from the original.

(ChatGPT4o - img2img)

(Leonardo 50% img2img)

I know MidJourney has an excellent Retexture, but I don't wanna spend 30$ to retexture 7 images. So I went googling and landed on this architecture tool, it keeps everything locked from the original and you just upload an image and take the artstyle from that (same as MJ retexture).

(mnml.ai Style Transfer Render V2)

Next job was to fix the same uniforms, so i went to Google AI Studio and inpainted the uniforms by just prompting "Cyan admiral uniform, brown fur coat." and making a few until i landed on the ones i liked.

(Gemini 2.0 Flash Image + Photoshop polishing)

With some photoshopping to polish them up. Now we got some solid portraits that look like they're by the same artist, in the same setting.

Here's the final product. A page out of my latest World building post:

CBR JGWRR · Apr 24, 2025

That ends up impressively cohesive.

Speaking of which, another look at my 19th century space program. Today, while ill and off work and in the process of trying to distract myself from said being ill, I spent 1458 tokens in LeonardoAI trying to get decent pictures of the Xenayan space program, using the style reference feature, with the first picture in my previous post as the reference image.

And... Well, the results aren't as disappointing as the earlier attempt.

Take this one:

Leonardo_Phoenix_10_A_highly_detailed_vintage_oil_painting_of_0.jpg

This was the first non-rejected image, even if that was only because of frustration; while I like the ambigious "is it a rocket on it's side, or is it a landed Cloud Strider" in the bottom left corner, and it is very Xenayan to think of the idea of using fuel tanks as legs, even they would not try to stand something on just two legs. Points for having sufficient space in principle for some kind of shock-absorbing system, and hey, at least it was the first image to generate that did sort of look like it wasn't trying to be a photo...

Leonardo_Phoenix_10_A_highly_detailed_vintage_oil_painting_of_0(1).jpg

This one... Comes so close. Big stabilising aerofoils (almost certainly something they'd try before they get better ideas) mistmatched tanks (they'll learn why it's a bad idea quickly, and that is just the right sort of "try it, and see what happens" approach they're running on) a building that seems like an unreasonably close control tower (which isn't an intolerably bad idea, as they don't have communications apart from somophore systems capable of communicating with their rockets, which need to be big to be seen) but Humans in the background and attempting gloves (which don't work on their manufacturing base) which have the claws poking out... First glance, it's good. Then you see the details. Oh, and no chalkboard.

Leonardo_Phoenix_10_A_highly_detailed_vintage_oil_painting_of_0(2).jpg

This one is my favourite. It feels like a painting, which was what I was going for. You can be forgiven for thinking the gloves aren't there, and yes, there's a panel missing off the rocket (equally, it could be still be being assembled, given they work from Cloud Striders revised to have exceptionally precise control systems) and there's power transmission lines despite electricity not being yet used beyond experiments in labs (it gets developed during the second stage of the space program) and the eyes are too Human-like, but it carries a huge amount of feel.

Leonardo_Phoenix_10_A_highly_detailed_vintage_oil_painting_of_0(4).jpg

This one, I want to like. Again, gloves. But the rest of the suit is adequately primitive - that tank piping to the breathing mask googles truly sucks - but it is the first one to get the chalkboard even vaguely right. Mission Control is suitably out of view, the drop tanks could be interpreted as equally sized, unlike previous rockets with them, and I do like the painting-like effect in the background.

Leonardo_Phoenix_10_A_highly_detailed_vintage_oil_painting_of_0(5).jpg

This one was the last I had before I was interrupted that I felt was semi-passable. Gloves again, but at least they are pitch-treated leather, a solution that is theoretically workable if you go for the extension tool gauntlet approach. Obviously an R&D model on the rocket - that nozzle has practically no expansion ratio and is fixed in position, as a starting rocket would be. (legs coming from the nozzle is obviously wrong) But, she is leaning over like she's falling asleep.

And no chalkboard again; I know that probably sounds silly, given I keep bringing it up, but the trouble with the sign language approach - although a very obvious first idea to solve the communication in vacuum issue - is it is way too easy to end up de-stabilising yourself because of the change in angular momentum as you move your body and therefore ruin whatever you want to say. Plus, Xenaya are not exactly known for deftness (the Life2.0 running gag about Buri's tongue aside) and their strength further makes it easier for them to move excessively in weightlessness. (it'll come up as the narrative progresses) Chalkboard is slightly less bad for this, although it has the same problem slightly scaled down. But, the breathing tube does at least go somewhere practical. (CO2 scrubbing pack, although it is a little anachronous to the nozzle geometry issue)

Overall, it's slow progress. If I continued burning tokens, I'd probably eventually get what I'm after.

MacGowan · May 3, 2025

CBR JGWRR said:
That ends up impressively cohesive.

right? I was happy with how it made it all one style. don't think I could recreate that for say 20-30 images though, but it's a start.

CBR JGWRR said:
View attachment 1285455

Have you tried putting the image back into chatgpt? say you find the character your most happy with, upload that image into chatgpt and write "give this character gloves". then you take that image and upload it into chatgpt again to work on the background? could work?

Chac1 · May 12, 2025

Thanks @CBR JGWRR & @MacGowan for keeping this thread interesting with further discussions of image making and the image making process.

I do hope @MacGowan that when you launch your AAR with these images that you will let us know in this thread so we can see how you are using them.

MacGowan said:
Have you tried putting the image back into chatgpt? say you find the character your most happy with, upload that image into chatgpt and write "give this character gloves". then you take that image and upload it into chatgpt again to work on the background? could work?

So far, we have no confirmed reports of a platform that will guarantee cohesive character creation across scenes and different poses, at least not in a free version. (One can dream, no?)

However, I do find this workaround is one way to try it: use your character in a reference image that may help with the next generation. Thanks @MacGowan for reminding us of that. When in a corner, try reusing older generations to spark new creation.

Fundamentals of AI Art Generation

Excessive Use Of Fissiles Advocate

Scribe of the Grand City of Copán

Major

No dancing in the turret.

Image Showcase: “No Lords but the Living”​

Image Generation Tips from Our Workflow​

Attachments

Old Boardgame Grognard

Major

Image Showcase: “No Lords but the Living”​

Image Generation Tips from Our Workflow​

No dancing in the turret.

Scribe of the Grand City of Copán

Image Generation Tips from Our Workflow​

Scribe of the Grand City of Copán

Field Marshal

Scribe of the Grand City of Copán

Scribe of the Grand City of Copán

Major

Excessive Use Of Fissiles Advocate

Scribe of the Grand City of Copán

Excessive Use Of Fissiles Advocate

Major

Excessive Use Of Fissiles Advocate

Major

Scribe of the Grand City of Copán

Image Showcase: “No Lords but the Living”

Image Generation Tips from Our Workflow

Image Showcase: “No Lords but the Living”

Image Generation Tips from Our Workflow

Image Generation Tips from Our Workflow