• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Stellaris Dev Diary #149 - Technical improvements

Hi everyone, this is Moah. I’m the tech lead on Stellaris and today I’m here to talk about the free 2.3 "Wolfe" update that will be arriving together with Ancient Relics, and what it brings to the table in terms of tech.

Stellaris is going 64 bits.
People have been clamoring for this for a while now, and various factors have led us to finally do this for this patch. I should temper your expectations though: while many have claimed that this would be a miracle cure for all their issues with Stellaris, the reality is somewhat more tame.

What does it mean?
The one solid benefit is that Stellaris is no longer limited to 4gb of memory, and won’t crash anymore in situations where it was reaching that limit. For people who play on huge galaxies, with many empires, many mods or well into 3000s, this will be a boon.

In terms of performance, though, it doesn’t change much. Without drowning you in technical details, let’s just say that some things go faster because you handle more data at once, some things go slower because you have more data to handle. In the end, our measurements have shown no perceptible difference.

Finally, the last effect of switching to 64 bits is that the game will no longer playable on 32 bits computers or OSes. We don’t think this will affect many people, but there you have it.


What about Performance?
I know that’s everyone’s favourite question, so let’s do our best to talk about it. First, let me dispel some notions floating around in various forums: Stellaris does use multithreading, and we’re always on the lookout for new things to thread. In fact between 2.2.0 and 2.2.7, a huge effort was made to thread jobs and pops, and it’s one of the main drivers of performance improvement between these version.

Pops and jobs are indeed what’s consuming most of our CPU time nowadays. We’ve improved on that by reducing the amount of jobs each pop evaluate. We’ve also found other areas where we were doing too much work, and cut on:
  • Ships calculating their daily regeneration when they’re at full health
  • Off-screen icons being updated
  • Uninhabitable planets doing the same evaluations as populated planets
Why do these seemingly pointless things happen? Well, we generally focus on getting gameplay up and working quickly so that our content designers can iterate quickly, and sometimes things fall through the cracks. Some of these systems are also quite complex and the scale of the new code is not so easily apparent. Sometimes, not limiting the number of targets is good enough because you’re not doing much but then, months later, someone adds more calculations or the number of objects explodes for unrelated reasons, and suddenly you’ve got a performance issue.

Modifiers
One thing that sets Stellaris apart from other PDS title is how much we use (or abuse) modifiers. Everything is a modifier. Modifiers are modified by other modifiers themselves modified by other modifiers, and sometimes by themselves. It’s quite hard to follow, and leads to every value being able to change at any time without your noticing.

“Why don’t you just compute jobs when a new one appears?” has often been asked around these parts. Well, a short answer to that is it’s really hard to know when a new job appears. You can get jobs from any modifier to: country, planet, pops. Each of these can get modifiers from ethics, traditions, perks, events, buildings, jobs, country, planets, pop, technology, etc.

Until now we were trying to calculate modifiers manually, forced to follow the chain in its entirety: when you recompute a country modifier, you then calculate their planets modifiers, and then each planet would recalculate their pops modifiers. Some of our freezes were just that tangled ball of yarn trying to sort itself out.

NexRiPkna2utTqAzF9H0DEjOCwHVsI4EejYO-vMQMh6QwUB-_uP7dXmpjkwXzOOKoiwDqkSzd9tlLmN3DlFN2R06A62od6XxWm8xh99XRDfRFRP3vVj42GBIaDaXSK7jjyKdS39b

This is our modifier flow charts. It’s not quite up to date, but gives you an idea of the complexity of the system (Unpolished because it’s a dev tool, and not made for the article).

No More!
For 2.3 “Wolfe” we have switched to a system of modifier nodes, where each node register what node they follow, and is recalculated when used, following the chain itself. We have modifiers that are more up to date, and calculated only when needed. This also reduces the number of pointless recalculations.

This system has shown remarkable promise, and cut the number of “big freezes” happening around the game (notably after loading, for example). It has some issues, but as we continue working with it, it’ll get better and help both with performance and our programmers’ sanity.

So, what’s the verdict?
In our tests, 2.3 “Wolfe” is between 10% and 30% faster than 2.2.7 right now. Hopefully it’ll stay that way until release, but the nature of the beast is that some of these optimizations break things and fixing the issues negate them, so we can’t promise anything.

IuIGuQ4cXPvjCEMWG_AowiNIFXhzpsPIcphmCVJD79vQqVMqUeZCqCoVfDlWDNZ3YNkAScYAJh2ebft947YsqoOhG7A_4pNBWxjZ6L9se5lkEEImNYZ4uOpTMWj-amEiwSYdirpd


Measurements provided by @sabrenity , using detailed info from the beta build. It’s worth noting the “SHIPS_SERIAL” purple line has since been eliminated.

AI
Another forum favorite, we have done some improvements to the AI. First, with @Glavius ’s permission, we’ve used his job weights to improve general AI job distribution. We’ve also done the usual pass of polish and improvements, and of course taught the AI how to use all our new features.


What else is new?
We’re also getting a new crash reporter that will send your crash report as soon as they happen rather than next time you start the game. We’ve improved our non-steam network stack for connectivity issues, etc.


All right, enough of my yammering. This has turned into a GRRM length novel, and even though there are many more areas we could cover, we’ll just turn this for your perusal.
 
Last edited:
  • 1
Reactions:
well sure, but not BFS as BFS, thats why its called a* O;P

Edit: difference between "pure" BFS->Dijkstra->a* is still very small
 
When this update goes through, will it require Mojave for Macs? I know that this may seem like a dumb question, but some newer titles (looking at you Imperator) require OS 10.14. I personally am very squeamish about updating, as I have heard that I will have trouble with my 32-bit apps, and I hope that this will not force me.
 
I think it might be lost cause to remove Wormholes/Gateways, they were the main reason caused frame drop before 2.0 and it’s why dev made that decision.
We can still keep the L-gate, because their number is small compare to others

While removing them probably isn't an option, I would like to see an option to make the Gateways non-buildable by checking a box in the pre-map generation menu. Make it where as a player I can decide to have Gateways just be lost technology that can be reactivated, but the ability to build them is lost to history. Not only would that keep the AI from spamming them all over the map, it would also make owning one very special and something worth fighting for.

Just my thoughts on the matter.
 
While removing them probably isn't an option, I would like to see an option to make the Gateways non-buildable by checking a box in the pre-map generation menu. Make it where as a player I can decide to have Gateways just be lost technology that can be reactivated, but the ability to build them is lost to history. Not only would that keep the AI from spamming them all over the map, it would also make owning one very special and something worth fighting for.

Just my thoughts on the matter.
well, then gateways are just another kind of wormholes, and they will still cause late game pathfind problem
 
When this update goes through, will it require Mojave for Macs? I know that this may seem like a dumb question, but some newer titles (looking at you Imperator) require OS 10.14. I personally am very squeamish about updating, as I have heard that I will have trouble with my 32-bit apps, and I hope that this will not force me.
that is actually quite fun to read, since apple is dropping the 32bit O;) not sure what macosx sdk we are compiling against but i do not think we changed anything there
 
well, then gateways are just another kind of wormholes, and they will still cause late game pathfind problem

But at least they would be limited for those of us that want the limitation. And again, make it an option up to each player. I already have wormholes and gateways turned down to the absolutely lowest setting I can use, because I prefer it that way. So making it so they aren't build-able in my game would be my personal preference in the settings.

Basically its an options = good scenario.
 
I'm really looking forward to the performance changes in Stellaris and it using more than 4 gigs of my 32GB of ram. I am wondering however whether you plan to fix the following bugs:
  • Fleets not returning fire at enemies (seriously?)
  • AI fleets bypassing ftl inhibitor stations
  • Upgrade costs having nothing to do with the listed cost (taking an instant 3x the cost listed plus more for each ship)
 
You might have a Windows 7 32b disc from years ago and not want to spend money on a new OS disc...or install Linux.
You can install Windows 7 64-bit using the same license you already have for 32-bit. No need to spend money. (Also note that if you have Windows 7 you only have about 8 months left before you stop getting updates so probably need a new OS anyway.)

All of those cpus you refer to are far under minimum spec for the game so it’s a bit of a ridiculous argument to make.

Absolutely everyone who meets the requirements has a 64bit cpu. Everyone.
For that matter, the Steam requirements for the game already list "Windows 7 SP1 64-bit" for minimum Windows OS.
It can be fun to see how far under the "official" specs will still run the game but don't expect to be supported while doing it (although PDX has still occasionally made fixes that weren't strictly needed if you actually met the official requirements).
 
When it comes to AI empires building gateways, I wish they built fewer gateways and more productive Megastructures of their own.

Gateways are usefull of course, but it seems late game the AI spams them everywhere.
 
Pretty much the only megastructure I see the AI's regularly build is gateways and habitats. I think I had maybe one game where the AI built a ringworld, but that might have just been the standard wild ruined one. Don't know if the AI just doesn't take the perk ever or just is never able to save up enough to start building one in the first place (probably because it's burning all the alloys and influence on unneeded gateways & habitats).

Having the AI build less of both would be a good QoL change and probably help performance. I like that habitats are going baseline, but given current AI behavior, I also dread the change because the AI seems like it'll spam a ton of them and be even worse at doing what it needs to do. I for one don't want a ton of lag from tons of unhappy, homeless & unemployed pops on hundreds of habitats that aren't even a third full. So I'm really hoping this behavior is on the docket for AI improvements.
 
Don't know if the AI just doesn't take the perk ever
I've looked at AI via observer - unlike players, they don't save up ascension perks.

If a player gets Perk 2, they'll go "Alright, second perk. Just 30 months left until my Psi Theory finishes researching, then I'll get Mind over Matter!", but an AI goes. "Alright, second perk. One Vision it is!"

Even when you give an AI empire EVERY tech and EVERY tradition via console commands at the start, they still seem to pick their perks randomly, and rarely stumble onto the 'good' ones.
 
I've looked at AI via observer - unlike players, they don't save up ascension perks.

If a player gets Perk 2, they'll go "Alright, second perk. Just 30 months left until my Psi Theory finishes researching, then I'll get Mind over Matter!", but an AI goes. "Alright, second perk. One Vision it is!"

Even when you give an AI empire EVERY tech and EVERY tradition via console commands at the start, they still seem to pick their perks randomly, and rarely stumble onto the 'good' ones.
Another reason it'd be nice to be able to see what Traditions and Ascension Perks AI empires have chosen- it'd make decoding their behaviour a bit easier and would give a good reason for the devs to poke at this particular aspect and make it work a little better.
 
I've looked at AI via observer - unlike players, they don't save up ascension perks.

If a player gets Perk 2, they'll go "Alright, second perk. Just 30 months left until my Psi Theory finishes researching, then I'll get Mind over Matter!", but an AI goes. "Alright, second perk. One Vision it is!"

Even when you give an AI empire EVERY tech and EVERY tradition via console commands at the start, they still seem to pick their perks randomly, and rarely stumble onto the 'good' ones.

Could be an interesting change to make perk choice influenced by the AI's personality.
For example Migratory Flocks going for things like Grasp the Void and Voidborne; or Harmonious Collectives aiming for Imperial Prerogative, One Vision, and Shared Destiny.
 
Could be an interesting change to make perk choice influenced by the AI's personality.
For example Migratory Flocks going for things like Grasp the Void and Voidborne; or Harmonious Collectives aiming for Imperial Prerogative, One Vision, and Shared Destiny.
Or Erudite Explorers swearing off any ascension that isn't Synthetic, and likewise with Psionic for Evangelizing Zealots and Spiritual Seekers.
 
This joke just needed to be done.

It's funny that free patch accompaining "Ancient relics" DLC disables the support of an actual antient relic.
 
For anyone interested, https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam here is a link to Steam's hardware survey; scroll down a bit to see OS distributions.

As you can see, the biggest 32-bit OS segment is Windows 7, at 1.15% of the userbase, which is not exactly a very large fraction; I would also venture a guess that people who play Stellaris/Paradox games probably have disproportionately newer hardware/OS, so the percentage may be even lower there.

I also found it very interesting just how low the Mac userbase was on Steam... I wasn't expecting much, but only 3% of all players is incredibly low. I guess Macs just really aren't popular for gaming.
 
The last 32-bit CPU was 17 years ago. Can those computers even run Stellaris? And there's literally no reason for anyone with a newer CPU to have a 32-bit OS. In short, anyone who's affected by this kinda deserves it.

Technically, everything since socket 775 Pentium 4's and Athlon X2's have been 64-bit capable. It's really people still using 32-bit OS's, rather then HW support.

The move to 64 bits is the occasion to get rid of many overflow bugs that plague the game. Either by using bigger variables, or by setting caps to prevent the values to overflow and become negative.
Examples:
Weapons range overflow prevents whole fleets from firing.
Research points overflow prevents your empire from building anything.
etc.

Technically, you can use 64-bit datatypes on a 32-bit CPU/OS, though at a performance hit. And going to 64-bits by itself doesn't solve the problem; all those datatypes still need to be changed in code to make them 64-bits in length.

Still, bodes well for other Paradox games going forward. Especially for games of the scale they make, there *really* isn't a reason to be 32-bit at this juncture.
 
We haven't worked on that (yet). Unfortunately this is a problem for which we don't have a good solution yet.
Our issue is that we have a cache that contains the distance from any system to any other system, and when you add/remove bypasses or systems we need to recalculate this cache. This is further compounded by the fact not everyone has the same access to every bypass.
We have the "basic" cache which is only for hyperlan distances, and then we have a cache patch that adds distance through gateways accessible to that country. This country specific cache needs to be emptied whenever a bypass gets added, and towards the end game, every country starts building gateways, leading to mass cache invalidations and reconstructions.
Add to that that, invariable, the pathfinding itself becomes more complicated because you get many more ways to reach the same point.
Until we find a genius idea, i'm not sure we can do much to improve that. I've suggested removing gateways/wormholes/l-gates but for some reason nobody likes it when i suggest we remove features. Go figure!
Are you partitioning the connectivity graph for the pathfinder? If not it's likely to bring significant improvements at least for default settings. On a default settings the map is a graph of clusters of systems with a small number of exit points (including wormholes and gateways). So instead of dealing with system-to-system cache you would have cluster-to-cluster cache and for each cluster a system-to-system cache (in total much smaller size which has obvious advantages). Cache updates will also be easier since in most cases you would only need to recompute cluster-to-cluster cache and system-to-system cache in the affected cluster. There are some cases where it would be a bit more than that (if there are multiple paths to the same cluster with the difference less than the maximal distance within a cluster).
 
Currently, the pathfinding discussed in this thread is generally assuming (or at least I am assuming) graph distance, counted in number of nodes. So basically how many hyperlanes do you travel over. However, since there is travel in-system between hyperlane entrances, there is additional distance not being accounted for by this approach.
I was assuming that each system has its own local graph reflecting the travel time between the exit points. Since it is isolated and rarely changes it probably doesn't add much cost.

Generally speaking, caching frequently-accessed values is a way of increasing space complexity (your memory footprint) to decrease time complexity (the speed of the algorithm). In simple terms, looking something up is almost always faster than running an algorithm.
If only... :) On a modern hardware time to access RAM or even cache is very significant in comparison to the time instructions take, so it's often faster to recompute values than to read them from somewhere. As long as you can recompute them without reading anything :) Unfortunately, in the path finding case it's not likely to help since there isn't much to compute without reading. I wonder if the current path-finding code spends most of the time waiting for loads (from cache or RAM)...