• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Stellaris Dev Diary #149 - Technical improvements

Hi everyone, this is Moah. I’m the tech lead on Stellaris and today I’m here to talk about the free 2.3 "Wolfe" update that will be arriving together with Ancient Relics, and what it brings to the table in terms of tech.

Stellaris is going 64 bits.
People have been clamoring for this for a while now, and various factors have led us to finally do this for this patch. I should temper your expectations though: while many have claimed that this would be a miracle cure for all their issues with Stellaris, the reality is somewhat more tame.

What does it mean?
The one solid benefit is that Stellaris is no longer limited to 4gb of memory, and won’t crash anymore in situations where it was reaching that limit. For people who play on huge galaxies, with many empires, many mods or well into 3000s, this will be a boon.

In terms of performance, though, it doesn’t change much. Without drowning you in technical details, let’s just say that some things go faster because you handle more data at once, some things go slower because you have more data to handle. In the end, our measurements have shown no perceptible difference.

Finally, the last effect of switching to 64 bits is that the game will no longer playable on 32 bits computers or OSes. We don’t think this will affect many people, but there you have it.


What about Performance?
I know that’s everyone’s favourite question, so let’s do our best to talk about it. First, let me dispel some notions floating around in various forums: Stellaris does use multithreading, and we’re always on the lookout for new things to thread. In fact between 2.2.0 and 2.2.7, a huge effort was made to thread jobs and pops, and it’s one of the main drivers of performance improvement between these version.

Pops and jobs are indeed what’s consuming most of our CPU time nowadays. We’ve improved on that by reducing the amount of jobs each pop evaluate. We’ve also found other areas where we were doing too much work, and cut on:
  • Ships calculating their daily regeneration when they’re at full health
  • Off-screen icons being updated
  • Uninhabitable planets doing the same evaluations as populated planets
Why do these seemingly pointless things happen? Well, we generally focus on getting gameplay up and working quickly so that our content designers can iterate quickly, and sometimes things fall through the cracks. Some of these systems are also quite complex and the scale of the new code is not so easily apparent. Sometimes, not limiting the number of targets is good enough because you’re not doing much but then, months later, someone adds more calculations or the number of objects explodes for unrelated reasons, and suddenly you’ve got a performance issue.

Modifiers
One thing that sets Stellaris apart from other PDS title is how much we use (or abuse) modifiers. Everything is a modifier. Modifiers are modified by other modifiers themselves modified by other modifiers, and sometimes by themselves. It’s quite hard to follow, and leads to every value being able to change at any time without your noticing.

“Why don’t you just compute jobs when a new one appears?” has often been asked around these parts. Well, a short answer to that is it’s really hard to know when a new job appears. You can get jobs from any modifier to: country, planet, pops. Each of these can get modifiers from ethics, traditions, perks, events, buildings, jobs, country, planets, pop, technology, etc.

Until now we were trying to calculate modifiers manually, forced to follow the chain in its entirety: when you recompute a country modifier, you then calculate their planets modifiers, and then each planet would recalculate their pops modifiers. Some of our freezes were just that tangled ball of yarn trying to sort itself out.

NexRiPkna2utTqAzF9H0DEjOCwHVsI4EejYO-vMQMh6QwUB-_uP7dXmpjkwXzOOKoiwDqkSzd9tlLmN3DlFN2R06A62od6XxWm8xh99XRDfRFRP3vVj42GBIaDaXSK7jjyKdS39b

This is our modifier flow charts. It’s not quite up to date, but gives you an idea of the complexity of the system (Unpolished because it’s a dev tool, and not made for the article).

No More!
For 2.3 “Wolfe” we have switched to a system of modifier nodes, where each node register what node they follow, and is recalculated when used, following the chain itself. We have modifiers that are more up to date, and calculated only when needed. This also reduces the number of pointless recalculations.

This system has shown remarkable promise, and cut the number of “big freezes” happening around the game (notably after loading, for example). It has some issues, but as we continue working with it, it’ll get better and help both with performance and our programmers’ sanity.

So, what’s the verdict?
In our tests, 2.3 “Wolfe” is between 10% and 30% faster than 2.2.7 right now. Hopefully it’ll stay that way until release, but the nature of the beast is that some of these optimizations break things and fixing the issues negate them, so we can’t promise anything.

IuIGuQ4cXPvjCEMWG_AowiNIFXhzpsPIcphmCVJD79vQqVMqUeZCqCoVfDlWDNZ3YNkAScYAJh2ebft947YsqoOhG7A_4pNBWxjZ6L9se5lkEEImNYZ4uOpTMWj-amEiwSYdirpd


Measurements provided by @sabrenity , using detailed info from the beta build. It’s worth noting the “SHIPS_SERIAL” purple line has since been eliminated.

AI
Another forum favorite, we have done some improvements to the AI. First, with @Glavius ’s permission, we’ve used his job weights to improve general AI job distribution. We’ve also done the usual pass of polish and improvements, and of course taught the AI how to use all our new features.


What else is new?
We’re also getting a new crash reporter that will send your crash report as soon as they happen rather than next time you start the game. We’ve improved our non-steam network stack for connectivity issues, etc.


All right, enough of my yammering. This has turned into a GRRM length novel, and even though there are many more areas we could cover, we’ll just turn this for your perusal.
 
Last edited:
  • 1
Reactions:
Why would this not be efficient? I dont understand what is inefficient here
Let me put it this way: pathfinding on graphs (like hyperlane graphs with gateways and wormholes) is a well studied problem and all the algorithms are known. All of them are based on brute force and as such usually depend exponentially on the number of vertexes and edges (connections between vertices) except special cases. Gateway network represents *a lot* of such edges.
 
Let me put it this way: pathfinding on graphs (like hyperlane graphs with gateways and wormholes) is a well studied problem and all the algorithms are known. All of them are based on brute force and as such usually depend exponentially on the number of vertexes and edges (connections between vertices) except special cases. Gateway network represents *a lot* of such edges.
Yes i assumed that much. Now you basically turn the problem into: Find a way to destination or the nearest gateway whatever happens first. Then stop and find a way from the destination to the nearest gateway. I cant imagine that the actual algorithm just makes a path like 100 jumps long and oops didnt find the target try again instead of going one jump away in all directions with each step. and then once you hit a gateway you suddenly have a few dozen more options to travel through which is precicely the thing that would explain why performance goes to poop in comparison to well not continuing from literally every accesible gateway with brute force.
 
if the gateway is not open to you then the system should not be open either right? either way it doesnt matter much with my system you just need to hit an open gateway rest is the same and as it worked before gateways where built it will do just the same afterwards only you do 2 calculations from different starting points instead of 1 from a single point (which will bloat a graph a LOT more i guess).
Not quite.
If you're at war with someone their gateways are closed to you, but their systems are accessible. (Give or take FTL inhibitor stations).
Yes i assumed that much. Now you basically turn the problem into: Find a way to destination or the nearest gateway whatever happens first. Then stop and find a way from the destination to the nearest gateway. I cant imagine that the actual algorithm just makes a path like 100 jumps long and oops didnt find the target try again instead of going one jump away in all directions with each step. and then once you hit a gateway you suddenly have a few dozen more options to travel through which is precicely the thing that would explain why performance goes to poop in comparison to well not continuing from literally every accesible gateway with brute force.
This doesn't work.
If the path to the destination is 20 jumps, and the nearest gateway to the starting location is 15 away, and the nearest gateway to the destination is also 15 away, then your solution reaches gateways after 15 jumps from each end, assumes that is closest, and terminates the search. In fact this means the gateway path is 30 jumps (plus the gateway/gateway link) rather than the *real* closest path which is 20 jumps.
 
Not quite.
If you're at war with someone their gateways are closed to you, but their systems are accessible. (Give or take FTL inhibitor stations).

This doesn't work.
If the path to the destination is 20 jumps, and the nearest gateway to the starting location is 15 away, and the nearest gateway to the destination is also 15 away, then your solution reaches gateways after 15 jumps from each end, assumes that is closest, and terminates the search. In fact this means the gateway path is 30 jumps (plus the gateway/gateway link) rather than the *real* closest path which is 20 jumps.
Yes but then you can also simply check for a path that ignores gateways all along and we know that works without lag because there is no reported problem before gateways are mass produced and then compare the 2 (worst case should be triple the effort which definitly wouldnt cause lag as we see it). And true forgot that war case though when you only consider open gateways (and systems) then it doesnt matter anymore.
 
Yes but then you can also simply check for a path that ignores gateways all along and we know that works without lag because there is no reported problem before gateways are mass produced and then compare the 2 (worst case should be triple the effort which definitly wouldnt cause lag as we see it). And true forgot that war case though when you only consider open gateways (and systems) then it doesnt matter anymore.
So we aren't terminating the search when we hit the gateway, like you suggested?

How does this not simply add more load to the search program than we currently have?
 
So we aren't terminating the search when we hit the gateway, like you suggested?

How does this not simply add more load to the search program than we currently have?
Not precisly we do terminate the search when we hit a gateway but only the search that does include gateways not the one that does simply ignore it (worst case this does about triple the effort of a search without gateways). But as gateways are a problem and its definitly not a huge deal to triple the effort this suggestion should be a lot better.
 
STOP whenever pathing hits a gateway first as every gateway is conected to any other gateway
While looking up a route, you need to check every suggested system for:
- closed borders;
- restriction of movement (player may set this manually);
- presence of hostile fleets or stations, if fleet stance is set as 'evasive' or you looking up a route for trade route;
- presence of hostile FTL inhibitor, if suggested system is not a target system;
- maybe more, if I forgot something;
- maybe even more, if mods may add something in.

Plus, you have to re-check a route (meaning re-check all of the above) pretty frequently - you may see it in-game when fleets and ships take another route due to changes in actual conditions.

Except of massive pipeline flushing due to branching, such a lot of conditions "breaks" any heuristic.
Plus, there is own set of conditions stored for every empire, as those has own visibility, closed borders, hostility conditions, etc. Meaning, every time pathfinding being called for a route, there's a massive CPU cache misses due to enormous (for caching subsystem) amount of data to look up in.

And all this is just about checking a system. You also have to check if nodes are connected *and* if suggested user of a route could use present connections (restrictions, techs, projects, local conditions, etc). Pathfinding also must find at least somewhat optimal route, in a mess of base hyperlane map, junctions, conditions and who knows what more.
 
While looking up a route, you need to check every suggested system for:
- closed borders;
- restriction of movement (player may set this manually);
- presence of hostile fleets or stations, if fleet stance is set as 'evasive' or you looking up a route for trade route;
- presence of hostile FTL inhibitor, if suggested system is not a target system;
- maybe more, if I forgot something;
- maybe even more, if mods may add something in.

Plus, you have to re-check a route (meaning re-check all of the above) pretty frequently - you may see it in-game when fleets and ships take another route due to changes in actual conditions.

Except of massive pipeline flushing due to branching, such a lot of conditions "breaks" any heuristic.
Plus, there is own set of conditions stored for every empire, as those has own visibility, closed borders, hostility conditions, etc. Meaning, every time pathfinding being called for a route, there's a massive CPU cache misses due to enormous (for caching subsystem) amount of data to look up in.

And all this is just about checking a system. You also have to check if nodes are connected *and* if suggested user of a route could use present connections (restrictions, techs, projects, local conditions, etc). Pathfinding also must find at least somewhat optimal route, in a mess of base hyperlane map, junctions, conditions and who knows what more.
Have you read the message from Moah where he a Paradox employee states that gateways and L-gates are/cause the problem with pathfinding? Because everything else you wrote is not gateways with the exeption of connections ofc. So with all that in mind and the obvious fact that smaller maps should make pathfinding a lot easier and still there is a problem with gateways on medium? or whatever have you any reason i overlocked why simply checking for

1. One path that entirely ignores gateways
2. Two paths that only go to the closest gateways from start and destination

does more than triple the effort (excluding the minimal effort of comparing the two ways)? Because tripling itself surely isnt nice but still shouldnt be like and everything lags now.
 
From algorithmic point of view gateways and L-gates are just a fancy set of hyperlanes with dynamically set accessibility. They cause problems because they connect locations that are far away from each other otherwise.
Not because of the distance. But because if you have 15 open gateways then yeah have fun continuing from EVERY one of those trying to find the target location with the shortest way as suddenly you basically have to consider a lot more places at once and due to wormholes or spiral galaxys who lack conections you cant just say yeah but thats on the other side of the galaxy so there is noooo way this is would be a shorter way...
 
Have you read the message from Moah where he a Paradox employee states that gateways and L-gates are/cause the problem with pathfinding? Because everything else you wrote is not gateways with the exeption of connections ofc. So with all that in mind and the obvious fact that smaller maps should make pathfinding a lot easier and still there is a problem with gateways on medium? or whatever have you any reason i overlocked why simply checking for

1. One path that entirely ignores gateways
2. Two paths that only go to the closest gateways from start and destination

does more than triple the effort (excluding the minimal effort of comparing the two ways)? Because tripling itself surely isnt nice but still shouldnt be like and everything lags now.
Yes, I've read all developers replies in this DD. As well as I read the whole topic (rare case, tbh).

Gateways, L-gates, wormholes and other junctions aren't exactly the problem. The problem has two sides:
- every empire has its own set of accessibility conditions (diplomacy, techs, etc);
- the base hyperlane map and layers of junctions can't be merged into single pre-calculated map for pathfinding due to differential conditioning and modifiability of the game.

Those sides aren't the problem, but just technical sides of gameplay features. In order to describe and store all this, you have to store a (relative) tonload of data. And the problem is unpredictable usage of the data. What chunks of data will be requested and when data will be requested at all. Plus, massive conditional branching. Plus, technical latencies when you load data to CPU from RAM.

As for any heuristic (not only yours), it's already been clearly stated about:
for a 1000 star system a optimal heuristics might be worse than none at all.
 
Yes, I've read all developers replies in this DD. As well as I read the whole topic (rare case, tbh).

Gateways, L-gates, wormholes and other junctions aren't exactly the problem. The problem has two sides:
- every empire has its own set of accessibility conditions (diplomacy, techs, etc);
- the base hyperlane map and layers of junctions can't be merged into single pre-calculated map for pathfinding due to differential conditioning and modifiability of the game.

Those sides aren't the problem, but just technical sides of gameplay features. In order to describe and store all this, you have to store a (relative) tonload of data. And the problem is unpredictable usage of the data. What chunks of data will be requested and when data will be requested at all. Plus, massive conditional branching. Plus, technical latencies when you load data to CPU from RAM.

As for any heuristic (not only yours), it's already been clearly stated about:
Yes and thats all true but as writen in the 150 dev diary in section optimization last point "Fixed a case where every pop would check for faction membership on every daily tick, creating a performance drain" stupid crap happens and when the problem appears with Gateways then i simply assumed that this is because well with gateways now you have (depending on number of open gateways) like 15 new starting locations.. have fun calculating that stuff. And if that is indeed the case the very simple solution is to calculate the way from both the end and start and always stop once you hit a gateway and only continue from the other places until both hit each other or depending on whats better either repeated checking if hit each other or simply the "lazy" yet maybe still better let only one continue as it would normal until it again either hit the target or is now certainly a longer way than simply combining the two paths till gateway. If i am wrong and its because of the stuff you told well i think we might never see a solution but again (getting tired of this crap because so many people doubt my idea will work but none of them actual paradox employees so who knows) they stated removing gateways would help a lot...

Edit: The optimal heuristic can never be worse than none at all simply because the optimum would then be none at all making them equal:p
 
Last edited:
Have you read the message from Moah where he a Paradox employee states that gateways and L-gates are/cause the problem with pathfinding? Because everything else you wrote is not gateways with the exeption of connections ofc. So with all that in mind and the obvious fact that smaller maps should make pathfinding a lot easier and still there is a problem with gateways on medium? or whatever have you any reason i overlocked why simply checking for

1. One path that entirely ignores gateways
2. Two paths that only go to the closest gateways from start and destination

does more than triple the effort (excluding the minimal effort of comparing the two ways)? Because tripling itself surely isnt nice but still shouldnt be like and everything lags now.
The thing is, that even making the effort 10% more than the current system is unacceptable.
You appear to think that increasing the effort is acceptable, and to not be seeing that this will make things lag more than they are now.
 
The thing is, that even making the effort 10% more than the current system is unacceptable.
You appear to think that increasing the effort is acceptable, and to not be seeing that this will make things lag more than they are now.
Are you crazy? Which lunatic would want to increase the amount of cpu usage? And which part of GATEWAYS CAUSE THE PROBLEM do you not understand? I simply mean that with the lag problem that gateways cause, it seems that those dont just triple the effort but increase it even more than times 3 (once a lot are built).