• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Stellaris Dev Diary #149 - Technical improvements

Hi everyone, this is Moah. I’m the tech lead on Stellaris and today I’m here to talk about the free 2.3 "Wolfe" update that will be arriving together with Ancient Relics, and what it brings to the table in terms of tech.

Stellaris is going 64 bits.
People have been clamoring for this for a while now, and various factors have led us to finally do this for this patch. I should temper your expectations though: while many have claimed that this would be a miracle cure for all their issues with Stellaris, the reality is somewhat more tame.

What does it mean?
The one solid benefit is that Stellaris is no longer limited to 4gb of memory, and won’t crash anymore in situations where it was reaching that limit. For people who play on huge galaxies, with many empires, many mods or well into 3000s, this will be a boon.

In terms of performance, though, it doesn’t change much. Without drowning you in technical details, let’s just say that some things go faster because you handle more data at once, some things go slower because you have more data to handle. In the end, our measurements have shown no perceptible difference.

Finally, the last effect of switching to 64 bits is that the game will no longer playable on 32 bits computers or OSes. We don’t think this will affect many people, but there you have it.


What about Performance?
I know that’s everyone’s favourite question, so let’s do our best to talk about it. First, let me dispel some notions floating around in various forums: Stellaris does use multithreading, and we’re always on the lookout for new things to thread. In fact between 2.2.0 and 2.2.7, a huge effort was made to thread jobs and pops, and it’s one of the main drivers of performance improvement between these version.

Pops and jobs are indeed what’s consuming most of our CPU time nowadays. We’ve improved on that by reducing the amount of jobs each pop evaluate. We’ve also found other areas where we were doing too much work, and cut on:
  • Ships calculating their daily regeneration when they’re at full health
  • Off-screen icons being updated
  • Uninhabitable planets doing the same evaluations as populated planets
Why do these seemingly pointless things happen? Well, we generally focus on getting gameplay up and working quickly so that our content designers can iterate quickly, and sometimes things fall through the cracks. Some of these systems are also quite complex and the scale of the new code is not so easily apparent. Sometimes, not limiting the number of targets is good enough because you’re not doing much but then, months later, someone adds more calculations or the number of objects explodes for unrelated reasons, and suddenly you’ve got a performance issue.

Modifiers
One thing that sets Stellaris apart from other PDS title is how much we use (or abuse) modifiers. Everything is a modifier. Modifiers are modified by other modifiers themselves modified by other modifiers, and sometimes by themselves. It’s quite hard to follow, and leads to every value being able to change at any time without your noticing.

“Why don’t you just compute jobs when a new one appears?” has often been asked around these parts. Well, a short answer to that is it’s really hard to know when a new job appears. You can get jobs from any modifier to: country, planet, pops. Each of these can get modifiers from ethics, traditions, perks, events, buildings, jobs, country, planets, pop, technology, etc.

Until now we were trying to calculate modifiers manually, forced to follow the chain in its entirety: when you recompute a country modifier, you then calculate their planets modifiers, and then each planet would recalculate their pops modifiers. Some of our freezes were just that tangled ball of yarn trying to sort itself out.

NexRiPkna2utTqAzF9H0DEjOCwHVsI4EejYO-vMQMh6QwUB-_uP7dXmpjkwXzOOKoiwDqkSzd9tlLmN3DlFN2R06A62od6XxWm8xh99XRDfRFRP3vVj42GBIaDaXSK7jjyKdS39b

This is our modifier flow charts. It’s not quite up to date, but gives you an idea of the complexity of the system (Unpolished because it’s a dev tool, and not made for the article).

No More!
For 2.3 “Wolfe” we have switched to a system of modifier nodes, where each node register what node they follow, and is recalculated when used, following the chain itself. We have modifiers that are more up to date, and calculated only when needed. This also reduces the number of pointless recalculations.

This system has shown remarkable promise, and cut the number of “big freezes” happening around the game (notably after loading, for example). It has some issues, but as we continue working with it, it’ll get better and help both with performance and our programmers’ sanity.

So, what’s the verdict?
In our tests, 2.3 “Wolfe” is between 10% and 30% faster than 2.2.7 right now. Hopefully it’ll stay that way until release, but the nature of the beast is that some of these optimizations break things and fixing the issues negate them, so we can’t promise anything.

IuIGuQ4cXPvjCEMWG_AowiNIFXhzpsPIcphmCVJD79vQqVMqUeZCqCoVfDlWDNZ3YNkAScYAJh2ebft947YsqoOhG7A_4pNBWxjZ6L9se5lkEEImNYZ4uOpTMWj-amEiwSYdirpd


Measurements provided by @sabrenity , using detailed info from the beta build. It’s worth noting the “SHIPS_SERIAL” purple line has since been eliminated.

AI
Another forum favorite, we have done some improvements to the AI. First, with @Glavius ’s permission, we’ve used his job weights to improve general AI job distribution. We’ve also done the usual pass of polish and improvements, and of course taught the AI how to use all our new features.


What else is new?
We’re also getting a new crash reporter that will send your crash report as soon as they happen rather than next time you start the game. We’ve improved our non-steam network stack for connectivity issues, etc.


All right, enough of my yammering. This has turned into a GRRM length novel, and even though there are many more areas we could cover, we’ll just turn this for your perusal.
 
Last edited:
  • 1
Reactions:
I'm so hyped about the update now!
I have been rooting for 64 bits for a long time now, the only gripe I have left is multiplayer file transfer rates that in late game it gets into the 10+ min and make many players impossible to arrange (specially if it crashes or desync)
 
Do I not need one of these tables for each system with a Gateway? I guess having separate tables is okay for avoiding Concurrent Modifications, but I can't help feeling that simply filtering the first table would be easier.



I'd have to think about this a bit. The argument here is that the more times we update the tables, the less we have to update, and since we don't call BFS as much, there's less to compute? There are other sources of performance--such as updating access to gateways--that will limit the gains, but it would indeed seem that this would be better.



I hope devs are taking notes, and can implement something like this. It may not be possible in a short timeframe, given that we have no idea what the code base looks like, but it does look like a path forward. (NPI)
Ok, here's how this one second table works. As I mentioned there will also be a third table for L-Gates but it works exactly the same:
For each of your n systems you store the distance to it's nearest Gateway and the Hyperlane leading to it. So exactly n entries and only one table. And this is the table you update. When more and more Gateways open you need to update more frequently, but each update is faster.

If you have enough Gateways that each system is at most 10 jumps from a gateway and you open a new one, the BFS will at most reach a depth of 10 before terminating.
 
Ok, here's how this one second table works. As I mentioned there will also be a third table for L-Gates but it works exactly the same:
For each of your n systems you store the distance to it's nearest Gateway and the Hyperlane leading to it. So exactly n entries and only one table. And this is the table you update. When more and more Gateways open you need to update more frequently, but each update is faster.

If you have enough Gateways that each system is at most 10 jumps from a gateway and you open a new one, the BFS will at most reach a depth of 10 before terminating.

Okay. Sorry for asking questions. So you have three tables in total. The 'big' one that we assume already exists, the Gateway one, and the L-Gate one. We need to figure out how to handle Wormholes. The issue with storing routes is that if we don't store all the routes, then we have to have a fourth table for Wormholes, right? The 'big table' doesn't know whether wormholes have been found.

Anyway, putting aside that wrinkle for the moment, when a ship/fleet requests a path from A to B, we would return the path with the minimum value among the following:

1) 'Big Table' directly. (Just hyperlanes)
2) A -> Gateway + Gateway -> B
3) A -> L-Gate + recursive call from L-Gate location?

(Not too sure about how L-Gates work. Don't all L-Gates connect to a single location in the L-cluster, so you would have to go to the L-Cluster to go to any other L-Gate?)

Now we need to work on the Wormhole issue.
 
Please please please put non-military crises on the cards for future :D it probably makes sense if we get a diplomacy overhaul at some point. Currently the mid and end game crises are only solvable by fighting. Having crises like plagues, galactic stock crash, hyper storms that rearrange lanes etc with diplomatic, economic and scientific solutions would be great.

Couldn't agree more.

And add disasters like supernovae which destroy parts of the galaxy.
 
Okay. Sorry for asking questions. So you have three tables in total. The 'big' one that we assume already exists, the Gateway one, and the L-Gate one. We need to figure out how to handle Wormholes. The issue with storing routes is that if we don't store all the routes, then we have to have a fourth table for Wormholes, right? The 'big table' doesn't know whether wormholes have been found.

Anyway, putting aside that wrinkle for the moment, when a ship/fleet requests a path from A to B, we would return the path with the minimum value among the following:

1) 'Big Table' directly. (Just hyperlanes)
2) A -> Gateway + Gateway -> B
3) A -> L-Gate + recursive call from L-Gate location?

(Not too sure about how L-Gates work. Don't all L-Gates connect to a single location in the L-cluster, so you would have to go to the L-Cluster to go to any other L-Gate?)

Now we need to work on the Wormhole issue.
Wormholes are practically Hyperlanes you need to open. To be more precise, Wormholes like Hyperlanes and unlike gates connect exactly two systems. and you can't block access. I'm not sure how exactly the existing Hyperlane table works, but I think wormholes could be treated like hyperlanes and thus be included.

No, the more complicated issue is closing borders and your access to a gateway being blocked by another empire. This could really result in a need to throw away both tables. I'm afraid I can't make any realistic asumption on wether or not this is an improvement without knowing how the existing code works, what it is used for and what causes it to update currently.

Edit: and yes, all L-Gate connect to the L-cluster, from where you could again reach any other L-Gate, so its like Gateways A -> L-Gate + L-Gate -> B
 
This right there is not correct. Opening a Gateway only effects the shortest distance calculation for systems in it's vicinity. Let me elaborate:
Let's for a moment only use Hyperlanes (and Wormholes which in this regard act like very long Hyperlanes)
Assume we have a table that gives us for each two systems A and B The distance from A to B (and of course the Hyperlane used on the shortest path). Needless to say, this table is symmetric, so it also gives us that for the route from B to A.

Now a Gateway opens. We create a second table, that for every System A gives the distance from A to the Gateway (and the Hyperlane). So far not a lot of calculations necessary.
Whenever a new Gateway opens we start doing a BFS from the System it tis in. For each visted system there are two options:
  1. The new Gateway is closer to the visted system then it's shortest distance to a previous Gateway. Then we update it's entry int he shortest-distance-to-gateway table. Then we add all it's neighbours to the BFS-list.
  2. The visted system has already an equally long route to a previous Gateway. Then we don't add it's neighbours to the BFS-List.
After this we have an updated table that gives us shortest distances to gateways. Now for every two Systems A and B the shortest distance between them including both Hyperlanes and Gateways is Min(HyperlaneDistance(A,B), GatewayDistance(A) + GatewayDistance(B))
In both cases the tables deliver the next system on the route as well.

Now we do the same with L-Gates. This opens up one problem: It's possible that the sortes route uses both a Gateway and an L-Gate. But in this case we know that we will use the shortest route between a Gateway and an L-Gate and this can be included in the above calculations without much trouble.

Distance(A,B) = Min(HyperlaneDistance(A,B), GatewayDistance(A) + GatewayDistance(B), LGateDistance(A) + LGateDistance(B), LGateAndGateWayDistance(A,B))
With LGateAndGateWayDistance(A,B) = Min(GateWayDistance(A), LGateDistance(A)) + Min(GateWayDistance(B), LGateDistance(B)) + DistanceBetweenLGateAndGateway.
Ok, here's how this one second table works. As I mentioned there will also be a third table for L-Gates but it works exactly the same:
For each of your n systems you store the distance to it's nearest Gateway and the Hyperlane leading to it. So exactly n entries and only one table. And this is the table you update. When more and more Gateways open you need to update more frequently, but each update is faster.

If you have enough Gateways that each system is at most 10 jumps from a gateway and you open a new one, the BFS will at most reach a depth of 10 before terminating.
This only works if wormholes, gateways and l Gates are hardcoded, right? The issue is that all those types are moddable, and there could be dozens networks of each of them.
 
Wormholes are practically Hyperlanes you need to open. To be more precise, Wormholes like Hyperlanes and unlike gates connect exactly two systems. and you can't block access. I'm not sure how exactly the existing Hyperlane table works, but I think wormholes could be treated like hyperlanes and thus be included.

No, the more complicated issue is closing borders and your access to a gateway being blocked by another empire. This could really result in a need to throw away both tables. I'm afraid I can't make any realistic asumption on wether or not this is an improvement without knowing how the existing code works, what it is used for and what causes it to update currently.

Edit: and yes, all L-Gate connect to the L-cluster, from where you could again reach any other L-Gate, so its like Gateways A -> L-Gate + L-Gate -> B

My concern with Wormholes is basically a special case of the closing-border issue: how do you know/update the 'big table' when you open a Wormhole? At a guess, one of the main reasons for slowdowns is that the default approach (with the big table) is what is used, but a check is made to see if the route is valid (there are no closed borders or wormholes in the way). If there is an obstruction, it just BFSs a route from A -> B. This will be slightly faster than regular BFS because you don't have to check any nodes in the graph for being the destination until you get to distance N, where N is the shortest path in the 'big table'.

Another option that might be faster would be to do BFS starting at A, and for every node, check its path in the 'big table' and then ask if that's the shortest open path to B. If it is, it has to be (one of) the shortest paths, and take that. If not, continue BFS. This will likely take less time than the approach in the preceding paragraph, because the node list to check likely grows slower in this case.

Edit:
This only works if wormholes, gateways and l Gates are hardcoded, right? The issue is that all those types are moddable, and there could be dozens networks of each of them.

Are you saying that it is possible to mod different types of Gateways, so that they are not interchangeable? The same approach could be used, but you would have n! different checks to make by Spartakus's algorithm where n is the number of different types of Gateways.
 
So this means techs costing 10 million points will be possible?
Let's see... googled 2^64 = 18 446 744 073 709 551 616. With the point shifted left by three digits for fixed-point arithmetic for in-game representation, the new cap would be 18 446 744 073 709 551 for unsigned and 9 223 372 036 854 775 for signed numbers.
So, ultimately, yes, 10M will be.

I've been wanting a space flu kind of crisis, forcing nations to cooperate to solve an intergalactic extinction level event without fighting for a while. It's unfortunately not in the plans right now.
It shouldn't be exactly a crisis, but rather random events firing here and there, with very heavy weights for crime, low stability, unemployment, devastation and other bad stuff. With mechanics to "spread" such events via migration network and trade routes. Just another counter-measure to snowballing and economy overheating, that might lead to downspiral. If everyone's sick, then there's no one left to work.
Except robots.
Which also could be a subject for some sort of scrap code pox, running in parallel to bio pop's.

Just another day-to-day sanitary costs.
 
This only works if wormholes, gateways and l Gates are hardcoded, right? The issue is that all those types are moddable, and there could be dozens networks of each of them.
Yikes! OK, but don't you still have a huge improvement by using a table for each of these networks? In an n-sized galaxy you only need n-sized tables for each network as opposed to an n*(n-1) sized table to cache all distances. And even in small galaxies the number of stars should be a lot bigger then the number of Gateway networks. So even if you're having M-Gates, O-Gates and Y-Gates it doesn't make a big difference.

Of course I don't really understand how your cache patch works or how moddable gateway networks really are, so I'm propably talking a bit to much out of my A-Gate here.
 
So far, I've only seen one mod adding a new gateway sytem - Ancient Caches of Technologies: Sins of the Fallen Empires, and it adds a fixed number of 4 gates along with a cluster.
 
Yikes! OK, but don't you still have a huge improvement by using a table for each of these networks? In an n-sized galaxy you only need n-sized tables for each network as opposed to an n*(n-1) sized table to cache all distances. And even in small galaxies the number of stars should be a lot bigger then the number of Gateway networks. So even if you're having M-Gates, O-Gates and Y-Gates it doesn't make a big difference.

Of course I don't really understand how your cache patch works or how moddable gateway networks really are, so I'm propably talking a bit to much out of my A-Gate here.

Actually, you would need a new table, but otherwise it works. For every system with a gate of a given type, store the shortest distance and path to a gate of every other type.

Then, when you need a path, do the following (example uses 5 networks as example):

Lookup A -> B, A -> G1, A -> G2, A -> G3, A -> G4, A -> G5

If A -> B is shortest, use that.

Otherwise,

Lookup the connections between each Gate type and the destination.

There will be n^2 of these (symmetry is broken once we start building paths piecewise). Trick is that once we move over a gateway, we won't use it again, because otherwise we would have used it more directly the first time.

So the number of lookups maxes out at the summation of I^2, where I = 1 to n

With 8 networks, that's a total possible number of checks of 240.

That seems pretty reasonable, right?
 
We haven't worked on that (yet). Unfortunately this is a problem for which we don't have a good solution yet.
Our issue is that we have a cache that contains the distance from any system to any other system, and when you add/remove bypasses or systems we need to recalculate this cache. This is further compounded by the fact not everyone has the same access to every bypass.
We have the "basic" cache which is only for hyperlan distances, and then we have a cache patch that adds distance through gateways accessible to that country. This country specific cache needs to be emptied whenever a bypass gets added, and towards the end game, every country starts building gateways, leading to mass cache invalidations and reconstructions.
Add to that that, invariable, the pathfinding itself becomes more complicated because you get many more ways to reach the same point.
Until we find a genius idea, i'm not sure we can do much to improve that. I've suggested removing gateways/wormholes/l-gates but for some reason nobody likes it when i suggest we remove features. Go figure!

To be honest I do not know why you have that cache and calculate it all at once...
*you have your map, all stars - these are nodes.
*each node is connected to some other nodes and distance is defined for each connection
*for gate to gate distance is 0 or very small amount - they are special nodes
*some nodes are more important than others - ala chokepoints.
*when you build a graph of such important nodes number of them decreases significantly.
https://www.redblobgames.com/pathfinding/grids/algorithms.html
*check special nodes in close proximity when you calculate shortest path
*Use pathfinding algorithms - they can calculate 5000 nodes graph in milliseconds (for sure you know)
https://www.researchgate.net/figure...h-with-10-blocked-node-in-grid_tbl1_315509846
*people are calculating paths for 100.000 nodes and more
*here is a web page that shows different algorithms in action
https://github.com/qiao/PathFinding.js/
https://qiao.github.io/PathFinding.js/visual/
*the results are in miliseconds (10-30) - animation is slower to show humans how it works (grid 40x60 = 2400 nodes - large galaxy), each node is connected to 6 around - so similar to hyperlane number form one star.


additionally:
*distance is bad - you should calculate by time needed to traverse from system A to B and add time needed for a fleet to go through the system (sublight speed) and number of jumps (cooldown counts) and node distances in a star system from one entry point to the other.
*it doesn't have to be perfect path - it has to be close enough - no one will notice!
*if you are doing cache with all distance for all stars you are calculating it all in one moment why? why do sth 5000x ? why not calculate it on the fly when the distance is needed?

if you use on demand pathfinding calculation new things will be possible for Stellaris - like:
* hyperlanes with different base speeds or even one directional (A>B but not B>A)
*hyperlanes that canot be used till certain tech level (aka distances... or simplified warp...)
*hyperlanes that can be used only by certain empire or group of empires (aka old womholes...)
*unlimited hyperlane creation and destruction that can reshape galaxy map
*some structures could modify time multiplayer for hyperlanes (aka inhibitors - 10x slower travel time... or faster)
all you need is store nodes and have additional parameters for each node connection like distance, speed modifier, list of restrictions/allowed techs/empires).

I am no coder, and I might be wrong but this is what people do in the internet to calculate distances/time.

Regards,
TC
 
Last edited:
I wasn't expecting a 64bit update so soon. Glad to see that, come 2.3, I will no longer be plagued by oom crashes.


I have but one thing to say...

ScaredSkeletor.jpg

Funnily enough, killing 75% of all pops would resolve our performance issues.

Don't tempt me.

Calling Thanos.jpg


@Jamor @Moah Any word on if the bug that prevents Synthetically Ascended empires from choosing to build their own main species has been looked into yet? This bug is the bane of my existence, as my preferred empire is consistently screwed by it, forcing me to get creative with the console to salvage the game.

https://forum.paradoxplaza.com/foru...-synthetically-ascended-still-broken.1175938/
 
Last edited:
Will the new modifier system could be used in other games in the future?
 
To be honest I do not know why you have that cache and calculate it all at once...
*you have your map, all stars - these are nodes.
*each node is connected to some other nodes and distance is defined for each connection
*for gate to gate distance is 0 or very small amount - they are special nodes
*some nodes are more important than others - ala chokepoints.
*when you build a graph of such important nodes number of them decreases significantly.
https://www.redblobgames.com/pathfinding/grids/algorithms.html
*check special nodes in close proximity when you calculate shortest path
*Use pathfinding algorithms - they can calculate 5000 nodes graph in milliseconds (for sure you know)
https://www.researchgate.net/figure...h-with-10-blocked-node-in-grid_tbl1_315509846
*people are calculating paths for 100.000 nodes and more
*here is a web page that shows different algorithms in action
https://github.com/qiao/PathFinding.js/
https://qiao.github.io/PathFinding.js/visual/
*the results are in miliseconds (10-30) - animation is slower to show humans how it works (grid 40x60 = 2400 nodes - large galaxy), each node is connected to 6 around - so similar to hyperlane number form one star.


additionally:
*distance is bad - you should calculate by time needed to traverse from system A to B and add time needed for a fleet to go through the system (sublight speed) and number of jumps (cooldown counts) and node distances in a star system from one entry point to the other.
*it doesn't have to be perfect path - it has to be close enough - no one will notice!
*if you are doing cache with all distance for all stars you are calculating it all in one moment why? why do sth 5000x ? why not calculate it on the fly when the distance is needed?

if you use on demand pathfinding calculation new things will be possible for Stellaris - like:
* hyperlanes with different base speeds or even one directional (A>B but not B>A)
*hyperlanes that canot be used till certain tech level (aka distances... or simplified warp...)
*hyperlanes that can be used only by certain empire or group of empires (aka old womholes...)
*unlimited hyperlane creation and destruction that can reshape galaxy map
*some structures could modify time multiplayer for hyperlanes (aka inhibitors - 10x slower travel time... or faster)
all you need is store nodes and have additional parameters for each node connection like distance, speed modifier, list of restrictions/allowed techs/empires).

I am no coder, and I might be wrong but this is what people do in the internet to calculate distances/time.

Regards,
TC

One problem is that different ships have different speeds, both due to jump time and sublight speed. You could probably do something like "[number of jumps required in path] x [charge time]", but also the point that it doesn't have to be perfect, just "good enough" is very true.
 
additionally:
*distance is bad - you should calculate by time needed to traverse from system A to B and add time needed for a fleet to go through the system (sublight speed) and number of jumps (cooldown counts) and node distances in a star system from one entry point to the other.

I think I am missing something here. Why is the set value of distance worse than the dynamic value of travel time?