• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Stellaris Dev Diary #170 - Performance and other technical issues

Hello, my friends! This is Moah, Tech Lead of Stellaris typing. I can finally talk about what you’ve all been waiting for: How many new platypi will there be in Federations? After weeks of…

Well, apparently, I should be "more technical." But before we jump into the mysteries of the Stellaris code, I want to take the time to talk a little about the balance between adding new features, improving performance and stability – especially in terms of multiplayer and the dreaded out-of-syncs (dreaded at least by me).

The Delicate Balance
Stellaris, like most decently sized code bases, is like a complex game of Mikado or Jenga: every part is connected in some way to every other part. When you add a feature, you add more connections. If you’re careful, you add only a few, if you’re in a rush you add a bit too many. This generally leads to Unplanned Features (aka bugs). In addition, once we see them perform in the actual game, we tend to expand features in new, unexpected ways, leading to more Unplanned Features(tm).

Once we realize what is happening, we start being more careful. Maybe too careful. Checking too many things, too often, ensuring that this interaction that is supposed to never actually happen is actually not happening. Not now, not later. Not ever.

So you have removed the unplanned features, but the game is a bit, ah… too careful. Some would say slow.

So you remove some of these checks. You realize that you don’t need to loop around the galaxy, you can just loop around this one tiny planet. Then you go one step further, and think “well I can maybe do that check only every three weeks, and this calculation needed by all these checks, I could store it in here and reuse it until the next time it changes.”

So now the game isn’t so careful anymore, we’re back in unplanned feature territory. But if the caching (storing/reusing calculations) happens at different times on different machines, you get slightly different results (like asking a developer for something before and after they had coffee).

Slightly different results are what OOS thrives on! Clients and servers have 0.0001 cost difference, compounded over time, that corvette is bought on the server but not on the client.

So you remove your “smart” algorithm. You replace it with the correct algorithm. You lose half of what you gained in step 2 and reintroduce some bugs. Probably.
Rinse and repeat.

But enough about my morning routine! Let’s talk about…

Performance
Stellaris fans are like C++ programmers: performance is always on their mind. To be fair, it has also been on ours a lot lately. We know that it’s not all that it could be, especially in late game and with the bigger galaxies. With that in mind, we’ve taken time to improve performance in a bit more depth than we usually can. We looked at what was taking the most time, and as everyone knows that is…

G3Zg2ENmwufWgqUXGFjTEebkxlbQzYRGI0diuSOCrFfUcSl9Xn8EkYCyzAUtWAyCdVXt5biT3vv65T4n-EnA5YmHZXb_Gpp9ydvqh28lj_Oa7py3yU3MHETwURjuo1QD4sFZiZNB


Pops.

There are many reasons why pops consume a lot of time in Stellaris, but the main one is that by endgame we have SO MANY of them. SO So so so so many. And they do so much! Pops have to calculate how good they’d be at every job (they do so every 7 days). Then they have to fight every other pop on the planet to get the job they’re best at. They also have to check if they could have a specific ethic. If they could join a specific faction. How happy they are. How happy they could be. How happy they would be on that planet over there.
All these things trigger modifiers calculations. If you remember my last dev diary, you know that modifiers are the only thing more numerous than Pops in Stellaris. And they all depend on each other. Calculating them is like pulling on a thread and getting the whole sweater.


OK, but what did we actually do about it?
Well first, I’ll admit I may have been a bit pigheaded on the whole “we need to do the jobs distribution every day because we don’t know when new jobs are added.” We reexamined this assumption, and jobs distribution is now only done on demand. It was also rewritten to iterate over a lot fewer things.

We also noticed a few triggers going through every pop of an empire to check if one or more are enslaved, decadent, or other things that can be tested at the species level. So we made new triggers to test these things at the species levels. In the same spirit, we had events going through every ship to find a fleet, so we added triggers at the fleet level.

Second, We’ve also reworked the approach to checking if pops can change ethics (and also made it work again), or if they can join factions.

Finally, we’ve looked for (and found) opportunities to use more multi threading.

But enough talk! What’s the result? Well, if a picture is worth a thousand words, here’s the answer at 30000 words a second:


The video compares the performance of 2.5.1 “Shelley” to 2.6 “Verne'' when running a save game from the community, which can be found attached to this post, with over 20000 pops. It was recorded on my work computer (Intel Core7-7900X @ 3.30Ghz, 10 cores and 20 threads, and AMD R9 Fury). You won’t necessarily get the same results, the exact difference in performance will vary with your computer, and the exact situation in your own save games, of course. On average, we’ve found something between 15% and 30% improvement in late game situations.
This save is just ideal to showcase the impact of the pops improvement.

DYxcPB_pqZfHKxxtAj0sh_Y3nx7zXM4OMcUHTkgNsDK9csuQgEECkgc6jVmUEgWpoa6lD2e9kfYdssD61j2I57mhM0XcyT20wfu8fFIZbP-Usqnw2PShuEAD0_-n-ZTNFcH0NJR6


What is this average anyway? How do you know?
Well, we have synths playing the game all night, every night. In the morning, we check how far they were able to go. We also ask them how many errors they encountered, what their endgame looked like, whether they got any OOS and then put all of that in tables and graphs, with many colors. Then we wipe the synths, so they don’t ask pesky questions about souls and whatnot.

EwNw1Mhvr5FLcwYQYuZClsoMxr8qHs3nF3VPqExEcAJrWCvISTEc2fcl3fNLWzQlWKdxuDLAGHEagL9FXOrtio6XazmKpx_rsR7Ri58Ts2tFbq7OcWPdsIG_ayumIutkMGm2VnD_


In conclusion
Although we keep performance in mind and do our best to keep it reasonable, we’re happy we had a chance to take a deeper dive into the issue. Hopefully the changes will spark as much joy for you as it did for us, and we’re looking forward to your feedback!

Next week will feature another dev diary about the other thing you’ve all been waiting for… MORE PLATYPI!

PS: The save file we're using is from the community, one of the performance threads. We are however unsure where we originally got it from. So if you recognize it, or if it's yours please tell us so we can credit you properly.
 

Attachments

  • perf_massive.sav
    4 MB · Views: 289
Last edited:
  • 1Like
Reactions:
Thank you for the answer anyways :). Please go on with the good work! But as a general idea, why not dedicate a thread for every specific game task (as long enough threads available of course). That would maybe give much better (balanced) CPU usage and reduce lag. But i'm shure you already had that idea and there are reasons why it didn't happen until now.
One of the problem is that the game is based on modifiers. Almost everything is a modifier, and modifiers depend from other modifiers, and you never know in what state they are, so we can't easily thread modifier calculations.
But you also have chains of dependencies, like pops depending on planets depending on countries, so you can't parallelize these side by side. Then fleet depend on system, countries and maybe planets? etc...
It's not impossible, but it would be us saying "all right, let's stop working on stellaris for six months while we rewrite the way we handle EVERYTHING."
At the end of the day, I don't think we need to do that to improve performance. Tactical usage of thread pools helps a lot, and I'm already thinking of several other places where we can get a great result with a much smaller rewrite.

Are any of your synths purchased from a certain fruit-themed Megacorp? There have been consistent problems with desyncs and crashes on Macs in MP (Linux too, pretty sure). Purging some more of those unplanned features would be greatly appreciated by both myself and many other mp players.
We have fixed several non-windows OOS. We're still on the lookout for more.

You just reminded me that the Empire Opinion galactic map filter is a good way to hang a game. Even when paused it grinds my machine down from 60 fps to 2-3 sometimes lol.
I actually remember seeing a JIRA about that and it getting fixed. So it should be much better in 2.6

Here is that one hard to read graph with a few labels.
You people are incredible!

if u want to keep the design of the game but offer speed to those who want. why not make a slider that says half popgrowth but x2 pop production. so those with slower systems can play.
Well there's a half formed idea of adapting "Game Rules" from CK2, etc and have a full on sliderfest for people who love that, but it hasn't been worked on yet.

The problem is that no hardware can handle 1k stars galaxy with 20+ AIs in the endgame.
Both my work and home machine handle this ok. YMMV and all that.

Okay, I did forget to ask: what is platypi?
It's one of the plural of "platypus." One of the more controversial ones.

1) You mention desync issues in the dev diary, but you don't refer to it later. Does that mean you are still working on that? Or it is to be assumed that those got improved as well?

2) Would it be possible to make the job checking configurable ? Doesn't have to be in the GUI anywhere, it's enough to put it in a config file. I feel it will allow players to tweak the performance of the game to their liking while retaining the original feeling for those that might feel weird if their jobs don't update immediately. I would like this option for my multiplayer group, as one of my friends has a rather crappy PC. I think with this option enabled we could play more comfortably, while the 15-30% average performance increase in late game might be not enough for us.
So as I said during the stream and earlier in this post: we've worked on OOS. We've fixed several, we're unsure how many are left because it's hard to know for sure without a huge number of people playing at once.
 
... So I was right. :/

Unless one already owns a beefy system, it will be negligible and still a slow crawling mess.

Sorry, but "15%-30% on already good specs" is not what I had hoped for.

Time to shelve Stellaris for good, I guess. I already didn't have much hope in anything meaningful, but this... Meh. :(

Haven't started the game in weeks now. Time to finally uninstall for good and just stop looking back.


I think we need to find out what they mean by 15% to 30%. If they are counting the EARLY game numbers into that average as some charts show then they are short-changing the performance uplift of the late game.

Plus the i7 used at the time was costly but it's the equivalent of a 130$ CPU now if Tom's Hardware CPU performance rankings are to be taken seriously.
 
Performance improvements look promising but I also hope you decide to implement a feature to disable certain notifications because it's way too much. You were only streaming the game for about 10 minutes before it ended up looking like this.

View attachment 546887
I had an idea to group similar notifications together, haven't time to work on it, though.
 
One of our betas asked me to post this picture. I quote their comments after the picture:
image (3).png

"@moah grab this picture and say that it is from i5-3570K based PC"
"hmm, the ratio of improved to old build showed slowdown up to year 2300 by about 10 percent on average... after that it was gaining speed, being 50% faster by year 2500
I think that can be explained by having more pops in first 100 years and improvements not kicking in yet and then it is winning, becoming super fast by the end. more than impressive"
"I think it is in fact linear and, again, tied to pops count, but due to wars happening at different times it has this funny shape (edited)
so in the end it is like two times faster, taking into account increased number of pops"
 
I actually remember seeing a JIRA about that and it getting fixed. So it should be much better in 2.6
Excellent! I think the last time I really used that Opinion map filter was near release ... of the base game lol.

Well there's a half formed idea of adapting "Game Rules" from CK2, etc and have a full on sliderfest for people who love that, but it hasn't been worked on yet.

Mmm i'd kill for more game/galaxy-sliders (and map filters). My dream:

5aQmVVM.jpg
 
Given that performance is actually a pretty difficult problem to actually deal with, this update exceeds my expectations. The videos show a late-game on a large galaxy that isn't perfect by any means, but runs smoothly enough to play decently. I think people are really hung up on the 15-30% gain, which is actually fairly substantial, particularly since it seems to be down-weighted by the early game (the ~50% ish gain in the late game on the graph above is comforting).

I'm kind of curious about what DD we are expecting next week though, and when we will hear about a release date. Given the additional DD, the earliest possible release date is now March 19th.
 
Honestly, congratulations! This is a promising step in the right direction and might make the lategame more bearable, but don't forget that performance is still improvable and you shouldn't stop optimising the game
 
On average, we’ve found something between 15% and 30% improvement in late game situations.
Just to clarify: that is with all the "Federations" features running, or just 2.5 + improvements (as, I assume, that video was ran on)?
 
Reading 15-30% was a bit of a let down but then I watched the video and... oh my Lord! Hope! Can't wait to try it out! The only thing that would make me happier is if the next Diary was about AI improvements.

Keep up the great work! :)
 
Reading 15-30% was a bit of a let down but then I watched the video and... oh my Lord! Hope! Can't wait to try it out! The only thing that would make me happier is if the next Diary was about AI improvements.

Keep up the great work! :)


I think a lot of people don't understand that 15-30% improvement was overall from 2200. The graph, esp. if it's from an older CPU, is much more telling. Here's the post for reference:
https://forum.paradoxplaza.com/forum/index.php?posts/26275405/
 
This DD makes me quite hopeful. Performance has been the biggest reason why I stopped playing Stellaris some time ago (the other being balance issues). Once I get this update in my hands I will probably buy Lithoids and Federations.
 
The additional performance comparison graph is very helpful to provide context, thank you! Great idea to show it

I, and I imagine tons others, saw the discrepancy between what we saw on Twitch + YouTube and the posted numbers. Turns out the numbers were from game start and not just late game. While I'd prefer a 300% performance uplift 10-50% during late game is still pretty good.
 
One of our betas asked me to post this picture.
That's helpful, especially considering that CPU is more within range of "average" PC.
the ratio of improved to old build showed slowdown up to year 2300 by about 10 percent on average...
It'll, probably go unnoticed: pre-2300 performance is not something people, usually, complain about. IMO, the more even game speed is across the timeline, the better.
 
That's helpful, especially considering that CPU is more within range of "average" PC.

That's exactly what I'm running at this point.

Devil's Advocate: If I were Paradox I'd CONSIDER updating specifications for 2.6 and later. Right now their "recommended" machine for FEDERATIONS [according to Steam] is basically an 8 year old CPU with only 4 GB ram. I'm thinking MINIMUM should be that same 8 year old CPU with 8 GB ram. That MIGHT give the devs some headroom to tinker with performance more esp. if they can cache data better.
 
It'll, probably go unnoticed: pre-2300 performance is not something people, usually, complain about. IMO, the more even game speed is across the timeline, the better.

No, sorry.

If the game actual gets slower in the early game with the improvements? That's pretty damn horrible! :(

This gets worse with every new piece of information.