• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

Stellaris Dev Diary #170 - Performance and other technical issues

Hello, my friends! This is Moah, Tech Lead of Stellaris typing. I can finally talk about what you’ve all been waiting for: How many new platypi will there be in Federations? After weeks of…

Well, apparently, I should be "more technical." But before we jump into the mysteries of the Stellaris code, I want to take the time to talk a little about the balance between adding new features, improving performance and stability – especially in terms of multiplayer and the dreaded out-of-syncs (dreaded at least by me).

The Delicate Balance
Stellaris, like most decently sized code bases, is like a complex game of Mikado or Jenga: every part is connected in some way to every other part. When you add a feature, you add more connections. If you’re careful, you add only a few, if you’re in a rush you add a bit too many. This generally leads to Unplanned Features (aka bugs). In addition, once we see them perform in the actual game, we tend to expand features in new, unexpected ways, leading to more Unplanned Features(tm).

Once we realize what is happening, we start being more careful. Maybe too careful. Checking too many things, too often, ensuring that this interaction that is supposed to never actually happen is actually not happening. Not now, not later. Not ever.

So you have removed the unplanned features, but the game is a bit, ah… too careful. Some would say slow.

So you remove some of these checks. You realize that you don’t need to loop around the galaxy, you can just loop around this one tiny planet. Then you go one step further, and think “well I can maybe do that check only every three weeks, and this calculation needed by all these checks, I could store it in here and reuse it until the next time it changes.”

So now the game isn’t so careful anymore, we’re back in unplanned feature territory. But if the caching (storing/reusing calculations) happens at different times on different machines, you get slightly different results (like asking a developer for something before and after they had coffee).

Slightly different results are what OOS thrives on! Clients and servers have 0.0001 cost difference, compounded over time, that corvette is bought on the server but not on the client.

So you remove your “smart” algorithm. You replace it with the correct algorithm. You lose half of what you gained in step 2 and reintroduce some bugs. Probably.
Rinse and repeat.

But enough about my morning routine! Let’s talk about…

Performance
Stellaris fans are like C++ programmers: performance is always on their mind. To be fair, it has also been on ours a lot lately. We know that it’s not all that it could be, especially in late game and with the bigger galaxies. With that in mind, we’ve taken time to improve performance in a bit more depth than we usually can. We looked at what was taking the most time, and as everyone knows that is…

G3Zg2ENmwufWgqUXGFjTEebkxlbQzYRGI0diuSOCrFfUcSl9Xn8EkYCyzAUtWAyCdVXt5biT3vv65T4n-EnA5YmHZXb_Gpp9ydvqh28lj_Oa7py3yU3MHETwURjuo1QD4sFZiZNB


Pops.

There are many reasons why pops consume a lot of time in Stellaris, but the main one is that by endgame we have SO MANY of them. SO So so so so many. And they do so much! Pops have to calculate how good they’d be at every job (they do so every 7 days). Then they have to fight every other pop on the planet to get the job they’re best at. They also have to check if they could have a specific ethic. If they could join a specific faction. How happy they are. How happy they could be. How happy they would be on that planet over there.
All these things trigger modifiers calculations. If you remember my last dev diary, you know that modifiers are the only thing more numerous than Pops in Stellaris. And they all depend on each other. Calculating them is like pulling on a thread and getting the whole sweater.


OK, but what did we actually do about it?
Well first, I’ll admit I may have been a bit pigheaded on the whole “we need to do the jobs distribution every day because we don’t know when new jobs are added.” We reexamined this assumption, and jobs distribution is now only done on demand. It was also rewritten to iterate over a lot fewer things.

We also noticed a few triggers going through every pop of an empire to check if one or more are enslaved, decadent, or other things that can be tested at the species level. So we made new triggers to test these things at the species levels. In the same spirit, we had events going through every ship to find a fleet, so we added triggers at the fleet level.

Second, We’ve also reworked the approach to checking if pops can change ethics (and also made it work again), or if they can join factions.

Finally, we’ve looked for (and found) opportunities to use more multi threading.

But enough talk! What’s the result? Well, if a picture is worth a thousand words, here’s the answer at 30000 words a second:


The video compares the performance of 2.5.1 “Shelley” to 2.6 “Verne'' when running a save game from the community, which can be found attached to this post, with over 20000 pops. It was recorded on my work computer (Intel Core7-7900X @ 3.30Ghz, 10 cores and 20 threads, and AMD R9 Fury). You won’t necessarily get the same results, the exact difference in performance will vary with your computer, and the exact situation in your own save games, of course. On average, we’ve found something between 15% and 30% improvement in late game situations.
This save is just ideal to showcase the impact of the pops improvement.

DYxcPB_pqZfHKxxtAj0sh_Y3nx7zXM4OMcUHTkgNsDK9csuQgEECkgc6jVmUEgWpoa6lD2e9kfYdssD61j2I57mhM0XcyT20wfu8fFIZbP-Usqnw2PShuEAD0_-n-ZTNFcH0NJR6


What is this average anyway? How do you know?
Well, we have synths playing the game all night, every night. In the morning, we check how far they were able to go. We also ask them how many errors they encountered, what their endgame looked like, whether they got any OOS and then put all of that in tables and graphs, with many colors. Then we wipe the synths, so they don’t ask pesky questions about souls and whatnot.

EwNw1Mhvr5FLcwYQYuZClsoMxr8qHs3nF3VPqExEcAJrWCvISTEc2fcl3fNLWzQlWKdxuDLAGHEagL9FXOrtio6XazmKpx_rsR7Ri58Ts2tFbq7OcWPdsIG_ayumIutkMGm2VnD_


In conclusion
Although we keep performance in mind and do our best to keep it reasonable, we’re happy we had a chance to take a deeper dive into the issue. Hopefully the changes will spark as much joy for you as it did for us, and we’re looking forward to your feedback!

Next week will feature another dev diary about the other thing you’ve all been waiting for… MORE PLATYPI!

PS: The save file we're using is from the community, one of the performance threads. We are however unsure where we originally got it from. So if you recognize it, or if it's yours please tell us so we can credit you properly.
 

Attachments

  • perf_massive.sav
    4 MB · Views: 289
Last edited:
  • 1Like
Reactions:
...How speedy do you play grand strategy games if 13 hours is too long? 50 hours for 250 years seem like far more realistic numbers (for a typical long campaign), if still quite speedy. I play slow, but I never thought I was glacial compared to others.
Stellaris is rather front-loaded on the decisions. I would probably spend couple of hours planning the strategy before starting the game, but during the game there aren't many big decision that would require thinking, so it's mostly small tactical decisions and busy-work. How much time to spend on the game is obviously subjective, but if I have 13 hours to spend on the game I would rather play 3-4 campaigns with different builds and strategies instead of playing just one (where most of my time is spent role-playing glorified data entry clerk).
 
Finally, Now my friend can play wide on his toaster : D

Was wondering, Are there any plans to allow for outfit customisation? e.g: In one of my imperial empires, The heir is using a different outfit to the starting ruler and it does not fit with the theme of my empire.

again, thanks for the hard work you all put in on Stellaris.
 
Nice to see the streams are back up, the performance improvements are looking nice. Keep up the good work, I’m upgrading to plain old optimistic. If the AI improvements work out well, then I’ll have a full smile :).
 
Would it make sense at this point to scale down the amount of pops? Make them half or one third of what they are now, and performance should benefit proportionally.
Of course that would mean adjusting pop growth, production rates, costs etc. across the board but that should not be a huge hassle I suppose?
 
Would it make sense at this point to scale down the amount of pops? Make them half or one third of what they are now, and performance should benefit proportionally.
Of course that would mean adjusting pop growth, production rates, costs etc. across the board but that should not be a huge hassle I suppose?
Most important, it would make popgrowth less important and that is desperately needed
 
I'm not sure if it will help you, but in my case nuking NVIDIA drivers with some 3rd party tool and then reinstalling them solved microfreezes issues both for Stellaris and another game.
would you be so kind to explain a little further? I've been using GeForce experience and always updated GPU drivers 2-3 weeks after each release. I think before installing new drivers, program delete an old one. Do you suggest to completely uninstall GPU drivers, uninstall geforce exp. and try like this ? Then what kind of 3rd party app are you talking about? Why haven't you try just to uninstall geforce exp ? Thanks in advance.
 
Well well well. Look what we've got here.

Nice to see this progress, kudos to you and your team Moah... this is some desperately needed improvement for Stellaris.
Heck, you might even get me to play again sometime in the near future!
 
Without showing CPU loads per thread this is almost useless. At some point the CPU will be maxed and you won't get any performance gains. A real comparison would be to run the test as he did then run it again on a much "weaker" CPU and compare the gains per CPU. That tells you if your code actually helps for older CPU's. Once a CPU is maxed no code changes will result in any improvement.
 
Without showing CPU loads per thread this is almost useless. At some point the CPU will be maxed and you won't get any performance gains. A real comparison would be to run the test as he did then run it again on a much "weaker" CPU and compare the gains per CPU. That tells you if your code actually helps for older CPU's. Once a CPU is maxed no code changes will result in any improvement.

Buried in the thread were some results with a 7-8 year old Core i5 3570K. It didn't give real-time-seconds-per-game-month BUT what it did was show % improvement. I think the 3570K showed a 50% improvement in year 2400. (??)

I agree with others asking that a "Beta" or "Performance Alpha" be released so that the users can get a better idea of how things perform on their specific computers. The gotcha is that there's likely a lot of "2.6" functional release code in that and I suspect Paradox isn't willing to showcase those pieces yet. Hopefully we'll get something soon that will answer a lot of the questions.
 
Does this Pop management also address the way they are assigned?
meaning, do they now actually go to the jobs they are best at, is there still flickering with for example enforcers when prioritized?
 
There is nothing else to discuss. The Literally Unplayable Bug is fixed. The patch is perfect in every imaginable way.