Performance Megathread

Less2 · Dec 1, 2019

The Founder said:
And then you loose all that performance gain and then some on the CPU, OS and game having to manage 200 Threads.
Congrats. You just made the code more complex/prone to error. More memory demanding. And most importantly, slower.

Do not feel bad. That is literally the easiest, most common mistake of Multitasking and -threading.
Seriously, it is among my top 5 responses by count on StackOverflow.

And btw, very important things about the planet that affect other planets can change anytime in the month. Stuff like being occupied, for example.

Maybe you shouldn't try and act like a know-it-all when you are too illiterate to understand the difference between "lose" and "loose".

No, the game would absolutely not be slower if it was properly multi-threaded so that job processing could expand to fill all available CPU time.

The Founder · Dec 1, 2019

Nigro Angelicus said:
In my humble opinion human life is finite. Therefore I believe that a game that might take more than a lifetime to be played could be described as "unplayable for a human being". The time needed for a new game to reach the end got longer and longer over time. If this trend will go on, sooner or later, a single human life might not be enough to play it.

Considering the game has no definitive end anyway, the possible runtime is already longer then a human lifetime. It was since 1.0.

Sorry. The only thing you prooved is that you got no failure conditions for your theory.

Less2 said:
Maybe you shouldn't try and act like a know-it-all when you are too illiterate to understand the difference between "lose" and "loose".

No, the game would absolutely not be slower if it was properly multi-threaded so that job processing could expand to fill all available CPU time.

I do know the concept of Multitasking Slowdown.
All you see to know is "you know nothing".

So how about you show me some Pseudo-code for this, so I can show you all the regions where you theory falls down?

BlackUmbrellas · Dec 1, 2019

Nigro Angelicus said:
Nevertheless, an unplayable game cannot be played; therefore if there is not a way to fix the performance issues without “destroying” the game or without asking the Devs to work for free, then there is not a "feasible" way the make the game playable again.

The game isn't unplayable, though. Does it lag as time goes on? Yes. Does that prevent it from being played? No, not really- the point where lag becomes too frustrating for someone to keep playing is entirely subjective.

McAwesome123 · Dec 1, 2019

GnoSIS said:
3. Breakthrough.

By changing how I played, I managed to speed up late game performance 500%+. In fact, on tiny it feels like the game runs almost like day 1 of a new game/galaxy and it seems like there is no upper bound to the number of pops you can have!! You can check the attached autobots save below, load it up, see it and play it for yourself. It has about 70 colonies, and 4.5 – 5k POPs,

I wanted to scale this up and was preparing to try another test game/empire, but then AlkincTeos supplied the performance megathread in the forums with a save on the latest version of the game (with lithoids dlc activated) that was slightly modded and had 18.5k pops and 261 colonies in it. I had to give it a go and as I posted yesterday the results were equally stunning, I would never consider being able to play with 18k POPs on my hardware, let alone it actually playing butter smooth. Here is the original post with the attachments. At the end of the post, you can download and try the new saves.

https://forum.paradoxplaza.com/foru...ance-megathread.1253705/page-19#post-25965104

4. The POPs are innocent: How it actually works (approximately!).

Everyone, including me, stated that the problem was the number of pops and the fact that the game spends too much time processing them. That is wrong. The true problem is vacant jobs. When a planet has no vacant jobs, no processing is taking place, as in:

“When a planet has no vacant jobs, the game engine doesn’t touch a single POP datum. It doesn’t even read them from ram. It’s zero processing”

The pops don’t check for jobs. it’s the other way around: the jobs check for pops and we all know how this goes:

a. Every month:

b. For each vacant job:

c. Check if even employed pops can fill it – check all pops.

So, if you have a colony with even a single job and 300 pops, that’s 300 checks, including:

i. Pulling all pop data including all traits.

ii. Pulling population rights, and calculating if a pop can be eligible for promotion.

iii. Calculating and comparing weights for both jobs: the job you are looking for and the job the current employed pop candidate you are inspecting is already doing.

Performance is O(c*v*p), c: number of colonies, v: number of vacant jobs, p: number of pops

Now, I really wish this was optimized so I don’t know if the game groups jobs and pops for performance because then the performance model changes. The resettlement window shows pops grouped. I’m not sure if this is just a visual thing or if it has engine significance. Also not sure what, if any, caching is done and if it can be relevant at these large data set numbers – more on that later.

Case study: A colony has 40 vacant jobs in 4 groups (4 different job types) and 200 existing pops working in 20 job groups:

a. No job grouping, no pop grouping: 40x200 = 8000 pair checks – Your CPU is suffering.
b. No job grouping, pop grouping: 40x20 = 800 pair checks – Still a lot!!!
c. Both pop and job grouping: 4x20 = 80 pair checks – Manageable? How about doing it for 3000 colonies?

Note: pop rights and societal strata make the actual numbers smaller – you don’t check if slaves can fill researcher jobs, or at least I hope these optimizations are already implemented! Only the devs know and exploring the source code can reveal that.
Note 2: Colonies with low numbers of pops make this calculation cheaper.

Now there seem to be break points. In my experiments I tried minimizing vacant jobs early but it seems that you must be thorough and only once the number of free jobs is very small the game speeds up. You have to take into account the AI colonies as well here. It may be that:

a. Caching is there and helps up to a specific point then the engine gives up and dies.
b. Data structures that assist in job calculations suffer as they are becoming larger, so if you cross a threshold as free jobs are reduced and they get downsized, things become exponentially faster.
c. other unaccounted engine stuff that I can’t know about.

I have decided to run some experiments based on this.

I modified Nexus Districts to give 1k housing and 1k jobs as such:

Code:

    planet_modifier = {
        planet_housing_add = 1000
        job_maintenance_drone_add = 300
        job_technician_drone_add = 220
        job_mining_drone_add = 200
        job_agri_drone_add = 140
        job_alloy_drone_add = 70
        job_calculator_add = 70
        planet_crime_add = -10000
        planet_crime_no_happiness_add = -10000
    }

With this, I ran 3 tests in 3 scenarios, with 5 pop "setups"
They all had:
Default Galaxy settings, except no default, fallen and marauder empires
Machine Empire
1 colony
Just over 10k pops
10 ± x Nexus Districts
1 Drone Storage building

All tests were run using:
'ticks_per_turn 3600'
'one_year'
(console commands)

The 3 scenarios:
A - 10 Nexus Districts
B - +2 Nexus Districts
C - -2 Nexus Districts

The 5 pop "setups":
1 - 1 species (1 machine)
2 - 2 species (1 machine, 1 bio)
3 - 3 species (1 machine, 2 bio)
4 - 6 species (2 machine, 4 bio)
5 - 15 species (3 machine, 12 bio)

2 - 5 are using almost the exact same save, except random pops' species has been changed to another.
1 is tested from 2200.02.01 (I believe), meanwhile, the others are from the start of 2201.
However, for a reason that I do not know, 4 and 5 had a huge lag spike (several minutes long) at 2201.01.02, 1 day after the save I tested from. As such, any tests for 4 and 5 start from 2201.01.02 instead of 2201.01.01.

The results of my tests:

Code:

1A - 29.089s, 29.761s, 30.59s  (29.813s)   1B - 29.776s, 29.916s, 30.586s (30.093s)   1C - 29.335s, 29.096s, 32.843s (30.425s)
2A - 25.371s, 27.985s, 27.175s (26.844s)   2B - 28.684s, 27.476s, 27.456s (27.872s)   2C - 31.791s, 28.389s, 29.175s (29.785s)
3A - 27.181s, 27.557s, 28.969s (27.902s)   3B - 27.086s, 27.3s,   27.396s (27.261s)   3C - 30.438s, 29.267s, 31.728s (30.478s)
4A - 27.054s, 33.898s, 27.908s (29.62s)    4B - 31.222s, 32.879s, 35.092s (33.064s)   4C - 34.141s, 33.837s, 27.878s (31.952s)
5A - 34.304s, 30.845s, 30.434s (31.861s)   5B - 29.302s, 39.085s, 32.354s (33.58s)    5C - 40.258s, 27.363s, 27.206s (31.609s)

Control (game without any modifications) - 6.876s, 6.462s, 6.493s (6.61s)

For me...
these results seem a bit inconclusive.

The only conclusion that I have from this is having more pops drastically decreases performance.

Now, this is by no means a perfect test. For one, the exact times vary by a lot in some cases, and also, I'm only using 1 planet.
But it does not seem like having a ton of free jobs decreases performance. At least not in these conditions.

I have included all the saves I tested from (except control, as I haven't done anything special there). Every test was done on the latest save.

Less2 · Dec 1, 2019

The Founder said:
I do know the concept of Multitasking Slowdown.
All you see to know is "you know nothing".

So how about you show me some Pseudo-code for this, so I can show you all the regions where you theory falls down?

My PC idling with ~1.6k threads with your normal firefox, steam, etc stuff running in background, performing ~28k context switches per second, completely responsive and normal.
"200 THREADS IZ TOO HARD THE OS CAN'T HANDLE IT!!!" - certified internet expert.

Starless2001 · Dec 1, 2019

sortulv said:
Since I'm currently playing a 1.9.1 game - nope, pretty much the same level of performance in the latest patch as then. Mind you, I have a prewarp computer, so my values my be a bit on the low end...

Pretty bizarre statement of the person you answered to... If you're happy with Stellaris pre-Le Guin, then don't ever update it. That's precisely when performance got hit by the jos/districtcs/no-more-tiles hurricane. A great idea on paper, a nightmare for performance (I heard the guy who headed Le Guin/Megacorp is lost his job as head of development, and I wonder if he got promoted or demoted...)

Anyway, even though Le Guin is a great patch on paper, like I said, the DLC that came along with it (Megacorps) is just bad. Maybe Verne will begin to fix the damage Le Guin caused (at least that's what we have been promised) but until then, if you are having fun with 1.9.1 do not update it.

Starless2001 · Dec 2, 2019

CMDR_HERNE said:
I'm quite reticent to revert back to 2.1 as I like the changes brought about since and with Ancient Relics. I've finished a couple of games reaching the end using the default settings of a huge galaxy with the maximum amount of ai opponents but after a while i've tried the default settings and having a problem with the times each day is passing. Longest I got was mid 2400s and that was an issue with each day of passing I daren't play upto the 3000s

Now trying a 400 size galaxy with the 0.25 chance of habitable planets with ai opponents set to random and I'm having a far far better time of it. There was only 5 ai civs that spawned at first and I'm at 2400 and each day is only taking about a second to pass. Only 50 years left before the end of the game. It's still Stellaris and it's still enjoyable just relatively smaller.

I really want to try a default sized galaxy again with the same settings to take advantage of the Federations update when it drops. As an aside random number of ai is better for me now as it doesn't make every game feel the same. Just need to get more civs to play against with a few more stars to explore.

Noted Stellaris Immortal will give that a go when you can use it with gestalts. I haven't given up on this game at all far from it will try a game of 2.1 just to remind myself of what it was like.

For context I'm using a 3rd gen i5/GTX750 and I first started playing Stellaris on lower than recomended specs on a core 2 duo. Now that was laggy....

I would like it if we had an option to remove everything Le Guin did and still retain the later installments. Go back to micromanaged tiles, no trade routes, no jobs (whatever tile you put a pop, that's its job) no districts... the game would get a lot more minimalistic, but given the almost unbearable state of the game, it could be nice. I never actually reach the victory date in any of the sessions I play. As soon as the AI empires either stomp me or get stomped I quit that save. Often even the fallen empires are pathetic, and the game becomes an exponential chore of finding room to your pops and researching repeatables so quickly you don't even see what you're doing anymore, leaders become immortal becomes the added +5 years is researched

sortulv · Dec 2, 2019

Starless2001 said:
Pretty bizarre statement of the person you answered to... If you're happy with Stellaris pre-Le Guin, then don't ever update it. That's precisely when performance got hit by the jos/districtcs/no-more-tiles hurricane. A great idea on paper, a nightmare for performance (I heard the guy who headed Le Guin/Megacorp is lost his job as head of development, and I wonder if he got promoted or demoted...)

Anyway, even though Le Guin is a great patch on paper, like I said, the DLC that came along with it (Megacorps) is just bad. Maybe Verne will begin to fix the damage Le Guin caused (at least that's what we have been promised) but until then, if you are having fun with 1.9.1 do not update it.

Playing 1 game for old times sake is hardly being happy with it. My point was simply that the performance was pretty much the same when compared you run them now.

Starless2001 · Dec 2, 2019

sortulv said:
Playing 1 game for old times sake is hardly being happy with it. My point was simply that the performance was pretty much the same when compared you run them now.

Was THAT your point? OMG!!! I can't... there are no words in any human language to communicate to you just how wrong you are!! After Le Guin the game just broke! And if somehow you disagree with that, you must have a very, very unusual device!

BlackUmbrellas · Dec 2, 2019

Starless2001 said:
Was THAT your point? OMG!!! I can't... there are no words in any human language to communicate to you just how wrong you are!! After Le Guin the game just broke! And if somehow you disagree with that, you must have a very, very unusual device!

Or your experience isnt as universal as you think.

matthobbit · Dec 2, 2019

I just upgraded my computer to an 8 core CPU, an M.2 SSD, and an rtx 2080 super (not for Stellaris), and my days STILL crawl at 3x speed on the massive 1000 star map.

Do I just need to select fewer planets or something? I would love to play long into the late game, but by the time comes along, it takes a minute for each month to crawl by?

I would love it if Stellaris were re-optimized to play faster late game.

Marissa · Dec 2, 2019

That's the dream. Some day..

sortulv · Dec 2, 2019

Starless2001 said:
Was THAT your point? OMG!!! I can't... there are no words in any human language to communicate to you just how wrong you are!! After Le Guin the game just broke! And if somehow you disagree with that, you must have a very, very unusual device!

You poor grasp of linguistics aside, I was commenting on the state of the game as it stands versus the state of the 1.9.1 available to anyone at the moment.
When is the last time you played 1.9.1? 20 months ago? When looking back do you see the rose tint on everything? Did you perhaps play with hyperlanes only back then?

(your post otherwise reminds me a lot of a certain character spouting "inconceivable" all the time

)

Aleriez · Dec 2, 2019

Until performance is fixed I can recommend the following mod: https://steamcommunity.com/sharedfiles/filedetails/?id=1882139456

Not ironman compatible.

Edit: Has to be placed at the bottom of the mod list in the new launcher.

sortulv · Dec 2, 2019

Less2 said:
View attachment 529487

My PC idling with ~1.6k threads with your normal firefox, steam, etc stuff running in background, performing ~28k context switches per second, completely responsive and normal.
"200 THREADS IZ TOO HARD THE OS CAN'T HANDLE IT!!!" - certified internet expert.

1. Are any of these threads dependent on any other?
2. Do the threads switch CPUs at any point?
3. How much of the CPU activity is dedicated to switching processes vs actually running them?
4. How much memory does each of these threads actually use?
5. How many threads the OS can handle is rather besides the point, isn't it - the question is if you will see any performance improvement by rewriting the entire game to be able to run everything in parallel.

Pancakelord · Dec 2, 2019

I'm now able to play long games with 5x worlds at much faster speeds than vanilla with a mod i made here - if Aleriez suggestion doesnt work for you, you can give mine a go.

Essentially I make pops grow exceedingly fast (and purge fast too), to ensure that there are never vacant jobs available for very long, with pops stopping growing at 2% over the housing cap (to make sure the AI always wants to expand - this is also in-line with frictional unemployment so it seemed fitting).

Is this balanced? no, not really (you'll want to keep an eye on crime, stability and food in the first century of a game, from the persistent low unemployment, for example - and you have to manually disable machine assembly as the player (AI auto-stops building pops when housing is hit though), so that is very annoying when playing as gestalt machines). The AI also seemed a little more competitive when it always had maxed out jobs, so it probably struggles with the jobs system due to overcapacity in vanilla, too.

But for me, and others who've given me feedback, its now fast again. I created this as a test to see if the vacant jobs theory (that vacant jobs cause daily job checks, slowing the PC down a lot) in the performance thread was true. Seems to hold up.

The other late game slowdown comes from fleet/ship pathing, so turning off wormholes and reducing xhyperlane connections will help there, to a degree. Pathing is basically a mathematical problem (similar to the travelling salesman problem) and relies on optimisation in areas modders cannot account for (only PDX can). This is why you might also see lag spikes at the start of big wars, lots of AI fleets are mobilising and maxing out the pathing thread with calculations, holding up other parts of the game.

This is also why higher single-core clock frequencies help more. im running an i7-4790k at 4.55 Ghz with 16gb of low latency ram (its the speed/timings that are important, less so the capacity).

NiclasCage · Dec 2, 2019

Am I the only one without any huge problems here? I'm currently at around 2750 in a huge galaxy (playing pacifist/inward perfection and struggling a bit with that) and one month takes me between 18 to 20 seconds at the fastest speed. It's considerably slower than at the start of the game when the numbers go by so quickly that they're blurry, but it's absolutely playable.

Less2 · Dec 2, 2019

sortulv said:
1. Are any of these threads dependent on any other?
2. Do the threads switch CPUs at any point?
3. How much of the CPU activity is dedicated to switching processes vs actually running them?
4. How much memory does each of these threads actually use?
5. How many threads the OS can handle is rather besides the point, isn't it - the question is if you will see any performance improvement by rewriting the entire game to be able to run everything in parallel.

This thread (hehe) is not an educational course. Please find one for yourself.

sortulv · Dec 2, 2019

Less2 said:
This thread (hehe) is not an educational course. Please find one for yourself.

I would, but the overhead of switching threads was too high...

Riince2 · Dec 2, 2019

sortulv said:
I would, but the overhead of switching threads was too high...

This might be why the game feels faster on GNU/Linux than windows to me.

Performance Megathread

Field Marshal

Field Marshal

Field Marshal

Corporal

Attachments

Field Marshal

Second Lieutenant

Second Lieutenant

Player Character

Second Lieutenant

Field Marshal

Second Lieutenant

Sergeant

Player Character

Colonel

Player Character

Lord of Pancakes

Major

Field Marshal

Player Character

Sergeant