• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

EU4 - Development Diary - 19th of February 2019

Fantastic bits and where to find them
Hi everyone, I am Lorenzo aka Duplo aka The Battlepope aka The Caped Crusader and I am programmer on our beloved Europa Universalis 4. You may have seen me in the Dharma Dev Clash, struggling to spread Catholicism and occasionally getting betrayed by fellow Italians and wrecked up by Venice AI :confused:

In these last weeks we announced that 1.28.3 was the last Europa Universalis 4 release supporting 32-bit, since we are moving toward 64-bit. I thought it would be nice to give to all of you an insight of what this means, what to genuinely expect in the near future and what to definitely not expect.

There's been a lot of fuss about 64 vs 32-bit apps lately. But what is all of this about?

It is no mystery that Apple decided to deprecate 32-bit apps in MacOS 10.14 "Mojave" (2018) in order to drop the support completely in the future: they are doing an amazing job reminding us about this every time we launch Steam or our beloved Europa Universalis 4.

What may be less known is that while 32-bit apps may still run on the latest MacOS release the new development environment was stripped out of all the fancy 32-bit support, making it painful for the poor developers to even compile the game. They are trying hard to make everyone move to 64. But why?

If I got your attention so far, we can finally dive into the topic. It's going to be great.

The fellowship of the bits
When we say 64-bit, we are talking about x86-64, a CPU architecture designed by AMD of which the specifications were released in the year of our Lord 2000.

In the year 2001, the Linux kernel started supporting this new architecture, even if there were no processors available on the market yet.

In the year 2003, the very first x86-64 processor is released: the AMD64 Opteron. Several Linux distributions were already supporting x86-64.

In the year 2005, Microsoft discontinued the IA-64 (another 64-bit architecture from Intel) version of Windows XP, and released Windows XP Professional x64 Edition.

In the year 2009, Apple latest state of the of art system, Mac OS X 10.6 "Snow Leopard", was released with full support for 64 bits on the x86-64 platform. Also Windows 7 started to be loaded by default in his 64 bit version on most new computers.

In the year 2011, with Mac OS X 10.7 "Lion", Apple decided to drop the support for 32-bit CPUs, effectively starting that modernization process of their platform that last year resulted in the deprecation of 32-bit apps.

Nice. But still, why should I care?
Well, there are several reasons to actually move to 64-bit. The looming threat of deprecation on MacOS is maybe the most evident, but there is much more. The x86-64 architecture is better than its older brother, by far.

Without going too much into technical details, the advantages of going full 64-bit can be summarized in three main points:

Extended address space
64-bit applications can break the hard limit of 4 GB of RAM. By several orders of magnitude. Do you like blobbing in Europa Universalis 4? Think about how happy the CPUs would be if they could do the same with RAM.
extended-address-space.jpg


Capacity increase
64-bit registers and 64-bit enabled operands that will allow the CPU to perform trivially operations that were costly in the 32-bit architecture. Didn’t you want to find your way from Paris to Beijing in the blink of an eye?
capacity-increase.jpg


Larger number of general-purpose registers
In compute-intensive code, the 64-bit compiler will make use of these additional registers to better optimize the generated program. Russia will be able to recruit Streltsy even faster!
number-of-general-purpose-registers.jpg


There is also another, non-technical, advantage: support. While Microsoft is not as zealous as Apple in its crusade against legacy technology, they are obviously putting most of their efforts in the new - and definitely more common - architecture. Of course the 32-bit tools are still doing an honest job, but having the chance to move to the new 64-bit ones will allow us to get advantage of the new hardware features, getting less bugs and better performances.

Wow! Great!
That said, moving to 64-bits won’t magically make the game run twice as fast, nor it will make more content appear out of nothing. While the migration to the new architecture itself will have little impact on Europa Universalis 4 at the beginning, it will put the foundation for the future development, allowing us - the programmers - to gradually try to squeeze the most out of the CPUs! But important questions first...

Multithreading when?
Really? C’mon… The lack of multithreading in Europa Universalis 4 has been a common misconception for a long time.

What is multithreading about? Multithreading involves taking small pieces of logic and potentially execute them at the same time, getting advantage of the multiple cores of modern CPUs. The advantage of multithreading is that we can squeeze the most of the processor and get things done faster, because executed at the same time. The main problem of multithreading is that it’s impossible to predict when the Operating System is going to let them run, as the execution of threads is completely out of the application control. Working with multithreading is basically finding the proper balance between the potentially better performances given by running tasks at the same time and the performance loss due to the synchronization.

This tradeoff is especially noticeable in Europa Universalis 4 as the game has tons of small pieces of information to calculate, and each of them is heavily influencing the AI behavior.

There are things that can be safely executed in parallel - and indeed they are executed that way - like loading and processing data, updating independent pieces of the gamestate, cache calculations, most of AI behaviors… But there are also things that definitely cannot: pieces of gamestate dependent on other pieces to be calculated before and so on... Trying to force those interdependent operations to be executed in parallel would produce non-deterministic results - different results even with exactly the same initial state - and inconsistent data. This non-deterministic behavior can lead to the game crashing, because the data dependent for an operation might be still in an incomplete state.

A mysterious and shady voice I can hear behind me explains this problem this way:
Imagine the data as existing as a quantum wave function. Until observed we can not know what state the data is in. When one thread needs to access and change that data the wave function collapses as it's being observed. If there were an expectation of that the data would be in a specific state before the wave function collapses it will just be garbled mess and the universe crashes.
multithreading.jpg


One of the focus of us, programmers, is to try to get the best performances possible, while adding new cool features. Multithreading is already a powerful tool in our toolbox. Multithreading is a thing in Europa Universalis 4, and it’s here to stay ;)

This old, but still valid Development Diary is an example of our effort to get the best performances out of our game, and the charts show how multithreading is actually a fundamental part of this effort!

If you have any question, please write them in this thread. I'll follow the discussion and try to answer all of you!
 
Most likely it's not even the CPU as I find it hard unlikely that someone today can use one for anything good (we're talking about stuff that is out of production since almost 20 year here). Most probably is just people stuck on 32 bit OS. Usually Windows guys that only own a 32 bit legit license and never decided to fork out some grands to buy a 64 bit one or to embrace the light and switch to an open OS entirely.
Yes, 32 bit CPUs are out of market since several years. The main problem may be the OS: Windows was not installed in his 64 bit vestige from the beginning.
 
Minor question, but does this affect things like buffer overflows? A recurring problem in Clausewitz games that's often been addressed by applying a cap, is that at certain values (usually 2 million and change - IIRC, the largest value of a signed 32-bit number if you use three places for decimals), the amount will overflow to negative 2 million (and change). Will these values still be 32-bit values with this size limitation, or will they now be 64-bit values, and as such, be able to be much much larger without overflowing?
The 64 bit transition won't affect on its own how the fixed point arithmetic is implemented. That said, having 64 bit support may allow us to change the implementation of the arithmetic to overcome that hard limit without risking performance loss.
 
DUPLO is so far my favourite developer and I like what he wrote, even if some aspect is hard to explain without going deep technical.
What he forgot to mention is that 64 bit architecture is a mandatory prerequisite for VICTORIA III! This was a well kept Paradox secret until this DD.

EDIT: He told me that 128 bit architecture is the real prerequisite... sorry.
Oh :)
You are making me blush now!
 
I quite like how he tries to explain why threading isn't a giant "i win" button. I have had some of my own code run 5-15% slower after being ordered to make it threaded while trying to argue that there was no benefit from it (the program was doing sequential calculations, where each calculation used the end state of the last iteration as it's start state).

Point being, while this doesn't explain anything about the game it's still a good development diary, as changing architecture support is a large development task, so ignore the hate and carry on ;)
One thing I learned, as a programmer, is that there are no "I win" buttons. Just small pieces and tools to be combined trying to find the proper trade off.

itsatrap.jpg
 
And what about previous versions? Will they remain on 32 bit? Cuz, for example, I can't handle of buying new processor and RAM for 64 bit, but I want to play EU4. If no, then i have to say goddbye to EU4
All the previous versions up to 1.28.3 are going to stay 32 bit.
 
Wasn't the 32bit limit 2GB of RAM that was extended with some tricks to 4GB? So one process couldn't address more than 2GB even if the OS could use 4GB?

"If you have any question, please write them in this thread. I'll follow the discussion and try to answer all of you!" - Forums are a perfect example to illustrate the problems with multi threading. Imagine if you had to synchronize 10-20 threads between eachother.
 
Wasn't the 32bit limit 2GB of RAM that was extended with some tricks to 4GB? So one process couldn't address more than 2GB even if the OS could use 4GB?
PAE was the way to overcome the 3GB barrier on RAM on x86 architecture. x86-64 architecture overcomes that limit inherently.

"If you have any question, please write them in this thread. I'll follow the discussion and try to answer all of you!" - Forums are a perfect example to illustrate the problems with multi threading. Imagine if you had to synchronize 10-20 threads between eachother.
This example is pure gold :D
 
Since there is a lot of technological talk here, I will translate some of it for the rest of us illiterates:

The 64-bit register needs to grab a hold of the the 32-bit register and wrestle him to the ground.
After being choked, the 32-bit register will tap out of the match.
Then the 64-bit register will take his bits and become even stronger. All that strength is then used to push the cogwheels inside the CPU even faster.
But that wrestling match will take some time though, as the 32-bit register is a tough opponent.
 
Since there is a lot of technological talk here, I will translate some of it for the rest of us illiterates:

The 64-bit register needs to grab a hold of the the 32-bit register and wrestle him to the ground.
After being choked, the 32-bit register will tap out of the match.
Then the 64-bit register will take his bits and become even stronger. All that strength is then used to push the cogwheels inside the CPU even faster.
But that wrestling match will take some time though, as the 32-bit register is a tough opponent.
giphy.gif
 
This is such great news! I really hope this eventually gets into other Paradox titles.
 
As to the multithreading, I spent some time working on multi core, multi processor simulation programs running in a controlled supercomputer environment. We had an expert come in to improve a program and after almost a year he found that the natural areas for multi threading were not cost efficient and other trival areas could be multi threaded for a massive improvement
 
EU4 uses less than 60 percent of one core of my Ryzen 2700x CPU at peak (rest of cores barely hit 10 percent mark), and frame-rate drops to 5 fps in late game campaigns. I think it is the definition of INEFFICIENCY.
game is not IO bound (measured, runs on NVME 256GB samsung 970 evo) and is not GPU bound (measured, uses less than 40 percent of my GTX 1080 at peak). so what bounds it? hardware has capacity to run it much faster. it is obviously algorithm bound (memory access patterns?), you now it better than us that your implementation is not efficient (old tech/engine?)
just zoom out and move in map in late game, and notice how many crazy frame drops to zero happen, at pretty specific intervals. the game is pretty much unplayable in late games.

and yes, I am a software engineer.
I complain because I crazy like this game, and I wish it run fast.
 
Anything that helps the performance at this point. I am one of those "the more provinces the better" people, but what good is a fancy detailed map and dozens of new mechanics, if the games loads and runs like you're doing constant heavy benchmarking. Seriosuly, there should be more hype, because I sure as hell can't wait for better performance!
 
Wow, ok. You know that feeling when you know you're almost completely ignorant about something, but then find out that the amount of things you don't know is actually vastly greater than you ever imagined? Like if early astronomers, already aware they were mostly ignorant of the thousands of stars they could see in the sky, had been informed about the actual size of the universe, that what they were aware they didn't know was nothing compared to the amount of things they didn't even know they didn't know.


That's how I feel now.


(Seriously, I'd kind of just assumed 32-bit vs. 64-bit was referring to, like, color definition or something. ...It's not, right? I'm still not entirely confident. Yikes.)
 
If you want to know a little about the whole 32vs64 and why the size of RAM is always brought up as one of the first things,
i suggest reading up on the barebones basics of binary systems and their difference to the base 10 system used in everyday math.

If, for example, you dont know what kind of significants the number ~4.29bil holds in computing.
Or the old joke about how there are 10 kinds of people in the world. Those that understand binary and those that dont.
 
EU4 uses less than 60 percent of one core of my Ryzen 2700x CPU at peak (rest of cores barely hit 10 percent mark), and frame-rate drops to 5 fps in late game campaigns. I think it is the definition of INEFFICIENCY.
game is not IO bound (measured, runs on NVME 256GB samsung 970 evo) and is not GPU bound (measured, uses less than 40 percent of my GTX 1080 at peak). so what bounds it? hardware has capacity to run it much faster. it is obviously algorithm bound (memory access patterns?), you now it better than us that your implementation is not efficient (old tech/engine?)
just zoom out and move in map in late game, and notice how many crazy frame drops to zero happen, at pretty specific intervals. the game is pretty much unplayable in late games.

and yes, I am a software engineer.
I complain because I crazy like this game, and I wish it run fast.

Unless your try doing that at speed 5 you are always going to get a limited use of your hardware because speed 1-4 is specifically defined and as long as you meet the specs to run those speeds 100% it shouldn't change. (which you do in spades - when i upgraded to my 2700x there was no difference in the early game on speed 1-4) This goes for CPU. If on Speed 5...... there are bottle necks. I assume the merge points to the main thread for the AI agents would be a major one, not to mention the AI agents themselves taking longer as the game goes on - so waiting for them to finish as these are spun out to your other threads (the Dev Diary linked by either Duplo or Groogy above shows the optimization of the multi threaded agents and the load balancing it does and it improved a lot going by those graphs). Using the better instruction sets native to 64 bit as Duplo eluded to could probably speed up those sections but they will most likely have to be looked at individually in due time.

Hard drive access after loading into memory wont matter outside of save games. This can be clearly shown with HDDs as disk access is minimal.

GPU, its DX9 with DX9 issues and if its anything like stellaris the UI could be a huge culprit (dont quote me on that though). but 5fps..... that seems like a stretch, I have a GTX970 and I dont get 5 fps on max settings. Also is it actually fps or game cycle processing showing up as stutter. One doesn't equal the other and I get a lot of the latter late game sometimes and not the former. With your line of "at pretty specific intervals" it sound like that.

Now RAM..... ram use could be improved... I always wondered how much extra data is computed that could be cached in memory if we went past the ~3-3.5 GB that EU4 currently uses. Could more information be cached to make AI transaction cost less CPU time (or do more for the same cost since it has to compute less data on the whole). Also mods...... MODS..... MOOOODDDDDDSSSSS!!!!!!!!!!! ahem.....

On the whole though I hear you, I love the game too. but its not like its a "quote-unquote" brand new game. Makes me really interested to see what Imperator performance ends up as that will also be 64 bit but sounds like designed from the ground up in that framework.

Also lets not forget that a bunch of the code base dates back to 2007. I dont know how much code is still around from EU3 but it did make the foundation of EU4. That and the many, many systems bolted on after launch. And not to mention a WHOLE lot has changed from 2007 in terms of not just programming styles but even the CPU architectures are very different and how you program for them. Like how much does the latency hit from Ryzen hurt performance in EU4 with legacy code.... or even new code? How much is the Windows Scheduler just being stupid on Ryzen or Threadripper compared to Linux, cause it would be the first time that happned. I personally noticed a slight climb of stutter when i got my 2700x (will be getting a 3000 series when it comes out so I'll see what happens then), so there was a hit somewhere, but its inconsistent so I am really not sure where the real issue is.

@Duplo and the rest of the EU4 team, good luck on the transition and I only hope it proves to not be a "huge pain in the ass" (cause it will be a pain in the ass... it always is :p ). Looking forward to the updates that come from the transition.

Wall of text.... because I love me wall of text.

P.S. Love the memes.