• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.
Showing developer posts only. Show all posts in this thread.

Duplo

Stellaris (Tech Lead)
Paradox Staff
1 Badges
Jan 26, 2018
114
17
  • Hearts of Iron IV Sign-up
Fantastic bits and where to find them
Hi everyone, I am Lorenzo aka Duplo aka The Battlepope aka The Caped Crusader and I am programmer on our beloved Europa Universalis 4. You may have seen me in the Dharma Dev Clash, struggling to spread Catholicism and occasionally getting betrayed by fellow Italians and wrecked up by Venice AI :confused:

In these last weeks we announced that 1.28.3 was the last Europa Universalis 4 release supporting 32-bit, since we are moving toward 64-bit. I thought it would be nice to give to all of you an insight of what this means, what to genuinely expect in the near future and what to definitely not expect.

There's been a lot of fuss about 64 vs 32-bit apps lately. But what is all of this about?

It is no mystery that Apple decided to deprecate 32-bit apps in MacOS 10.14 "Mojave" (2018) in order to drop the support completely in the future: they are doing an amazing job reminding us about this every time we launch Steam or our beloved Europa Universalis 4.

What may be less known is that while 32-bit apps may still run on the latest MacOS release the new development environment was stripped out of all the fancy 32-bit support, making it painful for the poor developers to even compile the game. They are trying hard to make everyone move to 64. But why?

If I got your attention so far, we can finally dive into the topic. It's going to be great.

The fellowship of the bits
When we say 64-bit, we are talking about x86-64, a CPU architecture designed by AMD of which the specifications were released in the year of our Lord 2000.

In the year 2001, the Linux kernel started supporting this new architecture, even if there were no processors available on the market yet.

In the year 2003, the very first x86-64 processor is released: the AMD64 Opteron. Several Linux distributions were already supporting x86-64.

In the year 2005, Microsoft discontinued the IA-64 (another 64-bit architecture from Intel) version of Windows XP, and released Windows XP Professional x64 Edition.

In the year 2009, Apple latest state of the of art system, Mac OS X 10.6 "Snow Leopard", was released with full support for 64 bits on the x86-64 platform. Also Windows 7 started to be loaded by default in his 64 bit version on most new computers.

In the year 2011, with Mac OS X 10.7 "Lion", Apple decided to drop the support for 32-bit CPUs, effectively starting that modernization process of their platform that last year resulted in the deprecation of 32-bit apps.

Nice. But still, why should I care?
Well, there are several reasons to actually move to 64-bit. The looming threat of deprecation on MacOS is maybe the most evident, but there is much more. The x86-64 architecture is better than its older brother, by far.

Without going too much into technical details, the advantages of going full 64-bit can be summarized in three main points:

Extended address space
64-bit applications can break the hard limit of 4 GB of RAM. By several orders of magnitude. Do you like blobbing in Europa Universalis 4? Think about how happy the CPUs would be if they could do the same with RAM.
extended-address-space.jpg


Capacity increase
64-bit registers and 64-bit enabled operands that will allow the CPU to perform trivially operations that were costly in the 32-bit architecture. Didn’t you want to find your way from Paris to Beijing in the blink of an eye?
capacity-increase.jpg


Larger number of general-purpose registers
In compute-intensive code, the 64-bit compiler will make use of these additional registers to better optimize the generated program. Russia will be able to recruit Streltsy even faster!
number-of-general-purpose-registers.jpg


There is also another, non-technical, advantage: support. While Microsoft is not as zealous as Apple in its crusade against legacy technology, they are obviously putting most of their efforts in the new - and definitely more common - architecture. Of course the 32-bit tools are still doing an honest job, but having the chance to move to the new 64-bit ones will allow us to get advantage of the new hardware features, getting less bugs and better performances.

Wow! Great!
That said, moving to 64-bits won’t magically make the game run twice as fast, nor it will make more content appear out of nothing. While the migration to the new architecture itself will have little impact on Europa Universalis 4 at the beginning, it will put the foundation for the future development, allowing us - the programmers - to gradually try to squeeze the most out of the CPUs! But important questions first...

Multithreading when?
Really? C’mon… The lack of multithreading in Europa Universalis 4 has been a common misconception for a long time.

What is multithreading about? Multithreading involves taking small pieces of logic and potentially execute them at the same time, getting advantage of the multiple cores of modern CPUs. The advantage of multithreading is that we can squeeze the most of the processor and get things done faster, because executed at the same time. The main problem of multithreading is that it’s impossible to predict when the Operating System is going to let them run, as the execution of threads is completely out of the application control. Working with multithreading is basically finding the proper balance between the potentially better performances given by running tasks at the same time and the performance loss due to the synchronization.

This tradeoff is especially noticeable in Europa Universalis 4 as the game has tons of small pieces of information to calculate, and each of them is heavily influencing the AI behavior.

There are things that can be safely executed in parallel - and indeed they are executed that way - like loading and processing data, updating independent pieces of the gamestate, cache calculations, most of AI behaviors… But there are also things that definitely cannot: pieces of gamestate dependent on other pieces to be calculated before and so on... Trying to force those interdependent operations to be executed in parallel would produce non-deterministic results - different results even with exactly the same initial state - and inconsistent data. This non-deterministic behavior can lead to the game crashing, because the data dependent for an operation might be still in an incomplete state.

A mysterious and shady voice I can hear behind me explains this problem this way:
Imagine the data as existing as a quantum wave function. Until observed we can not know what state the data is in. When one thread needs to access and change that data the wave function collapses as it's being observed. If there were an expectation of that the data would be in a specific state before the wave function collapses it will just be garbled mess and the universe crashes.
multithreading.jpg


One of the focus of us, programmers, is to try to get the best performances possible, while adding new cool features. Multithreading is already a powerful tool in our toolbox. Multithreading is a thing in Europa Universalis 4, and it’s here to stay ;)

This old, but still valid Development Diary is an example of our effort to get the best performances out of our game, and the charts show how multithreading is actually a fundamental part of this effort!

If you have any question, please write them in this thread. I'll follow the discussion and try to answer all of you!
 
Multi threading can be the cause of a desync yes for the reasons The Biso said. But they are also usually the easiest desyncs to fix because they are very "evident" in why something went wrong.
 
Please don't add anything that causes our universe to crash when playing EU4. Restarting the universe from a saved state always gives me a headache... (ohh - that sounds like an idea for an event in stellaris)

Some people end the universe through the task manager anyway just to avoid small inconveniences.
 
For whoever may be interested in more technical details, this is a totally legit real life scenario code(*) and the resulting assembly for 32 bit and 64 bit.
*this code is fictional, no game was harmed in the making of this example

Let's say that Sweden, after a long time of border tension, decides to declare on Russia. The Russian AI has to prepare for war, and needs to recruit the appropriate amount of troops. Let's say that it wants to recruit Streltsy. We all know that Streltsy strength is in numbers so we need a variable big enough to hold the maximum amount of Streltsy that we can recruit. Let's say at least 9,223,372,036,854,775,807 regiments.

Let's say we have this totally fine and perfectly looking code calculating the amount of Streltsy regiments to recruit:
Code:
long long how_many_streltsy_should_i_recruit(long long how_many, bool put_more_to_be_safe)
{
    while (how_many < 100000000000)
    {
        how_many *= put_more_to_be_safe ? 10 : 2;
    }
    return how_many;
}

Compiling it with clang (using "-O3 -m32" for a bit of optimizations and 32 bit output) we obtain this assembly:
Code:
how_many_streltsy_should_i_recruit(long long, bool): # @how_many_streltsy_should_i_recruit(long long, bool)
        push    ebx
        push    edi
        push    esi
        mov     ecx, dword ptr [esp + 20]
        mov     eax, dword ptr [esp + 16]
        cmp     eax, 1215752192
        mov     edx, ecx
        sbb     edx, 23
        jge     .LBB0_3
        mov     dl, byte ptr [esp + 24]
        xor     ebx, ebx
        mov     esi, 23
        mov     edi, 1215752192
        test    dl, dl
        setne   bl
        lea     ebx, [8*ebx + 2]
.LBB0_2:                                # =>This Inner Loop Header: Depth=1
        mul     ebx
        imul    ecx, ebx
        add     ecx, edx
        cmp     eax, edi
        mov     edx, ecx
        sbb     edx, esi
        jl      .LBB0_2
.LBB0_3:
        mov     edx, ecx
        pop     esi
        pop     edi
        pop     ebx
        ret

On the other hand, compiling the same code for 64 bit and the same optimizations ("-O3 -m64") we obtain this:
Code:
how_many_streltsy_should_i_recruit(long long, bool): # @how_many_streltsy_should_i_recruit(long long, bool)
        mov     rax, rdi
        movabs  rcx, 100000000000
        cmp     rdi, rcx
        jge     .LBB0_3
        movzx   edx, sil
        lea     rdx, [8*rdx + 2]
.LBB0_2:                                # =>This Inner Loop Header: Depth=1
        imul    rax, rdx
        cmp     rax, rcx
        jl      .LBB0_2
.LBB0_3:
        ret

While it is evident that the 64 bit version is definitely shorter, what is really interesting to see is why: the 64 bit version is able to take advantage of the 64 bit registers and compute it's operations without pushing/popping on/from the stack. This makes the 64 bit version consistently faster than the 32 one. And God only knows how critical is to recruit Streltsy if war happens.

Of course, this is an example made with the only purpose of showing what could happen. More work is indeed needed to get fully advantage of this transition to 64 bit in real life.
 
As this is a technical DD which DirectX Version is used?
We're still on DX9 and have been since release (like all Clausewitz games from that generation).
Imperator will be the first to use DirectX11 but we're not excluding future upgrades of existing titles. As always it's a question of cost versus benefits ;)
 
Most likely it's not even the CPU as I find it hard unlikely that someone today can use one for anything good (we're talking about stuff that is out of production since almost 20 year here). Most probably is just people stuck on 32 bit OS. Usually Windows guys that only own a 32 bit legit license and never decided to fork out some grands to buy a 64 bit one or to embrace the light and switch to an open OS entirely.
Yes, 32 bit CPUs are out of market since several years. The main problem may be the OS: Windows was not installed in his 64 bit vestige from the beginning.
 
Minor question, but does this affect things like buffer overflows? A recurring problem in Clausewitz games that's often been addressed by applying a cap, is that at certain values (usually 2 million and change - IIRC, the largest value of a signed 32-bit number if you use three places for decimals), the amount will overflow to negative 2 million (and change). Will these values still be 32-bit values with this size limitation, or will they now be 64-bit values, and as such, be able to be much much larger without overflowing?
The 64 bit transition won't affect on its own how the fixed point arithmetic is implemented. That said, having 64 bit support may allow us to change the implementation of the arithmetic to overcome that hard limit without risking performance loss.
 
DUPLO is so far my favourite developer and I like what he wrote, even if some aspect is hard to explain without going deep technical.
What he forgot to mention is that 64 bit architecture is a mandatory prerequisite for VICTORIA III! This was a well kept Paradox secret until this DD.

EDIT: He told me that 128 bit architecture is the real prerequisite... sorry.
Oh :)
You are making me blush now!
 
I quite like how he tries to explain why threading isn't a giant "i win" button. I have had some of my own code run 5-15% slower after being ordered to make it threaded while trying to argue that there was no benefit from it (the program was doing sequential calculations, where each calculation used the end state of the last iteration as it's start state).

Point being, while this doesn't explain anything about the game it's still a good development diary, as changing architecture support is a large development task, so ignore the hate and carry on ;)
One thing I learned, as a programmer, is that there are no "I win" buttons. Just small pieces and tools to be combined trying to find the proper trade off.

itsatrap.jpg
 
And what about previous versions? Will they remain on 32 bit? Cuz, for example, I can't handle of buying new processor and RAM for 64 bit, but I want to play EU4. If no, then i have to say goddbye to EU4
All the previous versions up to 1.28.3 are going to stay 32 bit.
 
Wasn't the 32bit limit 2GB of RAM that was extended with some tricks to 4GB? So one process couldn't address more than 2GB even if the OS could use 4GB?
PAE was the way to overcome the 3GB barrier on RAM on x86 architecture. x86-64 architecture overcomes that limit inherently.

"If you have any question, please write them in this thread. I'll follow the discussion and try to answer all of you!" - Forums are a perfect example to illustrate the problems with multi threading. Imagine if you had to synchronize 10-20 threads between eachother.
This example is pure gold :D
 
Proper multithreading: Run the same process in multiple CPUs, synchronize at set points. If one process disagrees - that CPU is probably faulty, so stop that thread, and continue in the rest (if no 2 threads agree, you have an actual crash). If a CPU goes down, you can continue running in the other threads, since they were in different CPUs... I don't think I've ever heard of a game running that kind of multithread though...
Actually, come to think of it - I can't think of anywhere that has used this for the last 20 years... Something about CPUs being too fast and too reliable...

That's not how it works
 
So will the game have to reinstall on my laptop (which is 64-bit) or will I never notice the change?
It's going to just work out of the box.

wait so will it no longer work on older computers if not then bye bye eu4.
Players that have a 32bit system and for any reasons cannot upgrade their system to 64bit will still be able to play 1.28.3 without any problem. All the versions before 1.28.3 are going to stay 32bit.