• We have updated our Community Code of Conduct. Please read through the new rules for the forum that are an integral part of Paradox Interactive’s User Agreement.

unmerged(6563)

Private
Nov 27, 2001
10
0
Visit site
I see a lot of threads here about how to mitigate the impact of having a large number of insignificant countries, and that the main performance bottleneck is memory-related. This isn't really a technical forum, but I'm going to take the liberty of making a technical post on how to go about solving the performance issue.

Firstly, you have to have tools. Either write them yourselves or get someone else to write them for you (hint, hint :D ). VTune is good for some things but inadequate for industrial-strength performance analysis.

The first thing you need is an UDP talker/listener class so whatever you log on your main machine can be piped to another machine and displayed graphically. We have to be able to look over and "see" the bottleneck, but displaying it on the machine running the app is going to invalidate your measurements. You all seem to be pretty savvy net programmers, but you can just lift this directly from Microsoft Press's "Network Programming for Windows" book, one of those orange thingies.

Next, you need a capable inline profiling dll. Implementing the one suggested by Wyatt's article in the May 1998 Game Developer magazine, source available for download, is where I'd go.

Someone is going to have to write a graphical display to put the results up on the second machine, I've got a hokey bar-code app that I wouldn't admit to having written, but it works. Also, on the client side, you have to be able to display a histogram of excessively lengthy frames, and have the option of enabling a bar-code "freeze" if there's an exceptionally large time/memory spike. Basic windows programming.

If you're doing your own memory allocation, instrumenting the total amount is trivial. I suspect this isn't the case for this app, as it isn't for most windows programs.

Picking out memory abuses for memory we actually allocate *when it is allocated* and display it immediately is a bit tricky, but not impossible. On starting the app, make a call to your allocation. Log your stack frame right after you return and grab the memory address of the allocation function. Then, you disable write protect in the code segment containing the function, overwrite the supplied library with a jump to your own allocator, and hook in your instrumentation there. Again, pipe the results on a frame-by-frame basis to the second machine and display them. Actually, I've forgotten some of this, there might be better ways to instrument where and how much memory is allocated, perhaps replacing the windows "new/new[]" operator with your own, but the vital thing is that it be done and reported in a timely fashion.

There is no way I know of to directly instrument a virtual memory swap, which isn't to say it can't be done, I just haven't done it. It's trivial, however, with an inline profiling system, to isolate exactly what event triggered the swap. To be frank, I'd doubt that we'd have to go further than this plus some basic memory usage instrumentation, but if we have to I've done it and I can dig up the code.

None of this is easy. After it's been done a few times, though, it's significantly easier, and again, I'm off for some indeterminate amount of time and would love to work on it pro-bono, so why not? I can't see a reson this app shouldn't scream on a 256M machine.

Best,
Randall
 
Now I understand why I'm not a programmer :D

This thread does not belong here imho, but frankly I don't know where it should it go, so I'll leave it here for now :)

Maybe Uglyduck will decide otherwise.
 
Originally posted by viper37
Now I understand why I'm not a programmer :D

This thread does not belong here imho, but frankly I don't know where it should it go, so I'll leave it here for now :)

Maybe Uglyduck will decide otherwise.

Leave it here - might be of interest to Johan. :)
 
Originally posted by Randall
None of this is easy. After it's been done a few times, though, it's significantly easier, and again, I'm off for some indeterminate amount of time and would love to work on it pro-bono, so why not? I can't see a reson this app shouldn't scream on a 256M machine.

Indeed, it could be done, but it is a lot of work. And another problem is that a lot of the questions can easier be answered when you have the source. Visual C++ has debuggers and profilers, which can be used by the people who own the source of the game.