Friday, January 1, 2010

Warhammer Online haz Blades!

If you didn't know I'm a tech geek. Have been for over 20 years and currently work for a fortune 500 company as an IT consultant. So it's with some interest that I read the Gamasutra article about Warhammer Online's Virginia server farm. Keep in mind it's an "Intel sponsored" article, which is a euphemism for "advertisement". And it is repleat with what one would expect from that sort of thing. However, the grist of what I was really interested in starts on page 2 and gives you a bit of a glipse into how those pixils come to life.
"Our Server Farm in Virginia, for example," Mann said, "has about 60 Dell Blade chassis running Warhammer Online-each hosting up to 16 servers. All in all, we have about 700 servers in operation at this location."
Problem is that Lum predicts Warhammer will have all of one "server" (meaning world cluster) as 2010 comes to a close. Unfortunately I'm inclined at agree as Mythic has been forced to shutter 3/4's of their "servers" over the past several months. Warhammer just didn't pan out like many claimed it would and from most accounts Mythic isn't doing the things they really need to be doing in order to correct the underlying issues that have lead to that exodus of players. All we really know is that Warhammer has somewhere less than 300k players currently, though I suspect they are south of 200k.

This is the part I think a lot of people really need to become more aware of when they question development decisions. Why exactly do you not want Auction Houses in Dalaran for instance. Answer: because that would certainly concentrate too many people in one small place too often.

As might be expected in a game scenario in which the levels of participation vary dramatically at different times of the day, in different regions, and with different types of activity, performance scaling is an essential component of successful server operation.

"One of our ongoing challenges," Mann commented, "is where to distribute people in the world. Our processes-that we distribute across the physical hardware-correspond to locations in the virtual world. One of the focuses of our game, the big focus, is to get a lot of people in one place and have them all fighting with each other. And that, unfortunately, works against us in the process distribution model."

"When you put a lot of people in one place, you're putting their entire server load onto one piece of hardware. We do have some technology to mitigate that. Our scenario system (which spawns up smaller arenas for smaller teams dynamically) allows us to split people off to different pieces of hardware if we need to, dynamically, in smaller chunks."

Using this approach, the application, instead of coping with 800 people in an area on one system, can take 400 of those people out of an area and engage them in smaller fights. Most of the parallelism for these kinds of operations, Shaw noted, is done by process, not by thread.

Most players are simply ignorant of the fact that for every pixil they push around on a screen, there has to be CPU cycles and RAM on the server side that keeps track of it. Not only must it keep track of "it" but also broadcast your location to every other player in the vicinity. And every NPC in that location as well. Along with weather elements, and anything else that must be tracked or rendered. Your client might make the pretty renders for you, but it's the server backside that is coordinating the show, so when you complain about "lag" it's usually one of two things. It's either the "server" experiencing rampant processes, induced by code errors or it's a simple matter of cpu utilization. It's amazing how tuned these clusters have to be to operate a virtual world, and when you have too much activity going on in one zone the server eventually gets to a point where it simply exceeds norms. Player meet lag, lag meet player. Now, don't QQ when you ahve to hobble back to the old world for your auctions!