Balancing a very non-linear game... feedback?

DeVore · 23-01-2014 03:05PM #1

Heya...

I've an interesting problem. We have a game coming out and its multiplayer. Each player has several attributes, lets say Armour, Primary Weapon Stength, Secondary Weapon Strength, NumberOfWeaponsCarryable, Speed etc.

Most are on a 1-10 scale and each scale is non-linear (so the first step up from 1-2 is more effective than the last step from 9-10 by a good bit).

Players will come in all variations, some highly equiped and upgraded and some basic grunts. I want to group players into groups of 4 who are closely matched in capacity of their upgrades. It doesnt have to be perfect but it cant be too far off either.

So, my first idea is to assign a value to each step of each attribute according to how likely it is to help a player win, based on all else being equal.

I can determine these values by having 4 different bots battle each other 100,000+ times and record the data and analyse it afterwards.

Thats my first thought but its a non-trivial problem neh?

Anyone got any thoughts as to how to ensure that when live we auto-match players against each other of similar upgrade levels given what I've explained?

Torakx · 23-01-2014 06:27PM

How about a ranking system?
I have played some games where you can't get inot certain dungeons without a certain equipment rating.
Maybe a variable can be assigned to each players character that is based off the integers of several different systems, like weapon str, armor etc.
If each system is 1-10 and there are 4 of them the Equip score could be based from 10-40 or something like that.
Maybe one system like weapons is dominant over secondary weapons and has a score of 1-20 making the overall variable 10-60 and carry on like that upgrading "equip rank" as you add new systems to the game.

If you are projecting a lot of different systems maybe it's easier to split it into def and off and modularize it again.

You could even grant the character a smexy title when they reach a certain equip rank or open up more items. Pushing them to focus on the rewards from playing and building their character.

With groups of 4 you can then add up the rank score of each member and use that int to do group matching. Giving you a "team equip rank" to measure off.

Thats my first thought on how I might view it.

Kilgore__Trout · 23-01-2014 11:36PM

Your idea of mass testing the effect of different attributes on the outcome of a match seems good. I guess you only need to get it reasonably well balanced at launch, you'll likely get tons of player feedback and can record the outcomes of actual player matches post launch.

Just out of curiosity, how would you run that many matches? Automation and increasing timescale? I'm guessing you aren't going to statistically simulate the outcomes of that many matches, as it's this information you are hoping to get.

Will player skill factor into matchmaking too?

Anyway, sounds like an interesting project, and I wish you luck with it.

DeVore · 24-01-2014 12:19AM

Torakx, some good and some problematic ideas there:

1. We cant "gate" the player community (aqla your dungeon suggestion) because of something called "liquidity". Thats the pool of people we choose our combatants from. If we section the community into Bronze, Silver and Gold ratings then we reduce the potential matching that we want to facilitate. ie: someone at the top of Bronze could have been matched against someone at the bottom of silver but now wont be because of the need to have a certain "upgrade score" to gain "entry" to the upper echelons.

2. The idea of ranking is what I'm suggesting and your approach is fine except we need to determine the relative worth of a point in armour vs say a point in weapons vs the capacity to carry two weapons. You didnt elaborate how you would determine that but otherwise, yep... thats basically the suggestion I'm coming up with myself.

3. The game is competitive, not co-operative so the grouping value of the 4 combatants isnt an issue.

4. Splitting the rank into Def and Off occured to me, but then it seems like we have the same problem of "ranking" the value of upgrades but now we have two numbers to match on.

Kilgore: We can run 300 matches a minute on 2 different machines. Thats 600 matches a minute, 36,000 matches an hour or about 3 hours to get 100,000 matches which is what I'm looking for at first. So, gathering data from the bots isnt an issue. Gathering GOOD data and asking good questions of it.... thats a different matter

So, the data will detail the various levels of the player attributes and where they came (1st, 2nd, 3rd, 4th) and the time they did it in.
My puzzle is, how to derive the value of a single point of upgrade in a specific weapon from that data. Clearly I can select out games where everything was equal except that one attribute and then look at the results as that attribute varied... thats my current plan of attack.

Bot skill is quite high, and they regularly beat us in testing so I'm not TOO concerned about their skill. If a player cant get as much out of his setup as a bot/top-player, thats a failing on their side and they deserve to struggle/lose

My final plan is to be able to predict the outcome of about 1M matches with the ranking system designed and then run that many matches and see if I was close or not because that will give me the answer to the question "Is my ranking system reflective of the real world". Of course the only answer to that is to have a sh*t ton of players play the game but finding 10,000 people to play a game 100 times before launch is not simple

satchmo · 24-01-2014 12:37AM

My first thought was "playtesting, and lots of it". But even with your idea of doing some automatic testing and classification with bots (I like this), even a small amount of attributes and levels is going to make the number of possible combinations explode. And that's assuming that all attributes are fairly balanced; that a high primary weapon strength isn't going to dominate all other attributes for example.

I'd suggest coming at it more heuristically - take a rough guess at how each stat would affect the player's overall strength, but also take into account how the player has fared in previous matches. The more they play, the more you get an idea of their level, and you can start to populate teams with a variety of players to match the teams evenly. This of course makes various assumptions about your game, so might or might not work depending on gameplay.

This all reminds me of Modern Warfare's matchmaking. They use a (closely-guarded) fuzzy algorithm based on many factors to try and fill each match with a variety of players at different levels in order to give a good game for everyone. If you know anyone at Demonware, now's the time to take them out and get them drunk

Drift · 24-01-2014 12:32PM

Hi Devore,

I stumbled over this thread and it intrigues me. I have no games development experience but I've done a lot of numerical programming (i.e. FORTRAN .. for shame!) so what I'm about to say below might be pure useless.

Firstly I presume there's some way of ranking players after a game .... I'm thinking either a straight out SCORE or REMAINING HEALTH. Something that's not just a 1/0, dead/alive type result.

The score of any bot at the end of a game can the be expressed as a function of the relative strength of each attribute the bot has compared to other bots in the game. The fact that the attributes are non-linear is going to make that a lot harder!!!

Score: SC
Relative Armour score of the player in question: AR
Relative Primary Weapon Strength of the player in question: PWS
Relative Secondary Weapon Strength of the player in question: SWS
Relative Speed of the player in question: SP
Relative Other Attribute of the player in question: OA

a,b,c,d,e ...... factors denoting the relevant importance of each attribute in determining the score.

So for every player at the end of a game:
SC = (a * AR) + (b * PWS) + (c * SWS) + (d * SP) + (e * OA)

After you run your 100,000 bot matches you can just do a regression analysis to determine a,b,c,d & e. You'll then have factors telling you the importance of each relative attribute in determing a players final score within a game. You could then use this as an initial starting point and after a defined period of beta testing re-run the regression analysis based on results using human players.

A formula could then be developed to give each player a hidden TOTAL STRENGTH rating based on a summation of the various attributes using the results of the regression analysis.

The real difficulty would lie in defining the relative attribute scores for each game. If the attributes were linear you could just get the difference between a players attribute and the mean attribute score of the four players in the game. But it's not linear!!!

Do you have a function that defines the non-linearity of the attributes? Or is it also unknown to you? If you have a function defining each attribute it would be easier.

I hope that makes sense and isn't way off thee mark. I find these type of things interesting from a maths point of view. Actually maybe the maths forum might help you because they'd know a lot more about this than me!!

DeVore · 24-01-2014 01:44PM

Heya, thats great... lots of food for thought.

Satchmo you are right, I forgot to mention we will have a chess-like ELO rating for the player too so that a terrible player who has managed to drag his equipment up to medium level doesnt always get matched with people who are really good with medium equipment.

But this isnt relevant for bots, yet!. We will be giving bots personalities and then we will be able to test matching between high-skill bots and low skill bots but thats for another day.

And yes, the combinations of attributes alone is consdierable. Add to that the non-linearity and the vagaries of play and play style and you have a big ball of chaos

Drift, that could be a quick enough way to determine a rough relative strengths. Its a 12 variable system with 6 of those variables ranging from 1-10 so if I let it simply randomly pick attributes for bots then I would have to run something of the order of 10,000,000 games just to have a few identical-but-for-one-attrib bots in games required to do the regression. But I think the approach has merit if we constrain the attributes to be within a range of each other on any given bot, then we hugely cut down the solution space.

Also, I do know the non-linear equation for the progression so I can un-non-linear it (its a word!:) ) and make decent guesses that way.

The proof of all this I guess will be to take several random bots and predict the outcome of their matches based on this Power Index (lets call it) calculation for each. Actually I suppose the TRUE test is to do that for humans but that brings in human skill which, as I mentioned above, is a whole other can of worms!

Drift · 24-01-2014 03:25PM

DeVore wrote: »

Actually I suppose the TRUE test is to do that for humans but that brings in human skill which, as I mentioned above, is a whole other can of worms!

Yup, the human element doesn't lend itself to any mathematics!!!

After a certain number of games have been played though it might be possible to include the "ELO" style rating as one of the unknowns in the regression. It might go some way to including a player skill rating when balancing the games.

Even with 12 variables I don't think 10,000,000 games would be necessary, particularly when you are only making an approximate correlation. But the more the better obviously!!!

I'd love to know how it turns out ... any possibility of keeping us up to date on how it works?

DeVore · 24-01-2014 04:01PM

Sure, absolutely. Especially when the NDA on the actual game is lifted and we can recruit alpha testers. Then I'll be able to release various bits of real data I hope and show everyone what we have been doing under the hood. In the mean time I will keep you all updated with progression on this topic.

The truth is that probably I could fairly easily muddle something together which would adequately do the job but since there are a mix of people looking to know what Games Design is about, I thought this would be an interesting example of something real which will (hopefully) be played by a lot of people in time!

The reason I estimated 10M is because there are at least 6 variables which vary from 1-10, thats a million combinations between them. The other 6 are generally On/Off or 1-3. So thats about 2^6 multiplier on 1M. 2^6 = 64 so thats 64M bot combinations. Each game has 4 bots in it so we'd need 16M games just to have one bot of each kind in a game. So in fact to have anything like a statistically significant sample size (say 20 games for each bot combination) we'd need 320M games. Thats a lot of games

We're going to have to cut those combinations down a little and also do some extrapolation of results but it should be do-able.

So, next week we are going to run the first series of bot games and capture the data from them. I'll let you know how it goes.

DeVore · 31-01-2014 06:57PM

Quick update. It transpires that there are 9 variables ranging from 1-10 and 3 ranging from 1-3 so the combination-space is off the scale for any sort of complete testing.

Currently we are running each step in each attribute separately in a single bot against a baseline of 3 standard bots and recording the result. We're doing that for 500 iterations per step so thats about 100 attribute-steps at 500 game-iterations each.... so thats only 50,000 actual games. Reasonably manageable.

While this only tells us how each upgrade-step performed against a baseline, we are hoping that that will be a rough enough estimate of "value" of each attribute to do us.

Additionally, a friend of mine is the lecturer in Big Data and Machine Learning in the Uni of Amsterdam (he probably has a much posher title than that) and he's going to approach it from a more theoretical/experimental approach and see how he gets on.

All this for a simple mobile game...

Drift · 03-02-2014 05:33PM

DeVore wrote: »

Additionally, a friend of mine is the lecturer in Big Data and Machine Learning in the Uni of Amsterdam (he probably has a much posher title than that) and he's going to approach it from a more theoretical/experimental approach and see how he gets on.

If he does a multiple regression analysis I want my name on the paper!!!

Balancing a very non-linear game... feedback?

Comments