Rating systems

Flagg · Jan 31, 2006

Solfi said:
don't like the theme of it, or

don't feel certain about any rules it leans on, and whether it upsets game balance significantly.

Sounds like a +0 rating to me.

Remember, the rating system is meant to gauge people's opinions of submissions, not their objective worth.

There are the criteria I personally use:

I like it: +1

I don't like it, but there's nothing inherently wrong with it: +0

I dislike it because it's game-breaking or poorly put together: -1

-S

Solfi · Jan 31, 2006

Yokai. Ganbatte.

Flagg · Jan 31, 2006

Solfi said:
Yokai. Ganbatte.

You see, this is what happens when I delete the "International" forum.

I have no idea what the hell you just said.

-S

Solfi · Jan 31, 2006

... sorry 'bout that, it's an old anime-related injury Ã‚Â :wink:

yokai = Roger that

ganbatte = will do my best (might be the incorrect form, but who cares)

memesis · Jan 31, 2006

Post your personal opinion of a submission.

Do not, under ANY CIRCUMSTANCES, try to assign a value based on what you think "the crowd" would do. Â If YOU think it sucks, rate it as such.

The wisdom of the crowd will come into play. Â For more details, go here:

http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds#Four_Elements_Required_to_Form_a_Wise_Crowd

Flagg · Feb 1, 2006

More musings on the rating system:

As it stands currently, if there are two submissions with the same numerical rating, the one with more votes takes precedence in terms of ranking. This is a good thing.

However, a submission with MANY good ratings, and one bad one, will always be lower ranked than one with only a single vote of +1, because its rating will be < 1.

This means, in terms of "best rated" lists, unless they are considered perfect by everyone, the items that have been rated many times will most likely be bumped off of the list by those that have only been rated once or twice. This, I think, is not a good thing.

Perhaps a more complicated metric could be used to generate the "best rated" lists to compensate for this.

Math isn't my strong suit, but a couple of ideas off the top of my head are:

Requiring that a submission is rated X number of times before it qualifies for the "best rated" list.
Somehow taking into account a ratio between the numerical rating, and the number of times it was rated.

I'm sure someone with a better head for number crunching could offer alternatives.

-S

Flagg · Feb 1, 2006

Perhaps "X" in the first example could be = to the average number of ratings for a submission in that category. This makes it automatically self-scaling as the number of rated submissions increases.

For instance, if 90 ratings have been handed out amongst the spells, and there are 30 spells in the database, the cutoff number would be 3. Then, only spells that have received 3+ ratings would qualify for the best-rated list.

As the average ratings per submission increases (or decreases) the qualification criterion will adjust itself automatically.

Does this make any sense?

-S

memesis · Feb 1, 2006

I think any system that has a cutoff like "you must have X ratings on this submission" will fail, given the actual number of ratings we have now.

It may be useful to have another classification: "most discussed submissions".

But before I did that, I would ask this question: What is the core goal we're trying to accomplish? Â I think the answer is, "to present a view of the highest-quality submissions we have, as judged by our users". Â "Most discussed" is not necessarily "highest quality". Â Nor, necessarily, is "highest rated".

Flagg · Feb 1, 2006

memesis said:
I think any system that has a cutoff like "you must have X ratings on this submission" will fail, given the actual number of ratings we have now.

Which is a probelm that I think my follow-up idea addresses adequately.

memesis said:
But before I did that, I would ask this question: What is the core goal we're trying to accomplish? Â I think the answer is, "to present a view of the highest-quality submissions we have, as judged by our users".

Agreed. However, I'd personally put more weight behind a rating that was comprised of 40 +1s and three -1s, over a rating composed of just one or two +1s, or even 20 +1s. It would be nice to have a system that reflected this. As it stands now, it doesn't. I think the rankings system needs to be weighted.

-S

Flagg · Feb 1, 2006

Another (possible) system would be to have the ratings be summed instead of averaged. For example, a submission with the following ratings:

+0
+1
+0
+1
-1
+1

would have an overall rating of +2, instead of +0.333.

That would create a bit more nuance between the items on a "best rated" list, rather than just a list that's entirely +1s.

-S

Flagg · Feb 1, 2006

Re: Thumbs

wordman said:
In this system, users are offered only three choices: thumbs up, thumbs down or uncommitted. Scores for new items start at zero. A thumbs up vote gives the item +1, a thumbs down -1 and an uncommitted +0. The total is then divided by the number of votes cast, yielding a final score of -1.0 to 1.0, with 0 as the average. You cannot vote on your own item.
Pros:

Items start in the center of the voting range. Good items will bubble up, the bad ones down, and the average or ignored will stay average.

Provides reasonable guarantee that people will have a similar notion of what the different votes mean.

Binary good/bad doesn't allow room for nuanced votes.

Cons:

Binary good/bad doesn't allow room for nuanced votes.

Doesn't solve problem of fewer votes skewing the result.

Variations:

Number range could change from -1.0 to 1.0 to anything else (0 to 10 with five being the starting value, etc.)

Rather than average the score, the final score could just be the total of the vote points. This would keep the average centered at zero, but allow theoretically infinite range. This would mean that items with more votes could (potentially) have much higher scores, which may be desirable.

Quoted from earlier in the thread, as a good refresher.

-S

wordman · Feb 1, 2006

Another alternative might be to show the score using the format A:B:C :D:

E, where:

A = Total sum of scores
B = Average score
C = Total +1 scores
D = Total 0 scores
E = Total -1 scores

...and items are sorted by comparing these values in that order.

memesis · Feb 1, 2006

As an example, to show how things work, I've changed the "best rated" view from AVG() to SUM(), meaning that good items are sorted purely by the sum of their ratings. Â Take a look, let me know how it works for everyone.

Flagg · Feb 1, 2006

First impressions:

While the results haven't changed too much (not surprising), the at-a-glance usefulness of the ratings is improved by this method.

The summed ratings also give (at least the illusion of) more nuance, rather than have every list just be a long string of "+1". I think with time, and more overall ratings, there will end up being a more noticeable variance in the results, which, to my mind, is good.

-S

Flagg · Feb 1, 2006

wordman said:
Another alternative might be to show the score using the format A:B:CE, where:

A = Total sum of scores

B = Average score

C = Total +1 scores

D = Total 0 scores

E = Total -1 scores

...and items are sorted by comparing these values in that order.

That sounds like it might be slight overkill. Perhaps sorting by sum, then by total +1 would be sufficient?

-S

Flagg · Feb 1, 2006

One major difference that just occured to me is that with sums, a neutral (+0) rating does not affect the total, whereas with averages, a neutral rating effectively lowers it.

-S

Hanat-Osul · Feb 2, 2006

I think it's pretty good. Â Seeing "+4 (5 given)" gives some much-needed nuance.

memesis · Feb 2, 2006

Stillborn said:
One major difference that just occured to me is that with sums, a neutral (+0) rating does not affect the total, whereas with averages, a neutral rating effectively lowers it.
-S

Arguments can be made both ways about whether this is a good thing or not. Â With sum(), "neutral" means "I abstain", where with avg(), it means "this is a so-so submission".

Flagg · Feb 2, 2006

memesis said:
Arguments can be made both ways about whether this is a good thing or not. Â With sum(), "neutral" means "I abstain", where with avg(), it means "this is a so-so submission".

In the case of sums, a +0 could still mean, "this is a so-so submission". Since all submissions start at +0, Â giving another +0 rating denies it ay mobility, wether upwards or downwards, helping to keep it relegated to "so-so-land". Of course, in this case, not voting has the exact same effect. At least with a +0 vote, you can leave comments.

-S

Rating systems

Flagg

The Most Electrifying Man in Sports Entertainment

Solfi

One Thousand Club

Flagg

The Most Electrifying Man in Sports Entertainment

Solfi

One Thousand Club

memesis

One Thousand Club

Flagg

The Most Electrifying Man in Sports Entertainment

Flagg

The Most Electrifying Man in Sports Entertainment

memesis

One Thousand Club

Flagg

The Most Electrifying Man in Sports Entertainment

Flagg

The Most Electrifying Man in Sports Entertainment

Flagg

The Most Electrifying Man in Sports Entertainment

wordman

Two Thousand Club

memesis

One Thousand Club

Flagg

The Most Electrifying Man in Sports Entertainment

Flagg

The Most Electrifying Man in Sports Entertainment

Flagg

The Most Electrifying Man in Sports Entertainment

Hanat-Osul

Elder Member

memesis

One Thousand Club

Flagg

The Most Electrifying Man in Sports Entertainment

Users who are viewing this thread