Proposal: a reputation system for blog comments


UPDATE - We received lots of brilliant feedbacks about our proposal. A revised version is now available here with more “philosophical” background here.


The problem

Every blogger has to deal with comment spam and the adopted strategy is always a trade-off between reader convenience and effectiveness.

Publish or discard? This decision can be taken solely by the blogger. He cannot rely on any external wisdom to support his decision. And the commenter cannot event presents her credentials.

The blogger is left alone with a bunch of tools: grey/black/white lists, bayesian filters, and so on. Wouldn’t be beautiful if the blogosphere could seamlessly cooperate to defeat the evil of comment spam, spam blog networks, trackback abuse?

The approaching digital identity big bang will shape a new world where you could store your identity attributes within the framework of an online identity, owned by you and managed by one of the several identity providers competing in this field. Our proposal is workable only in this future scenario.

scenario

As of today you can easily get an i-name, an LID url or an OpenID url. Nonetheless it’s unlikely they will disrupt the usual way of exchanging your identity attributes. Sometimes you can stumble across an LID enabled website, or an online service accepting i-names, but this is rather improbable.

This proposal aims also to accelerate the success of digital identity providers enabling them to add to their services, a simple (and hopefully useful) application: reputation management for blog commenters. It will be the first real problem solved with an Identity 2.0 approach.

A quick summary

Our idea mimics real life reputation mechanism. Building from that, we suggest a protocol the identity providers could easily implement. This will help blog owner to manage comment approval more conveniently.

We envision two kind of intertwined “reputations”: the reputation a blog B has about a user U, and the reputation a blog B1 has about another blog B2. We refer to them as R(U, B) and R(B2, B1) respectively. Both these information are implicitly handled by the identity providers, and the details of the implementation could be an open ground for competition between the players in this field.

It is important to note that a “reputation” is not an absolute value associated to a user, but a relative value that each blog has about a user.

The “workflow” is quite streamlined:

  • a blog reader ask her identity provider for a unique comment token;
  • when submitting the comment, the user sign it with her identity URL and the above token;
  • the blog, accredited by the token, ask the identity provider for the “reputation” of the commenter;
  • the blog accept/delete the comment and send a feedback about this decision to the identity provider;
  • the feedback is used by the identity provider to update the reputations of the commenter and of those blogs that has previously received her comments.

algorithm

More details

The reputation of a commenter is just one of the identity attributes managed by her own identity provider. In our schema is just a number, more precisely a value between 0 and 100. The higher the number, the better the reputation.

Each blog owner should set two reputation thresholds T1 and T2, with T1<=T2 (these values could be set differently for each provider, based on the trust the blogger has about them).

Comments from people with a reputation lower than T1 will be automatically discarded or filed as spam, while comments from people with a reputation higher than T2 will be automatically published. Comments from people in the range from T1 to T2 need an explicit manual review.

So, if blog reader U1 wants to post a comment C1 to blog B1 …

  1. She has to request a comment token K1 from her identity provider in advance (this could be easily accomplished via a bookmarklet). The token is just a unique key, never issued twice by the same provider. The provider keeps track of the user and the blog associated to the issued token in order to validate future queries from the blog.

  2. Then U1 can submit the comment along with the token and some pointer to her identity provider, e.g. her full identity-URL.

  3. Afterward, blog B1 resolve the identity URL of the commenter and request her reputation to the correct identity provider. The token K1 proves its request is legitimate.

  • If U1 has never written a comment before, the identity provider returns a neutral reputation R(U1, B1)=50.

  • Otherwise the identity provider computes the reputation R(U1, B1) from those of the blogs that received a comment from U1 in the past: R(U1, B2), R(U1, B3), …

1. Then blog B1 reviews the actual comment (manually or automatically) and notifies the identity provider about the acceptance or rejection of the comment. With this feedback the identity provider can now: * increase or decrease the reputation R(U1, B1) * update the reputations R(Bi, B1) of all the blogs Bi which assessed the reputation of U1 in the past, i.e. increasing the reputation of blogs that made similar evaluation of U1 and decreasing the reputation of the others.

The balance between commenter reputations and blog reputations will make building “networks of spammers” ineffective. A commenter with a very good reputation built posting on fake blogs, will be downgraded as soon as she tries to post outside the fake blogs, and the reputations of the fake blogs will quickly be destroyed as well. On the other hand, a good, well deserved reputation cannot be easily ruined. The system is intrinsically decentralized.

There are a lot of details that are specific to the implementation of each identity provider. Among them:

  • the computation of R(U1, B1) given R(U1, B2), R(U1, B3), … that won’t ignore R(B2, B1), R(B3, B1), …
  • the computation of R(Bi, B1) given R(U1, Ri) and R(U1, R1)

Furthermore each provider can apply different policies in relation to anonymity, number of reputations linked to the same user, etc.

Next steps

To test this proposal we are planning to:

  • design a reputation API to access the functions needed to implement the above schema (the reputation module of the blogging platforms will call this functions, the reputation application of the identity provider should be compliant to the API)

  • implement a functional and free reputation application exclusively intended for users registered at clipperz.com (the site will go live shortly)

  • (trying to) implement reputation modules for a couple of leading blogging platforms (with a lot of help from their communities)

tags:

Post new comment

The content of this field is kept private and will not be shown publicly.
Captcha
This question is used to make sure you are a human visitor and to prevent spam submissions.
Copy the characters (respecting upper/lower case) from the image.