Project Quality Metrics on Drupal.org (meta document)

Posted by calebgilbert on November 18, 2007 at 12:19am

According to a survey of over 1,000 Drupal users, the most requested feature for Drupal.org is a recommendation system for the modules section. With the exponential proliferation of Drupal modules it's certainly easy to imagine why.

Because the discussion about this subject has been splintered into so many places, it was suggested (see here and here) to create a thread within the D.O. Redesign group in order to centralize the brainstorming, mockups, proposals, etc for this initiative.

What this thread is intended for (comments from dww):

Meta-discussion on the proposals: which ones are clearly a good idea now, which ones need more thought, which ones aren't going to work at all, etc.
Volunteers to work on the "low hanging fruit" -- the proposals that we can right now agree would be worth doing, but which won't take a huge amount of code to accomplish.
Further discussion on the specific proposals that need to be fine-tuned or otherwise better thought-out. I consider this particular proposal for end-user ratings and reviews (issue #50605) in this category. There are a lot of potential problems this will create, so we need to have good answers for those if we want to continue exploring this.

Here's an initial list of all known efforts in this space:

#32124: Enable download statistics
#187019: Checkout statistics from CVS
#50605: Project.module: Add user ratings for projects
#52475: browse projects by # of downloads
#66013: Create project sorting method using data from update.module
#77976: Project quality indicator: rating system [duplicate with #50605 -- but still some potentially useful ideas]
#79550: Automate gathering of quality metrics
#99466: add 'orphaned' module category?
#63491: Drupal Version-Module Support Matrix
#165380: Make usage statistics (from update_status) visible
#192410: Deploy module recommendations based on content in page, and recommend modules on module project pages (aka "Pivots")
#203313: Add a way for maintainers to indicate multiple supported branches
http://groups.drupal.org/node/3314 drewish's original SoC proposal about project quality metrics
http://groups.drupal.org/node/5022 Drupal Project Metrics (wiki during Drewish's SoC work)
http://groups.drupal.org/node/6186 Project node UI redesign

There are probably others scattered around, but this is at least a start... ;)

Before posting please consider reviewing some of the links included above, if you haven't already. There has already been lots of discussion and some activity for this topic, the goal now is to reach a point of broader consensus and action.

Comments

Something to keep in mind...

Posted by webchick on November 18, 2007 at 1:07am

Joomla!'s experience with allowing user-generated ratings/reviews of their modules:
http://www.joomla.org/component/option,com_jd-wp/Itemid,105/p,344/

Any efforts I spend on this issue will be spent on #79550. I trust objective measurements a lot more than humans. :)

I must say

Posted by calebgilbert on November 18, 2007 at 1:32am

...that joomla link is quite educational. Honestly, any proposal should have to address most/all of the concerns raised there (namely the clone accounts) before being accepted.

Along those lines, one method which I have seen implemented - is the use of a "trusted user status", which is required to be able to vote on certain things. Assumably, this is a role/permission which gets activated/deactivated at given threshold and which is related to an algorithm 'deemed to be meaningful'.

My experience...

Posted by eaton on November 18, 2007 at 1:47am

...Is that any web site that includes any kind of voting mechanism will be gamed. If the voting mechanism is used to rank anything, in any way, it'll be gamed hard. It's tough.

True, that.

Posted by dww on November 18, 2007 at 3:27am

Take it from Mr. VotingAPI... he speaks the truth. I'm with webchick: I'd rather see effort spent on the automated metrics that are harder (that's right, I said "harder", not "impossible") to game and will very frequently be more accurate and useful than user votes or reviews.

Ratings vs Reviews

Posted by alpritt on November 18, 2007 at 1:46pm

The only way I have found star ratings to be useful is to find a mixture of positive and negative reviews so that I can read a cross section of opinions. Other than that I find them pretty meaningless. I think this is particularly true of Drupal modules, because modules are useful or not depending more on the project you are working on than some overall quality. It's not like we are voting on movies. So I completely agree that the objective measurements is the way to go.

Reviews are a little different, I think. If I read a review I can pretty easily tell whether it is well thought out, honest, unbiased and so forth. They are useful in ways that simple ratings are not. However, they can also easily lose out on their signal to noise ratio, so I think if you are going to do implement reviews you also need a system like on Amazon where the best reviews get more prominence -- based on a 'was this review useful' vote. Not sure if that would cause gaming problems as well though.

But either way, user ratings and reviews should be considered as separate entities with different merits and issues.

-
www.alanpritt.com

Adding

Posted by Amazon on November 18, 2007 at 2:01am

Adding http://drupal.org/node/188993 which recommends we have a custom module to parse the AWSTATS data files to get download statistics.
I am also adding this mostly off-topic thread about measurnig CVS statistics: http://drupal.org/node/187019

To seek, to strive, to find, and not to yield

New Drupal career! Drupal profile builders.
Try pre-configured and updatable profiles on CivicSpaceOnDemand

Amazon reviews

Posted by Amazon on November 18, 2007 at 4:17am

I think there is a more authoritative source on Amazon review cheating.

http://weblogs.java.net/blog/monsonhaefel/archive/2003/11/amazoncom_revi...

To seek, to strive, to find, and not to yield

New Drupal career! Drupal profile builders.
Try pre-configured and updatable profiles on CivicSpaceOnDemand

Gaming always can happen, but smartmobs persist

Posted by laura s on November 18, 2007 at 5:54am

Votes alone won't win the day, but votes + comments can make a big difference. The smart/savvy admin/webmistress will look at the votes, yes, but also look at the text behind the votes. When I look at CNet ratings, for example, I look at the reviews -- the discussions -- to see what's behind the numbers. This isn't new. You never can trust just numbers in an opt-in "poll" of any sort.

Is this a community that is so likely to game the system? You never can tell, when the community is truly open, such as this one. But I have to ask: When it comes to quality ratings of modules, aren't the ad hoc case studies -- the anecdotes -- going to balance out the voting systems in a way that can yield informative data for the attentive prospective downloader?

Laura
pingVision, LLC

_{Laura Scott
PINGV | Strategy • Design • Drupal Development}

don't be paralyzed by imperfection

Posted by moshe weitzman on November 18, 2007 at 4:19pm

sure, gaming happens in most of these systems. so what. does amazon shy away from reviews because of obvious gaming? does google quit working on search because of SEO sneakiness? My point is that user reviews are hugely valuable despite known imperfections. I think that engineers tend to discount imperfect solutions. To do so on this case is to deny the community a very valuable resource.

Paralysis vs. caution

Posted by dww on November 18, 2007 at 7:53pm

If I was being "paralyzed by imperfection" this thread wouldn't exist. I'm just SO BURNED OUT on doing d.o support for important things with incredibly minor problems that people get all bent out of shape over. This topic is a huge can of worms, with lots of potential pitfalls, so I'm moving forward carefully, not rushing into anything half-baked.

buddylist as a trust web

Posted by moshe weitzman on November 18, 2007 at 4:23pm

On the "gaming" problem, I wrote some code that I need to contribute for votingapi. It is an integration between voting and buddylist. So, in the extreme only the votes of your buddies count toward the scores that you see. The formula can be tweaked so that buddy votes are x% of the score and overall is the other 100-x%. This basically gets rid of the gaming problem, since unknown voters won't affect the scores. Further, it answers the "what module should i use" question in precisely the way i want it answered. Namely, I want to use the modules that Earl, Eaton, Dries, Angie, dww, etc. are using. Those are my trusted sources. I'm sure others have theirs.

One small issue with this is that new drupal people don't have a trusted network. I think thats easily remedied by showing which users have been befriended often. Or just list the members of the security team, the committers, and the Board.

I was thinking of running this ratings system on my own domain for a while until drupal.org gets its act together. But I haven't dedicated enough time to the site.

Anyway, thoughts on this approach are welcome.

Agree it's useful enough to try

Posted by Amazon on November 18, 2007 at 5:15pm

I think user reviews are good, here are some suggestions, pick and choose.
1) They should link to the site that is using the module. This way end users can evaluate the recommendation as part of an evaluation of the site. If they chose not to use the module because it didn't meet their criteria then I am not sure a review is appropriate.
2) They should indicate what version they are using.
3) The criticism should be constructive and have some acknowledgment that they are using a volunteers work, and they can't make personal criticism.
4) I think we might want to consider whether project maintainers can choose to have ratings. There are projects you want to share with other developers in case they find it useful, you want bug reporting, or welcome patches. However, you didn't necessarily sign-up to have public reviews of your work and you don't want to be putting your reputation out there to be pre-requisite to sharing code. It's one thing to have criticism from your peers it's another thing to have potential clients read a scathing review of a module you have agreed to maintain, but it isn't a showcase of your work.
5) Apple has something called staff picks. It's what I personally use, and I'd would be interested in seeing a roll out of project reviews first start off with a closed trusted set of reviewers. It could allow for a certain tone to be set and I think having reviews from trusted sources is better to start with. We can then slowly grant reviewer rights to more people and ultimately consider opening it up.
6) I used to use Yelp, a lot. Initially having a few reviews with a rating was good and I found it quite useful. However, now when there are over a hundred reviews I find the ratings are basically worthless. The reviews are bland and generic and it's almost easier to just go to the restaurant than make sense of which review is reflective of the experience I am likely to have. In general, there's a lot of value being created by rating and doing reviews, but it seems like a lot of work is being created to manage that value creation.

We need some volunteers to manage reviews if they go forward. I think it's going to be a focused job and having some new volunteers babysit reviews is a pre-requisite to deployment. Analytical assessments might not be better than human reviews, but once implemented they are probably a lot more cost effective.

To seek, to strive, to find, and not to yield

New Drupal career! Drupal profile builders.
Try pre-configured and updatable profiles on CivicSpaceOnDemand

Cool idea, but I fear the performance implications

Posted by dww on November 18, 2007 at 7:53pm

This is slick, and would be a great enhancement to ratings in general, and d.o project ratings in particular.

However, a few concerns:

A) I'm wondering how impractical a d.o buddylist implementation would become, and what kind of resources (both hardware and admin support) it would require. I've never used or looked at buddylist, so I have no direct experience.

B) I just thought of a very scary problem. :( One of the main features of a rating system would be the ability to sort modules based on ratings. The project browsing pages are already some of the most expensive queries on d.o, and some of the most complicated code. I shudder to think of how much more complicated the code and queries would have to get, and how many additional JOINs we'd be doing to say:

Show me all of the modules compatibile with 5.x, from the "Images" category, that have official releases, ordered by the total rating of all votes from all the people I think are a buddy.

Holy cow is that going to kill d.o's DB, especially since you can't cache any of this because it's different for every single user. :( This would be multiple full table scans since the ORDER BY clause would be a computed value and there'd be no indexes to help us.

If those problems can be solved, this would be a great solution to the gaming problem.

Partial solution for performance issue

Posted by calebgilbert on November 18, 2007 at 8:15pm

In addition to whatever greater-minds-than-I come up with, asynchronous loading for these kind of searches/sorts would lighten the load a good bit, couldn't they? (e.g., at least we wouldn't have to load a full page request everytime)

"asynchronous loading" == caching == not possible

Posted by dww on November 18, 2007 at 8:25pm

The only way to minimize the heavy lifting by doing the big terrible queries at some other time is to be able to cache the results of the queries and save them for later. However, as I pointed out above, this is basically impossible since all of the queries are specific for each user. So, we could cache the results for each user, for each project category, for each page in the pager. :( I suppose that'd have some minor benefit for users that frequently browse and go back and forth between different pages a lot in a short period of time. But, basically, it'd be uncached, since the cache results would only be valid for an incredibly small % of the requests we'd have to serve.

In the interest of not letting the thread die...

Posted by calebgilbert on November 22, 2007 at 2:37am

I guess, caching would definitely be out of the question for such queries for authenticated users at least. I'm not sure what percentage authenticated users are of D.O.'s browsing population, but I'm wondering if the asynchronous loading (so that full page loads could be skipped) would be enough to alleviate things that the infrastructure could weather it. I'm thinking it could based on my experience with playing around with such nuances.

Webchick, Amazon, et al -- is there a budget for helping fund this kind of work. Undoubtedly, dww and hunmonk and possibly others would be asked to devote many hours to helping implement this. Can the D.A. help facilitate something that the community-at-large is crying out for?

Possibly the right way to do

Posted by Chris Johnson@d... on December 12, 2007 at 9:28am

Possibly the right way to do this is to extract the objective information from the d.o database once a day and put it into another database. Then users interested in filtering and sorting modules by various means would hit that second database, instead of d.o's.

This very much parallels classic business duplication of data from their live, transactional databases (OLTP) to decision-support databases (DS) for precisely this same reason. Transactions have to remain fast; reports and metrics can take a while. OLTP is usually a mix of read/write while DS is usually mostly read. The DS database can also be optimized for the kind of queries it will see.

Ratings vs. reviews vs. automated metrics

Posted by dww on November 18, 2007 at 7:52pm

[Ratings]: It's not clear what you're rating. As many have pointed out, modules are specific to the problem you're trying to solve or the feature you're trying to add. What works great for one site/scenario might be totally wrong for another. So, what exactly are you voting on? Would the people who believe strongly in ratings (votes) please start posting a list of example questions? Ratings are basically survey research, and a huge part of that is asking the right question(s).

[Reviews]: IMHO, reviews could be more useful than ratings. However, in addition to the potential pitfalls that Amazon pointed out, I'd like to add this one: people will embed support requests, bug reports, and feature requests in their reviews, making more work for the maintainers. I believe a restricted set of trusted "staff picks" is a great solution to most of these problems, but that sounds like the dreaded "golden contrib" debate again. ;) Who maintains the "project reviewers" role on d.o? What are the criteria for membership? Who reviews the reviews? It's a lot of (valuable) work, but it's work nonetheless, and I doubt the association is planning to hire anyone to coordinate this as a part-time job.

[Automated metrics]: I never said that I'm totally opposed to human opinions regarding quality factoring into this. I just said that I'd rather see the initial effort going into the automated metrics, since I believe that in many cases, they'll tell you more about a project than a human's vote on a set of (at this point, undefined) questions, or even reviews. They're harder to game. They'll require much less on-going support/babysitting. One of the downsides is that you have to know how to interpret the numbers. So it's not going to be as immediately helpful for brand new Drupal users as a simple "5 golden stars" next to the project name whenever it's listed on the site. But, at some level, anything is better than what we have now, and I'd be more interested in starting with the automated metrics and moving out to the human-driven metrics once we've a) made some progress and b) have some experience. Other than a minor question on exactly where/how to display the metrics (see http://groups.drupal.org/node/6186) in the d.o UI, none of the automated metrics are really blocked on anything beyond someone standing up and producing the code. The human-metrics are blocked on many social and technical problems. That doesn't mean they can't be solved, but it'll be more work to get any human metrics in place.

Multiple rating axes?

Posted by dman on January 7, 2008 at 1:32am

RE point one, I think that anyone who's done a rating system (or planned one as I have) should be aware that there's a difference between:
"It's no good"
"I don't like it"
"It doesn't solve my particular problem today"
"It's written badly or is hard to use"
"it's badly documented/supported"
"It's not very popular or widely used"

So, like decent software reviews do, a fuller picture would be derived from voting on (possibly) several criteria.
This of course is exponentially more difficult to code, and somewhat trickier to browse and weight, so I don't seriously suggest this be attempted in the first round, but it's an aspect to consider...

Added another link

Posted by dww on November 18, 2007 at 8:04pm

FYI: In case you're not looking at the revisions tab closely, I just added another link to the big list: http://drupal.org/node/63491

Votes + Comments + Editors Pick

Posted by theborg on November 18, 2007 at 11:29pm

Agree with Laura (comment-21052) I always look at the comments after the rating, cNet also have the "editors rating" but maybe this means more work for the group of savy people Moshe mentioned. Also the have "the good" and "the bad", difficult to evalute without being influenced.

Software sites like snapfiles do something like that with:

Our ratings
Popularity
User opinions

An issue I see in all the ratings thing is that the popular modules will continue being popular and the new ones will slowly die because of the higher ratings ones, I've tried nearly every taxo module and found the one I was looking for after some time.

Rating/reviews/metrics are important but an acurate expanation of what the module does, a demo site and the interaction of it with the whole system/other modules is needed also.

Rating with review

Posted by jayjenxi on November 23, 2007 at 6:38pm

I suggest a system where the user gets to vote but would have to give a compulsory review along with the rating. I believe this would give the readers a better idea of the rationale behind such votes. I've seen such a system being used on the Firefox Add-ons site. I think the system is feasible for Drupal.org

https://addons.mozilla.org/en-US/firefox/

It also has a recommendation section, where they have a list of add-ons that are popular and recommended. I'm not sure how exactly they come up with the recommendations. However, this can be implemented on Drupal.org by having the staff list down certain modules that users should try out to improve on the site functionality. This means that, unlike the review system where the staff would have to go through each and every module, they would only need to review modules that are popular and highly-rated by users.

In this system that I suggest, there should be a field for the reviewer to post link(s) of sites where they implemented the modules. This would allow readers to have a better idea of how the module helped the particular reviewer. This field can also be used to count the number of sites using the module. Although it might not be a complete pictures of the usage of the module, readers can have an idea of the popularity of the module. I got this idea from vbulletin.org, where they have a "Mark as installed" option for users of the mods to indicate their using the mod.

I believe that readers who see a high rating accompanied with unconvincing review would not be swayed by the vote.

Modules would not be ranked according to rating. This rating system would not have an average rating. The users would have to judge the rating from the individual ratings given by the reviewers.

I hope I managed to put my idea across well enough. Please do feedback with your thoughts and concerns.

Usage statistic created by modul update_status?

Posted by Thomas_Zahreddin on November 27, 2007 at 10:12am

I think the update_status modul requests for new versions, so there is a statistic about the useage of a certain modul on the drupal sites - am I right?

And I can't find this statistic though I think I saw this page.

Can someone post the link?

Please read the initial post for this thread...

Posted by dww on November 27, 2007 at 5:41pm

#165380: Make usage statistics (from update_status) visible

I almost went off with

Posted by Chris Johnson@d... on December 12, 2007 at 9:34am

I almost went off with greggles, Michelle and MattKelly and implemented a module review system. In fact, I think Matt has already done some basics on his website.

I think we need to rope everyone who had such thoughts together here and make some progress.

I'm off to see if all the ideas the 4 of us put together in a Google doc have been expressed in this group. Matt is already active here, I think.

Pivots getting ready for deployment

Posted by Amazon on December 12, 2007 at 3:33pm

This is quick reminder that Pivots and Double pivots are being deployed on http://scratch.drupal.org, on the way to being deployed on D.O. Modules are being recommended based on content and a series of algorithms that are tuned to make those recommendations. Double pivots will recommend modules that are known to be used together based on context, and also the update status information.

Chris, looks like a good objective set of criteria for evaluating. You might want to also consider issue queue and commit activity.

Cheers,
Kieran

To seek, to strive, to find, and not to yield

New Drupal career! Drupal profile builders.
Try pre-configured and updatable profiles on CivicSpaceOnDemand

Dependencies

Posted by KarenS on December 28, 2007 at 3:51pm

Another metric to use to assess the quality/importance of a project is the number of other modules that depend on it. Is there any way to get metrics on that using the dependencies data in the .info files?

That'd require d.o to parse all the .info files

Posted by dww on December 28, 2007 at 5:38pm

We could in theory have a cron job that runs on d.o to parse all the .info files and keep a DB table populated with dependency info. Actually, we could put that in the packaging script when it's considering all the dev snapshots, since it already has to check out everything from CVS, anyway (to compare timestamps and see if anything changed and needs to be repackaged). It even already finds all the .info files, so it can add its extra attributes automatically (datestamp, project, version, etc). So, it wouldn't be that hard to have it always parse all the .info files it finds for each project, record the dependencies and update the DB. Creating an issue about this, and starting a patch, is left as an exercise for the interested reader. ;)

I really want to help out,

Posted by Steven Jones on March 28, 2008 at 12:16am

I really want to help out, but I'm not sure what to do. I've explored a few of the issues listed but they either seem very, very old or people just aren't too interested. I'm going to have a lot of free time in the next week or two and would really like to get stuck into something, just not sure what!

Guidance please!

old != bad

Posted by dww on March 29, 2008 at 7:53am

just because an issue is old doesn't make it a bad idea. it just means i don't personally have time to work on it, and no one else did, yet, either. my advice would be to find something you're personally interested in and get it closer to done.

Project metrics and comparison

Posted by sun on April 11, 2008 at 3:03am

I've just posted my findings about Project module's hidden project metrics in a new article in this group. Too much to summarize in a comment. I hope I didn't duplicate existing efforts.

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
netzstrategen

quality metrics

Posted by earnie@drupal.org (not verified) on June 3, 2008 at 1:05pm

For me quality metrics would be the considerations I would use to decide a modules worth. I do not give too much credit for user supplied rankings on any piece of software. Metrics such as the number of downloads, the date of the last release, the number of issues resolved, the date of the oldest issue, the date of the youngest issue and the date of the last commit are all good factors for considering a modules value. I would also include the number of project page views to gather a sense of interest in a module. Modules with good interest but poor maintainership are good candidates for improvement or takeover and this information can benefit the community. IMO modules with poor interest and poor maintainership should be removed from the list of modules since they get in the way of the ones that do have good maintainership or good interest.

Project Quality Metrics on Drupal.org (meta document)

Comments

Something to keep in mind...

I must say

My experience...

True, that.

Ratings vs Reviews

Adding

Amazon reviews

Gaming always can happen, but smartmobs persist

don't be paralyzed by imperfection

Paralysis vs. caution

buddylist as a trust web

Agree it's useful enough to try

Cool idea, but I fear the performance implications

Partial solution for performance issue

"asynchronous loading" == caching == not possible

In the interest of not letting the thread die...

Possibly the right way to do

Ratings vs. reviews vs. automated metrics

Multiple rating axes?

Added another link

Votes + Comments + Editors Pick

Rating with review

Usage statistic created by modul update_status?

Please read the initial post for this thread...

I almost went off with

See also

Pivots getting ready for deployment

Dependencies

That'd require d.o to parse all the .info files

I really want to help out,

old != bad

Project metrics and comparison

quality metrics

[Archive] Drupal Association improvements to Drupal.org

Group organizers

New groups

Group notifications

Hot content this week