This is my list of possible metrics used to rate the quality of project nodes on Drupal.org. I'm envisioning that there will be many different metrics grouped into three categories, development, usage, and support. Development would be focused on determining how active the module's developers are. Usage would look at how popular the module is. Support would focus on how well the maintainers and other users help answer questions.
Development
Developer count - cvslog.inc -DoneSELECT COUNT(cpm.uid) FROM {cvs_project_maintainers} WHERE cpm.nid = %dproject has release node for current version of DrupalDone- how often are new releases offered
- how recent are commits
- the number of feature requests opened vs. the number of closed
- the number of uncommitted patches, etc.
Usage
the number of sites using it based the update_status module's callback data.See: #128827- Used on this site - module_exists( {project_projects}.uri );
- The number of downloads... Probably won't do this one since it would involve parsing apache logs.
Support
Project as docs link defined -Donevalid_url($node->documentation)- Support request issues closed in the last month - project_issue.inc
- the number of support requests opened vs. the number closed in the last month.
- the number of unique people commenting on issues.
Comments
translations
As I have written in my own status report, we will have a pretty accurate picture of how well are the project interface translations maintained, which could be another factor (possibly different for each release of course). When you get to a stage to be able to receive data about these things, we should talk about integrating localization data in.
Of those...
I'd say the # of downloads/# of sites using them should be toward the top of the priority list, although they're going to be a bit tricky to code, I think...
There are quite a few of the developer metrics that I also use... I'd love to see "whether or not there's an official release" weigh very highly, as it's incredibly annoying/scary ;) to have to use a dev module. The ones around commit activity and issue activity should also go near the top of the priority list, and will probably be easier to code.
The support ones (with the exception of 'how many people are commenting; that's kind of interesting) unfortunately are going to be nearly useless because almost everyone uses the forum for support. So I'd call those "nice to haves" but definitely not critical.
Suggestion: Instead of "in the last month", have a configurable length of time ... 1 day, 7 days, 30 days... 3 months, 6 months, one year, all time?
Suggestion: If concerned about where to focus first, I'd try to specifically make http://drupal.org/files/issues/project-information-at-a-glance.png possible. Looks like you've got them all listed above?
some other wish-list metrics
number of simpletest tests that exist.
number of simpletest tests that are currently passing. ;)
some basic documentation metrics:
--- existance of a README.txt
--- implementation of hook_help()
Does potx.php extract translation templates without errors?
What does coder.module say about code style?
Those are all that immediately come to mind. Big +1 on webchick and gabor's suggestions, too. If I think of others, I'll post here. I'm just brainstorming, not to say that all of these are really important or must be completed during this SoC, but I wanted to get them all in one place and then we can prioritize...
What about some qualitative feedback?
I really like webchick's "at a glance" overview, and I think that it should be boiled down to its essence, but with the option to look at a more information about it -- possibly in a collapsible field or another page.
I also think that some of these could be rendered as health bar images styled via CSS-- or boiled down to a single number or series of numbers. This could be on the second iteration after people get a chance to look at the numbers you come up with and be able to gauge how accurately they reflect the health of the module.
I realize that this is going on Drupal.org, and that there are performance issues to consider, but I'd love to see a fivestar rating system in addition to being able to add favorites. And also some qualitative reviews from users that specify the release number that is being talked about. Some modules may be really well maintained in one version and have deprecated support for a past version.
There are also different levels of commitment that module developers have towards their own modules, and it might be insightful to have the writer of the module publicly declare how active they intend on being with their modules ranging from:
* Actively Co-Maintained
* Actively Maintained
* Looking for a Co-Maintainer
* Maintaining while looking for a New Maintainer
* Completely Orphaned
I imagine that most would say "Looking for a Co-Maintainer," but there might be free up some orphaned modules to declare that they're not being supported any more -- or that it'd be great if someone could take this off my hands. As of now, those conversations take place in the backchannels.
I'd be into some sort of
I'd be into some sort of voting system that's factored into the popularity number. It's been discussed on the dev list and in the project module issue queue in the past. Dries had opposed installing a voting module out of security concerns.
There was discussion last fall about adding a vocabulary to drupal.org to allow module maintainers to specify their level of level of involvement with the module. The conversation sort of stalled out but it seems like it would be great addition from my point of view.
Ideally the metrics would be hard for unscrupulous maintainers to game (i.e. voting for their own modules with dummy accounts), but a maintainer should be able to lower their module's score by indicating that it isn't maintained.
maintainership status
Yes, as someone who's been long involved with the system, and who had some maintainer changes (from and to myself) before, I think this status marker would be really nice.
Time to look at VotingAPI again -- Data visualizations
Hey Drewish,
Lots of great links there.
Clearly we're tapping into a perennial issue that has been in high demand for a while, and I'm sure that's just scratching the surface of all of the discussions that have happened.
My first response is that opposition from Dries from the development listserv was from January 27, 2006! He was skeptical about the VotingAPI complexity, which as a project had only been in existence for about three months at that point. For a matter of perspective, the fivestar module didn't come out until about a year later at the end of 2006. And since then Fivestar has deployed by Lullabot on some very high trafficked websites. It's a lot more battle tested and evolved by this point, and hopefully scalable.
I think it's time to revisit VotingAPI/Fivestar combination for this, and get them both into shape and up to whatever other standards are necessary. I mean, I'd be really disappointed if Drupal can't even handle voting on our flagship site. I'm sure eaton and the lullabot folks have enough lessons learned by now to make it happen.
I was also happy to see that many of the suggestions that I had for the maintainer to be able to declare their level of commitment were mentioned in this thread. I agree w/ merlinofchaos when he suggested that the three levels of "Maintained", "Needs help" and "Unmaintained" should probably be enough.
It also jumped out to me what Dries said about these project vocabulary:
Dries then says, "I'm not convinced we're approaching this the right way. For example, I still think this is a better approach:
"
So I think it's important to grab as many different metrics as possible, and then find out a quick and efficient way to present the most pertinent info.
It'll probably take a couple of iterations of data gathering and presenting, but I think we should try to avoid coming up with something that looks like this:

And looks something more like one of these:


downloads
[some how got my posts mixed up... i'm sure that'll screw everyone who's following via email up]
i think the big thing for downloads is to figure out what the maximum is and then throw it into some kind of a log function to normalize it into a 1-10 range.
log discussion
So, we discussed this in IRC a bit and I'm curious about "log(# downloads, # max downloads)" vs. "# downloads / # max downloads" vs. "rank weight (#modules + 1 - rank)/# modules".
As drewish pointed out, this helps modules that have relatively low numbers of downloads.
So, from the april download statistics here are those three methods applied to the top 20 modules.
The last two are fakes to provide a sense of what the middle and bottom tier would look like.
Any thoughts on which system seems best given the data?
--
Knaddisons Denver Life | mmm Chipotle Log | The Big Spanish Tour
knaddison blog | Morris Animal Foundation
rank weight eh...
the rank weight looks really good. i wonder if it would be efficient to compute for a single node. it seems like you'd have to sort the whole list and then compute the whole list at once.
Project metrics
There are some good ideas in this thread that could be merged with my proposal of project metrics based on already available project data (statistics): Project module's hidden project metrics
Daniel F. Kudwien
unleashed mind
Daniel F. Kudwien
netzstrategen