Posted by webchick on September 26, 2011 at 5:59am
Right now, the project application process only spot-checks security and API problems that happen to be in the specific lines of code of a specific project. A contributor can walk away with a "swiss cheese" understanding of Drupal best practices.
One way to combat this is to set up a "quiz" of some kind in front of granting people access. This idea has sign-off from the security team. I've created an issue for this over here: http://drupal.org/node/1290624
To populate the contents of the quiz, I'd love some feedback on what are some of the most common problems you find in new applications (as well as links to resources on where to learn how to fix them).

Comments
My impression: ca. 99% of
My impression:
ca. 99% of them are pushing into the master branch (naming conventions: http://drupal.org/node/1015226, git branching: http://drupal.org/node/1066342)
ca. 80% of them do not even test their modules with coder (http://drupal.org/project/coder, http://drupal.org/coding-standards)
ca. 60% of them do not create a README.txt (http://drupal.org/node/447604)
ca. 40% of them still include $id$ lines (http://couldnot.find)
ca. 20% percent of them push a LICENSE.txt file (http://couldnot.find)
Not testing with coder would
Not testing with coder would be a common problem while the others in the list doesn't matter as much for drupal security or API best practices.
I think encouraging the use of libraries would cut down on the support needed for more of the code, which would in turn free up time to fix bugs. More focus on that front should help code review more broadly in regards to drupal security or API best practices as well.
Branching.
I went through the process this summer...
Is a quiz really going to do
Is a quiz really going to do anything? Is it that hard to look up the answers? If the questions are hard enough, abstract enough, that they can't be looked up, will they be fair to a new contributor?
As far as problems go, I often see misuse of t() and the other sanitisation functions. Adding it in places where it isn't needed. I also find that assumptions are made about the safety of data coming from other modules, or Drupal core. People not sanitising data passed to them from core, like node titles, thinking it is safe. Things like non-parameterised db queries tend to stick out, and are easily caught by coder. Agreeing with patrickd, I almost always find a mistake by running it through coder.
Maybe, maybe not
The research we did on the project applications queue before DrupalCon London though shows that the current process is not actually accomplishing what it was originally set out to do—finding security holes—nor what it eventually morphed into being—a mentorship process for new contributors on Drupal best practices. This is a way to ensure up-front exposure to major Drupal concepts that people struggle with. Even if they cheat to pass it (which we hope they won't), at least they had to read the question and answer. ;)
It will never be a mentoring
It will never be a mentoring process. Such a thing requires way too much time and energy on the part of both parties. (Unless people are really dedicated) We are asking random people in the world to look over the work of other random people and ensure they are good at their job. It's crazy, and internet-based relationships don't have the attention span for such things. On top of that, everyone is complaining the process needs to go MUCH, MUCH faster. If the point of a quiz is to introduce people to a concept, as they quickly hit a "next" button, what information is going to be retained? It all sounds rather pie in the sky. Is there a proposal to stop the application process and just replace it with this quiz? I can see that being better than just a fully open system. (Everyone can create projects, but we want you to read this info first.)
The slides in
The slides in http://london2011.drupal.org/coreconversation/1730-1800-project-applicat... present some reasonable ideas toward this, I think:
And then slide 18 has the full recommendations.
As you say, mentoring will be really hard to do given time constraints and pressure to speed up the process. If we automate as much as possible then we hopefully free up some mentoring time & educate without human intervention.
I think the fact that the presentation at Drupalcon and proposals here come from a lot of well-intentioned folks who have done few, if any, project reviews is a little off-putting to those "in the trenches" slogging this out. But let's not let that stand in the way of some good progress.
I am in favor of the quiz and actually already started working on it :)
knaddison blog | Morris Animal Foundation
Does quiz provide adaptive
Does quiz provide adaptive questions based on the previous answers like they do for MS certs? Making a quiz does provide some heads up on knowledge that a person is missing. If the questions were targeted for the components that you're building, that would be an additional plus. For example, if writing code for views, you get views related api/security questions along with some standard ones, while for core, you get questions for core along with the same, do only once if you pass standard questions. 7.x-dev quiz is more or less ready for people who don't need to migrate from what I read in the issue queue, but I haven't tried it out yet.
Coder
Is that research online somewhere? It sounds like the results don't match my experience as one of the more active reviewers. I regularly find problems with both security and best practices, and applicants are generally appreciative to have them pointed out. I'm surprised and disappointed to hear I haven't been a successful mentor. :(
That's not to say we don't have problems with the application process, but I share davidhernandez's concern that a quiz may not help. One of the biggest problems I see with the current process is applicants seeing reviewers as gatekeepers to be placated by performing arbitrary tasks, rather than collaborators in a shared effort of making their project better. A quiz seems likely to increase this disconnect between new contributors and the wider community.
To answer the original question, I agree with others that not checking Coder is the most common problem. Beyond what Coder finds, the problems tend to be very specific (hence Coder not finding them). For security problems, lack of sanitization and lack of access checks are probably the most common issues I see.
Here were the slides:
Here were the slides: http://london2011.drupal.org/coreconversation/1730-1800-project-applicat... The data is imperfect, but shows general trends. The vast majority of effort in the application process is going to coding standards review rather than pointing out security holes in the modules. Comments here seem to support that as well. This is work that robot brains should be doing, not human brains, and jthorson is actively working on that.
My impression of the code review team is that the mentorship that's being provided is excellent. So no disrespect there. However, the problem is that no two modules are alike (cos if they were, we'd deny them full project permissions :P) and so someone might go through the application process learning about the proper use of t(), but never about the proper use of db_query() or when to add a node_access tag to their query, or when to use filter_xss_admin(), or the proper way to call a theme function. They'll only learn specific lessons about the specific code they wrote.
The intent of this is to help fill in those gaps, so that new contributors are knowledgeable on multiple facets of secure Drupal development.
Regarding coding style being
Regarding coding style being the top item in the cited stats, I think it might be a little misleading. The main reason is it being low hanging fruit. From my experience, reviewers are not really concentrating on coding style, but it is easily noticed. When you first start a review you look at the code, and see crazy spacing, things out of place, hard to read, etc. Comment on it, set it to needs work, and then go about spending time actually playing with the module and looking over the code. So it might have the most comments, but I don't think it garners the most attention from reviewers when judging a module. (reviews/reviewers may vary)
I like the idea of a quiz
I like the idea of a quiz very much, but the hurdle must be kept high. I think a minimum of 100 random question would be a lot, but it would ensure quality much better than the current project application queue ever could.
This would really force people to start reading the documentation.
I'd really like to help finding a lot of good quiz questions!
Sure, a quiz... why not
I think it is good in the sense that everyone will hopefully have a mutual understanding of risks and issues associated with module development. We already get a bit of this through the documentation on how to start a project but people sometimes disregard that.
Having completed the process my current concern is that I have another three modules in my sandbox now and I would like an external review on at least one or two of those. I don't feel like I should put myself back into the project applications queue... so what is the best practice? I have a "mentor" I co-maintain with but my new modules are completely unrelated to the co-maintainership.
I would like a way to flag my module for review for awhile after the project application queue to ensure I'm staying within the guidelines (especially formatting, uninstall functions, etc. non-security stuff). Maybe I'm being too paranoid... I should probably just branch and post a stable release already.
What are other communities doing?
With regard to the communities and maintainers of other extremely popular open-source projects:
Answer - Nobody's requiring some quiz. I've never, ever heard of any such thing. (However, the Drupalverse is also doing something else which I'm not aware of elsewhere - see next post for details.)
OK, my answer can be longer than "what are they not doing". What they are doing includes among the following:
The reason Drupal.org has such a great problem now is that we've taken it upon ourselves to do a completely different job than other communities do, with regard to representing and offering the non-core code. For better or for worse.
What is the Drupalverse doing differently from other communities
Is Drupal really the only community experiencing something like this, with regard to community contributions?
Yes, it probably is (I don't actually know - it's good rhetoric though).
Why?
Because other communities do have "open systems".
The Drupal project made a decision long ago to have a review process for creating projects on Drupal.org. The result? Drupal.org is widely perceived as definitive, reliable, and "official", as a source for contrib modules. Non-developers already think that every contrib module on drupal.org has been painstakingly reviewed for security problems and can be trusted unconditionally in that regard.
It's beyond obvious that project review is unsustainable. It's not even fair as it is currently - one successful project candidate review yields unlimited project creation rights for that contributor.
Does anybody care to guess:
So, let's ask ourselves:
Do we Drupal people want to become more like other communities around extremely popular open-source projects with large collections of available extensions? Represent third-party extensions as strictly third-party? Let the users "rank" and "review" modules for quality? Get involved only when some contributed extension is revealed to be philosophically or securitally objectionable? Free-up developers to get to work?
Or do we want to continue to be less like such communities? Preserve the perception that "If it comes from Drupal.org, it must be secure and reliable!"? Impose gates on potential contributors? Systemically treat some contributors "more equally" than others? Obscure what end-users actually think of the contrib products?
I'm sure there's a middle ground. Here's a +1 for automating Coder and other basic checks. We don't even have to let those checks bar the contribution - we could just let the results of those checks be available for potential users to evaluate.
Take-over maintainers
When I went through the process the "take-over maintainer" route was no longer an option. If you go that route post-git you still need to go through the project applications queue for full access. Sorry I don't have a link to "prove" this, I found this out in IRC.
Correction
The 'abandoned module' and 'co-maintainer' processes are both still valid options, and do not require project app approval ... in fact, I've seen many cases where we try to steer people in that direction because both are much simpler processes than we impose here.
The project application
The project application process was put into place in the first place due to something that is unique to our community: a security team that manages and issues security advisories for all contributed add-ons that have a stable release. This is a tremendous benefit of using Drupal over some of the other projects with communities we could be more like.
However, preserving this capability means that we need to ensure people have somewhat of a cluestick before we turn them loose with project access, or else the work of the security team goes up exponentially. This is one approach to ensuring general knowledge of security best practices, as well as proper use of APIs for interoperability with other modules. Automated tools are also necessary, and are being actively worked on, but they only cover specifics to one code base, not the bigger picture. Adding this can help with that bigger picture, and make sure people actually understand the text in http://drupal.org/writing-secure-code without being dependent on a human to read it for them.
something that is unique to our community
Yes indeed, I recognize the security team's tremendous benefit.
So, I don't mean to hijack your thread about quizzing. I do suggest however that we think of ways to separate the different goals.
If security were treated totally separately from all other issues around project creation on d.o, the security job would be far more efficient, and, what we're all talking about here, so would the application queue.
My own opinion is that security is the only subject which justifies not "turn[ing] them loose with project access".
My other opinion is that "project access" sounds scarier than it has to. It's hard to loosen the grip now, after it's been tight for so long. It used to be a comfortable fit, but we grew 'til it's a painful choke hold now.
I've observed other FLOSS communities where the contrib repositories are half full of crappy code, duplicated functionality, and world-class software too. These communities don't have d.o's reputation for security, but, they don't (generally) deserve it any less - at least not according to the current state of the art on d.o. On the other hand, the presence of crappy code and duplicated functionality hasn't harmed those communities, because they have effective ways of rewarding quality without (much) policing of the losers.
"On the other hand, the
"On the other hand, the presence of crappy code and duplicated functionality hasn't harmed those communities, because they have effective ways of rewarding quality without (much) policing of the losers."
Well, you're preaching to the choir on that one. ;) I would much rather us "filter on output" rather than filter on input when it comes to projects, and I know many others feel the same. However, since the security team has to bear the brunt of any policy changes, we need to weigh their feelings on the matter pretty heavily.
At DrupalCon London, I shared the results of the research with the members of the security team who were there, and basically the consensus there (later corroborated on the sec. mailing list among the wider team) was that as long as we had decent automated scanning for security vulnerabilities on new projects, coupled with some basic sanity-checking of security skills (particularly for the most common vulnerabilities), they would be comfortable removing the human-driven application process, at least concerning their needs. Hence, my post to ask the code review team, who handles the day-to-day execution of this process, their perspectives on what might go into such a sanity-checking.
Now, over time, this basic security sanity-checking approval process has morphed from its original intent to now being a mentorship process for new contributors on The Drupal Way, a detailed code review of everything from API usage to coding standards, and an opportunity to block duplicate modules from being created. It's long been my position that those other goals are fine and good, but should not be coupled to granting access to someone to contribute more to Drupal. Sounds like we agree.
...make sure people actually
I think we all agree this is the goal; however, there is a big difference between "did you read this and can answer a multiple choice question?" and "can you apply this to your code?" If the ultimate goal is to make people good developers before allowing their work to show up on drupal.org, that requires a lot of work. There is no short cut. Either keep the bar for entry high and have higher quality content, or lower it and accept lower quality content. There are pros and cons for both.
Looking over the slides, I'm interested in point #3, and how that would play out. If we move to having tar balls in sandboxes, and dev releases, it probably renders some of this moot. Hopefully, it would lessen the mentoring crunch if it is only for full releases.
Background
After getting some background on this in IRC, I suggest taking a step back and sharing the background in public before moving ahead on this.
The London presentation, which as far as I can tell was never posted to this group before this thread, apparently prompted some private conversations that eventually turned into a plan to add a quiz. But almost no one knows anything about this plan or where it came from. This makes it difficult to give useful input, and also sends a bad and unintended message about our community. Clarifying as people ask questions is good; starting over would be better.
Agreed. We are probably
Agreed. We are probably rehashing a lot of conversation that has already happened, because this is the first many of us are hearing about it.
Sure, we're working on a
Sure, we're working on a larger post on the conclusions from the talk. I've been holding off until I had a chance to sit down and vet it with the other folks who co-prepared the presentation to make sure it was saying something we all agreed with; this was an incredibly long (and, at times heated) discussion. Finding this time post-DrupalCon has been a bit tough, though.
However, for the purposes of this post, all that entire background isn't really necessary. The deal is:
I feel like people are taking this entire subject way personally, and that definitely is NOT the intent. We are trying to come to you for ideas on how to lessen the workload the code review team has to do that could be automated by testbots and other processes, so we can empower new contributors to help themselves without manual intervention, and so your brains can be firmly focused on more nuanced/challenging/fun problems, like the actual mentoring of new contributors.
The project application
Are you talking about the post-git-migration process or the original process? I think the cvs application process predates the security team: Sec team October 1, 2005 vs. cvs application mentioned in 2003. Given that in 2003 access to CVS meant you could commit to any project I imagine it was a general sanity check. The process from when I applied was "read the TOS, read the FAQ, don't mess anything up."
As a general comment: I think the process has improved a lot over the years and the current review/mentoring process that people work to achieve is pretty amazing. I look forward to automating more of it as a way to keep improving it and free people up to work on the more important and more difficult to automate aspects.
knaddison blog | Morris Animal Foundation
.
I can't remember when I started helping out with the apps but it was quite a while ago, 3, maybe 4 years, I think? At that time, we weren't even specifically looking for security issues. It wasn't until just before the apps moved to the issue queues that the code was really looked at beyond "is this total crap?". And this was the idea of one person, not any community decided policy. At the time, it was mostly the two of us handling apps (via email) with a few others helping here and there. When ajk decided to start doing code reviews, the whole system bogged down and got moved to the queues to try and get more people helping out.
So when people act like code reviews is how this has always been and is community policy, I have to laugh because it was just one person deciding he would help out by making sure the code of new contributors was reasonable and it spiraled out of control from there with additional reviewers getting pickier and pickier until we ended up in the mess we are now.
Michelle
I don't think anyone is
I don't think anyone is taking it personally, at least they shouldn't, but there was certainly some lack of communication (for some, if not for everyone.) You'll have to expect some odd expressions on people's faces (in comment form.) Any positive change to the process is welcomed, especially automation. (Hooray for automation!) I just think this quiz idea is a bit naive; thinking people will actually make changes based on what they read in it. (We see A LOT of ..."interesting" coding) I'm a bit pessimistic though, when it comes to human nature, so maybe it's just me. :-/ Anything's worth a shot?
To be more on topic, some of us have already posted some content for the quiz. Is any of it what you were looking for? do you want to expand on anything in particular?
Did some quick research on
Did some quick research on the https://drupal.org/security/contrib rss feed for 7.x contrib modules and here are the results:
Security Risk: Critical
Information Disclosure - 1
Security Risk: Moderately Critical
Cross-Site Scripting - 7
Privilege Escalation - 1
Validation bypass - 1
Open redirect - 1
Multiple Vulnerabilities - 1
Security Risk: Not Critical
Cross Site Request Forgery - 1
So the biggest bang for the buck is presumably making cross-site scripting impossible to do, if that's possible, or mitigate the attack surface via various means.
Yes, take a look at this
Yes, take a look at this distribution:
http://www.flickr.com/photos/gregglesworth/6190961996/
This is based on work for the Drupal Security Report.
Yes, XSS is the worst. We should and are working on making that better.
Access bypass is harder to pinpoint where/how it happens.
knaddison blog | Morris Animal Foundation