Discussion for RCE in Contrib PSA and announcements

greggles's picture

Hi!

Based on a lot of discussion, especially in this thread, the security team adjusted the way we announced the most recent contrib module releases.

  • We did a psa 24 hours in advance and named the time of the release
  • The PSA (after editing, whoops, should have been in the initial version) included the number of installed sites roughly
  • We have a security scale and could specifically say how critical the upcoming issues were
  • We used twitter to talk about it as well
  • We posted the PSA to the front page
  • The PSA went out to the normal email and rss lists, so people got that heads up

Things we didn't do, but that happened:

  • Several people blogged about the event with some advice on proactive steps to take
  • Several people (sometimes same people) blogged with a fair bit of dramatic/sensational language

Some specific questions for the future:

  • Was one day enough without being too much?
  • If the affected module is used on > 10,000 sites, how specific can/should we be about how many sites are affected?
  • Any questions or feedback about the process and if it worked well or not?

Comments

Great official communication

ultimike's picture

I thought the official communication was great.

I was a bit disconcerted by some of the "dramatic/sensational language" in non-official posts I saw during the past 24 hours. I'm willing to attribute this to the fact that the official communication pattern was new. I'd bet the next time we won't see a similar level (see "The boy who cried wolf"). Things like this have a way of self-regulating.

I think it was great that loads of people were spreading the word - that's nothing but good.

Mike

This was the first time where

chrisshattuck's picture

This was the first time where I felt like I was ready to patch my stuff when the SA came out. I mostly learned about it via Twitter, but saw re-posts of it all day, so it was hard to miss. That was nice. This morning was kind of exciting, discussions going on in Slack and Twitter about what it could be, it was almost a let down to discover that I didn't have any updates to do.

I didn't see a ton of drama on Twitter, but I feel like overreaction is a good thing when it comes to further getting the word out. Getting more eyeballs on it seems like a nice side effect of folks tapping into some FUD for effect.

One day was enough for me, but I can imagine that there were some slower moving entities out there (cough, government, cough) that might have been freaking out about the limited lead time. But so much better than zero days.

Really proud of you guys, thank you!

Chris Shattuck
Learn Drupal with over 1700 Drupal video tutorials

Happy with the process

natted's picture

I think the process was good. I know a few people who heard about it and prepared to be ready for patching.

Even though this time the alert didn't affect me, it was useful to have the heads up. If I had been affected, I would have been even more grateful for the announcement.

You nailed it. Great

corbacho's picture

You nailed it. Great communication

Pretty great PSA process

NickWilde's picture

I personally felt that it was a great pre-alert. Gave me time to be fully ready and hence was able to check all my sites with 10 minutes after release.

I think that if it is used on > 10k sites that that is the max information about usage that can be given or it'll be as good as saying what module.

I mentioned it on the other

mpdonadio's picture

I mentioned it on the other thread, but if it isn't a disclosure, then saying whether or not a site would be protected by maintenance mode and/or would require a full lockdown if not patched in time would be nice.

Same thing about whether a module needs to be enabled or not to be exploited. I did see people prepare by making lists of modules used on sites, but these didn't typically include disabled ones.

Those two points are just to help people triage and prepare. I double checked my clients' sites ASAP with this. I just turned off Apache on my personal sites at 11:55, and will deal with them when I have time.

The above are just things for the SecTeam to think about, not complaints. This PSA/SA process was top notch. I feel like I was better prepared than when SA-CORE-2014-005 landed, and I think I had more lead time when that happened based on who I followed on Twitter.

As an exercise, did anyone draft what a PSA under the current guidelines for SA-CORE-2014-005 would have read as?

Close to perfect

jurgenhaas's picture

The process and communication has been great, nothing to complain about.

Personally I had a problem regarding the timing this week and wondered if a longer lead-time would be better. But then, in my personal situation which triggered that thought, even that wouldn't made any difference as my unavailability couldn't have changed. Fortunately it turned out that none of the sites I'm hosting was using any of those three modules.

Let's briefly think about what would be the situation if a similarly critical update would be required for a more popular module or even core. If some people can't update within hours, would it be possible to provide some "gateway protection" for those who utilise proxies like HaProxy?

I remember Drupageddon where it was communicated that big hosters like Acquia, Pantheon and others got involved early in the process so that they were able to protect their infrastructure at the gateway level for the time period until they were able to update each of their hosted sites.

Not each and every security issue can be mitigated by some filtering on the gateway level. But if it were, is the security team considering to provide such filters (or other means) for the popular proxy systems out there in the wild? That could probably help to take some pressure especially for the slower moving entities.

What I'm talking about here is no criticism, it's just a suggestion to improve an otherwise almost perfect process. Thanks guys and gals for helping us all to keep our infrastructure as secure as possible.

Longer notice period?

ressa's picture

I agree that the communication has been great. I do feel that the 24 hour notice is not enough, especially during the Industrial Vacation, where the Drupal specialists might be on vacation.

I for one contacted my former work place to remind them of the upcoming patch, and the one Drupal person left hadn't heard about the PSA, and was about to go on vacation himself Wednesday night... And I myself could have easily been away on vacation right now.

Perhaps a weeks notice is worth considering?

I think it's good to do

catch's picture

I think it's good to do pre-announcements for highly critical vulnerabilities like this.

Couple of things though:

Putting numbers on the post was good (and helped mitigate some of the over-reaction), but it also underestimated the number of sites due to the fact coder didn't need to be enabled. I'd expect there to be very few production sites that have coder in the code base, similar to devel it's very rare to add it to version control, but Open Atrium completely changed that. Should Open Atrium have had its own SA? Is there a way we could automate flagging of distributions that include modules for cases like this?

More worryingly, the commit to coder was made 16:49 on July 12th: https://www.drupal.org/commitlog/commit/2542/46fe6f6171df7f4505f502f3439...

The SA came out at 14:59 on July 13th https://www.drupal.org/node/2765575

Combined with the pre-announcement, someone could very easily have had 22 hours to to search through commits and find this one, prepare an exploit etc.

While the rules for contrib security releases allow for commits/releases to happen up to 24 hours before (https://www.drupal.org/node/101497) I don't feel like that applies if there's a pre-announcement of a vulnerability, much more incentive to do some digging in this case.

What stops people checking coder in?

jp.stacey's picture

Personally, I worry that a LOT of production sites have coder in the codebase: the long tail of newbie software development! It's very easy for newbies to check coder in, and nothing apparently goes wrong when they do so. Why wouldn't you, for convenience's sake, if you didn't really appreciate the ramifications?

I know we can't ever completely mitigate against people following bad practice, but often the Drupal site maintainers who don't know what they don't know, are the ones we will benefit most from protecting and making better informed about the decisions they make/have made.

Regardless, +1 to everyone saying this was a good process this time around. A client of mine that maintains one of their sites themselves was 100% on the ball with this, ready and waiting for the announcement, then really happy to stand down. They didn't treat it as excessive at all.

Nothing stops you from doing

catch's picture

Nothing stops you from doing it (and clearly open atrium did!), just it's a module where you shouldn't ever need to do that (compared to restws or webform_multifile which are supposed to be used on production).

The process for distributions

mpotter's picture

The process for distributions has always been somewhat tricky.

In order to update Open Atrium, we had to wait for the actual release of Coder 2.6 (and oa_devel) to become available via the drush-make process that drupal.org uses to build distributions. Then we could tag and release the new Atrium version. The packager sometimes still fails to see the new module release and gets stuck. But we still had the Open Atrium release out within 3 hrs.

And while the 1,000-10,000 range was probably an under-estimate because it affected disabled Coder sites, there was really no way to measure that.

Normally a distribution only gets it's own SA when the security issue is related to custom code within the distribution itself. For example, we have done an SA in the past when the issue was in oa_core. But when the issue is in Drupal Core or in a Contrib module, the distribution itself doesn't get an SA. However, because of the severity of the issue, we did mark the new Atrium release as a Security Update and had it approved by the security team in order to raise more attention to sites using the distribution.

We also did the same with Open Public and marked it as a Security Update.

In general I think the security process worked pretty well in this situation. The 24 hr prior notice was a big help in this case to a lot of people to raise awareness of what was coming. It would be difficult to give more notice than this given that you don't often know what patches and releases will be ready on any given Wednesday.

Some potential improvements for the future would be

1) a way to list the actual modules any distro is using. This can be tricky since it's a recursive build process. In this specific case, Coder was part of oa_devel which was part of Open Atrium. But this information would be useful to be listed automatically by the packager on the project release page even beyond security issues.

2) if a PSA is issued, better coordination with the module maintainer to delay the commit until closer to the actual release time

we had to wait for the actual

greggles's picture

we had to wait for the actual release of Coder 2.6 (and oa_devel) to become available via the drush-make process that drupal.org uses to build distributions. Then we could tag and release the new Atrium version. The packager sometimes still fails to see the new module release and gets stuck. But we still had the Open Atrium release out within 3 hrs.

If you're willing to invest some time here we can probably figure out what caches need to be cleared to make this work properly. Michael now has a list of caches he clears in cases like this so that the news and releasees are all available ASAP even if people are browsing as anonymous.

"a way to list the actual

Mixologic's picture

"a way to list the actual modules any distro is using. "

They should display on the release pages, but there currently exists a bug that some do not get that listing : https://www.drupal.org/node/1638208

Yeah, 24 hours in advance is

greggles's picture

Yeah, 24 hours in advance is a bit long in this instance, particularly when the commit message doesn't follow the standard of being opaque. I think it would be ideal to limit that down to 1 hour or less in cases where the issue is extreme and/or in a popular project.

Thanks for the feedback here and, of course, before the releases in the Security Team queue.

dsnopek's picture

Thanks for the feedback! You raise some really good points.

While the rules for contrib security releases allow for commits/releases to happen up to 24 hours before (https://www.drupal.org/node/101497) I don't feel like that applies if there's a pre-announcement of a vulnerability, much more incentive to do some digging in this case.

Controlling when commits happen is hard because the maintainer is the one who does them. In a distributed Open Source community made up of unaffiliated volunteers, I'm not sure we could do better without the security team taking over and doing the commit (and I guess locking the repo so nothing could be committed until release time?).

I'm not crazy about the idea of doing that, however. I like that the maintainer is the one who fixes the issue and creates the release with the help of the security team, rather than the other way around. But maybe there's a place for that in the case a highly critical vulnerability?

Meant to be in response to @catch

dsnopek's picture

Eep! This was meant to be in response to @catch on https://groups.drupal.org/node/512587#comment-1148681 - I hit the wrong reply link. Sorry!

Kudos

nedjo's picture

Kudos to the security team for carefully and responsibly planning and executing a revised SA process that appears to hit the mark, responding to community feedback without compromising security standards.