The launch of schema.org got me thinking about how we could help the major search engines to extract the bits of information they care about from Drupal 7 pages. They will try to follow the spec of whatever format they are parsing, but there will be bugs, and we should be able to detect these bugs easily so they can be fixed. From a search engine perspective, it is hard to know how to work with Drupal pages without knowing much about the internals of Drupal and the kind of information they output. They could look online for random Drupal 7 sites but they might miss the bigger picture if they don't work with the Drupal community. What if we built a set of typical Drupal 7 HTML output pages featuring the structured data that search engines should expect? Bing, Google (and also Facebook) could use these pages to test and tune their parsers. We could also use these pages to test the various search engines and ping them if something breaks on their side (this would be manual testing at first, but we could think about automatize that in the long term, to provide some form of QA for the Drupal SEO community). I think this would also allow us to learn more about how consumers like search engines use our markup, and potentially help us to make decisions if we needed to make some tweaks to Drupal's output.
We could start with a few common types of pages that search engines are familiar with, such as news article, person, event, recipe. Each of these types could be documented as a site recipe (maybe using features which would include the appropriate schema.org mappings) and we could generate an HTML snapshot from each of these types. (I think Lin has already started to work on some of these site recipes). For each supported type, we would take an HTML snapshot and assign a version number, and increment that number as we make fixes or modules updates. Each version would be hosted online so that it can be tested against the various search engine tools. We would have for example:
http://qa.semantic-drupal.com/snippets/person/1
http://qa.semantic-drupal.com/snippets/event/1if we find a bug in one of our modules, we fix it and release a new set of snapshots:
http://qa.semantic-drupal.com/snippets/person/2
http://qa.semantic-drupal.com/snippets/event/2and so forth...
The testing can be a manual copy/paste at the beginning, but as I said, it could potentially be automatized in the long run if each test comes with its expected results.
Thoughts?

Comments
This makes a lot of sense.
This makes a lot of sense. Have you gotten an indication from any of those parties that they would test against such docs? If not, we might want to get them involved early.
Steve Macbeth from the Bing
Steve Macbeth from the Bing team already expressed in interest in collaborating on the main schema.org thread. Hopefully Google will join us soon too.
Sounds Like a Good Idea To Me
+1! This sounds like a good idea to me.
I like it
I think this is a great idea. I hope the other parties agree.
Basically, #1 requires the involvement of other parties and #2 could still be useful for us and would not necessarily rely upon the involvement of other parties.
Bing, Google (and also Facebook) could use these pages to test and tune their parsers.
We could also use these pages to test the various search engines and ping them if something breaks on their side.