cosmicdreams's picture

Looks like there is a lot of support for the schema definitions that can be found on

Should we be trying to bring our standards in line with these?



kardave's picture

I think it's up to the site builder. The definitions sound great, but too much extra work for an average site.
BTW how can it be done? Just in the theme, or by mudule?



Yes, I think that it would be

linclark.research's picture

Yes, I think that it would be excellent to bring this set of vocabularies into modules on Because mappings can accept multiple properties in both RDFa and Microdata, this just means adding the mappings to the existing mappings.

The first step would be going through modules in contrib and seeing where there are content types that match the types on I've mentioned this to an intern who I am supposed to be supervising this summer, I'm not sure whether it will figure into his work or not.

So you don't see this as a

mlangfeld's picture

So you don't see this as a "threat" to RDFa or a potential takeover of the semantic web space? I don't know, and haven't see much in the blogsphere discussing this. I do wonder why they don't just use RDFa, especially Google who purchased Freebase, I thought with the idea of moving more into semantic search.

Best, Marilyn

Yes, my impression is that

milesw's picture

Yes, my impression is that RDFa will be treated as legacy software – it'll be supported for a while until it dies out.

From the Google Webmaster blog...

If you’ve already done markup on your pages using microformats or RDFa, we’ll continue to support it. One caveat to watch out for: while it’s OK to use the new markup or continue to use existing microformats or RDFa markup, you should avoid mixing the formats together on the same web page, as this can confuse our parsers.

So unfortunately it's one or the other – either RDFa or Microdata. Manu Sporny, one of the W3C folks working on RDFa, wrote an interesting blog post sharing his point of view.

I'll bet Google will fix

no2e's picture

I'll bet Google will fix their parsers in the next time, so that it should be no problem to use RDFa, microdata and microformats together on the same page. Or is there any technical reason that this might be not possible?

RDFa is not only useful for search engines.
microdata is still a Working Draft (in sense of w3c process).
So, I think, we did the very right choice to support RDFa out of the box.

And I hope we won't shift from RDFa to microdata; we should support both.

RDFa and Microdata are

linclark.research's picture

RDFa and Microdata are extremely similar, they are both Entity-Attribute-Value models that support using URIs as universal identifiers for things. There is an algorithm for converting Microdata to RDF.

AFAIK, you can use RDF vocabularies in Microdata and you can provide RDFS definitions for vocabularies like the ones. Two of the folks I collaborate with at DERI, Michael Hausenblas (my thesis advisor) and Richard Cyganiak, have worked with Scrapper Wiki to develop an RDFS definition for the terms,

I have started a project to ensure compatibility between Microdata and RDFa + the RDF Mapping API as much as possible... and I think that a good integration is possible. I don't see this as a threat at all, I believe that this is a great step in getting people more interested in this technology.

As one of the maintainers of the FOAF vocabulary said in the wake of the announcement, there are plenty of sane reasons for wanting a self contained vocab, explaining multiple vocabularies is hard. This provides an easy on-ramp, and I think that the whole Linked Data effort will benefit from that. I also think that the development effort in Drupal will benefit from involving all these new people. And once everyone knows how to use this technology, we can start seeing how we can build further upon it. I think it helps the Web of Data develop incrementally, which is essential to evolvable systems.

Unless there is a lock-down on vocabularies that you can use, this poses no threat, but is actually a big step towards pushing the Web of Data forward.

EDIT: I thought that ARC2 had a parser for Microdata already, but it does not. I have started an issue in the Microdata queue to see if anyone wants to take this on.

droopy eyelid

Paul William's picture

Extremely nice post. I just stumbled upon your blog and wanted to say that I have seriously enjoyed surfing around your weblog posts. Immediately after all I might be subscribing to your rss feed and I hope you write once again soon!


Paul William

droopy eyelid

That's good to hear, Lin.

mlangfeld's picture

That's good to hear, Lin. Seems the semantic web community does have some concerns: Manu Sporny and David Wood, who notes the work DERI just did, which is awesome. It does seem that semantic search could threaten large search companies, so we'll see how this plays out. I hope your vision of this pushing forward the Web of Data comes to pass.

Best, Marilyn

Syntax lock in

scor's picture

I applaud the major search engine companies to finally get together and define a general vocabulary for the Web. Structured data FTW.

What worries me though is the push for microdata as the exclusive long term syntax for semantic markup. They go to the extent of discouraging the use of RDFa arguing that using both syntaxes will confuse their parsers. Both of these syntaxes do not share any attribute in common, I'd like to understand where this argument is coming from... as if they don't have the skills to write parsers, or are they just too lazy to bother with RDFa? They're forcing Web developers to use a specific syntax, ignoring all the RDFa use cases which are not covered by microdata. If you still don't get it, Mark Pilgrim sets the tone in a post on moving to "If you’ve been using microformats or RDFa to mark up your Google Rich Snippets, sorry, you backed the wrong horse.". is promoting a standard that no one uses yet. Are they trying to get the whole web developer community adopt it and see how it works out? Is that the way Web standards get built now, without community consensus, not based on technology merits but on personal agendas instead? The HTML5 spec has only one editor, and he's a Google employee. Coincidence? We have yet to see microdata deployed on the Web to the level of RDFa: Drupal 7, Facebook OGP, Best Buy, and all the e-commerce sites which have incorporated GoodRelations (which is only really available in RDFa). So now you're asking all of us to switch to a new syntax if we want to continue to be indexed by search engines? They say: "We will also be monitoring the web for RDFa and microformats adoption and if they pick up, we will look into supporting these syntaxes". They clearly have missed or ignored the announcements of the last couple of years about the adoption of RDFa.

RDFa shines at mixing vocabularies, but true, many webdevs only care about SEO, and will just do as Google/Yahoo!/Bing say in a sheep-like fashion. One thing to note though, is if you want to support more than the taxonomy, you just can't do it with microdata, you are limited to one single vocabulary per data element, so it's either their way or the highway. You can't do it with RDFa either because your site will not "comply" to the microdata-only policy. You can certainly extend their schema and reuse their namespace, though I wonder how that plays with their patent infringement.

As for Drupal, we already have a structured data syntax baked in (RDFa), I don't see why we should throw out all the work we've put into it to just follow what Google says, and downgrade to a less feature rich markup: If it ain't broke, don't fix it.

I'm not trying to advocate a particular syntax, I'm only fighting for the freedom of choice for web developers to choose the syntax that works best for their use case, whether it's Microdata, RDFa, or even microformats. Search engines should support whatever syntaxes the Web community decides to use, and whatever syntax makes the job of search engines parsers easier. Microdata is great for simple use cases, but the Web of structured data is not only for search engines to benefit from, and this arrogant push for an exclusive syntax is not a healthy direction for the Web community.

I'm with scor on this

netsensei's picture

I'm with scor on this one.

Microdata and RDFa can live next to each other but pushes the community towards a standard that inhibits extensibility and flexibility due to patents and vendor lock in.

The problem is not the technology (mapping of the schema over standards should be next to trivial) It's the mere fact that the vocabularies provided by won't suffice if you want to cover more complex domains without any freedom to extend them.

What's more, is mainly intended for indexing and SEO purpopses specifically. RDFa covers that and far more interesting use cases. The demand to not mix both syntaxes forces developers at large to choose a conservative path: if Google/Bing/Yahoo! can't/doesn't index your clients' site, you put yourself at a commercial disadvantage.

If a large part of the semantic web, the part Drupal targets' with baked in RDFa functionality, is described with limiting vocabularies, we're losing out on a lot of potential. To me, the semantic web is only as good as the quality of the metadata that describes its' content.

If anything, I support the idea that developers - and their clients - should have the freedom to choose whatever syntax that works best for them. That's why I oppose the fundamental idea behind

I acknowledge though that from the point of view of the search engines, it's probably easier (more cost efficient) for them to build and optimize applications that work with a controlled vocabulary rather then rely on external vocabularies.

I think it is important to

linclark.research's picture

I think it is important to maintain the distinction between vocabularies and formats. is a vocabulary which is independent of the syntax used to place it. RDFa does cover more interesting use cases, but microdata can cover almost all of those use cases as well because you can use vocabularies defined in RDF in microdata.

Does give any indication whether you will be penalized for using other terms on your page? I do know that Google's Rich Snippets tool did that with RDFa already, and I think that would be a very bad thing. I think that that is an issue where they should be pressured, people should be allowed to markup things with multiple different vocabularies. However, they might fear that opens the doors for spammers (the same way spammers put thousands of words in metatags).

I wish it was syntax neutral

scor's picture is a vocabulary which is independent of the syntax used to place it.

I wish it was, but the FAQ says "Changing to the new markup format could be helpful over time because you will be switching to a standard that is accepted across all three companies". Currently, the format they are referring to is the one that everyone will see on their main pages: MD. You have to be pretty hard core to really look for the RDFa 1.1 mapping page.

Does give any indication whether you will be penalized for using other terms on your page? doesn't, but the announcement from Google and the post from Mark Pilgrim both do. These restrictions are only dependent on each search engine implementation of their parsers, all syntaxes can live along each others on the same page.

The vocabulary is independent

linclark.research's picture

The vocabulary is independent of the syntax. Whether Google, Microsoft and Yahoo will parse and use RDFa that contains this schema is a tangental (and, yes, political) issue. But we need to be quite clear with our use of terms here so that others can understand the real issues.

Can you please include the parts of the Google announcement and the Mark Pilgrim post that you are referring to? It would be helpful to my understanding of the issues.

It's the second link in my

scor's picture

Both links were in my initial posts, but I updated my second comment to add them there as well.

As far as I can tell, they

linclark.research's picture

As far as I can tell, they both say that you will be penalized for mixing formats, but they don't say you will be penalized for mixing vocabularies.

Mixing formats

Their parser already gets confused just with RDFa, so I'm guessing that if you mix formats, it would actually confuse their parser. I don't think many people using Drupal would want to mix formats anyway, because it would negatively impact front-end performance and make the HTML convoluted. I personally think this is fine advice, I don't see a problem with it.

Mixing vocabularies

This is what I was asking about, do they say you can't use FOAF and their Person terms on the same item? That would be a problematic issue, although one I could see them having a valid opposition to. If they allow vocabulary mixing, it could lead to spammers using every vocabulary on the same item just to get the search result into every vertical. However, I think it is important for the evolution of the Web that people be able to mix and match vocabularies so that new standards can emerge. If they do penalize pages that mix vocabularies, I would oppose that.

These are two quite separate issues, which is why it is important to be clear with our use of terms.

sorry Lin, I was still in

scor's picture

sorry Lin, I was still in syntax mode when I misread your post, so links are irrelevant. However I know Facebook's parser does not support multiple CURIEs in the same attribute, I did mention it to them last year at SemTech. other than poorly implemented parsers, I don't know of a case where they would not allow that. I'm all for term mixing of course, so that's something we must point out whenever we encounter such bug.

Yeah, the Google Rich

linclark.research's picture

Yeah, the Google Rich Snippets testing tool would give warnings for including info about the Node author on the page and also for using foaf:Image as the type on image files. I do think that we as the SemWeb community should mobilize around that issue and ensure that multiple vocabularies can be used on a page... it would truly be terrible if the big three halted independent vocabulary creation and use, whether intentionally or unintentionally.

I'll keep an eye out and see whether it looks like any of them are penalizing the use of multiple vocabs.

you are limited to one single

linclark.research's picture

you are limited to one single vocabulary per data element

I've been hearing this about mixing vocabularies from a few folks. I'm unclear on what people are refering to. From the specification, it seems that you can use full URIs for properties that are outside the vocabulary that the element is scoped in. They have an example that scopes the item with but also uses terms from The advantage that RDFa 1.1 would have in a similar use case (when using the vocab attribute to scope an element to a vocabulary) would be that you could use a CURIE for the outside term. But it seems like, beyond the CURIE, the mechanism is basically the same. Please let me know if I'm misunderstanding, I've been wondering what people have been referring to for a while.

I agree with your point, we should definitely continue to support both of them and try to integrate the two as much as possible so that it is easy for Drupalistas to make their own choice (and even change between the two whenever they want). It makes no sense to make Drupal a battleground for warriors on either side.

You are right Lin, what I

scor's picture

You are right Lin, what I said was unclear without more context. You are not limited to one vocabulary in microdata, if you want to use multiple vocabularies, you can use full URIs for properties. The case of type is more complex though: the itemtype attribute can only hold one type (the default vocabulary) and you have to use some hacks to workaround that limitation which RDFa does not have. I felt this debate needed its own post, so I wrote a blog post about handling multiple vocabularies in microdata.

stevemacbeth's picture

Hi. I stumbled upon your debate on from a search on Bing. I am one of the founders of from the Bing Search Quality team.

I wanted to clarify two points and solicited feedback on a third.

The first is about the potential future support of RDFa. On the FAQ's page for, which is referenced extensively in the above debate, our position on this is relatively clear. At least we thought it was relatively clear.

Q: Why microdata? Why not RDFa or microformats?

Focusing on microdata was a pragmatic decision. Supporting multiple syntaxes makes documentation for webmasters more complex and introduces more overhead in terms of defining new formats. Microformats are concise and easy to understand, but they don't offer an open extensibility mechanism and the reuse of the class tag can cause conflicts with website CSS. RDFa is extensible and very expressive, but the substantial complexity of the language has contributed to slower adoption. Microdata is the most recent well-known standard, created along with HTML5. It strikes a balance between extensibility and simplicity, and is most suitable for building the Google and Yahoo! have in the past supported both microformats and RDFa for certain schemas and will continue to support these syntaxes for those schemas. We will also be monitoring the web for RDFa and microformats adoption and if they pick up, we will look into supporting these syntaxes. Also read the section on the data model for more on RDFa.

Our decision, as indicated above, to support microdata was a trade-off between complexity and expressiveness. We tried to find the right balance between these as we believe having sufficient expressiveness is necessary for us (major search engines) to use this data in a meaningful way. On the other hand keeping complexity low is necessary to drive adoption.

I hope from the above that it is clear we have not ruled out RDFa as a syntax that we would consider supporting in the future. It wasn't our first choice for the reasons I already stated, but is something we are open to and already in discussion with key RDFa community members about.

The second point I wanted to try to clarify is about the ability for a parser to parse a page that includes multiple syntax. This is currently supported by Bing. I can't speak for Google, but I know this is an issue they are aware of and I would expect to hear something on this front shortly. It was an oversight in the original work we did and wasn't directional in our thinking. In other words we did not intend to limit support of other syntaxes by publishers/webmasters as many have stated.

The third point is with regards to potential collaboration with the Drupal development community. We believe, as I believe a number of members of this community have shown interest in, that large scale adoption will require implementation support in key publishing tools. Drupal being a significant player in the publishing industry seems like a reasonable place to start with collaboration discussions. We have had some internal discussions already within Microsoft about support through our web publishing and hosting stack and would like to engage with other third parties. If this is something that you are interested in please let me know and I would be happy to get you connected as those discussions mature.

On a side note I think the point that Lin made early in this thread is a very important one and one I think she made very eloquently. It is important to separate syntax and vocabulary. We are passionate about having a single, shared, well-define vocabulary. We are less passionate about syntax. We are also very open to making the industry specific extensions. We designed with extensibility in mind and announced our first broad partnership today with AEP & CC.

This is a partnership between, Association of Education Publishers, Creative Commons, The Bill & Melinda Gates Foundation and a large number of the leading educational content publishers.

Thank you for pitching in.

netsensei's picture

Thank you for pitching in. Indeed, the issue is less about syntax.

I wonder though, how this quote from Lin:

Mixing vocabularies
This is what I was asking about, do they say you can't use FOAF and their Person terms on the same item? That would be a problematic issue, although one I could see them having a valid opposition to. If they allow vocabulary mixing, it could lead to spammers using every vocabulary on the same item just to get the search result into every vertical. However, I think it is important for the evolution of the Web that people be able to mix and match vocabularies so that new standards can emerge. If they do penalize pages that mix vocabularies, I would oppose that.

relates to:

We are passionate about having a single, shared, well-define vocabulary.

Vocabularies are the building blocks of the semantic web. They define the concepts we need to describe resources on the semantic web. If we want to be able to describe things in new ways, these should be extensible as Lin says.

Major search engines play a key role in the adoption of the semantic web as their services are a fundamental functional part of the web today. Endorsing their own vocabulary where other well established vocabularies are already out there (FOAF vs. Person) can hurt the semweb:

A few questions I heard these past days:

  • If I have a lot of FOAF described data? Do I have to convert those to the Person vocabulary?
  • Does a search engine company have enough knowledge about a given domain to curate a single, well defined vocabulary?
  • Is it even the task/role of search engines to write and push vocabularies?
  • Can I extend the vocabularies using other vocabularies?
  • Will search engines still support other vocabularies like FOAF, OWL, DC,...?

Hi Steve, Thanks for reaching

scor's picture

Hi Steve,

Thanks for reaching out and posting directly here. Having open conversations with search engines staff like you will definitely help us understand how you process pages, so we can make decisions and fix bug if they exist. We'll definitely also help you understand how we generate our markup so you can make good use of it.

I understand that Microdata was a pragmatic choice to launch and I apologize if I jumped the gun too quickly. Really, my concerns came from the way Google announced how they would make use of I'm glad to hear we're on the same page regarding being syntax agnostic, and that Bing supports both syntaxes. I'm sure Google will fix their parsers too with time.

As for your third point about collaboration with the Drupal development community, I'd thrilled to see this happening. Feel free to use this groups ( to post public discussions... if they are particular issues that need to be fixed, we can then direct you to the right place depending on if it pertains Drupal core or a particular module). If you want some of us to participate in discussion in Bing forums, please send us some links.

I'm excited to check out the Bing webmaster tool, I'll give it a shoot. Feel free to post some relevant links here. Bing is actually the first search engine to engage conversation with the Drupal community, that's really awesome. Thanks Steve.


Thank you

alexandercoxs's picture

Hi Steve,
Thanks for your open conversations directly here, It is really appreciable for reaching out directly here. This is great news.We are trying this schema so that it will help us to give better user experience with semantic web.
Cooking games

Clarity and supporting your early adopters

ronald_istos's picture

Steve - thanks for pitching in. Very useful info there.

I think there is one issue we are all in agreement - structured data will be a component of the future web and efforts that take us in that direction are generally a good thing. The question, then, is how to introduce that data and how should such efforts coordinate.

My main concern is not so much with as a technology - you've clearly thought about it a lot and reached what you believe is the best conclusion at the time. The main problem as I see it is the method. Although is not about replacing RDFa (after all mostly covers one specific use case that is of primary and immediate concern to search engines) the perception to a lot of people is that RDFa has lost some sort of battle.

To me it looks like Google and Microsoft missed the opportunity to champion the cause of structured data while introducing By pitching it as an essentially either/or choice damaged a lot of hard work people did to convince communities to adopt structured data. Scor, Lin, Dries and many others worked hard to educate and convince the Drupal community to pick up RDFa - a risky way to spend limited resources on an open source project. Now, to a lot of people it looks like Drupal bet on the wrong horse (that is by far not the case - but I am talking perceptions).

For example now a lot of companies will be wondering whether to even switch on the RDFa capabilities in Drupal 7 - which means less adoption for structured data. They will certainly be questioning whether they should invest to expand on the existing RDFa capabilities. People working with companies in that direction will now have to introduce another level of complexity in motivating choices introducing lots of "maybe", "probably", "they said that perhaps", etc.

My question given the situation now is: What are the difficulties for the companies behind in saying that you will unequivocally support an RDFa vocabulary (e.g. ) that cleanly translates into your vocabulary in the interest of:
1. Supporting all the hard work of the communities that have championed the common cause of structured data
2. Ensuring that people that want to do more that just the basic SEO use case can do so without having to mix technologies.

The people most likely to provide your search engines with useful structure data the earliest are the people that you've just made life harder for. By explicitly and clearly providing support for an RDFa equivalent you can benefit as well and the cause of structured data definitely benefits.

Very basic question

pamelalies's picture

I feel like this is a very basic question that I should be able to figure out myself, but can't find confirmation in any of the documentation I've found.

Is it the case that you can only get markup to work for new content types that you create after you have the module installed and enabled? Or should you be able to go into your existing content types, map them to an "Event" or "Product" type say, and then map their fields to the appropriate properties?

I've tried to do this and no markup is created for existing content types for which I've mapped fields to properties, but the sample content types that come with the install of the module do produce markup when you create content with them.

Is that really the case that you can't use with existing content types?

Thanks! Pamela

Re: Very basic question

garyebickford's picture


My associate and I are discussing your question now. We're going to have to run a couple of tests to be sure, but we think the following is true.

You can add a properties to an existing content type. This will not be reflected in any existing content with that type, but will be reflected in any new content you create of that content type after that.

This leaves open the question of whether editing existing content would result in the RDF being applied. There are some other related questions - if one uses the Node Convert module to convert from one type to another, then what happens?

There are also interactions that may be happening with some of the deeper RDF modules as well.

[... time ...] My associate just ran some tests. He found that re-indexing after adding the mapping seems to work. To do this, add the mappings you want to the content type. Then go to the Search API configuration, and delete the index (whichever one applies to this content type), and rebuild it. He says that after this is done, both the existing and the new content will show the RDF mappings.

As my associate noted, "This must all be learned through sweat and tears." :} - and a lot of trial and error.

Thanks for taking a look at this...

pamelalies's picture

I figured it might be the case that existing content would not have the schema markup applied, but I also tried creating a new node from the content type I had previously mapped the fields to the schema properties on - and no schema markup was produced on that new node either.

I will try the re-indexing you suggest (after I get through my next meeting!), and report back as to whether that was the fix. Fingers crossed!

Is that really the case that

scor's picture

Is that really the case that you can't use with existing content types?

In Drupal 7, any RDF mapping that is set for any new or old content type should work. The module uses the core RDF mappings and therefore should be part of the markup unless there is a contrib module or a theme that alters or takes over the rendering pipeline. First make sure you are logged in as admin when you are debugging, that will avoid any caching issue to interfere. What I suspect is happening is that for your "legacy" content type, you might have configured some module like panels to take over the layout and rendering your nodes. This is a known issue. Could you create a new issue here with as many details as possible here so I can help you debug this?

Thanks so much Scor! I'm

pamelalies's picture

Thanks so much Scor! I'm going to try the suggestion above to re-index, and if that doesn't work I will create a new issue with further details. Thanks!

I'm intrigued by the

scor's picture

I'm intrigued by the re-indexing you are talking about. If you are only talking about RDFa in the regular HTML output, there should be no re-indexing of any kind required. Are you using Search API or any other kind of search tool?

This is in reference to

pamelalies's picture

This is in reference to having the markup apply to existing content that was created before the fields were mapped to the schema properties on the content type.

Garyebickford suggested that:

[... time ...] My associate just ran some tests. He found that re-indexing after adding the mapping seems to work. To do this, add the mappings you want to the content type. Then go to the Search API configuration, and delete the index (whichever one applies to this content type), and rebuild it. He says that after this is done, both the existing and the new content will show the RDF mappings.

I do suspect, however, that I

pamelalies's picture

I do suspect, however, that I have another issue to deal with first, since I created a new node from a content type I had added the mappings to - and the schema markup was not created on that page.

Our configuration may be different as well

garyebickford's picture

We've been working with so many different configurations and modules it's hard to tell what we have in place or how our installation might differ. For example, we've been working with almost all of the various RDF, Schema, and Sparql related tools together and separately at one time or another in the last month. I suppose the definitive test would be to start from a completely generic initial Drupal install, and install only the SchemaOrg module - in one's spare time of course :)

One thing that can sometimes be useful is to run 'drush pm-list > /your/filename.ext' from the command line, if you have drush running. This lists all the modules and themes you have installed and puts the output into a text file. That makes it easy to use gvimdiff or equivalent to compare different drupal instances.

Re: I'm intrigued by the

garyebickford's picture

Thanks, that's a good point. I'm a bit removed from the actual work on this at this time, as someone else has taken over this part of the work, and also we have more than just the things going on. I'll ask my associate - maybe I can get him to register here and put in his thoughts.

Reindexing after adding an RDF Mapping to a Field

bshambaugh's picture

I found that after I added a RDF Mapping for a particular field in a Content Type I needed to clear the index and re-index the content. Otherwise a sparql query at the ARC2 sparql endpoint would not show the mappings (unless of course I waited awhile for cron I suppose). I was using a setup described by: {Indexing RDF data and providing a SPARQL endpoint}(with the exception that I also added an index for taxonomy terms). I was using Search API, so the answer to scor's inquiry is yes.

Semantic Web

Group organizers

Group events

  • 2018-05-31 17:00 - 2018-06-02 18:00 Europe/Amsterdam
Add to calendar

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: