Posted by jensimmons on July 25, 2010 at 8:48pm
We are discussing adding to the HTML Tools module functionality that adds more HTML tags ("elements") to the "filtered" input filter.
Let's create a list here of what tags should be allowed.
Examples:
<article>, <section>, <mark>, <time>
Comments
Context?
It might be good to add some indication of best uses for an element. For example, how is article or section to be used? W3C spec indicates that these should not be used as general purpose containers like div. The article element seems a good match for Drupal syndication and section tag might be appropriate for use with Drupal tabs. I image there are many more uses for these two.
trickiness...
if section and article are allowed, shouldn't header and footer be allowed as well, since you can have either of those within a section?
in a talk by tantek celik (@ voices that matter san francisco 2010), he mentioned that with the new html5 standard a couple tags got revisions:
a lot of old, presentational elements have been repurposed to have semantic meaning. for example:
moreover, li value and ol start have been restored.
see: http://www.w3.org/TR/2010/WD-html5-diff-20100624/#changed-elements
FWIW
I'm theming a site where the designer is delivering mock-ups using HTML5, CSS3 and jQuery :)
So far, the only new elements that he has used include header, footer and nav.
Also, we are using a very simple html tag with only the language attribute specified at this time (no namespace).
Also here's a list of all the
Also here's a list of all the text level semantic tags http://www.w3.org/TR/html5/text-level-semantics.html
video
DOH! What am I thinking, I'd love to see video tag allowed :)
Oh, right. We probably could
Oh, right. We probably could allow the video tag — yes? I'm so used to excluding the object and embed tags for security reasons. Are the video and audio tags secure? If so, let's include them! Wow — videos in comments, ftw!
Also, let's not use this space for debating what the html tags do, or everything about them. Let's list and discuss which tags should go into something that's just like the current "filtered html" input filter in Drupal 6.
For reference, the default D6 input filter is
<a><em>
<strong>
<cite>
<code>
<ul>
<ol>
<li>
<dl>
<dt>
<dd>
It looks to me like D6 core HTML4/XHTML defaults were chosen with several things in mind:
1) don't allow anything that's insecure
2) keep the list small and simple
There's no h1, h2, h3, etc. There's no blockquote, del, q, sub, pre, b, u, sup, img, table, strike, acronym, etc.
If we want to not have too many new tags, how can we decide what's more likely to be needed?
Here's a list of what's been suggest above:
sectionarticle
header
footer
video
audio
mark
time
b
i
wbr
hr
small
What is considered insecure?
from html4: embed, object, img... what else?
Are any of the new HTML5 tags dangerous? Canvas for sure. Any others?
Jen Simmons
http://jensimmons.com
block vs. inline
It's also been noted that the original list is only inline elements, though I'm not sure if that was on purpose.
Regarding security, I think it's somewhat too early to say for sure which are going to be safe/unsafe.
There is a list at http://heideri.ch/jso/#html5 that is long, but I haven't reviewed it fully to see which of the html5 tags specifcally cause new problems.
knaddison blog | Morris Animal Foundation
new elements will also be
new elements will also be inline unless specifically stated otherwise, this is the default behavior for unknown elements (like if you were to serve XML and make up your own tags)
also do note that our dearest friends, the IE family (excluding 9), refuse to render unknown elements unless they're shivved in -> http://remysharp.com/2009/01/07/html5-enabling-script/
that one also includes the print fix (even with a regular shiv, IE doesn't apply those styles when printing, so u need extra magic)
I also think that IE8 refuses to apply it if there is no body tag present (which is optional now btw, but for obvious reasons, should be retained)
this means that the IEs won't style these elements if js is disabled, a usable workaround is to use an inner wrapper with an element IE does understand like this for example:
instead of just
<!doctype html><html>
<head>
<title>foo</title>
</head>
<body>
<header>
...stuff
</header>
</body>
</html>
you could use
<!doctype html><html>
<head>
<title>foo</title>
</head>
<body>
<header>
<div class="header">
...stuff
</div>
</header>
</body>
</html>
and then style .header instead of header
I know, it brings back the horrible nightmares of divitis...
Headings & Accessibility
Just a quick note that h2, h3, etc. should be part of the default but didn't make the cut for D7. See http://drupal.org/node/514008
It's the semantic way to break up long pieces of content (even if you aren't blind) so really should be considered when adding to the default.
The trick is you don't want to allow someone to add an h1 or sometimes even an h2 depending on the document structure.
--
OpenConcept | Twitter @mgifford | Drupal Security Guide
if img is not included for
if img is not included for security reasons, i would guess that video and audio might be subject to the same issues. maybe this is something to ask the drupal security team, i dunno. with the video and audio tags you can specify multiple sources, and browsers can choose which source to use, based on which codec they support. could specifying fake src's, sort of like tracking pixels, be a security threat?
as for section and article, they are considered tags for "sectioning content" (http://www.w3.org/TR/html5/content-models.html#sectioning-content-0). sectioning content tags "potentially [have] a heading and an outline". the heading includes the h1-h6 tags, which are currently not allowed. so if you can't have headings in the current D6 html input format, then should you be allowed sections?
Headings being recursive is new for HTML5, so i think two arguments could be made:
My personal vote is the latter, since it seems to me most consistent to the intent of the original D6 Filtered HTML input format.
Hmm...
I think that the document structural elements like section, article, footer, header, nav, aside, figure, should be left out of a basic input filter. It should be left to each site builder/owner the degree of latitude they're allowing for each content creator on their site. Furthermore it's inconsistent to allow the new elements without allowing div also.
Also if you provide
<audio>and<video>you also have to provide<source>.Here's a flowchart from HTML5Doctor about the new HTML5 elements.
http://html5doctor.com/wp-content/uploads/HTML5Doctor-sectioning-flowcha...
Don't forget FIGURE and FIGCAPTION!
Two of my favorite new tags - I would love to be able to use them in the text editor.
Oh my.
Most all of these new tags belong in the 'full' input filter but not in the 'filtered'. People need an easy to use, very simple, secure filter as a default, something that will work for comments, for instance. Much better to think of the minimum required, rather than what would be 'kewl' to a front end web dev.