Input formats. A different point of view

Events happening in the community are now at Drupal community events on www.drupal.org.
claudiu.cristea's picture

Drupal is formatting the user input by applying a set of customized filters before the content is send to visitor’s browser. In other words the user input is stored untouched in the database (except the SQL injection filtering) and then the filter module is performing code cleanup and formatting at the HTML output. In fact the input formats are... output formats :)

There are web administrators (and so am I) that preferring to filter the content at input. They want clean code stored in the database backend. They want to strip all potential dangerous code and also fix XHTML issues before the content is stored. Why? There are many reasons. Maybe they use the Drupal database with other client applications that have no output filtering system and they want be sure that the content retrieved from database is "clean". Maybe is possible that in the future they will need to migrate to other systems and they want to have a well formed content in their database.

Drupal output formatting/filtering is valuable too. I’m thinking specially to URL filter, Line break converter, PHP evaluator. These are kind of tools that are dealing more with formatting than with filtering. They alter the content in a manner that is not necessarily needed to be stored in the database. They need to be applied at output rather than input.

My proposal: For each input format the administrators should configure each filter if it is an input or an output filter. Each filter checkbox will turn into a combo box with 3 values: "none", "input filter", "output filter". For example the Filtered HTML input format can have the HTML filter configured as input filter while Line break converter and URL filter are configured as output filters.

Of course this requires rewriting the filter module. Before creating patches I thought that a discussion is welcomed. Any comments, ideas?

Comments

one small problem

kyle_mathews's picture

that'd create is when I create a new site I frequently have to tweak the Filtered HTML input format to be less restrictive. Frequently I miss a tag or two that should be allowed through. If the full html is stored in the database, then you just tweak the allowed tag list and your content displays right. If the input filter was actually an input filter, those tags would be gone forever.

But beyond that, I agree with your reasoning. It would be useful to filter content as it comes in -- just be careful what you filter away forever.

Kyle Mathews

Kyle Mathews

RE: one small problem

claudiu.cristea's picture

Hi Kyle,

I totally agree with your point of view. As I write, I'm also "fan" of the formatting filters that are applied at the output. But even in the cases you talk about there are tasks that need to be performed before storing the content in the database. Such filtering tasks that you may want to "filter away forever" includes:

  • Fixing XHTML issues. Unclosed tags, wrong nesting, uppercase tags, etc.
  • Removing garbage. Strip MS attributes from tags (ex. <p class=MsoNormal>)
  • Striping dangerous tags (like <script>)

These are things tasks that I want to "filter away forever"...

Other filtering task I'd like to be performed at output:

  • Striping unwanted (but not dangerous!) tags like: <span>, <div>, etc.
  • Transforming URLs to hyperlinks
  • Convert line breaks in paragraphs, etc.

______________
Claudiu Cristea
www.ascentgroup.ro

________________
Claudiu Cristea
webikon.com

Reasonable request

sun's picture

input filter means user input filter in Drupal, so your concerns are rather UI/terminology related.

However, although I don't know if I would use such a feature, I see your point about integration with other systems. Thus, an option to decide whether a filter should be applied upon save or view sounds sane to me.

Daniel F. Kudwien
unleashed mind

Daniel F. Kudwien
netzstrategen

RE: Reasonable request

claudiu.cristea's picture

@sun:

However, although I don't know if I would use such a feature,

That's the reason why I posted this on Drupal Groups. I will not start any core patching if...

  • ... there is no interest for such features,

OR

  • such feature is not well accepted as core feature by the Drupal community

... and thanks for your feed-back


Claudiu Cristea
www.ascentgroup.ro

________________
Claudiu Cristea
webikon.com

What do you think of

cwgordon7's picture

What do you think of including the flexifilter module in core as a replacement to the filter module API / UI?

gain an audience, then start a campaign

christefano's picture

Flexifilter will have more fans, I think, once it supports Drupal 5.x. Nearly all my sites are running D5, so I'm speaking as an eager fan-to-be :)

In all honesty, at the

cwgordon7's picture

In all honesty, at the current rate I do not think there will ever be a Drupal 5.x version unless someone steps up and writes the patch. We are trying to look forward— Drupal 6 and 7 are the future. The issue queue is open for patches, but otherwise, I don't see this happening, at least in the short term. Besides which, sites are about to start switching over to Drupal 6.

Great Idea

glass.dimly's picture

This would be very useful for understanding the ramifications of attempting to corral TinyMCE's output. I've encountered problems around just this very issue and am attempting to resolve them right now.

-Jeremy

mikeschinkel's picture

Hey Claudiu:

I just read this and wanted to say that I think your idea to have both input and output formats is a really, really excellent idea! I support this completely!

-Mike Schinkel
http://mikeschinkel.com/

Improvements to core

Group categories

Category

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: