Project page is http://drupal.org/node/236461
For more information, see http://groups.drupal.org/node/10890
Discussion: http://groups.drupal.org/node/9929
The goal of this project is to create a plugin based import module for Drupal that allows the upload of office suite file formats which would be parsed into Drupal nodes. Allowing even novice CMS users to generate content using a familiar office productivity suite.
PROJECT DETAILS:
Features to Implement:
-The Router module
-Create a set of hooks other modules implement to parse the various data types
-Allow the parsers to specify mime types that they are interested in and keep an internal registry of those types
-Handle files put to the server routing the data appropriately.
-Create the appropriate node(s) for the uploaded data
-Map any metadata to the correct node fields
-The Parser Modules
-Microsoft Office
-Parse the .doc file type provided by various versions of Word.
-Parse the .xls file type provided by various versions of Excel.
-Open Document Support
-Support the various file types under Open Document
-Implement parsers for ODT, ODS, and possibly ODPStyles to Support:
These should be the minimum styles supported by any parser
-Lists with the appropriate formatting / styles, (ul, ol, dl)
-Bold text (strong, b)
-Italicized text (em, i)
-Underlined text (u)
-Paragraphs and breaks
-Basic symbols, certain symbols should be supported (and encoded) by default (&, copyright, tm, etc)
Comments
I am interested in joining
I am interested in joining forces on this project
I am going to start working
I am going to start working on a proposal for this idea. Anybody know of any good way to parse .doc files that would work on your average cheap shared server?
I don't know of a good way
I don't know of a good way to do it out of the box on shared hosting, but there are a ton of resources for external libraries in the threads from last year. You might also want to contact Chris Bradford to find out how much he got done since maybe you can pick up where he left off. I know he's in SoC again this year, so he should be around.
I would love to see this project get off the ground again.
Mentors?
Anyone willing to be a mentor for this idea if my proposal is accepted?
I think I'm going to look into the Google Documents List Data API to see if that might be helpful in parsing. All the other solutions I've found for parsing the old Microsoft Office files would require permissions that are usually not available on cheap web hosting packages.
yes, I am ready to mentor this
we plan to include capability to upload multiple files and create nodes from them
Willing to co-mentor
Dan DeGeest
Lead Software Developer
iMed Studios
http://www.imedstudios.com/labs
Dan DeGeest
Software Developer
Somewhere or Another