Auto Taxonomy generation - Wiki page

Events happening in the community are now at Drupal community events on www.drupal.org.

Project Information

Project page on d.o: http://drupal.org/project/autordf
Student: Tushar Mahajan (chia on d.o)
Mentor: Thomas Narres (narres on d.o)

Current Status

The module find important words and phrases in node content, check it against a list of stopwords. The test site is at http://gsoc.chia.in
I have the tagger test page at http://gsoc.chia.in/autordf

Goals

1) A auto tagging system that will tag node content accurately and learn from existing tags.
2) Automatic find relevant tags related to different vocabularies and RDFing the content.
3) Finding Relations between the different tags and build a Taxonomy tree.
A robust auto taxonomy system and RDFing the content output by Drupal sites.

Project Schedule

  • Finding Tags within node
  • Applying Tags to node
  • Categorizing tags
  • Identifying Tags from Different Vocabulary
  • Learning from existing taxonomy tree
  • Relation Between Different Tags using Association Rules (Roadmap)
  • Create a Taxonomy Tree
  • Assist users in Tagging (UI)
  • Increasing Accuracy of overall system
  • Named Entity Recognition System (Basic Implemented using predefined Lexicons and taxonomy learning)
  • Using Services Module To expose Drupal Taxonomy entity System, helps in increasing accuracy (Roadmap)
  • Sentiments Analyzer using Part of Speech(POS) and predefined Lexicons (Roadmap)
  • RDFing content and tags (very basic implementation done)
  • Mapping found tags to correct entity in dbpedia.org

Other Ideas

  • Using search query log as additional weighing mechanism

Google Summer of Code 2010

Group organizers

Group categories

Important Announcement

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week