Project Information
Project page on d.o: http://drupal.org/project/autordf
Student: Tushar Mahajan (chia on d.o)
Mentor: Thomas Narres (narres on d.o)
Current Status
The module find important words and phrases in node content, check it against a list of stopwords. The test site is at http://gsoc.chia.in
I have the tagger test page at http://gsoc.chia.in/autordf
Goals
1) A auto tagging system that will tag node content accurately and learn from existing tags.
2) Automatic find relevant tags related to different vocabularies and RDFing the content.
3) Finding Relations between the different tags and build a Taxonomy tree.
A robust auto taxonomy system and RDFing the content output by Drupal sites.
Project Schedule
Finding Tags within nodeApplying Tags to nodeCategorizing tagsIdentifying Tags from Different VocabularyLearning from existing taxonomy tree- Relation Between Different Tags using Association Rules (Roadmap)
Create a Taxonomy TreeAssist users in Tagging (UI)Increasing Accuracy of overall systemNamed Entity Recognition System(Basic Implemented using predefined Lexicons and taxonomy learning)- Using Services Module To expose Drupal Taxonomy entity System, helps in increasing accuracy (Roadmap)
- Sentiments Analyzer using Part of Speech(POS) and predefined Lexicons (Roadmap)
RDFing content and tags(very basic implementation done)- Mapping found tags to correct entity in dbpedia.org
Other Ideas
- Using search query log as additional weighing mechanism