Question for event search engine

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
mimilamite's picture

Hi

I've been looking for times for a system to do my project :
It is an event search engine. Quite common thing.
But here in France, very few event sources have rss, xml, or anything that
permit event syndication.

Here's the idea : index given websites, or even scan paper programs with
OCR, input into the engine,
And it sorts out itself what is the start date, end date, where the event
takes place (city name), The title and the description of the event.
Then it makes a list out of these events for people to go out.

Would you recommend me to use drupal/nutch for this ? (I intend to hook a
community script behind this )

Thanks from Toulouse, France.
Mitia NOTARAS

www.gite-sidobre.com

Comments

Drupal would most likely be

cpliakas's picture

Drupal would most likely be used as the search UI as opposed to the underlying functionality to achieve what you are looking to do. A Solr / nutch solution sounds like your best bet, however the OCR piece seem like it will be tricky. Apache Solr does integrate with the Apache Tika project to extract text from things like word docs, PDFs. etc, however if the papers are images or scanned documents, then OCR might be necessary. It is very processor intensive and tricky to get right, so it would be good to evaluate the documents you plan on indexing and determine whether OCR would provide enough value to warrant the effort.

Lucene, Nutch and Solr

Group organizers

Group categories

Projects

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week