Bdragon's vision for doing locations "right" in Drupal

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
bdragon's picture

As discussed in IRC, September 10, 2007

A. Geocoding / Address Storage

We need a unified geocoding api that other modules can call into. Because nobody focuses on it, everybody sucks at it. There are a million and one different web services to do geocoding.... Running a local TIGER geocoder is actually possible too, assuming you have a bit of spare disk space on your MySQL / PostgreSQL server.

The problem with geocoding is that addresses vary so much around the world. Some geocoding services have their own address parsing; for example, with Google you just send the whole address, not divided into “street number” or “zip / post code” or any of that, and it figures out how to parse it. But some services need a preparsed address.

So we’d have to come up with a reasonable set of fields to use internally. I’m not worried about the internals, I’m worried about the bridge to other modules. I think that should just start with two modes -- “unparsed gob of data” and “split into fields used by countries likely to have net access." And worry about adding more fields as needed...

In my opinion, it is critical that geocoding be a dedicated module, as it will be the part that knows all the details about a specific country's postal system (so there will be a rather large list of people with commit access, as there will no doubt be a lot of per-country tweaks and updates trickling in.)

Geocoding is so closely tied to address storage that I believe the geocoder module should handle address storage, retrieval, and query. (Query being "query of the address elements", not any sort of proximity query.)

What if we were to have the geocoder responsible for keeping track of what a country’s address input form looks like? So it could say “Oh, the USA.. They use bla and bla, and a postcode field called “Zip Code”" Geocoding is very tied to address parsing, input and storage...I think probabaly a good bet would be to take those “very common” fields we made when doing the inital geocoder api and give them their own columns... and have a serialized column for the uncommon ones...

Note that I did not say "make it responsible for making the address form."

B. Geospatial

Geo module is for storage, query, and retrieval of geospatial data. It should not touch geocoding. A proper geospatial database can answer questions like, "Which nodes have points that fall inside the “Japan” polygon?" very quickly. Proper geospatial databases are currently PostGIS and Oracle Spatial. MySQL can do some answers, but has limitations that can require postprocessing to prune out results (it only deals with bounding boxes). MySQL 4.1 (iirc) and later come with “good enough for most people” geospatial support. The leader in the OSS world is PostGIS (which is a PostgreSQL extension which is commonly bundled with PostgreSQL), it’s very nice.

Geo will be responsible for stuff like views filters and point / polygon / line storage, and the views filters for proximity and such. Geo already handles most of this stuff, it doesn’t have a purty gui yet, though. Geo should probabaly stay behind the scenes and handle the geospatial stuff, and not do other tasks like providing the data entry interface, which is my next point.

C. Data Entry

Data entry is one of the two "user facing" parts of the system. (The other is searching, which can be accomplished with views's exposed filters.)

Currently, the only game in town is Location.

There are three ways to do data entry:
1) By geocoding an address
2) By drilling down on a map
3) Direct entry (for people who have GPS units, etc)

There should be a module dedicated to providing a usable GUI for this, and using the other pieces of the system to handle the behind the scenes work. If there's a *_ui.module in this plan, THIS is it.

1) Given a country, it would ask geocoder how to set up the address component fields. It would collect the data from the user, and feed it back into geocoder. It would take the result and display it on a map, in case the user wishes to tweak it.

2) We can store a hierarchical set of bounding boxes, and put together a list of bounding boxes for continents / countries / administrative sections of countries and let the user drill down far enough to be able to find the location they want just by clicking some more on the map. gmap has routines built in for providing this sort of interface, but I wouldn’t want gmap itself to handle this. “Drilldown” is a generic thing gmap provides to others. Want to have this loosely coupled, because some people might want to use other mapping providers for this. The api for that would be basically “hey map, move and zoom so this bounding box is inside your viewport” and “This is the map, the user decided on these coordinates:”

3) just lat/lon boxes with validation / conversion. Internal storage is decimal degrees, but some people might only have Degree/Minute/Second format coordinates, etc. (well, INTERNAL storage is a format called WKB, but that’s entirely geo.module’s concern and the other modules should not know or care about this)


So, to review: geocoder does 1) provide special fields for countries 2) translates the address into latitude/longitude via different hook-in modules. Geo.module does things like proximity searches, in-area searches etc. geo.module only cares about coordinates and coordinates would be the “output product” of the geocoder.

D. Data Searching

The actual module that this all rotates around is Views. The other parts provide filters for things that are under their control.

geo.module proving all sorts of nifty views filters dealing with things like proximity or being inside a polygon / region / being to the north of another place / etc.

geocoder providing filters for searching postcodes / streets / etc.

E. Data visualization

GMap is definately the best solution at the moment. however, I would be HIGHLY interested in having some competition in this department, PARTICULARILY if the competition is openlayers. http://openlayers.org/ OpenLayers has some very interesting parts that would be useful in the various other parts of the system. (For example, it can operate as a WFS-T client. See section G)

F. GeoRSS / KML / GML

GeoRSS is geospatial metadata on rss, and KML and GML are feature markup languages, IIRC. (GML is the most complex of the three.) I haven’t done too much research into this part, as I have been focusing on the other technologies. Say you blog about your travels; GeoRSS is a way to let people plot your posts on a map. They are all three somewhat similar, and GeoRSS itself has multiple dialects IIRC. But you can’t ask GeoRSS for, say, “Only the points that fall within this polygon: xxxxxxx”

The BENEFIT to GeoRSS is it is very simple to implement in comparison to a full feature service (see G).

G. Web Feature Service

Web Feature Service is basically a server that answers geospatial questions. So, say your server provides a map layer that has all the fire hydrants in the city. Other people could pull that data (if you wished to share it) into a map that has layers from other places. Likewise, with a WFS client, you could say, pull in a layer containing line features of all the railways in the city. And then you could do analysis on these... Of course, this is something that especially appeals to researchers, but has other uses too. The nice thing about WFS is it’s a way to share your toys, and end up with something altogether more powerful.

Geodaniel had some VERY interesting ideas regarding adding a standards compliant feature server module to Drupal. http://www.dankarran.com/blog/archives/2006/07/27/drupal_as_a_wfs.php This would allow cool things like hooking a GIS client up to Drupal and editing things directly there. (This is assuming we add WFS-T support, which supports both reading and writing data)

Think BlogAPI for cartography.

Now, this would be a very difficult but very rewarding subproject, and if done right, Drupal would be THE geocms. This is a bit far off though, and depends on everything I discussed previously to be specced, written, and working, because it would draw on ALL of it. But an easy to use geospatial CMS with WFS support would be VERY appealing to people doing any geoscientific / geopolitical / geowhatever research, as, despite there being quite a bit of software available, there are few (no?) actual content management systems that have comprehensive geospatial support.

Comments

+1, will buy again

Boris Mann's picture

Where do I pay?

+1, love it

challer's picture

Great vision, what's the plan to tackle it?

+1 and some info on alternative data visualization

larzz's picture

really glad to see that you've taken this on and, on first read, have to say i like the way you're approaching this.

i'm really happy to see that you are seeking alternatives to gMap, in particular open source ones (as those close to me know already, gMap - especially when integrated with/into open source solutions - is a bit of a pet peeve of mine).

not sure if you are aware of Open Source MapGuide, http://mapguide.osgeo.org (Autodesk still have a proprietary version), but you may want to check it out.
it's currently in early development (v1.2 was recently released), but quite mature in feature set and usability.
i've been working it since pre-1.0 releases (helping some with debugging and participating on the mailing list).
i have experience with a variety of open source mapping software, like UMN MapServer, OpenLayers, and KaMap. i'd have to say that MapGuide is the best i've seen so far.
however i haven't fully thought through how your vision and MapGuide might/would fit together
i, too, have had similar thoughts, but haven't had the time or opportunity to fully develop some of the ideas i have, but one of those ideas is to integrate MapGuide with drupal in a way similar to how gMaps currently works (and probably additional features/capabilities).

(don't mean to toot my own horn, just want to give a bit of perspective of where i'm coming from.
i have several years of experience in many of the areas you've raised, including building tiger data, using postgresql and mysql (>v5.0) as datastores, creating a tiger-based geocoder and perl-based soap service, and data visualization (with the above-mentioned software) and will be giving a presentation on these topics (http://www.foss4g2007.org/presentations/view.php?abstract_id=246) at the FOSS4G conference at the end of september)

all that being said, great initial work and i'd love to help out with research and refining things.
i look forward to future discussions :)

I'll be there

bdragon's picture

I'm going to foss4g2007, so I'll see you there. :)

More documentation on what's in CVS now?

kreynen's picture

+1

I actually set-up a project to port some geocoding code I wrote for OurTahoe.org's Places to utilize Geo's tables when I read about the module back in June (http://drupal.org/project/geocode), but I couldn't figure out where things were headed from the code available then and didn't get much of response to the questions I asked.

I've installed the latest Geo code from CVS, but still need some explanation what the Geo-Spatial Tables interface is supposed to do (do I need to configure the manually?) or what type of data belongs in the columns in the geo_link table (left_table, left_field, right_field, additional_fields)?

I'd love to help with this, but I need more information about how Geo is supposed to work with MySQL and PostgreSQL.

A handbook page or Dojo screencast would be great, but I'd settle for some information in an Install.txt included with the module!

+1

mlncn's picture

What can be done to help?

~ ben melançon

member, Agaric Design Collective
http://AgaricDesign.com - "Open Source Web Development"

benjamin, agaric

Code, Docs, Money, Press

mfredrickson's picture

Code

Geo is pretty far along, but needs a lot of polish to make a 1.0 release. Originally, I had wanted to get some integration with existing shape tables in the 1.0 release, but I'd be willing to wait until later for that. I would rate my priorities for geo to be improving views integration (specifically distance queries) and better input/ouput methods. Right now you have to type in Well-Known-Text (e.g. POINT(45 90) or POLYGON((45 90 50 100 45 60 45 90)) -- yuck). We need some "draw on a map" style tools. Perhaps OpenLayers has something? (he says optimistically)

Docs

Kevin is right. The docs are non-existant. A good first step might be collecting links to existing PostGIS and MySQL Spatial tutorials. Further work can be done to explain what the geo module actually does.

Money

Any billionaire playboys looking to fund this work, please contact me.

Um. No one? Ok, how about we all apply for Knight grants. I think Bejamin knows a thing or two about those. :-)

The more applications the more likelihood that one would be accepted. Plus, since geo is so general it would fit with many applications and projects.

Press

Talk, blog, shout about it. Keep up the chatter, we're sure to get someone's attention.

Code...

bdragon's picture

I can make a simplistic feature creator of the current GMap macro creator code if you like. It can do points and polys, and can compute circle polys at the moment. There are other features I need to roll into it too, anyway.

Do you think this is worthwhile?

Update...

bdragon's picture

I have started on the openlayers module. It has some very powerful feature creation tools, and can speak WKT natively, amongst many other formats. It's rather elegant.

Billionaire playboys

Boris Mann's picture

Well, if an estimate can be made for a certain bundle of features, we can do a bounty. Having companies support / fund the goal of "Making Drupal the first GeoCMS" is something that in all likelihood will be very attractive. Sun? Google? Yahoo? IBM? Some sort of geo companies? Customers looking for this functionality?

All is possible. A strong roadmap and availability of resources to do everything from code to doc to demo is very attractive.

Starting a "geo" install profile might be very interesting...

some code at least, oh, and press, oh and money

greggles's picture

I'd like to use the geo module as a backend for a geocoding service module that I'm building and need to see a 1.0 release. I've built "yet another lat/lon module" and would really be happy deleting it but if geo is not in a 1.0 state then I need to point people to my lackluster implementation.

For better coordination can we use the issue queue? If you can point to specific "must be done by 1.0 issues" then I'd be happy to help work on them.

And yes, when the geocoding module is done I'll be doing my best on Press. I could even kick a small amount of money towards the module to get some happy views integration, but I'd need to be able to say "this money is a bounty for issue nodes X, Y, Z".

Also, what about creating a 5.x-1.x-dev release and a roadmap on the homepage? I had no idea that deep in the cvs for this module their lurked a cck field.

Thanks,
Greg

--
Knaddisons Denver Life | mmm Chipotle Log | The Big Spanish Tour

Stable 1.0 release

beeradb's picture

I'm in the exact same boat as Greg. I'm developing my own mapping module, which I was planning to release but would be willing to abandon for a more comprehensive solution. I too would be willing to donate some time to this project if we could get an issue queue going so we know what needs to be completed for a 1.0 release.

Anyway, keep us informed if you can:)

Thanks,
brad

Solidarity

mlncn's picture

Greg, Beeradb, is your code public anywhere?

Mostly, I just want to second everything here. Looking towards working together. (My mess is here.)

Also, I was thinking the roadmap could be made a wiki and linked from the group description? We can then link plans, related approaches, bounties in a central, fairly high-profile spot.

Speaking of bounties, People Who Give a Damn, Inc. just received its 501c3 and could process contributions for this sort of work as tax-deductible donations, it's completely in the technology-for-bringing-people-together mission.

ben, Agaric Design Collective

benjamin, agaric

my code

greggles's picture

Yes - I have committed (but not created a project for) my cck lat/lon module but, as I said, I'd really like to delete it from the repository to help solidify geo as the main module. Just today I see http://drupal.org/project/geobrowser which seems a lot like yet-another-gmaps module.

I've built two things:
1) A module to integrate a geocoding service that will take addresses (initially from the cck_address module, but it's built in an extensible way) and turn them into lat/lon, and then store them somewhere (somewhere currently being......
2) cck_latlon field which provides a cck field that holds lat/lon. It's a very naive implementation - just two numbers in a table. I don't think it even validates that they are possible lat/lon numbers. I needed that to build the geocoding integration module and didn't realize that the "geo" module already provided a cck field for points until I was already done with that work.

The geocoding integration module is still being reviewed by the client so it is not ready to be released yet. Most of the code in it, though, is an administration/configuration screen and code specific to the XMLRPC web service that it uses.

Greg

--
Knaddisons Denver Life | mmm Chipotle Log | The Big Spanish Tour

Coming Soon...

beeradb's picture

I've had a few delay's and had to pull off the project temporarily. I've restarted work and should have something in a couple days, i'll be sure to post it here when it's done

I'll be happy to help with

catch's picture

I'll be happy to help with testing alpha code and 0.x releases. I'm only really interested in (and more importantly able to understand at all!) data entry, google map + views integration but will do what I can.

+++HUGE PLUS ONE+++

bcn's picture

I'm so happy to see a detailed and coherent plans such as this for getting some better geo support into drupal...

With regards to "Part C (Data Entry)", the following links provide some reference to related issues:

(others that tangential, but interesting)
- Exposed filter selector and term hierarchy- http://drupal.org/node/54365

GeoNames support

ChrisBryant's picture

This is a great writeup, thanks. Count me in. :-)

It would also be great to support GeoNames as well. There is some great data there that can be accessible and integrated. The website is here: http://www.geonames.org/

There are some 18 web services that you can see here: http://www.geonames.org/export/

It looks like a lot of work has already been done to integrate this with Drupal: http://geonames.edesign.no/

Maybe you will be able to collaborate together on it.

Chris

ALIAN DESIGN

That might also be of

Frando's picture

That might also be of interest: http://drupal.org/project/geonames

It's a Drupal module which seems to be mainly a wrapper around the public geonames webservice API.

Edit: I just realized that this is the code that drives the drupal demo site that you mentioned in your post.

I drafted code for a PHP WMS/WFS awhile back

nedjo's picture

http://sourceforge.net/projects/servicegeo/

code is at:

http://servicegeo.cvs.sourceforge.net/servicegeo/

written as a PEAR package. I wrote it to enable both web map service (WMS) and web feature service (WFS) support, though I only coded the WMS portion.

I included support for data stored in Drupal--that was kinda my whole aim. (For map rendering, I forked the existing PEAR Image_GIS package, after not getting response from the maintainers.)

Didn't pursue the project after I ended up leaving the NGO where I was working. But it could be picked up again. I did at least map out solutions to many of the issues of providing a geographic web service in PHP (responding to capabilities requests, error handling, etc.).

As far as the WFS side, I

tmcw's picture

As far as the WFS side, I wrote the WFS module, which is basically looking for a maintainer. It works quite well for a bunch of use cases, but doesn't have dual support for 1.0 and 1.1, and I've abandoned WFS out of misgivings about the OGC standards and WFS in particular.

Tom, could you elaborate more

R.J. Steinert's picture

Tom, could you elaborate more on your misgivings and/or talk about what standards you are favoring?

Sure: First, caveats, since

tmcw's picture

Sure:

First, caveats, since obviously I'm just one guy with specific usecases and no need for certain properties of a standard or compatibilities. For people who need gold-star government-approved standards which are compatible with ArcGIS and GeoServer, WFS is a godsend.

So, first off: the WFS standard, like the rest of the OGC standards, is behind a sign-in wall and a PDF that is so long and obtruse and completely examples-empty that it's easier to hack an implementation around an existing implementation, like GeoServer, than it is to read the standards document. This way of managing standards is appalling. The WFS module was written by hacking around GeoServer, rather than reading the standards, and I think it could have been way better if the standards were well-written. Ironically, support for WFS in open source is weak - GeoServer's WFS client is pretty much unfinished and unmaintained, the WFS server is decent. OpenLayers has okay support, QGIS supports one version. But all of these seemed to have quirks, possibly as a result of the standard being unreadable.

Next: the standard itself is WMS-era. As in, it's reliant on lots of GET requests with lots of XML and lots of client and server architecture. It's basically painful and annoying to support. There's 'raw data' coming in raw headers and, big suprise, it's XML that you're posting. WFS-T is not widely implemented. Even OpenLayers, the god of supporting everything, has weak support that's unpatched in core.

Next: permissions are poorly defined, even within GeoServer. Same with multilingual. 'Vendor extensions' to the protocol are painful. GML as a wrapped data format is just as verbose as it is unwrapped, and the fact that different versions of WFS require different versions of GML is annoying. GeoJSON is far less annoying in every way, and has broad support via OGR.

Finally, my opinion after working with GeoServer as an external renderer for Drupal data is that WFS is not a real protocol for syncing data - it's decent for querying, but for large datasets, the expansion of data due to GML encoding and the inability of clients to properly detect when data is updated (or get just updated data) makes it a nonstarter.

Thus, for live rasterization, I prefer GeoJSON as an output format, which is then rendered with Mapnik with TileLive and interaction data is provided by TileLive. This scales down much better than a full WFS solution, in my humble opinion, and does data management better by allowing data to live at the source rather than being synced from place to place.

Awesome summary! Thank you :)

R.J. Steinert's picture

Awesome summary! Thank you :)

A. Geocoding / Address Storage

andremolnar's picture

You seem to lump two things in the title - Geocoding and Address Storage

These have to be decoupled. Geocoding is an action that takes place. (e.g. i've got this [partial] address now get me a coordinate) What I do with that point (e.g. store it) after I get it is a whole other action. From Where and How I get an address are also a totally different actions.

There are tons of geocoding options out there, but I don't know if you can or want to create one (The?) Drupal Geocoding API module to unite them all. But we can have several geocoding modules happily co-exist by trying to impose rules on them. Each geocoding module should be able to stand on its own without any dependencies (besides the service that generates the coordinate and drupal core). Each geocoding module should simply expose an API - data in - data out. Each geocoding module (via its API) knows exactly what format it wants its information - and defines exactly what format its going to give its result(s).

It would then be the job of a good location module to do two major things:
Address storage/retrieval. (put address fields into the database e.g. country, province, sub-boundry, sub sub boundry, postal code, street address)
Address field validation (which itself should rely on other external modules (e.g. a german postal code validator or Reverse Geocoding))

Then it would be up to other modules to decide which geocoding API to use in different situations (during core Drupal API / hook calls) depending on how much address information they had or which country they were dealing with or some other criteria (e.g. goecoding module a is best if I only have a city in the US, module b is best for addresses in hungary, module c is a good catch all etc.).

All in all, I would like maximum flexibility. And ideally, I want highly specialized modules that each do one thing really well - and can all work together to do amazing things (i.e. where the whole is greater than the sum of parts).

andre

I second this, especially the

weder's picture

I second this, especially the address storage part.

There's a need for storing and dealing with international addresses from the geocoding side alright, but also in general. I do have the requirement to be able to show international addresses properly formatted the way its done in their respective countries (something the Addresses module provides, for example), and I would not want to go to the geocoding module for this. I'd treat with address management, display, import, export, etc. separately in its own module. We could have a look at Addresses and/or Postal for a start.

I would have a look at

Adam S's picture

I would have a look at OpenLayers, OpenLayers Geocode and OpenLayers Proximity.

Marine job board with Drupal 7 at http://windwardjobs.com

Another thought on geocoding

andremolnar's picture

As I tinker with google maps API's geocoding feature: If we do store geocoding data it would be nice to know the level of accuracy.
For example:
http://www.google.com/apis/maps/documentation/reference.html#GGeoAddress...

andre

+1e0.25*pi (i'm not all there, but getting around to it)

psi-borg's picture

i can cheerlead and do virtual backflips over this stuff... was wondering though, any chance there could be born a drupal wms for geocoding and layers and stuff, as a central repository?

Data Entry

joe.murray's picture

I'm going to provide a specific use case, in case that helps focus design / tools / architecture.

I have a potential client who could use some geospatial functionality to help match providers with consumers. The application is trying to determine what programs, services, incentives, etc an individual or business would qualify for. Providers of the programs, services, might be businesses, ngo's, governments, etc. Expected areas would include polygons corresponding to muncipality, conservation area, ecological region, radius around a business location, perhaps the whole province which the website serves.

The providers need to be able to easily specify areas they serve. Then the users' point location would be used to find relevant providers.

A single set of bounding boxes is insufficent since some providers are muncipal jurisdictions, while others are ecologically specified areas that may interset but not completely contain the muncipality. As a result, it would be better to allow more than one map layer to provide a geofilter at the same time.

In terms of data entry tools, it's likely useful to allow for a map-based selection of spatial areas since many users will not know the names of the relevant muncipalities in the region they want to serve. A map of the province might be made available with several layers so that providers could specify their area in whichever way is most appropriate for them. Again, one layer might be drainage areas of each river system, another would be by municipality, another by conservation authority, another by ecologically sensitive terrain.... Selecting entries in a multi-select box would highlight on the map those areas, or selecting on the map would highlight (or un-highlight) the entries in the multi-select boxes. Tools would need to allow regions for current layer to be selected or deselected by clicking or by drawing a rectangle to select say, all polygons within it, or perhaps all that intersect with it, with perhaps a control key for adding to a selection. Maybe a free-form polygon tool could be used to delineate an area, but I think that is likely presuming a bit more than most users will be able to do. Similarly, allowing selections from multiple layers is likely getting a bit too complicated.

HTH.

Joe Murray

Joe Murray

bump

frankcarey's picture

Frank Carey
TwelveGrove Drupal Development
http://www.twelvegrove.com

Location and Mapping

Group organizers

Group categories

Wiki type

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: