As discussed in IRC, September 10, 2007
- A. Geocoding / Address Storage
- B. Geospatial
- C. Data Entry
- D. Data Searching
- E. Data Visualization
- F. GeoRSS / KML / GML
- G. Web Feature Service
We need a unified geocoding api that other modules can call into. Because nobody focuses on it, everybody sucks at it. There are a million and one different web services to do geocoding.... Running a local TIGER geocoder is actually possible too, assuming you have a bit of spare disk space on your MySQL / PostgreSQL server.
The problem with geocoding is that addresses vary so much around the world. Some geocoding services have their own address parsing; for example, with Google you just send the whole address, not divided into “street number” or “zip / post code” or any of that, and it figures out how to parse it. But some services need a preparsed address.
So we’d have to come up with a reasonable set of fields to use internally. I’m not worried about the internals, I’m worried about the bridge to other modules. I think that should just start with two modes -- “unparsed gob of data” and “split into fields used by countries likely to have net access." And worry about adding more fields as needed...
In my opinion, it is critical that geocoding be a dedicated module, as it will be the part that knows all the details about a specific country's postal system (so there will be a rather large list of people with commit access, as there will no doubt be a lot of per-country tweaks and updates trickling in.)
Geocoding is so closely tied to address storage that I believe the geocoder module should handle address storage, retrieval, and query. (Query being "query of the address elements", not any sort of proximity query.)
What if we were to have the geocoder responsible for keeping track of what a country’s address input form looks like? So it could say “Oh, the USA.. They use bla and bla, and a postcode field called “Zip Code”" Geocoding is very tied to address parsing, input and storage...I think probabaly a good bet would be to take those “very common” fields we made when doing the inital geocoder api and give them their own columns... and have a serialized column for the uncommon ones...
Note that I did not say "make it responsible for making the address form."
Geo module is for storage, query, and retrieval of geospatial data. It should not touch geocoding. A proper geospatial database can answer questions like, "Which nodes have points that fall inside the “Japan” polygon?" very quickly. Proper geospatial databases are currently PostGIS and Oracle Spatial. MySQL can do some answers, but has limitations that can require postprocessing to prune out results (it only deals with bounding boxes). MySQL 4.1 (iirc) and later come with “good enough for most people” geospatial support. The leader in the OSS world is PostGIS (which is a PostgreSQL extension which is commonly bundled with PostgreSQL), it’s very nice.
Geo will be responsible for stuff like views filters and point / polygon / line storage, and the views filters for proximity and such. Geo already handles most of this stuff, it doesn’t have a purty gui yet, though. Geo should probabaly stay behind the scenes and handle the geospatial stuff, and not do other tasks like providing the data entry interface, which is my next point.
Data entry is one of the two "user facing" parts of the system. (The other is searching, which can be accomplished with views's exposed filters.)
Currently, the only game in town is Location.
There are three ways to do data entry:
1) By geocoding an address
2) By drilling down on a map
3) Direct entry (for people who have GPS units, etc)
There should be a module dedicated to providing a usable GUI for this, and using the other pieces of the system to handle the behind the scenes work. If there's a *_ui.module in this plan, THIS is it.
1) Given a country, it would ask geocoder how to set up the address component fields. It would collect the data from the user, and feed it back into geocoder. It would take the result and display it on a map, in case the user wishes to tweak it.
2) We can store a hierarchical set of bounding boxes, and put together a list of bounding boxes for continents / countries / administrative sections of countries and let the user drill down far enough to be able to find the location they want just by clicking some more on the map. gmap has routines built in for providing this sort of interface, but I wouldn’t want gmap itself to handle this. “Drilldown” is a generic thing gmap provides to others. Want to have this loosely coupled, because some people might want to use other mapping providers for this. The api for that would be basically “hey map, move and zoom so this bounding box is inside your viewport” and “This is the map, the user decided on these coordinates:”
3) just lat/lon boxes with validation / conversion. Internal storage is decimal degrees, but some people might only have Degree/Minute/Second format coordinates, etc. (well, INTERNAL storage is a format called WKB, but that’s entirely geo.module’s concern and the other modules should not know or care about this)
So, to review: geocoder does 1) provide special fields for countries 2) translates the address into latitude/longitude via different hook-in modules. Geo.module does things like proximity searches, in-area searches etc. geo.module only cares about coordinates and coordinates would be the “output product” of the geocoder.
The actual module that this all rotates around is Views. The other parts provide filters for things that are under their control.
geo.module proving all sorts of nifty views filters dealing with things like proximity or being inside a polygon / region / being to the north of another place / etc.
geocoder providing filters for searching postcodes / streets / etc.
GMap is definately the best solution at the moment. however, I would be HIGHLY interested in having some competition in this department, PARTICULARILY if the competition is openlayers. http://openlayers.org/ OpenLayers has some very interesting parts that would be useful in the various other parts of the system. (For example, it can operate as a WFS-T client. See section G)
GeoRSS is geospatial metadata on rss, and KML and GML are feature markup languages, IIRC. (GML is the most complex of the three.) I haven’t done too much research into this part, as I have been focusing on the other technologies. Say you blog about your travels; GeoRSS is a way to let people plot your posts on a map. They are all three somewhat similar, and GeoRSS itself has multiple dialects IIRC. But you can’t ask GeoRSS for, say, “Only the points that fall within this polygon: xxxxxxx”
The BENEFIT to GeoRSS is it is very simple to implement in comparison to a full feature service (see G).
Web Feature Service is basically a server that answers geospatial questions. So, say your server provides a map layer that has all the fire hydrants in the city. Other people could pull that data (if you wished to share it) into a map that has layers from other places. Likewise, with a WFS client, you could say, pull in a layer containing line features of all the railways in the city. And then you could do analysis on these... Of course, this is something that especially appeals to researchers, but has other uses too. The nice thing about WFS is it’s a way to share your toys, and end up with something altogether more powerful.
Geodaniel had some VERY interesting ideas regarding adding a standards compliant feature server module to Drupal. http://www.dankarran.com/blog/archives/2006/07/27/drupal_as_a_wfs.php This would allow cool things like hooking a GIS client up to Drupal and editing things directly there. (This is assuming we add WFS-T support, which supports both reading and writing data)
Think BlogAPI for cartography.
Now, this would be a very difficult but very rewarding subproject, and if done right, Drupal would be THE geocms. This is a bit far off though, and depends on everything I discussed previously to be specced, written, and working, because it would draw on ALL of it. But an easy to use geospatial CMS with WFS support would be VERY appealing to people doing any geoscientific / geopolitical / geowhatever research, as, despite there being quite a bit of software available, there are few (no?) actual content management systems that have comprehensive geospatial support.