In our work with development cooperation GIS we have come to a point where we find it necessary to establish an overall publication system for environmental information. We will do this together with some of our partners. The system, a clearinghouse, should enable our partners to present project related information to the general public.
A draft system was set up and documented in a former posting on this website. In the article, Environmental Spatial Data Infrastructure – technology, I described the system and some of the challenges. In this article I am taking it a bit further, hoping to stimulate to discussions about how such a system could be implemented.
This posting is about designing a clearinghouse predominantly intended for environmental data. It describes a work in progress. We are working on a requirements document and this posting is ment to inform interested parties about the work. Inputs to our work is both asked for and necessary.
What is a clearinghouse?
Clearinghouses come in many shapes and colors. Some are gigantic structures slowly crumbling under their own weight – being too complex and expensive to sustain use over time. Other systems are feather weight systems failing to do the job assigned to them. Other again are some are perfectly sized constructions which integrate well with other systems, flexible and varsatile. We are of course aiming at building the latter of these. There are no standards – and we’re starting from scratch on this one.
The objective of the clearinghouse is to provide in-depth information about the state and development of the environment. It shall present environmental topics from projects in a simple and easy-to-follow way, providing access to more detailed scientific presentations where such have been supplied by the contact organizations.
The following concept figure places a clearinghouse within in a context. As a concept figure it is not drawn with any particular country in mind.
Lets start with the main structure of our clearinghouse. Our partner has over the last three years completed two versions of a sensitivity atlas for a relatively big area. The next project is to establish a monitoring plan for the same area.
The structure for the monitoring program looks more or less like this:
We have noted that the structure is general enough to be used with other projects. It is our ambition that the clearinghouse should be able to shoulder more than one project. If it does, it will probably also be used more, and one would avoid having a rather confused family of specialized sub-systems on the hosting organizations webpages. It would be easier for developers, partners, administrators and users.
Now what if we also could be able to use this system with other partners? And what if others could use it as well? It depends on how well we design our system. Will it be general enough without rendering it useless for our purpose?
Our project context
We have taken great care to design the system as general as possible. Let’s have a look at the structure with more and different projects. See how the general structure (dark gray boxes) can be reflected in different projects in the figure below.
The above figure at gives us an indication of how our data model fits with several of the prospective projects it shall serve.
Within the overall structure we also have to consider which functionality the system should serve. The following represent a roundup of core functionalities of the system:
- Structured/hierarchical fact pages according to the presented projects. The pages will have text, tables, graphics and maps.
- The system facilitates input of data from stakeholders. The system administrator imports the data to the system.
- The system consists of both spatial and non-spatial data. In addition comes descriptive data (meta-data) providing the users with information necessary to know more about the provided data.
- The spatial data is provided to the user through maps. The maps consist of base layers (imagery to provide context) and the actual data. The actual data are “clickable” and will point to non-spatial data.
- A standard library of geographical objects is used. This means that the user will only have to upload references to the geographical objects and associated values.
- Non spatial data are files and online texts. The online texts provide context for files stored in the system. It gives the user an option to move through the system in a systematic way through hierarchies.
- Interactive request and comments box-which helps to provide feedback from stakeholders.
- Map layers are available through a content management system, as well as through OGC standards like Web Map Services (WMS), WFS and other.
It’s a tall order, but it can be done.
A challenge with the functionalities is that it is evident that the system will have two parallel data-supporting structures. One for the spatial data and one for the non-spatial data.
We’re thinking of keeping the spatial data in a separate database. The data is formated and made available as embeddable maps or tables giving an overview of the spatial data.
The non-spatial data provides a framework around the data. This means that we are using a content management system to establish a parallell structure to our custom made database. From within the content management system we then call for relevant tabular presentations and maps. It is not as neat as one could wish for, but drawing on the resources of a professional content management system is far better than building your own.
Our main work will therefore be in building components which will provide the content management with “consumables”.
Including maps to a clearinghouse adds to the complexity. There are two ways of dealing with the spatially related user data:
- The user uploads the geographical objects with attributes
- The user interact with predefined geographical objects
Uploading geographical objects would mean uploading shapefiles. Unfortunately shapefile uploading without a good quality assurance process could lead to several problems. These are some of them:
- Messed up or missing projections
- Corrupted files
- Inconsistent naming of files, objects and column names
- Duplicate objects
- Objects covering the same area but with different origins and quality
- Questionable legal status on the geographical objects
- It might be necessary to establish a user-role model within the admin module
We think the best thing in this case is to let one administrator handle the geographical objects. He or she should discuss with the users/partners what geographical objects are necessary and a proper process should then lead to the correct objects being imported.
The geographical objects in the system could be points, lines, polygons and multi-polygons. Basically whatever you can throw into a PostGIS table and attach a value to. An example could be waterbodies, districts, rivers, measuring points or even Quarter Degree Grid Cells. The constraint is that the table of geographical objects should have a unique reference known to all potential providers of tabular data to be connected with the geographical object.
How will this pan out? We start with keeping quality assured geographical objects in the database. Next thing is to let users provide files with values and references to the geographical objects. Excel files would probably be easiest. Using SQL views within geoserver will make it easy for us to pull out the correct maps. Using SQL parameters could make it possible for us to limit the number of layer definitions in Geoserver.
In the end this will give us a map pretty much like this:
The users need to know which geographical objects are available, and which unique reference they answer to. The system should have a report engine able to provide them with a default list.
The picture is of course somewhat more complex. We have to restrict the value sets the available for the users as well. This is because we most probably will have to keep a limited number of styles available for the users. Many styles would add to the complexity.
By restricting the data contributors we are hopefully helping the users of the system getting more readable maps.
Styling the maps
The map styling is handled by keeping a default styling in the database. We are considering three styling standards. We will have to consider more depending on how many layers we will have in one map.
We might also be able to implement the new styling transformation functions as documented in Geoserver 2.2 and this might ease our workload somewhat.
A data model has been sketched using MS Access. The clearinghouse will however be running on top of a PostGIS database. The database will not integrate directly with the one supporting the content management system.
As one can see from the model the hierarchy is supported by the following tables:
We will leave it to the interested reader to look in more detail on the data model.
Pulling it all together
As explained earlier the system will rely on many modules, some of which are custom made and some of which are standard systems. Luckily we have now come to where we have many good standard systems many of which are made by OpenGeo.
The following are the standard modules and tools which will be used:
The main effort for the clearinghouse project will lie in the custom made modules:
The figure below indicates the relation between the different modules. The custom made modules are in light gray and will have to be specified and developed.
The custom modules will leave a lot of work on the structure to be done within WordPress. This means the administratir will have to maintain a mirror structure of the one in the clearinghouse database for non spatial data. A properly designed lists module will be helpfull.
Challenges and questions
Challenges to a system like this are likely to be many. Here are some of them:
- How do you handle the many styles necessary to keep several layers in one map?
- The styling standard used in geoserver (SLD) is flexible. We will probably set up some standard styles and also keep a couple of dynamic styles using the interpolate functionality in SLD.
- Why did we choose OpenGeoSuite over GeoNode?
- GeoNode is an excellent tool for management and presentation of spatial data. But currently the project stops there. In a clearinghouse it is necessary to integrate many pieces of information – both spatial and non-spatial. The data model and other tools to integrate the information is bluntly speaking too complex for Geonode. That is not to say that Geonode cannot be a tool used to play around with the data we present. It can be – and it probably also will be. But at this stage not by us.
- How do we deal with ownership?
- Using open source modules it makes sense not to break it. In an initial phase, up until the first version is ready we will work our way forward together with our developers. The custom made modules will then be released into the wild on a suitable collaborative platform for open source coding.
- What about meta-data?
- Meta data is an important part of web based services. We were at some point contemplating including GeoNetwork into the system. At this point we will have to focus on preparing a primary level functional clearinghouse. Geonetwork could be added later.
This article describes my initial thoughts on such a system. We are working on a specification of the system and we do have funding for some development. We expect to have a draft specifications document ready for distribution sometime in September. We expect our developer has a beta ready by the end of the year.
At this point we are looking for feedback on the above text. Are there issues we should take care in covering better? Does similar systems exist? Where should we host the publicly available code?