News Explore the world of Geo-information

Frans Knibbe

Spatial data on the Web: why?

19 January 2016

Opportunities abound when spatial data are let loose on the World Wide Web. They are a crucial ingredient for concepts like Big Data, the Internet of Things, Machine Learning, Smart Environments and Smart Organizations. But a few things still need to be improved in order to unlock all potential of location on the Web. For that reason the Spatial Data on the Web Working Group (SDWWG) was started by the OGC and the W3C. On behalf of Geodan I became a member of that group. As such I want to report on what has been happening and what can be expected.

In this first blog I will describe the value of sharing spatial data in a Web architecture, and I will make a case for collaboration between the Web domain and the geospatial domain.

The power of the Web

In common parlance, the Internet and the Web often are treated as interchangeable concepts. But in fact they are different things. The global network of computer networks that we call the Internet is older. Born from a predecessor from 1969 the Internet as we know it started in 1983. Its foundation is TCP/IP, a set of standards for communication between computers. The existence of the Internet made the creation of the World Wide Web (WWW, or ‘the Web’) possible.

The Web came to life in 1991 and was invented by Tim Berners-Lee, who presently is the director of the W3C. The Web is one of the applications that make use of the Internet (an example of another application is e-mail). Like the Internet, the Web is based on standards. The most important ones are HTTP, the protocol for exchanging hypertext (text that contains hyperlinks, like this blog), HTML, a standard format for formatting hypertext and the URL or URI, the universal identification of anything on the Web (Web resources).

The Web made it possible to universally publish and link text document documents. Later development of the Web showed the Web being used for sharing images, still as well as moving. Nowadays web browsers are so powerful that they can be regarded as full-blown operating systems.

Saying that the Web has an enormous impact on the world today is an understatement. Worldwide information exchange has changed greatly, and in social, economic, cultural and political areas the Web has triggered profound transformations. Along with the world the Web keeps evolving too. It is now much more than a way to share text documents. There is a growing drive to use the Web as a platform for sharing raw data, an important resource in our society. Raw data are a source of information and knowledge and are the foundation of many decisions that are made by people, organisations and autonomous machines. When raw data are called ‘the modern gold’ or ‘the modern oil’ that is fully justified – raw data are an extremely valuable resource. Speaking of the value of data, the worth of spatial data should definitely be acknowledged.

Spatial data: more than geography

Spatial data are data that tell something about location. That encompasses more than the geographical data that are traditionally stored and used in Geographic Information Systems (GIS).
For starters, ‘spatial’ is a much broader term than ‘geographic’. Geographic data use Earth or a model of Earth as reference. Data about other celestial bodies are spatial, but not geographic. Data about locations on Earth that do not use an earthbound reference system also are spatial, but not geographic. Take building plans for example, the geometry is often expressed in a local coordinate system that is not directly related to Earth. Another example was mentioned in one of the use cases for the SDWWG: describing shapes and spatial relations in microscope slides to help cancer research. It shows that spatial data are not limited to human scales.

Further we should realize that spatial data are not always expressed in numbers (e.g. coordinates). There are other ways of expressing location, for example by using names (toponyms) or addresses.

Lastly it is worth mentioning that data exist about things that have an unknown or vague location. Famous from the classic ‘Asterix and Chieftain’s Shield’, the place name Alesia might be familiar. Its location however is not precisely known (although it is more clear now than when the comic book was made). The same goes for many other historical places, events and artefacts. Also for current things the location can (temporarily) be unknown. That should not inhibit having data about those things.

Another type of vagueness occurs with places like the Middle East, the Sahara or the Orient. A clear definition of their boundaries is hard to give. Still they are spatial objects. Or what to think about “across the street” or “in my left trouser pocket”? Those are locations too, however vague they may be. The latter two examples also show that spatial data is about more than recording location, it is also about expressions of spatial relations, which can be recorded or be derived from location data. Examples of spatial relations that can be described as spatial data are “Belgium borders on the Netherlands” and “I am 20 kilometers away from the nearest filling station”.

So the concept of spatial data is broader than one might think, making spatial data all the more ubiquitous and unavoidable. A large majority of the data that people work with is spatial, directly or indirectly. That gives us a huge amount of possibilities to do something interesting with those data. Techniques from the field of geoinformatics can play an important role in unlocking the potential of spatial data.

The need for holism

Historically, developments in spatial data and web data have taken place in separate domains. Fortunately a clear trend of mutual approach and integration can be discerned. For instance, take the popularity of specifications like GeoJSON (vector geometry in JavaScript Object Notation) and TopoJSON (a topology extension for GeoJSON). Or the continuing developments of what Google does with geographical data. Still, many developments take place in isolated domains. Which is a pity, because a holistic, zoomed out perspective on current problems can lead to the best and most sustainable progress.

From a historical GIS perspective there has always been a desire for improving scalability and interoperability. The first generation of GIS was characterised by data storage in files, using distinct formats specific for particular GIS package. Interoperability was low. To use data from one GIS in the other, data had to be converted, with accompanying risks and nuisances. The second generation of GIS saw usage of a more general medium for data storage and processing: relational databases. This enabled using larger amounts of data and improved options for combining geographical data with other types of data. And now we are moving toward a third generation of GIS in which all data, spatial as well as non-spatial, can be shared in one global distributed database and where most user interaction takes place via web applications. Again we take a major step in scalability and interoperability.

A historical Web perspective shows the increasing importance of raw data. The Web is developing from a document sharing platform to an all-encompassing IT platform. With that comes a way of storing, sharing, combining and querying raw data. In that development the knowledge base that has been build in the GIS world over the past decades is indispensable. For dealing with complex spatial objects, spatial relationships, spatial networks, spatial patterns etc. knowledge that has been cultivated by organizations like the OGC will have to be utilized, expanded and integrated.

The Spatial Data on the Web Working Group

To summarize, getting spatial data to work well on the Web is of great importance and full of promise. In order to do just that the Spatial Data on the Web Working Group was set up by a joint effort from the OGC and the W3C. In the following blog I will tell you about what the SDWWG wants to accomplish and about the first results of the Working Group.