Cartographic generalization

This short paper gives a subjective view on cartographic generalization, its achievements in the past, and the challenges it faces in the future.


Generalization and abstraction
Already the oldest document of human maps, a clay tablet showing the region around Nuzi, Mesopotamia, is a piece of high abstraction and strong generalization: there is a strong selection of a few objects and they are presented in a strongly schematic way-e.g., cities are represented as points, hills in so-called mole hill manner.
Generalization is strongly connected to multi-scale representations, but also to multiple representations in general. Humans do not only have one representation of objects of their environment [3], but composites of different representations and scales, which allows them to gain a holistic internal representation. Cartographic generalization thus serves as means to mimic and support this process of gaining and processing multi-scale input of our environment to achieve a highly faceted representation. The different representations are then used to answer queries on a corresponding abstraction level. In order to do so, it is important, that the different levels are also linked with each other to allow for a coarse-tofine approach from overview to detail and vice versa. The different abstraction levels not only govern our human perception, but are general principles of human communication, as talks, papers, or newspapers are organized in this way.
Generalization is therefore used to allow a quick comprehension, to support a coarseto-fine analysis, and also to process data always on the optimal level of detail for a given task.

Challenges in cartographic generalization
Cartographers have developed skills and rules to abstract and generalize spatial information to create maps at different scales. These rules include aspects related to the semantics of the objects, but mainly their geometry. They are implemented in a set of generalization operators, e.g., simplification, aggregation, classification, displacement. The generalization leads to clear visualizations, where the important objects are preserved and even enhanced, and the (application specific) irrelevant information is ignored. However, the generalization of a map is more than a mere application of individual generalization operators: a human cartographer analyzes the spatial situation and then chooses the adequate operators and also the sequence of their application. This is a complex task, which requires human experience and creativity-and turned out to be hard to automate.
The problem is important not only in traditional cartography when it comes to creating maps in different scales. It is equally relevant in today's digital era, where a coarse-to-fine exploration of data in digital devices is of elementary importance. The functionality of zooming in and out-in terms of logical zooms, i.e., not just a simple scaling-is essential for the (human) exploration of space. Due to the small displays of (mobile) devices, there is even a higher demand for abstraction. This is very relevant for navigation systems, where it is also a safety issue that humans are able to to quickly interpret a situation in order to take the right decision.

Brief review of progress in the last decades
In the 1990s, there has been a great surge of activities in automation of generalization. The National Mapping Agencies started striving for automating their labour-intensive manual processes. The scientific community was attracted by a computationally challenging problem-thus, not only cartographers and geoinformation scientists started searching for solutions, but also researchers from computational geometry. These activities united scientists in activities of the International Cartographic Association, and especially the Commission on Map Generalization, and lead to a vast amount of literature and also compiled books [4,9].
In the beginning of the activities, there was a focus on individual operators, e.g., simplification, selection, or aggregation [12,21]. Later the interplay of operators was recognized as a challenge, focusing on holistic solutions, for which, e.g., agent based solutions [14] or optimization approaches [7,16] were proposed. Researchers also looked at prerequisites for the application of the operators, namely structure recognition [8], and at the formalization of operators to run them in web-interfaces [5].
New challenges were seen in the integrated generalization of topographic data and other data sources, as well as the generalization of new data types, e.g., volunteered geographic information (VGI) [17] and 3D-data, for which no standards exist in traditional generalization. 3D-generalization has been mainly tackled in the domain of 3D city models [11]. New data models have been defined (CityGML, including different Levels of Detail-LoDs) and several approaches have been invented to automatically derive the different LoDs.

www.josis.org
Researchers also sought to enhance the visual user experience when zooming in and out on a digital device. Here, continuous representation schemes were proposed (e.g., [1]); in recent years, the so-called vario-scale approach was presented [10].

Major challenges for next 10 years
In the recent years a consolidation and even a decline of the activities in the scientific research in generalization can be observed: for many operators satisfactory algorithms are available; many approaches have been tried out in the recent years, and only marginal improvements seem to occur; and web-visualizations in several levels of detail are available, which seem to satisfy most users. This is due to the fact that the demands for onlinevisualization are lower than those for printed maps, as there is always the possibility to zoom in or out to see more or less detail, in case there are unclear situations. Thus, the high demands of a clear and unambiguous presentation are relaxed when it comes to online maps. This tendency can also be observed in NMAs [20], where the requirements are shifting from the demand to produce high quality first to the demand to guarantee high actuality first [15]. Still, there are interesting challenges for the next decade, which are described in this section.

Deep learning-machine learning revisited
As in other domains, in recent years the huge potential of deep learning methods has also been identified to solve the generalization problem. The idea is to use existing data sets and train a supervised model to mimic the generalization process. The problem domain seems to be very suitable for the application of deep learning: on the one hand it is a problem where humans have proven to be capable to solve it; also, it disposes of a lot of example data that can be used for training.
Interestingly, machine learning has been proposed already very early in the automation of generalization in order to mimic human behaviour and human decisions in cases of complex generalization situations. The task is to find a creative solution for a complex problem, where different map objects are interacting and have to be adequately presented. This complex orchestration process was long considered a realm of human art and experience. At that time, the goal was mainly to reveal the underlying rules and try to make them explicit and apply them in the generalization operators [13,23].
Today, the concept of a deep neural network as a "black box" is appreciated, and the expectation is that after training the model with a lot of input samples, it would be capable of generalizing a given input as an end-to-end solution. That is, it is not the immediate goal to reveal the underlying rules, but to have a versatile tool which produces a scaled version of a given input data set.
In the domain of graphics simplification, an early work looked at cleaning sketches of drawings [19]. First approaches in cartographic generalization have been presented by several authors recently, looking at different problem areas, e.g., the generalization of buildings [18], the classification of building patterns [24] or the recognition of patterns in road networks [22]. Different types of networks were applied, mainly convolutional neural networks (CNNs), but also graph CNNs [24] and generalized adversarial networks (GANs) [6].

SESTER
When implementing deep learning, an important design decision is the choice of the representation of the data. Most deep learning methods are based on regular structures such as images, or sequential structures such as texts. First approaches therefore naturally relied on a raster representation in terms of images. These first attempts are promising, yet, there are also many challenges, e.g.: • What is the adequate representation for the map data, i.e. semantic vector data?
• What are the adequate network models, that are able to represent the regularities inherent in the data (e.g., right angles in building representation, topological correctness). • Is it possible to include knowledge into the learning process in order to accelerate it and also make it more consistent and correct? • How to devise mechanisms to explain why a given result was achieved?

Incremental data acquisition and map creation
Novel sensors and data acquisition methods can be characterized by allowing to collect spatial information in a highly distributed way: cars are equipped with sensors to capture their immediate environment (e.g., cameras, LiDAR, ultrasonic sensors, GPS, . . . ), crowd sourcing provides a stream of highly up-to-date information. This leads to a paradigm shift, where the environment is constantly scanned, however, not by a single sensor (or institution), and not with a focus of a complete mapping. The map creation is possible by adequately integrating and fusing these different information chunks. Due to the variety of sensors and to different acquisition configurations, the data will typically have different resolutions and densities. The concept of generalization is relevant when it comes to compile and integrate data from different sources in an incremental fashion.
Autonomous cars need very up-to-date information for their localization and navigation. As the environment is constantly changing, it has to be continuously captured. This can only be done in a scalable fashion, if the maps of the future are acquired by all available sensors-by distributed sensors with different capabilities-i.e., by the users (here: the cars) themselves. Consider, for example, the acquisition of a 3D-city model: a car equipped with cameras or LiDAR can acquire parts of the buildings-namely those facing to the road the car was driving along. Another car might pass by another side of the buildings and thus capture those parts; an UAV can fly over the city and acquire data from above. The challenge is to integrate the different information chunks, taking all the characteristics of the data into account. Generalization is needed in order to ensure that the resulting map is always in a consistent (and quality assured) state, even though the complete object is not necessarily yet fully captured. A hierarchical approach allows to store a preliminary, coarse state, when only a few observations have been measured. This state can be incrementally refined, once more data is available. The transition to a more detailed representation can be controlled taking generalization principles into account, as those principles do reflect which level of detail is available at a certain resolution or scale. For example, when a building facade is acquired from far away, with only a few points, it will be best represented by a single plane which averages all the measured points. When subsequently a sensor captures the facade from a smaller distance, details such as windows or decorations will be measured and thus can be included in the current model. Still, for this automatic interpretation of the details, the bounds described by the coarse representation (the plane) can be used to guide and control the inference process. Thus, the new measurements have to www.josis.org fit to the coarse representation, otherwise they are likely to represent outliers and so not to belong to the detailed facade.
In this way, the incremental data acquisition is guided and controlled by a hierarchical representation and a coarse-to-fine approach. As a by-product, the hierarchical, multi-scale representation can later also be used for other purposes, e.g., for visualization.
There are many open questions, e.g.: • What is the best representation for coarse, intermediate objects?
• How to distinguish between measurement errors and environmental changes?
• How to ensure the quality of the fused information?

3D-generalization
The generalization of 3D-data still remains a challenge, especially when it comes to generalize arbitrary 3D-objects in the context of building information modelling [2]. Whereas in the traditional mapping domain generalization mainly served as a means for clear visualization, here the goals also refer to the reduction of storage demand, the simplification in the data processing, and the coarse-to-fine analysis of the data. All aspects require a data reduction-however, which operations are needed is not clear, and very much application dependent. Such rules are defined in the traditional topographic mapping, however not for the generalization of other spatial objects. Thus, open questions are, e.g.: • The definition of generalization requirements and rules for 3D objects; • Implementation of these rules; and • Devising mechanisms for optimally selecting appropriate levels of detail for a given analysis.

Summary
Cartographic generalization is a fascinating and challenging topic as it involves computational methods and it has an immediate link to human perception and understanding. This means that the computational solutions always have to be useful and usable for humans. Maps or digital representations of our environment are meant to make the life of humans more easy and more safe. Thus they have to be designed to support users in the optimal way. Today's data richness needs processes to adequately sort, filter, aggregate and integrate information and optimally present it in different representation levels.