# Michael T. Gastner

Assistant Professor, Yale-NUS College

I am an applied mathematician and data scientist. My main interests are at the interface of data visualization and cartography. Currently, I am spending most of my research time on developing user-friendly methods to construct cartograms.

I am a member of the International Cartographic Association's Commission on Map Projections.

Curriculum Vitae
ORCID: 0000-0002-1097-8833
Profile on Google Scholar and NCBI

## Cartograms

We live in the era of “big data”. Visualizing data is an important step when summarizing information. Geographic maps are a popular means to visualize spatial data, but conventional maps often tell a misleading story. Usually, each map region is displayed with an area proportional to its actual land area.

Unfortunately, equal-area maps usually miscommunicate statistical data. In an election, for example, the land area of a region is irrelevant. Instead, well-designed infographics should visualize the number of votes in a region.

Cartograms are infographics that rescale map areas in proportion to statistical data (e.g. population size or the number of votes).

I am developing cartogram algorithms and software. The objectives are that

• the algorithms guarantee that all areas are scaled correctly.
• geographic neighbours remain neighbours on the cartogram.
• cartograms are presented with an intuitive user interface.
For our latest algorithm (described in this PNAS paper), please visit our GitHub repository. To simplify cartogram generation, we have recently embedded this code into the web application go-cart.io.

## Publications

Below is a selection of representative publications.

### Interactivity in cartograms

I. K. Duncan, S. Tingsheng, S. T. Perrault and M. T. Gastner
Task-based effectiveness of interactive contiguous area cartograms
IEEE Trans. Vis. Comput. Graph. 27(3):2136–2152 (2021)

Cartograms are map-based data visualizations in which the area of each map region is proportional to an associated numeric data value (e.g. population or gross domestic product). Because of their distorted appearance, cartograms have often been criticised as difficult to read. We conducted an experiment to evaluate whether cartograms are more legible if they are accompanied by interactive features (animations, linked brushing, or infotips). With access to interactivity, most participants answered even complex questions about the maps correctly. Among the interactive features, animations had the strongest positive effect, so we recommend them as a minimum of interactivity when cartograms are displayed on a computer screen.

### Efficient cartogram generation

M. T. Gastner, V. Seguy and P. More
Fast flow-based algorithm for creating density-equalizing map projections
Proc. Natl. Acad. Sci. U.S.A. 115(10):E2156–E2164 (2018)

On conventional maps, each region is displayed with an area proportional (or at least nearly proportional) to its geographic area in square kilometres. But equal-area maps can grossly misrepresent demographic data: densely populated cities should be given more prominence than large, but sparsely populated territories. Cartograms solve this problem by rescaling map regions in proportion to, for example, population or gross domestic products. Here we describe and benchmark a fast flow-based algorithm that computes cartograms in a matter of seconds.

### Cargo shipping

P. Kaluza, A. Kölzsch, M. T. Gastner and B. Blasius
The complex network of global cargo ship movements
J. Royal Soc. Interface 7(48):1093–1103 (2010)

The global network of merchant ships plays a crucial role in human mobility, the exchange of goods and the spread of invasive species. We use information about the itineraries of 16 363 cargo ships during the year 2007 to construct a network of links between ports. We show that bulk dry carriers, container ships and oil tankers differ in their mobility patterns and networks. Container ships follow regularly repeating paths whereas bulk dry carriers and oil tankers move less predictably between ports. The network of all ship movements possesses a heavy-tailed distribution with systematic differences between ship types.

### Diffusion cartograms

M. T. Gastner and M. E. J. Newman
Diffusion-based method for producing density-equalizing maps
Proc. Natl. Acad. Sci. U.S.A. 101(20):7499–7504 (2004)

Cartograms are maps in which the sizes of geographic regions (e.g. countries, provinces) appear in proportion to their population. Such maps are invaluable for data visualization. Unfortunately, to scale regions and still have them fit together, one is normally forced to distort the regions’ shapes, potentially resulting in maps that are difficult to read. Here we present a technique based on ideas borrowed from elementary physics that suffers from none of these drawbacks.

## Teaching

### YCC1122: Quantitative Reasoning

This “Common Curriculum” course aims to develop the students’ skills in logical and statistical reasoning so that they become critical and informed readers of quantitative data. The course applies the pedagogy of Team-based Learning to ensure that students who bring diverse talents and backgrounds to the course can learn together and from each another.

Students learn to criticise and question empirical claims, support them with logical arguments and address real-life problems by gathering and visually representing quantitative data. The course teaches quantitative literacy so that students grasp how algorithmic and statistical thinking is used in the natural and social sciences.

### YSC2210: Data Analysis and Visualization (DAVis) with R

This course teaches how to use the programming language R for analyzing and presenting statistical data. Starting from the fundamentals of R (data types, flow control), students learn how to write their own R scripts and functions. They learn how to extract data from web sites and bring the input into a shape (e.g. using regular expressions) that is suitable for further analysis.

Much of the course focuses on R’s graphics features, including network representations and geographic maps. The objective is to present data in ways that are informative, elegant and fun (e.g. as short animated video clips).

Example of a visualization project in DAVis: An animation of Singapore's age pyramid between 1960 and 2017. (Data from Data.gov.sg.)

### YSC3216: Stochastic Processes and Models (SPaM)

What do stock markets, the weather, genetic mutations and the movements of a drunkard have in common? All these phenomena are subject to a certain degree of randomness. Such “stochastic processes” are a vibrant area of interdisciplinary research, ranging from mathematical finance over biology to predicting waiting times in supermarket queues.

In this course, students learn the mathematics behind the most common models of stochastic processes: Markov chains, Poisson and renewal processes, queuing theory. Students learn how to prove the most important mathematical results and apply them to realistic problems.

### YSC4208: Monte Carlo Simulations in Science and Statistics (MoCaSinSS)

Monte Carlo simulations are computer experiments that solve numerical problems by using random number generators. At first glance, it may seem bizarre to use a computer, arguably the most accurate and deterministic of all human inventions, to perform random experiments. However, Monte Carlo simulations are nowadays an essential component in many quantitative studies. They are used in the natural sciences, industrial engineering, finance and statistics.

This course teaches how to write elegant and efficient Monte Carlo simulations for concrete real-world examples. Students also learn the theoretical foundations of pseudorandom number generators, Markov chain Monte Carlo and the Metropolis-Hastings algorithm.

Example of a Monte Carlo simulation. Buffon's needle” is an experiment to estimate the value of $$\pi$$ = 3.14... We randomly drop needles onto a floor with parallel strips (vertical lines at the top). Needles that cross at least one of the strips are coloured blue. Needles that fall between two strips are shown in green. By counting the numbers of blue and green needles, we can obtain an estimate of $$\pi$$. The more needles we drop, the more accurate the estimate (plotted at the bottom).

## Contact

Michael T. Gastner
Yale-NUS College, Division of Science
16 College Avenue West, #01-220 Singapore 138527
michael.gastner@yale-nus.edu.sg

My office is RC3-02-05L in Cendana College (2nd floor).