GIS Algorithms: Theory and Applications for Geographic Information Science & Technology

Books

Ningchuan Xiao

  • Citations
  • Add to My List
  • Text Size

  • Chapters
  • Front Matter
  • Back Matter
  • Subject Index
  • Copyright

    Acknowledgements

    Geographic Information Science and Technology (GIST) is enjoying profound innovation in its research and development. SAGE Advances in GIST will provide students with learning resources to support and enrich their interest in the workings of Geographic Information Science and Technology. These highly visual and highly applied texts will promote curiosity about developments in the field, as well as provide focused and up-to-date tools for doing GIS research.

    Series edited by Mei-Po Kwan, Department of Geography, University of California, Berkeley

    Also in the GIST series:

    Spatial Statistics and Geostatistics

    Yongwan Chun and Daniel A. Griffith

    Acknowledgements

    To Alester, for understanding everything.

    About the Author

    Ningchuan Xiao is an associate professor in the Department of Geography at the Ohio State University. He has taught a wide range of courses in cartography, GIS, and spatial analysis and modeling. He previously served as chair of the Spatial Analysis and Modeling Specialty Group of the Association of American Geographers from 2009 to 2012. Dr Xiao’s research focuses on the development of effective and efficient computational methods for mapping and analyzing spatial and temporal data in various application domains, including spatial optimization, spatial decision support systems, environmental and ecological modeling, and spatial epidemiology. His research has been published in leading journals in geography and GIScience. His current projects include designing and implementing novel approaches to analyzing and mapping big data from social media and other online sources, and developing search algorithms for solving spatial aggregation problems. He is also working with interdisciplinary teams on projects to map the impacts of human mobility on transmission of infectious diseases and to model the impacts of environment on social dynamics in the Far North Region of Cameroon.

    Preface

    Geographic information systems (GIS) have become increasingly important in helping us understand complex social, economic, and natural dynamics where spatial components play a key role. In the relatively short history of GIS development and applications, one can often observe an alienating process that separates GIS as a “machine” from its human users. In some cases, GIS have been reduced to a black box that can be used to generate pleasant mapping products serving various purposes. This trend has already encroached on our GIS education programs where a noticeable portion of teaching has focused on training students how to use the graphical user interface of GIS packages. Addressing this situation presents a great challenge to GIS researchers and educators.

    In this book we focus on critical algorithms used in GIS that serve as a cornerstone in supporting the many operations on spatial data. The field of GIS has always been diverse, and many textbooks typically cover the general topics without providing in-depth discussion for students to fully understand the context of GIS concepts. Algorithms in GIS are often presented in different ways using different data structures, and the lack of a coherent representation has made it difficult for students to digest the essence of algorithms. Students in GIS classes often come from a variety of research and educational backgrounds and may not be fully comfortable with the terms used in traditional and formal algorithm description. All of these have seemingly made algorithms a difficult topic to teach in GIS classes. But they should not be excuses for us to avoid algorithm topics. By examining how spatial data are input into an algorithm and how the algorithm is used to process the data to yield the output, we can gain substantial understanding of two important components of GIS: what geospatial data actually are and how these data are actually processed.

    This book covers algorithms that are critical in implementing some major GIS functions. The goal, however, is not to go over an exhaustive and long list of algorithms. Instead, we only include algorithms that either are commonly used in today’s GIS or have a significant influence on the development of the current algorithms; as such the choice of topics may appear to be subjective. We look at geospatial data from a minimalist perspective by simply viewing them as being spatial and locational, meaning that we focus on the coordinates using an atomic view of geographic information where most geospatial data can be understood as collections of points. In this sense, we boil down a lot of unnecessary discussion about the difference between vector and raster to a fundamental data model. Starting from there, we dive into the diversity of GIS algorithms that can be used to help us carry out some fundamental functions: measuring important spatial properties such as distance, incorporating multiple data sources using overlay, and speeding up analysis using various indexing techniques. We also go to certain lengths to cover algorithms for spatial analysis and modeling tasks such as interpolation, pattern analysis, and decision-making using optimization models. Admittedly all these functions are available in many GIS packages, open source or commercial. However, our goal is not to duplicate what is available out there, but to show how things out there work so that we can implement our own GIS, or at least GIS functions, without relying on a software system labeled as “GIS.” To gain such freedom, we must go to the level where data are transparent not packaged, processes are outlined as code not as buttons to click on, and results are as transparent as input.

    This is not a traditional book on algorithms. In a typical computer science book one would expect algorithms to be presented using pseudo-code that is assumed to concentrate on the critical parts of algorithms but to ignore some of the details. The pseudo-code approach, however, is not suitable for many of those who study GIS because the theoretical aspects of the algorithms are often not the main concern. Instead, the drive to understand algorithms in GIS typically comes from the desire to know “how stuff works.” For this reason, real, working code is used in this book to present and implement the algorithms. We specifically use Python as the language in this book mainly because of its simplistic programming style that eliminates much of the overhead in other programming languages. Careful thought has been given to balancing the use of powerful but otherwise “packaged” Python modules in order to show the actual mechanism in the algorithms. With that, we also maintain a coherent geospatial data representation and therefore data structures, starting from the beginning of the book.

    This book would not be in the form it is without help. I am deeply grateful to the open source community that has tremendously advanced the way we think about and use spatial data today, and in this sense, this book would not even exist without open source developments. I have on countless occasions consulted with the Stack Overflow sites for my Python questions. The online LaTeX community (http://tex.stackexchange.com and http://en.wikibooks.org/wiki/LaTeX) always had answers to my typesetting questions during the many all-nighters spent writing this book. Some open source packages are especially crucial to many parts of the book, including those on k-D trees (http://en.wikipedia.org/wiki/K-d_tree and https://code.google.com/p/python-kdtree/) and kriging (https://github.com/cjohnson318/geostatsmodels). My thanks are due to students in many of the classes I have taught in the past few years in the United States and in China. Their feedback and sometimes critiques have enabled me to improve my implementations of many of the algorithms. I thank Mei-Po Kwan for her support, and Richard Leigh, Katherine Haw and Matthew Oldfield for carefully handling the manuscript. Detailed comments from a number of reviewers of an earlier version of this book have greatly helped me enhance the text. Finally, I would like to thank my family for their patience and support during the writing process.

    Ningchuan Xiao−82.8650°, 40.0556°December 2014
  • Postscript

    Donkey: Are we there yet?

    Shrek: No.

    Donkey: Are we there yet?

    Fiona: Not yet.

    Donkey: Are we there yet?

    Fiona: No.

    Donkey: Are we there yet?

    Shrek: No!

    Donkey: Are we there yet?

    Shrek: Yes.

    Donkey: Really?

    Shrek: No!!

    Shrek 2 (2004)

    We started this book with an introduction to spatial data and some basic geometric algorithms. From there we moved on to discuss indexing methods that can be used to expedite the handling of spatial data. Then we explored a number of applications in spatial analysis. In each chapter, we focused on theory, methods, and practice with code. We used examples to show, in a humble way, that by coding we gain the freedom to do things without relying on large software packages. But are we there yet?

    Each chapter in its own right could be extended into a book or even more than one book. We have covered some of the most important topics, concentrating our discussion around how the concepts work and how to write a simple program in Python to implement the methods. We discussed the depth of each area in the notes to each chapter, but only to a limited extent. More efforts, collective ones, will be needed to embrace the breadth and depth of the field involving geospatial data. Also, we can certainly do more with the code included. For example, each of the programs is used individually in this book, for a good reason – we need to introduce them individually for each algorithm when we go over the computational processes. Now that we have seen all the programs, we can compile them together into a Python package that can be used more easily. However, we should point out that the programs in this book are coded in a minimalist fashion so as to put the algorithms under the spotlight. While all the programs work, they may not work under all conditions. We do not check the type or validity of variables unless it is absolutely necessary to do so. For example, we do not make sure to only insert points (as in the Point class) into a k-D tree. These may become “bugs’’ that may cause a program to crash unexpectedly. Some may consider this to be a limitation of a weakly typed programming language like Python. However, it also gives us the convenience to think about the computational process from a more intuitive perspective by not going into excessive details.

    Between here and there, then, there is the vast space of the discipline of geographic information science and spatial analysis, and rigorous software engineering processes. But none of that detracts from the true value of the chapters in this book: they are part of the process that prepares us to gain the freedom to handle spatial data.

    Python: A Primer

    Beautiful is better than ugly.

    Explicit is better than implicit.

    Simple is better than complex.

    Complex is better than complicated.

    Flat is better than nested.

    Sparse is better than dense.

    Readability counts.

    Tim Peters, The Zen of Python

    This appendix is mainly for those readers who may not be familiar with some of the Python features used in this book. It can be treated as a quick way to get started with Python but is not meant to be a thorough introduction to the language. For a comprehensive understanding of Python, it would be better to use other tutorials. There are excellent tutorials out there and many of them are free. For example, the official Python tutorial1 covers a wide range of topics and provides many very effective examples that can be easily extended to more complicated cases. Other tutorials, such as How to Think Like a Computer Scientist,2 may provide more specialized content. Nevertheless, here we will look at Python topics that are closely related to many of the algorithms introduced in this book. Our aim is to keep things simple as more detailed information is easily available on the Web or in other publications.

    Python is a cross-platform programming language. In this book we will assume a successful installation of Python and many of the modules that come with the installation. We only use the command line of the Python shell. We will introduce Python using some basic commands, before moving on to discuss some more specific topics. It should be noted that all Python commands are entered after the prompt >>>. The version of Python used to prepare this appendix is 2.7.6.

    We can use Python as a powerful calculator. However, we must be careful about the type of each item. A value of 2 will be treated as an integer; the calculation of 2/3 will be treated as an operation between two integers and the result will be an integer (hence 0) in the output. To ensure a floating point value result we must use a floating point value in at least one of the numbers in the operation (e.g., 2.0/3). We can also use the function float to specify the type of number.

    • >>> 10 * 10
    • 100
    • >>> 2/3
    • 0
    • >>> 2.0/3
    • 0.6666666666666666
    • >>> 2/3.0
    • 0.6666666666666666
    • >>> float(2)/3
    • 0.6666666666666666
    • >>> 2/float(3)
    • 0.6666666666666666

    Variables are important in Python (as in all programming languages). However, variables in Python do not have types. The type of a variable will be determined by Python at run time. In the following example, we assign an integer to b first and then we change it to a string of text (enclosed in a pair of double quotes or single quotes). When we want to add a string to an integer, Python reports an error. Also note that in Python we use the symbol # to indicate comments and everything after the comment symbol in the same line will not be considered by Python for any operation.

    If both variables are strings, we can definitely add them up. Note that a string can be indicated using either single or double quotes, as long as they match.

    • >>> a = “astring”
    • >>> a + ’ ’ + b
    • ’astring mystring’

    Strings can be duplicated using the “*’’ operation.

    • >>> a * 5
    • ’astringastringastringastringastring’
    • >>> (a + ’ ’) * 5
    • ’astring astring astring astring astring ’

    We can also combine variable values with strings to format powerful and meaningful expressions using the built-in format method that comes with each string. The first part of the command is a partial string where the braces are used as placeholders whose values will be filled in by the variables in the parentheses. The variables are indexed as 0, 1, and 2, and these indices are used in the partial string.

    • >>> n1, n2, n3 = 10, 1, 2
    • >>> ’I have {0} apples, {1} pears, and {2} others’.\
    • ... format(n1, n2, n3)
    • ’I have 10 apples, 1 pears, and 2 others’

    The first line in the above example represents another useful feature of Python called simultaneous assignment that allows us to assign multiple values at once. Because of this feature, functions in Python can return multiple values at the same time.

    • >>> a,b = 3,4
    • >>> a,b
    • (3, 4)
    • >>> a,b = b,a
    • >>> a,b
    • (4, 3)

    Another way to format a string is to use the % operation. This is a little more confusing but can be useful. Using % symbols within quotes indicates placeholders for values; such % symbols are followed by a character to indicate the type of variable or value. Here the letters df and s denote integer, floating point, and string types, respectively. For a floating point value, we can also use %.2f to make sure only two decimal points are shown. A % symbol after an end quote indicates the variables or values that will be used to replace the placeholders in the quotes.

    • >>> n1, n2, n3 = 10, 1, 2
    • >>> “n1 = %d, n2 = %d, n3 = %d”%(n1, n2, n3)
    • ’n1 = 10, n2 = 1, n3 = 2’
    • >>> name = “Dr. P”
    • >>> age = 40
    • >>> height = 6.1
    • >>> “%s is %d years old and %.2f feet tall”%(name,age,height)
    • ’Dr. P is 40 years old and 6.10 feet tall’

    Finally, the help function can be used to get useful explanations and examples of Python functions. Quotes must be used for Python symbols in the this function.

    • >>> help(format)
    • >>> help(“<”)
    • >>> help(“==”)
    • >>> help(“%”)
    A.1 List comprehension

    A list in Python can be used to store multiple things of any data type, and is the most powerful data structure in Python. In the following example, a is a list that contains six different values. The function len can be used to return the length of the list. Each item in a list is referred to using an index that starts from 0.

    • >>> a = [’str1’, 10, ’str2’, ’str3’, 10, 5]
    • >>> a
    • [’str1’, 10, ’str2’, ’str3’, 10, 5]
    • >>> len(a)
    • 6
    • >>> a[0] ‘str1’
    • >>> a[0] + ’ ’ + a[2]
    • ’str1 str2’
    • >>> a[1] + a[5]
    • 15

    When talking about lists, a very useful tool is the range function that returns a list of numbers. When only one argument (e.g., 10) is entered, it is the stop condition and the function will yield a list of integers from 0 to the integer one less than the input.

    • >>> range(9)
    • [0, 1, 2, 3, 4, 5, 6, 7, 8]
    • >>> range(2, 9)
    • [2, 3, 4, 5, 6, 7, 8]
    • >>> range(2, 9, 3)
    • [2, 5, 8]

    Creating a list is called list comprehension in Python. There are many ways to create lists. Because a list contains a set of items, a traditional way to fill in values in these items is through a for loop. We first create an empty list ([]) and then keep adding new items to the list using the append function. The following example generates a list of 10 values, each being the ith power of 2 for i ranging from 0 to 9. (Note ** in the following code is the operator for “power of’’ in Python.)

    • >>> exps = []
    • >>> for i in range(10):
    • ...   exps.append(2**i)
    • ...
    • >>> exps
    • [1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

    Before we move on to list comprehension, let us focus on a very important rule in Python syntax: indentation. Instead of using brackets or other symbols to indicate a code block, Python uses indentation. In the above example, the code that is logically within the for loop must be written with some leading white space (four spaces in this case). Before the indentation starts, the line above must end with a colon (:). This rule applies whenever a logical group of lines must be identified, as we will often see in this appendix and in the main text of the book.

    The above example can be replaced using the list comprehension method in Python with one line of code (shown below) that returns a list with exactly the same contents. We also compare the two lists using the logical expression == that only returns True if elements in both lists are exactly the same. Each list comprehension contains at least two kinds of operations: assignment and enumeration. In our example below, the assignment (2**i) is done over an enumeration (using the for statement) of each of the i values from a list of 10 integers generated by the range function.

    • >>> exps1 = [2**i for i in range(10)]
    • >>> exps1 == exps
    • True

    We can also put conditions in list comprehension to create more complicated lists. The following example creates a list with only the members from exps1 that can be evenly divided by 32. Here, the symbol % is the modulo operator that returns the remainder from a division operation of the first parameter by the second (e.g., 5%3 returns 2).

    • >>> [x for x in exps1 if x%32==0]

    We can actually plug in the expression for exps1 to make a nested list comprehension.

    • >>> [x for x in [2**i for i in range(10)] if x%32==0]

    In Section 2.8, we use the following two for loops to create a list of points for the parallels where each parallel contains 37 points ranging from –180 to 180 in steps of 10.

    • >>> points = []
    • >>> linenum = 0
    • >>> for lat in range(-90, 91, 10):
    • ...  for lon in range(-180, 181, 10):
    • ...    points.append([linenum, lon, lat])
    • ...  linenum += 1
    • ...

    This can be simplified using list comprehension into one line of code as follows.

    • >>> points = [ [(lat+90)/10, lat, lon]
    • ...      for lat in range(-90, 91, 10)
    • ...      for lon in range(-180, 181, 10) ]

    The X and Y coordinates of the points in the 17th parallel (80ºN) can be retrieved using another list comprehension.

    • >>> [[p[1], p[2]] for p in points if p[0]==17]
    A.2 Functions, modules, and recursive functions

    As in any programming language, we can group a collection of lines of code into a function that can be used repeatedly. We use the Python keyword def to start defining a function. Immediately after the keyword is the name of the function followed by a set of input arguments in a pair of parentheses. The line defining a function ends with a colon. We use a simple example below to demonstrate how functions work.

    Here we define a function that calculates the distance between two points with X and Y coordinates given by x1 and y1 for the first point and x2 and y2 for the second. We will need to use the sqrt function to compute the square root of the sum of squares. However, this function is not a built-in Python function. Instead, sqrt is available in a Python module called math that must be specifically imported (in the first line). All the lines in the function must be indented to indicate that they belong to the function. In this example, we do not show Python prompts because it is inconvenient to write multiple lines of code in the interactive Python shell where we cannot edit the previous lines. Instead, we can write the Python commands in a text editor and save them as a file. There are many good editors for us to choose from. My favorite is Emacs, but Notepad++ is also a popular and free option. Let us assume we save the following code in a file called dist.py. In this way, the code we write becomes part of a module and can be imported for use when needed.

    • import math
    • def distance(x1, y1, x2, y2):
    •  dx = x1-x2
    •  dy = y1-y2
    •  d = math.sqrt(dx*dx + dy*dy)
    •  print “my name is:”, __name__
    •  return d

    The following example shows how to use the code to calculate distance.

    • >>> from dist import *
    • >>> distance(1.5, 2.5, 3.5, 4.0)
    • my name is: dist
    • 2.5
    • >>> print __name__
    • __main__

    In this way, we have created a Python module called dist. In general, a Python module includes the code blocks that can be used somewhere else outside its original file. The above exercise also shows an important built-in variable in Python called --name--. This variable is assigned to the value of --main-- in the process that was first executed by the Python interpreter, which in our case is the Python prompt. When a module is loaded, Python will use the module name to set the value of --name--.

    The following is an example showing the calculation of the factorial of an integer using the formula n! = n × (n − 1) × (n − 2) × ... × 3 × 2 × 1. Here we simply use a list of integers ranging from 1 to n that repeatedly multiplies itself.

    • def factorial_i(n):
    •  f = 1
    •  for i in range(1,n+1):
    •  f = f*i
    •  return f

    The calculation of n!, however, can be implemented in another manner where the function calls itself repeatedly. This is what we refer to as a recursive function.

    • def factorial_r(n):
    •  if n == 1:
    •   return 1
    •  return n * factorial_r(n-1)

    Recursive functions are often simple and elegant to write. However, we must do this carefully to avoid endlessly calling the function. In general, each recursive function must have a base where the calling stops. In the above example, the last line returns by calling the function itself. If we do not control this, it will end up calling itself ad infinitum. Therefore, we always have a condition before we call the function itself. Here, if the value of n becomes 1, we will return a constant instead of another function call. This is the base case in this example. Also, a recursive function must have a way to move on so as to get closer to the base case. In our example, we make sure we will reach the base case by calling the function with a decreasing value of n. In the main text of this book there are plenty of examples of recursive functions, especially when we are dealing with trees where a recursive call makes it easy to go down the tree continuously until we reach the end of the tree (a leaf node). Using recursive functions, however, has a big disadvantage in terms of performance. For example, when n is 10 and we call function factorial_r(10-1), no answer is returned and we have to wait until n reaches 1 before we can go back and actually calculate all the values for n. This backtracking requires a lot of memory storage to hold the unsolved function calls, and by backtracking we will spend more time going through the loop.

    A.3 Lambda functions and sorting

    Sometimes, we need a function but do not need a real function with a name. This is an anonymous function and we use the Python construct lambda for this purpose. For example, below we define a function so that it returns True if an input is an even number or False otherwise. This is a quick way of defining a function without using the def keyword and all the syntax rules. It is quite simple: the variable before the colon is the input and everything after the colon will be calculated and returned.

    • >>> f = lambda x: x%2==0
    • >>> f(10)
    • True
    • >>> f(1)
    • False

    This simple way of defining functions may seem unnecessary and confusing. Nevertheless, it provides great expressiveness and abstraction of our code. We demonstrate its use in the context of list sorting. A list in Python can be sorted using the built-in method sort.

    • >>> exps1 = [2**i for i in range(10)]
    • >>> exps1.sort(reverse=True)
    • >>> exps1
    • [512, 256, 128, 64, 32, 16, 8, 4, 2, 1]

    However, this will not work when each element in the list has a more complicated structure. For example, if each element in the list is another list that contains the X and Y coordinates of a point, sorting that list directly will give results based on the first value in each element. How about if we want to sort it based on the second value in each element? Here is how the lambda function comes to our assistance. We utilize the key parameter of the sort function by specifying the second value in each element to be considered.

    • >>> points=[[10,1], [1,5], [1,2], [3,7], [10,8], [7,8], [9,8]]
    • >>> points.sort()
    • >>> points
    • [[1, 2], [1, 5], [3, 7], [7, 8], [9, 8], [10, 1], [10, 8]]
    • >>> points.sort(key=lambda points: points[1])
    • >>> points
    • [[10, 1], [1, 2], [1, 5], [3, 7], [7, 8], [9, 8], [10, 8]]
    A.4 NumPy and Matplotlib

    NumPy is a powerful Python module specifically developed to operate on arrays and matrices. We use some of the functions in NumPy extensively in Chapter 8 where interpolation methods are discussed. NumPy is closely tied to many of the features of Python. For example, we can easily convert a Python list into a NumPy array. Many of the NumPy functions are built to handle multiple inputs from an array directly. We demonstrate these features using the following simple example. Here, we apply the sin function in NumPy directly to a list called angles. It appears that NumPy implicitly converts the list into an array and computes the sine of each element, using these to form an array for output. The sin function that comes in the math module, however, does not work in this way because it requires a single float value to be the input.

    • >>> import numpy as np
    • >>> angles = [i*np.pi/10 for i in range(1, 10)]
    • >>> np.sin(angles)
    • array([0.30901699, 0.58778525, 0.80901699, 0.95105652, 1.,
    •  0.95105652, 0.80901699, 0.58778525, 0.30901699])
    • >>> import math
    • >>> math.sin(angles)
    • Traceback (most recent call last):
    •  File “<stdin>”, line 1, in <module>
    • TypeError: a float is required

    Matplotlib is a highly useful Python package designed for two-dimensional graphics. Despite its command line style, it is fairly straightforward and quick to use this package to make impressive plots and graphics. We show three examples here to demonstrate its use in three distinct areas that are related to our spatial data handling applications.

    The Matplotlib package contains two modes, IPython and pylab. The pylab mode is the main interface to the functionality of Matplotlib and we import it first in the following example. Note that this way we actually also import NumPy as np. Then we create a list called angles to hold 60 angles for more than a cycle of 2π. The plot function creates a plot with the first two inputs as the X and Y coordinates. A number of parameters can be applied in the plot function to control the look of the plot. Nothing will be displayed until the show function is called. Figure A.1 shows the result of this example.

    • from pylab import * # or: import matplotlib.pyplot as plt
    • angles = [i*pi/20 for i in range(60)]
    • plot(angles, np.sin(angles), color=’grey’)
    • plot(angles, np.cos(angles), linewidth=2, color=’lightblue’)
    • show()

    Figure A.1 Plotting using Matplotlib

    The plot function can actually be used to plot line maps. We used this function in Chapter 2 to draw three map projections using the graticule and world coastlines. To plot polygons, we can use the Polygon function in Matplotlib.

    The second example illustrates how to draw points in a scatter plot, which can be equivalent to a “point map.’’ Here we create 100 points, 50 of which are scattered around the point (0,0) in a Cartesian coordinate system, and the other 50 around the point (10,10). Both X and Y coordinates follow a normal distribution with a standard deviation of 1. The NumPy function normal can be used to do that by specifying the mean, standard deviation, and number of samples to be created. We create the X and Y coordinates for each group separately and then combine them using the NumPy function concatenate for the X and Y coordinates, respectively. We mark the points in the first group around (0,0) in red and those in the second group around (10,10) in blue by creating a list called col with 100 elements, where the first 50 are red and the second 50 blue. Finally, the scatter function is used to create the plot, where we can also specify the colors for each point (col) and the transparency. A 0.5 level of transparency will allow us to show the effect when multiple points overlap. Figure A.2 shows the final result of this example. Note that we export the figure into a PDF, which is converted into the EPS format for publication separately because transparency in EPS is not natively supported in Matplotlib.

    • import matplotlib.pyplot as plt
    • import numpy as np
    • n = 100
    • X1 = np.random.normal(0, 1, n/2)
    • X2 = np.random.normal(10, 5, n/2)
    • Y1 = np.random.normal(0, 1, n/2)
    • Y2 = np.random.normal(10, 5, n/2)
    • X = np.concatenate((X1, X2))
    • Y = np.concatenate((Y1, Y2))
    • col = [’red’ if i<n/2 else ’blue’ for i in range(n)]
    • plt.scatter(X, Y, color=col, alpha=.5)
    • plt.axis(’scaled’)
    • plt.savefig(’matplotlib-scatter.pdf’,
    •   bbox_inches=’tight’,pad_inches=0)
    • plt.show()

    Figure A.2 Plotting points using Matplotlib

    In our last example, we reuse the distance function we developed before to create a raster with two centers, one at the center of the lower-left quadrangle and the other at center of the upper-right. For each location in the raster, we find the nearest center to that location and set the value at the location using a biweight kernel (see Section 9.4). The function K is used to return such a value.

    • from dist import distance
    • def K(x, y, c1, c2, r):
    •  d1 = distance(x, y, c1[0], c1[1])
    •  d2 = distance(x, y, c2[0], c2[1])
    •  d, i = d1, 0
    •  if d2<d1:
    •   d,i = d2,1
    •  if d>r:
    •   v = 0
    • else:
    •   u = float(d/r)
    •   v = 15.0/16.0 * (1 - u**2)**2
    •  return v

    The following lines of code create a raster of 20 × 20 cells and the bandwidth of the kernel function is set to 10. We use list comprehension to create a list of 20 values from 1 to 20. The variables c1 and c2 are used to indicate the two centers. Z is the 2D list (a list of lists) where the value at each cell is generated using the function K. The Matplotlib function imshow is used to create a 2D image using values in Z. We use a color map of Greys to make a black-and-white map. By default, the imshow function will map the lowest value in the data to one end of the color map (white) and the highest to the other end (black). However, this will leave our map with very high contrast. Here, we force the function to use minimal and maximal values, −0.5 and 1.2, respectively, that are outside our data range (0 and 0.9375) so that the actual color used in the map will be in the middle range of the color map. The origin parameter in imshow specifies that the origin of the coordinates is in the lower-left corner of the image, which is consistent with our data. The colorbar function displays a legend and we set the total length of the bar to be 99% of the map height. The xticks and yticks functions take empty lists as input, which helps hide the ticks so that the rendering is more like a real map. Figure A.3 shows the result of this example.

    • from pylab import *
    • n = 20
    • r = n/2.0
    • c1 = [n/4, n/4]
    • c2 = [3*n/4, 3*n/4]
    • Z = [ [K(x, y, c1, c2, r) for x in range(1, n+1) ]
    •  for y in range(1, n+1)]
    • imshow(Z, cmap=’Greys’, vmin=-0.5, vmax=1.2, origin=’lower’)
    • colorbar(shrink=.99)
    • xticks([]), yticks([])
    • show()

    Figure A.3 Plotting a raster image using Matplotlib

    A.5 Classes

    We can group data and the functions that are closely associated with data into a class. In most Python terminology, data are called attributes and functions are called methods when they are encapsulated in a class. The simplest way to create a class is to use the class keyword in Python. We will create a class called Shape that stores a number of points. The first line declares the class. We define two methods in this class, both of them built-in methods. A built-in method in a Python class means it comes with every definition of the class and it does some specific thing. For example the __init__ method is always called when an instance of the class is created. For this reason, we also call it the constructor of the class, and we as users can rewrite it with our own code. For example, here we require a Shape instance to be provided with a set of points when it is created. Each method in a Python class must include a variable called self as the first input argument. In this way, we can pass the class itself (along with its data and methods) to the method (function) for convenience. For example, in this __init__ method, we can then assign the input variable values to the member attributes of the Shape class. Defining class attributes is straightforward: we simply write the names there with a self. prefix, where the dot means that whatever comes after the dot is a member of the class. The other built-in method is __repr__ which defines how the instance of the class will be printed as text. Because of that, we will need to return a string. Here we simply convert the list of points into a string.

    • class Shape():
    •  def __init__(self, points, id=None, t=None):
    •   self.points = points
    •   self.id = id
    •   self.type = t
    •  def __repr__(self):
    •   return str(self.points)

    To use the above class, we simply define a “variable’’ and assign it the class name as a function, with a set of input arguments. Such a variable is called an object of the class. Classes are a conceptualization or blueprint of the object. In other words, we cannot do anything with the class because it is just a definition, but we can create an object and operate on it.

    • >>> s = Shape([[1,1], [2,3], [3,7], [2,8]])
    • >>> print s
    • [[1, 1], [2, 3], [3, 7], [2, 8]]

    There is not much we can do with the Shape object because an actual shape will be more complex than just a set of points. For example, we may have lines and polygons, and we will desire different behaviors from them. For this reason, we can extend an existing class into subclasses. This is called inheritance: we define subclasses that inherit features from their parent classes. We first take a look at an example of a subclass called Polyline. This time, we still use the class keyword, but we include the name of the parent class in the parentheses. In the constructor (--init--) of this class, we call the constructor of the parent class first to pass the necessary information that will be stored with the parent class. Beyond that, we introduce two new attributes to this new subclass, including name and color. Both of the values get a default value and we can make changes later. Now we also include a new method called draw where we first use the Polygon function in Matplotlib to create a line object and then we add it to the plot. Note that we can use the Polygon function to draw a line by specifying both parameters closed and fill to False. It will be necessary to import Matplotlib here.

    • import matplotlib.pyplot as plt
    • class Polyline(Shape):
    •  def __init__(self, points, id, name=None):
    •   Shape.__init__(self, points, id, “polyline”)
    •   self.name = name
    •   self.color = ’grey’
    •  def draw(self):
    •   l = plt.Polygon(self.points, closed=False, fill=False,
    •    edgecolor=self.color)
    •   plt.gca().add_line(l)

    We then continue to define another class called Polygon below. For this subclass, we have two color properties, one for the fill (fillcolor) and one for the polygon outline (edgecolor). A Polygon class will have its own draw function due to the specific features of a polygon. We set the transparency level to 0.5 using the alpha parameter of the Polygon function.

    • class Polygon(Shape):
    •  def __init__(self, points, id, name=None):
    •   Shape.__init__(self, points, id, “polygon”)
    •   self.name = name
    •   self.fillcolor = ’white’
    •   self.edgecolor = ’grey’
    •  def draw(self):
    •   poly = plt.Polygon(self.points, closed=True,
    •    fill=True,
    •    facecolor=self.fillcolor,
    •    edgecolor=self.edgecolor,
    •   alpha=0.5)
    • plt.gca().add_line(poly)

    Finally, we create another class called Shapes as a container to include and manage multiple shapes. The key feature here is a list called shapes and a function that adds different shapes to the list. Now we have a complete draw function that first calls all the shapes in the list to draw their geometric features (but not showing at this time). Then the show function in Matplotlib is called to finally draw the data.

    • class Shapes():
    •  def __init__(self):
    •   self.shapes = []
    •  def add_shape(self, shape):
    •   self.shapes.append(shape)
    •  def draw(self):
    •   for s in self.shapes:
    •    s.draw()
    •   if len(self.shapes):
    •    plt.axis(’scaled’)
    •    plt.xticks([])
    •    plt.yticks([])
    •    plt.show()

    We create three very simple shapes and draw the final map. Figure A.4 shows the result.

    • shapes = Shapes()
    • l1 = Polyline([[1,2], [5,3], [7,2]], 0)
    • l2 = Polyline([[6,2], [1,1], [5,5]], 1)
    • l2.color = ’red’
    • p1 = Polygon([[1,1], [2,3], [3,7], [2,8]], 3)
    • shapes.add_shape(l1)
    • shapes.add_shape(l2)
    • shapes.add_shape(p1)
    • shapes.draw()

    Figure A.4 Plotting shapes using Matplotlib

    GDAL/OGR and PySAL

    Many Python packages have been developed to handle geospatial data. In this appendix we introduce two sets of libraries for different purposes. GDAL (Geospatial Data Abstraction Library) is a powerful library that can be used to support various tasks in handling geospatial data sets. GDAL is designed to handle raster data and supports about 100 raster formats.1 GDAL represents a major advance in free and open source GIS. As a component of GDAL, OGR supports vector data formats. OGR used to mean OpenGIS Simple Features Reference Implementation, but the OGR library does not fully comply with the OpenGIS Simple Features Specification of the Open GIS Consortium (OGC) and therefore was not approved by the OGC. As a consequence, OGR today technically does not stand for anything. In general, OGR supports many vector formats2 such as shapefiles, personal geodatabases, ArcSDE, MapInfo, GRASS, TIGER/Line, SDTS, GML, and KML. It also supports many database connections, including MySQL, PostgreSQL, and Oracle Spatial. More information about GDAL and OGR can be found on their website.3 GDAL/OGR was not initially developed as a Python library, but many programming languages support GDAL/OGR, including Python. In that sense, we are using a Python wrapper of the binary library of GDAL/OGR, which to some extent will have consequences on the performance of the code.

    The second library is an open source software package called PySAL (Python Spatial Analysis Library) that has been used in a wide range of applications of spatial data.4 Different from GDAL/OGR, PySAL is developed in Python and supports a wide range of spatial analysis applications. PySAL has its own core data structure that supports file input and output. A key component of PySAL is spatial weights that are central in many spatial analysis methods.

    In this appendix, we will go through some basic features to demonstrate how these powerful geospatial libraries can help us implement and understand many of the algorithms covered in this book. This appendix is mainly based on coding, with a lot of comments. Some of the lines included here are for instructional purposes and therefore may not be necessary in all cases. However, they should provide a good context in which to understand the functionality of GDAL/ORG and PySAL. We do not intend to explain the meaning of each and every line since it should be relatively straightforward to understand (especially with the extensive comments for most parts of the code). For more applications and details of the use of GDAL/OGR library, there is an online Python GDAL/ORG Cookbook.5 PySAL has its own tutorials6 that include details of how to use the library.

    There is a great deal we can do with the power of GDAL/OGR and PySAL because these libraries give us a convenient way to convert geospatial data in any format into any data structures. Almost every algorithm introduced in this book can be plugged in here. We should take this opportunity to test and explore those algorithms using our own data.

    Both GDAL/OGR and PySAL can be installed on different operating systems. We do not discuss the installation process here since the websites of these libraries provide detailed information.

    B.1 OGR

    To use OGR, we typically should start with the following lines, where we show how OGR works for a shapefile. In the following example, we assume there is a polygon shapefile called uscnty48area.shp existing in a folder called data in the upper level of the current working directory. To make it work for new data, we will need to have a polygon shapefile and make sure we know the path to that file. We focus on polygon shapefiles here. For other types of shapefiles (e.g., polyline), the processes are similar though the details will be different. We have an example of the use of a polyline shapefile (world coastlines) in Chapter 2. The last line in the following code should return False, indicating successful reading of the shapefile.

    B.1.1 Attributes

    The variable vector instantiated in the previous code lets us access the information stored in the shapefile. A general procedure is to go down the hierarchy of file > layer > feature to access different types of information. For a shapefile, one file only contains one layer, which in turn contains multiple features. Below are the lines of code for accessing the attributes associated with a feature.

    We can write a short function (getattr.py) that returns a list of values in an attribute in a shapefile. We will use this function later in Section B.3.

    B.1.2 Geometry and coordinates

    We will now see how to retrieve the geometry and coordinates from the shapefile we just loaded. We only show how to get points of one polygon, but it is of course possible to get points for all polygons using a loop, as we will show later.

    B.1.3 Projecting points

    We can transform a set of points in a geographic coordinate system (GCS) expressed in longitudes and latitudes into a projected coordinate system. Suppose we have a few locations from northern Cameroon that are recorded in WGS84 datum format, and we want to transform them into UTM zone 33N format. Here we will use the OSR (OGRSpatialReference) library, an OGR module for processing spatial reference systems, to transform between coordinate systems. OSR is based on the PROJ.4 Cartographic Projections Library.7 The general flow of work is to start by defining the source coordinate system, which is what we call gcs in the following code. Then we define the target coordinate system, called utmsr below. There are different ways to define a projected coordinate system, and we choose the easiest way using the EPSG code that is uniquely assigned to UTM 33N. With these two coordinate systems, we can define a transformation using the OSR function CoordinateTransformation, and here we name the transformation coordTransPcsUtm33. The known coordinates we have for the four locations in Cameroon are stored in a list of lists where the string in each element list refers to a place name. We first need to convert these coordinates into a point geometry in OGR and then we can simply use the Transform function to do the coordinate system transformation.

    B.1.4 Projecting and writing geospatial data

    Now we turn to our shapefile that is in a GCS and this time we will project it to the Albers equal-area conic projection that is commonly used for the conterminous United States. Our shapefile comes with a .prj file and we can read the spatial reference information directly from there. The general idea here is to find each point and transform it to a new coordinate system and then write it to a ring, which is in turn added into a polygon feature. We save the transformed data into a new shapefile.

    A general polygon in a shapefile may have holes, where each hole and the outline polygon are called rings. However, many counties in our data also have islands in a river or lake and these islands are not holes. In this case, they are organized as multipolygons. As the name suggests, each multipolygon includes a set of polygons, and each of those polygons may still contain holes. To streamline the process of handling such data, we design a specific function trans_rings that retrieves every ring in the input geometry that must be a polygon, which must not be a multipolygon but can have holes. (Having holes or not does not affect how this function works because holes and the outline polygon are all treated as rings and a polygon will have at least one ring.) This function returns a new polygon that contains transformed points. This polygon is then added into the feature for the new shapefile. For a feature that is only a simple polygon (with or without holes), we call this function once and get a new polygon. For a multipolygon feature, we will need to call this function multiple times, each time for a polygon in the multipolygon. Every time a new polygon is added, we create a new feature and add the polygon into the feature.

    Figure B.1 Projecting spatial data. Left: original data in GCS. Right: data projected into Albers equal-area conic projection

    The result is shown in Figure B.1. The two maps in this figure are drawn using Matplotlib using the following Python code (draw_shp_polygons.py).

    B.1.5 Adjacency matrix

    Our final project using OGR is to create an adjacent matrix for a polygon shapefile. The procedure is simple: we continuously check each pair of polygons and test if they touch each other or one contains another. We use the OGR Touches and Contains functions to do so. To speed up the process, we design a function called env_touch to test if the bounding boxes (envelopes) of two polygons touch each other. We can write a function called geom_touch to test whether two OGR geometry objects touch each other by sharing a specified number of points (in parameter n0).

    The adjacency_matrix function takes three input parameters: the path to the shapefile, the output format (M for matrix and L for list), and the number of shared points for polygons to be considered as adjacent (default is 1). In the test of the function, we use the US county shapefile and dump the final result in matrix form in a Python pickle file. In Python, pickle is a standard way of translating complicated objects so that we can store them in a text file and restore them for future use. The pickle file stored here is used in our main text (Section 9.2) for calculating Moran’s I.

    B.2 GDAL

    We now turn to GDAL to explore a few ways to access raster data. In this example we use the 2011 National Land Cover Database (NLCD) to demonstrate the use of GDAL. Similarly to OGR, we will need a driver to start using raster data, but in GDAL we also need to call the Register function before we can actually use it to open the raster file. Driver names (code) are available at http://www.gdal.org/formats_list.html. The NLCD we use here is stored in Erdas Imagine format (.img), and the GDAL code for that is HFA. The purpose here is to load the NLCD data into a NumPy array using the ReadAsArray function. Once the data are in an array format, which is an intuitive format for raster data, we can process them using many other methods such as the contagion metric (Section 9.4).

    B.3 PySAL

    PySAL covers a very wide range of topics in spatial analysis and beyond (Rey and Anselin, 2010). Here we demonstrate how to use it to calculate Moran’s I for an area data set. PySAL provides a lot of options to select the type of spatial weights. Here we use binary weights, meaning the weight between two polygons will be either 1 (adjacent) or 0 (non-adjacent). There are also a few different ways to compute the adjacency between polygons, and here we use the queen criterion, in which two polygons are adjacent if they share one or more points (the rook criterion, on the other hand, would require that they share an edge). Based on these choices, we will be able to compare the PySAL results with the code discussed in this book.

    In Section 9.2, we introduced the calculation of Moran’s I based on an adjacency matrix (or list), which is equivalent to the binary weights determined using the queen criterion. The above code also tests the result of using the Moran’s I code in this book and the results are almost identical (readers who are interested in this example can examine the source code and explain what causes the very small difference between the two programs). To further validate the results, we can test the same data using the spdep package in R. To the precision reported in the R package, the result is 0.3158373928, which is consistent with the results obtained by the Python programs.

    Code List

    The majority of the programs in this book have a file name so that they can be imported as modules or run from the command line. Each program is typically run in its own directory, and whenever we need to use a module from another directory we append that directory to the system path using sys.path.append from the sys module. Suppose we are in directory interpolation and we need to import the point.py module from the geom directory. We can use the following three lines:

    • import sys
    • sys.path.append(’../geom’)
    • from point import *

    In Table C.1 we list all the named files, in the order they appear in the text. In addition to these files, there are code blocks in the text that are not listed here because they are mainly meant for interactive commands. Data sets used as examples in this book are also listed in the table.

    Table C.1 Code and data

    Folder

    Name

    Section

    geom

    point.py

    2.1

    spherical_distance.py

    2.2

    point2line.py

    2.3

    centroid.py

    2.4

    sideplr.py

    2.5

    linesegment.py

    2.6

    intersection.py

    2.7

    point_in_polygon.py

    2.8

    point_in_polygon_winding.py

    2.9

    neville.py

    2.10

    transform1.py

    2.11

    worldmap.py

    2.12

    test_projection.py

    2.13

    transform2.py

    2.14

    line_seg_eventqueue.py

    3.2

    line_seg_intersection.py

    3.3

    test_line_seg_intersection.py

    3.4

    overlay.py

    3.5

    test_overlay_1.py

    3.6

    indexing

    kdtree1.py

    5.1

    kdtree2a.py

    5.2

    kdtree2b.py

    5.3

    kdtree3.py

    5.4

    prkdtree1.py

    5.5

    prkdtree2.py

    5.6

    prkdtree3.py

    5.7

    kdtree_performance.py

    5.8

    quadtree1.py

    6.1

    pointquadtree1.py

    6.2

    pointquadtree2.py

    6.3

    pointquadtree3.py

    6.4

    xdcel.py

    7.1

    extent.py

    7.2

    pmquadtree.py

    7.3

    pm1quadtree.py

    7.4

    pm2quadtree.py

    7.5

    pm3quadtree.py

    7.6

    rtree1.py

    7.7

    rtree2.py

    7.8

    use_rtree.py

    7.9

    interpolation

    idw.py

    8.1

    read_data.py

    8.2

    prepare_interpolation_data.py

    8.3

    test_idw.py

    8.4

    idw_cross_validation.py

    8.5

    semivariance.py

    8.6

    covariance.py

    8.8

    fitsemivariance.py

    8.9

    test_fitsemivariance.py

    8.10

    okriging.py

    8.11

    ordinary_kriging_test.py

    8.12

    skriging.py

    8.13

    simple_kriging_test.py

    8.14

    interpolate_surface.py

    8.15

    kriging_cross_validation.py

    8.16

    midpoint2d.py

    8.17

    spatialanalysis

    nnd.py

    9.1

    nnd_monte_carlo.py

    9.2

    oval_trees.py

    9.3

    test_oval_trees_nnd.py

    9.4

    kfunction.py

    9.5

    test_oval_trees.py

    9.6

    moransi.py

    9.7

    moransi2.py

    9.8

    test_moransi.py

    9.9

    test_moransi_mc.py

    9.10

    test_moransi_grid.py

    9.11

    kmeans.py

    9.12

    contagion.py

    9.13

    test_contagion.py

    9.14

    test_contagion_kernel.py

    9.15

    networks

    network2listmatrix.py

    10.2

    bfs.py

    10.3

    dfs.py

    10.4

    dijkstra.py

    10.5

    gateway.py

    10.6

    allpairdist.py

    10.7

    allpairpaths.py

    10.8

    optimization

    disc.py

    11.1

    cp.py

    11.2

    elzinga_hearn.py

    11.3

    welzl.py

    11.4

    test_1center.py

    11.5

    pmed_lpformat.py

    11.6

    pmed_greedy.py

    12.1

    teitz_bart.py

    12.3

    test_orlib_pmed.py

    12.4

    simulated_annealing.py

    12.6

    test_orlib_pmed_simann.py

    12.7

    gdal

    getattr.py

    B.1.1

    transform_cameroon_points.py

    B.1.3

    transform_coordinate_systems.py

    B.1.4

    draw_shp_polygons.py

    B.1.4

    adjacency_matrix.py

    B.1.5

    gdal1.py

    B.2

    pysal1.py

    B.3

    data

    uscnty48area.*

    franklin.*

    necoldem.dat

    necoldem250.dat

    data/orlib

    pmed*.orlib

    pmed*.distmatrix

    References

    Aluru, S. (2005). Quadtrees and octrees. In D. Metha and S. Sahni (eds), Handbook of Data Structures and Applications. Boca Raton, FL: Chapman & Hall/CRC.
    Anselin, L. (1995). Local indicators of spatial association – LISA. Geographical Analysis 27(2), 93115.
    Bailey, T. C. and A. C. Gatrell (1995). Interactive Spatial Data Analysis. Harlow, UK: Pearson Education.
    Bayer, R. and E. McCreight (1972). Organization and maintenance of large ordered indexes. Acta Informatica 1, 173189.
    Beasley, J. E. (1985). A note on solving large p-median problems. European Journal of Operational Research 21, 270273.
    Beckmann, N., H. Kriegel, R. Schneider, and B. Seeger (1990). The R*-tree: An efficient and robust method for points and rectangles. In Proceedings of the ACM SIGMOD Conference on Management of Data, Atlantic City, pp. 322331. New York: Association for Computing Machinery.
    Bentley, J. L. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM 18(9), 509.
    Bentley, J. L. and T. A. Ottmann (1979). Algorithms for reporting and counting geometric intersections. IEEE Transactions on Computers C 28(9), 643647.
    Bentley, J. L., D. F. Stanat, and J. E. H. Williams (1977). The complexity of finding fixed-radius near neighbors. Information Processing Letters 6(6), 209212.
    Boots, B. N. and A. Getis (1988). Point Pattern Analysis. Newbury Park, CA: Sage.
    Burrough, P. A. and R. A. McDonnell (1998). Principles of Geographical Information Systems. Oxford: Oxford University Press.
    Carr, J. R. (1995). Numerical Analysis for the Geological Sciences. Englewood Cliffs, NJ: Prentice Hall.
    Cerny, V. (1985). Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm. Journal of Optimization Theory and Application 45(1), 4151.
    Chakraborty, R. K. and P. K. Chaudhuri (1981). Note on geometrical solution for some minimax location problems. Transportation Science 15, 164166.
    Chan, T. M. (1994). A simple trapezoid sweep algorithm for reporting red/blue segment intersections. In Canadian Conference on Computational Geometry, Saskatoon, Saskatchewan, Canada, pp. 263268.
    Chrystal, G. (1885). On the problem to construct the minimum circle enclosing n given points in the plane. Proceedings of Edinburgh Mathematical Society 3, 3033.
    Clark, P. J. and F. C. Evans (1954). Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology 35, 445453.
    Cliff, A. D. and J. K. Ord (1981). Spatial Processes: Models and Applications. London: Pion.
    Cormen, T. H., C. E. Leiserson, R. L. Rivest, and C. Stein (2001). Introduction to Algorithms (
    2nd
    edn). Cambridge, MA: MIT Press.
    Cressie, N. (1990). The origins of kriging. Mathematical Geology 22, 239253.
    Cressie, N. A. C. (1991). Statistics for Spatial Data. New York: John Wiley & Sons.
    Daskin, M. S. (1995). Network and Discrete Location: Models, Algorithms, and Applications. New York: John Wiley & Sons.
    de Berg, M., M. van Kreveld, M. Overmars, and O. Schwarzkopf (1998). Computational Geometry: Algorithms and Applications (
    2nd
    edn). Berlin: Springer.
    Densham, P. J. and G. Rushton (1992). A more efficient heuristic for solving large p-median problems. Papers in Regional Science 41(3), 307329.
    Deza, M. M. and E. Deza (2010). Encyclopedia of Distances (
    2nd
    edn). Berlin: Springer.
    Dijkstra, E. W. (1959). A note on two problems in connexion with graphs. Numerische Mathematik 1, 269271.
    Dorigo, M. and L. M. Gambardella (1997). Ant colony system: A cooperative learning approach to the traveling salesman problem. IEEE Transactions on Evolutionary Computation 1(1), 5366.
    Drezner, Z. (ed.) (1995). Facility Location: A Survey of Applications and Methods. New York: Springer.
    Drezner, Z. and S. Shelah (1987). On the complexity of the Elzinga–Hearn algorithm for the 1-center problem. Mathematics of Operations Research 12(2), 255261.
    Dueck, G. and T. Scheuer (1990). Threshold accepting: A general purpose optimization algorithm appearing superior to simulated annealing. Journal of Computational Physics 90, 161175.
    Elzinga, J. and D. W. Hearn (1972). Geometrical solutions for some minimax location problems. Transportation Science 6, 379394.
    Finkel, R. and J. Bentley (1974). Quad trees: A data structure for retrieval on composite keys. Acta Informatica 4(1), 19.
    Floyd, R. W. (1962). Algorithm 97: Shortest path. Communications of the ACM 5(6), 345.
    Fotheringham, A. S., C. Brunsdon, and M. Charlton (2000). Quantitative Geography: Perspectives on Spatial Data Analysis. London: Sage.
    Fournier, A., D. Fussell, and L. Carpenter (1982). Computer rendering of stochastic models. Communication of the ACM 25(6), 371384.
    Frank, A. U. (1987). Overlay processing in spatial information systems. In Proceedings, Eighth International Symposium on Computer-Assisted Cartography (Auto-Carto 8), pp. 1231. Baltimore, MD: ASPRS, ACSM.
    Franklin, W. R., C. Narayanaswami, M. Kankanhalli, D. Sun, M.-C. Zhou, and P. Y. Wu (1989). Uniform grids: A technique for intersection detection on serial and parallel machines. In Ninth International Symposium on Computer-Assisted Cartography (Auto-Carto 9), pp. 100109. Baltimore, MD: ASPRS, ACSM.
    Geary, R. C. (1954). The contiguity ratio and statistical mapping. Incorporated Statistician 5, 115141.
    Gendreau, M., G. Laporte, and F. Semet (1997). Solving an ambulance location model by tabu search. Location Science 5(2), 7588.
    Getis, A. and J. K. Ord (1992). The analysis of spatial association by use of distance statistics. Geographical Analysis 24(3), 189206.
    Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley.
    Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford: Oxford University Press.
    Guttman, A. (1984). R-trees: A dynamic index structure for spatial searching. In Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, SIGMOD 84, Boston, pp. 4757. New York: ACM.
    Haines, E. (1994). Point in polygon strategies. In P. S. Heckbert (ed.), Graphics Gems IV, pp. 2446. San Francisco: Morgan Kaufmann.
    Hansen, P. and N. Mladenovic (1997). Variable neighborhood search for the p-median. Location Science 5(4), 207226.
    Healey, R. G., S. Sowers, B. M. Gittings, and M. J. Mineter (1998). Parallel Processing Algorithms for GIS. London: Taylor and Francis.
    Huang, C.-W. and T.-Y. Shih (1997). On the complexity of point-in-polygon algorithms. Computers & Geociences 23(1), 109118.
    Ipbuker, C. (2004). Numerical evaluation of the Robinson projection. Cartography and Geographic Information Science 31(2), 7988.
    Jordan, C. (1887). Cours d’analyse de l’École polytechnique. Paris: Gauthier-Villars.
    Kamel, I. and C. Faloutsos (1994). Hilbert R-tree: An improved R-tree using fractals. In Proceedings of the 20th International Conference on Very Large Data Bases, pp. 500509. San Francisco: Morgan Kaufmann.
    Kirkpatrick, S., C. D. Gelatt, and M. P. Vecchi, Jr. (1983). Optimization by simulated annealing. Science 220, 671680.
    Krige, D. G. (1951). A statistical approach to some basic mine valuation problems on the Witwatersrand. Journal of the Chemical and Metallurgical Mining Society of South Africa 52, 119139.
    Krzanowski, R. M. and J. Raper (2001). Spatial Evolutionary Modeling. Oxford: Oxford University Press.
    Lee, D. T. and C. K. Wong (1977). Worst-case analysis for region and partial region searches in multidimensional binary search trees and balanced quad trees. Acta Informatica 9(1), 2329.
    Lombard, K. and R. L. Church (1993). The gateway shortest path problem: Generating alternative routes for a corridor location problem. Geographical Systems 1(1), 2545.
    Mandelbrot, B. B. (1967). How long is the coast of Britain? Statistical self-similarity and fractional dimension. Science 155, 636638.
    Mandelbrot, B. B. (1977a). The Fractal Geometry of Nature. San Francisco: W. H. Freeman.
    Mandelbrot, B. B. (1977b). Fractals: Form, Chance, and Dimension. San Francisco: W. H. Freeman.
    Manolopoulos, Y., A. Nanopoulos, A. N. Papadopoulos, and Y. Theodoridis (2006). R-Trees: Theory and Applications. London: Springer.
    Matheron, G. (1963). Principles of geostatistics. Economic Geology 58, 12461266.
    McGarigal, K. and B. J. Marks (1995). FRAGSTATS: Spatial pattern analysis program for quantifying landscape structure. Technical Report PNW-GTR-351, U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station.
    Meagher, D. (1980). Octree encoding: A new technique for the representation, manipulation and display of arbitrary 3-D objects by computer. Technical Report IPL-TR-80-111, Rensselaer Polytechnic Institute.
    Meagher, D. (1982). Geometric modeling using octree encoding. Computer Graphics and Image Processing 19, 129147.
    Michalewicz, Z. and D. B. Fogel (2000). How to Solve It: Modern Heuristics. Berlin: Springer.
    Miller, H. J. and S.-L. Shaw (2001). Geographic Information Systems for Transportation: Principles and Applications. New York: Oxford University Press.
    Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika 37(1), 1723.
    O’Neill, R., J. Krummel, R. Gardner, G. Sugihara, B. Jackson, D. DeAngelis, B. Milne, M. Turner, B. Zygmunt, S. Christensen, V. Dale, and R. Graham (1988). Indices of landscape pattern. Landscape Ecology 1(3), 153162.
    O’Rourke, J. (1998). Point in polygon. In Computational Geometry in C (
    2nd
    edn), pp. 279285. Cambridge: Cambridge University Press.
    Peitgen, H.-O., H. Jürgens, and D. Saupe (1992). Chaos and Fractals: New Frontiers of Science. New York: Springer.
    Peitgen, H.-O. and D. Saupe (1988). The Science of Fractal Images. New York: Springer.
    Plastria, F. (2002). Continuous covering location problems. In Z. Drezner and H. W. Hamacher (eds), Facility Location: Applications and Theory, pp. 3779. Berlin: Springer.
    Press, W. H., S. A. Teukosky, W. T. Vetterling, and B. P. Flannery (2002). Numerical Recipes in C++ (
    2nd
    edn). Cambridge: Cambridge University Press.
    Resende, M. G. C. and R. F. Werneck (2003). On the implementation of a swap-based local search procedure for the p-median problem. In R. E. Ladner (ed.), Proceedings of the Fifth Workshop on Algorithm Engineering and Experiments, pp. 119127. Philadelphia: SIAM.
    Resende, M. G. C. and R. F. Werneck (2004). A hybrid heuristic for the p-median problem. Journal of Heuristics 10, 5988.
    Rey, S. J. and L. Anselin (2010). PySAL: A Python library of spatial analytical methods. In M. Fischer and A. Getis (eds), Handbook of Applied Spatial Analysis: Software Tools, Methods and Applications, pp. 175193. Berlin: Springer.
    Richardson, R. T. (1989). Area deformation on the Robinson projection. American Cartographer 16, 294296.
    Riitters, K. H., R. V. O’Neill, J. D. Wickham, and K. B. Jones (1996). A note on contagion indices for landscape analysis. Landscape Ecology 11(4), 197202.
    Ripley, B. D. (1981). Spatial Statistics. New York: John Wiley & Sons.
    Robinson, A. H. (1974). A new map projection: Its development and characteristics. International Yearbook of Cartography 14, 145155.
    Rosing, K. E. (1997). An empirical investigation of the effectiveness of a vertex substitution heuristic. Environment and Planning B: Planning and Design 24(1), 5967.
    Rosing, K. E., E. L. Hillsman, and H. Rosing-Vogelaar (1979). A note comparing optimal and heuristic solutions to the p-median problem. Geographical Analysis 11(1), 8689.
    Rosing, K. E. and M. J. Hodgson (2002). Heuristic concentration for the p-median: An example demonstrating how and why it works. Computers & Operations Research 29, 13171330.
    Samet, H. (1990a). Applications of Spatial Data Structures: Computer Graphics, Image Processing, and GIS. Reading, MA: Addison-Wesley.
    Samet, H. (1990b). The Design and Analysis of Spatial Data Structures. Reading, MA: Addison-Wesley.
    Samet, H. (2006). Foundations of Multidimensional and Metric Data Structures. San Francisco: Morgan Kaufmann.
    Samet, H. and R. E. Webber (1985). Storing a collection of polygons using quadtrees. ACM Transactions on Graphics 4(3), 182222.
    Sellis, T., N. Roussopoulos, and C. Faloutsos (1987). The R+-tree: A dynamic index for multi-dimensional objects. Technical report, VLDB Endowments.
    Shamos, M. and D. Hoey (1976). Geometric intersection problems. In 17th Annual Symposium on Foundations of Computer Science, pp. 208215. IEEE.
    Sinnott, R. (1984). Virtues of the Haversine. Sky and Telescope 68, 159.
    Snyder, J. P. (1987). Map projections – a working manual. Professional paper 1395, U.S. Geological Survey.
    Snyder, J. P. (1990). The Robinson projection: A computation algorithm. Cartography and Geographic Information Systems 17(4), 301305.
    Sylvester, J. J. (1860). On Poncelet’s approximate linear valuation of Surd forms. Philosophical Magazine 20, 203222.
    Teitz, M. B. and P. Bart (1968). Heuristic methods for estimating the generalized vertex median of a weighted graph. Operations Research 16, 955961.
    Tobler, W. R. (1970). A computer movie simulating urban growth in the Detroit region. Economic Geography 46, 234240.
    Webster, R. and M. A. Oliver (1990). Statistical Methods in Soil and Land Resource Survey. Oxford: Oxford University Press.
    Welzl, E. (1991). Smallest enclosing disks (balls and ellipsoids). In H. Maurer (ed.), New Results and New Trends in Computer Science, pp. 359370. Berlin: Springer.
    Whitaker, R. (1983). A fast algorithm for the greedy interchange of large-scale clustering and median location problems. INFOR 21, 95108.
    Wilford, J. N. (1988). The impossible quest for the perfect map. New York Times, October 25. http://nyti.ms/1BcUqAt.
    Wolpert, D. and W. Macready (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1(1), 6782.
    Xiao, N. (2006). An evolutionary algorithm for site search problems. Geographical Analysis 38(3), 227247.
    Xiao, N. (2008). A unified conceptual framework for geographical optimization using evolutionary algorithms. Annals of the Association of American Geographers 98(4), 795817.
    Xiao, N. (2012). A parallel cooperative hybridization approach to the p-median problem. Environment and Planning B: Planning and Design 39, 755774.
    Xiao, N., D. A. Bennett, and M. P. Armstrong (2002). Using evolutionary algorithms to generate alternatives for multiobjective site search problems. Environment and Planning A 34(4), 639656.
    Xiao, N., D. A. Bennett, and M. P. Armstrong (2007). Interactive evolutionary approaches to multiobjective spatial decision making: A synthetic review. Computers, Environment and Urban Systems 30, 232252.
    Xiao, N. and A. T. Murray (2015). Interpolation. In M. Monmonier (ed.), History of Cartography, Volume 6: Cartography in the Twentieth Century. Chicago: University of Chicago Press.
    Young, C., D. Martin, and C. Skinner (2009). Geographically intelligent disclosure control for flexible aggregation of census data. International Journal of Geographical Information Science 23(4), 457482.

    • Loading...
Back to Top