In this concise introduction, I explain essential things you need to know when working with geographical co-ordinate systems. I explain why many geographical co-ordinate systems are being used and how to use the free Proj software to convert between them, and I make recommendations on how to store co-ordinates in your database. I assume you already know what latitude and longitude means and what cartesian (x-y) co-ordinates are, and that you have some understanding of the geometrical term “projection”, notably what we mean by “project a 3d shape to a plane”. The last section is intended for programmers.

The shape of the Earth: the geoid

You probably heard in primary school that the Earth is a sphere, and later you were told that the sphere is actually slightly squeezed (in mathematics a squeezed sphere is called a spheroid, which is a special case of ellipsoid). While it is true that the Earth is approximately a spheroid, we first need to clarify what we mean when we say “shape of the Earth”. We could, for example, define as shape of the Earth the surface that consists of the limit between atmosphere and land or ocean. This would be a valid definition, but it is generally more convenient for the purposes of geography and surveying to consider that the mountains are bulges on a smoother surface. Therefore, when we say “the shape of the Earth”, we normally mean a surface that consists of all zero altitude points. This is roughly the surface of the sea, extended under the continents.

Sea level goes up and down because of tides. Therefore, we define as zero altitude the average level of the sea, which we call the “mean sea level”. It happens that the surface described by the mean sea level is perpendicular to gravity, and we extend the (average) sea surface under the continents in such a manner that it is always perpendicular to gravity. This is then, what we mean when we say the “shape of the Earth”: we mean the surface that is perpendicular to gravity at all its points, and which is also level with mean sea level. This is, indeed, close to a spheroid, but it is not a spheroid; it is an irregular surface which needs something like 300 parameters to be described mathematically. We call that surface the geoid. (In fact, the geoid is so close to a spheroid that if the Earth had the size of a ball, the difference between the geoid and the closest matching spheroid would be of the order of a thousandth of a millimetre; but at the actual size of the Earth it makes a practical difference.)

Although in this text we consider that there is only one geoid, which we call “the geoid”, in fact there can be many geoids, for two reasons. First, the Earth’s shape changes with time, and so the geoid of 1900 can be different from the geoid of 2000. In addition, we don’t really know the geoid, but we estimate it from measurements. Therefore, we could talk, for example, of the Vaníček geoid, meaning the geoid that was estimated by Vaníček, which could be different from a geoid estimated by another researcher. I only tell you this so that you appreciate the complexity of the problem. For the rest of this text, this detail does not matter, and it is enough to consider that there is a single, given geoid.

Spheroids are special cases of ellipsoids, and in geodesy the term “ellipsoid” is used more often than “spheroid”. Therefore we will also use the more general term “ellipsoid” for the rest of this text.

Ellipsoids and geodetic CRS’s

The geoid is a complicated surface and it is difficult to work with, and we usually approximate it with an ellipsoid. There are many such ellipsoids you can use, and a given ellipsoid can be placed in many different ways. An ellipsoid with a fixed location and with a given zero-longitude meridian is called a geodetic co-ordinate reference system, or geodetic CRS.

Let’s see an example. The GRS80 is an ellipsoid, and it is fully defined by two parameters: its semi-major axis (radius at equator) and its “flattening”, which shows how shorter its semi-minor axis (distance from center to the pole) is with respect to the semi-major axis (in case you are wondering, the semi-major axis of GRS80 is 6378137 metres and the flattening is 1/298.257222101). Now depending on how we place this ellipsoid on the Earth, we get different geodetic CRS’s. If we place it so that the centre of the ellipsoid is at the centre of mass of the Earth, its minor axis is parallel to the Earth’s rotation axis, and we define as primary meridian (zero-longitude) the meridian that passes through a specific point of the Meridian Building of the Greenwich Royal Observatory in London, then we have the NAD83 geodetic CRS; but if we shift it so that instead of having its centre at the centre of mass of the Earth, we have its surface touching a specific point at Dionysos (an area at Attica, Greece), then we have the GGRS87 geodetic CRS. Note that the word “datum” is often used instead of “geodetic CRS”, but datum is a more general term, so here we will be using “geodetic CRS” to avoid confusion.

The reason we use different ellipsoids and geodetic CRS’s is that the approximation that is best depends on the kind of study you are making, and especially the area you are in. The geoid, being irregular, has an offset from the geodetic CRS, something like 100 metres for example. A geodetic CRS like NAD83, which is created in order to approximate the entire Earth, is probably worse for Greece than GGRS87, which approximates Greece better but probably sucks elsewhere.

The most widely used general-purpose geodetic CRS today is WGS84, which uses the ellipsoid also called WGS84. (The WGS84 ellipsoid is almost the same as GRS80 ellipsoid: they differ by 0.1 mm in the semi-minor axis; apparently the outrageous accuracy by which ellipsoids are defined has to do with rounding errors that are introduced during calculations, but I don’t really know.) The WGS84 geodetic CRS uses the WGS84 ellipsoid, with the centre of the ellipsoid at the centre of the Earth, the minor axis of the ellipsoid parallel to the Earth’s axis of rotation, and the primary meridian is the IERS Reference Meridian, which is close to, but not entirely the same as, the Greenwich meridian (it passes about 100 m east of the Meridian Building) .

Projections and projected CRS’s

The ellipsoids are often inconvenient. Maps are usually flat so that you can put them on walls or tables or view them on screens, which means that you need to project the ellipsoid to a plane. There is no single way to make such a projection. There are projections that keep relative distances, but they distort shapes and alter areas. There are projections that keep areas, but they distort shapes and alter distances. There are projections such that a straight line on the map is the shortest route between two points (these are used in air navigation, for example); and so on. In addition, latitude and longitude are very inconvenient if you want, for example, to calculate a distance between two points, and therefore we use alternative Cartesian co-ordinate systems, which, however, work only in small areas of the Earth, which can be assumed to be flat. We therefore use the so-called projected co-ordinate reference systems, or projected CRS’s. In Greece, for example, we use the GGRS87 projected CRS (not to be confused with the GGRS87 geodetic CRS mentioned in the previous section), which has x and y co-ordinates, with x increasing eastward and y northward. This system works as an approximation because you can assume that the Earth is flat throughout Greece, but if you are in Spain you need to use another CRS.

Except for the horizontal co-ordinates, there is also the altitude. The altitude is usually in a different co-ordinate system than the horizontal co-ordinates; usually we measure altitude from mean sea level, that is, from the geoid. If we used the same co-ordinate system as for the horizontal co-ordinates, then we would measure altitude from an ellipsoid (from a geodetic CRS), and because the ellipsoid has an offset from the geoid, we would find sea level to have a positive or negative altitude rather than zero.

Converting between co-ordinate systems

We use proj to convert between co-ordinate systems. Proj is a free library, the original version of which is in the public domain and more recent modifications are licensed under a MIT license. Proj is one of the most widely used libraries for this purpose. Many GNU/Linux distributions have it packaged; for example, in Debian and Ubuntu you can just “aptitude install proj” in versions up to Debian 5.0 and Ubuntu 9.04, and “aptitude install proj-bin” in later versions.

To convert between two co-ordinate systems, use the cs2cs (co-ordinate system to co-ordinate system) utility, which is included in proj. As an example, let’s convert co-ordinates from the GGRS87 projected CRS to the WGS84 geodetic CRS.

In order to convert, we need to describe the two co-ordinate systems. cs2cs accepts strings that describe the co-ordinate systems. The description for WGS84 is

+proj=latlong +datum=WGS84

which means “use latitude and longitude on the WGS84 geodetic CRS”. Note that using latitude and longitude means that you are not using a projection, and therefore the “+proj=latlong”, which literally means “the projection is latitude and longitude”, is inaccurate; it’s the type of co-ordinates that is latitude and longitude in this case, there is no projection. However, for convenience, the Proj library uses the “+proj” parameter in that case as well.

The description for the GGRS87 projected CRS is more complicated, as we need to find its parameters. First, we go to http://www.eps-registry.org/, and search for GGRS87. It will return something like 5 results. Of those, we are interested in the one that has a code of EPSG::2100 (we say that it has a spatial reference id, or SRID, of 2100), because this is the only one whose type is “projected CRS”. If we now look at its details, we see that it uses a Transverse Mercator projection, with natural origin at (0°N 24°E), scale factor 0.9996, and false easting 500000 metres. The cs2cs string that describes these parameters is

+proj=tmerc +lat_0=0 +lon_0=24e +k=0.9996 +x_0=500000

Remeber that a projected CRS is a projection from a geodetic CRS to a plane. The parameters above tell us how to make the projection, but they don’t tell us which is the geodetic CRS. The EPSG registry page about the GGRS87 projected CRS says that the “base geodetic CRS” is the “GGRS87” (SRID=6121), and although it does provide its definition, including the fact that it uses the GRS80 ellipsoid, it does not tell us how we can convert it to WGS84. For this, we need to go back to the search results page for GGRS87 and pick the “GGRS87 to WGS84” co-ordinate transformation (SRID=1272), and we will find the three parameters -199.87, 74.79, 246.62 (these are the offsets, in metres, of the GGRS87 geodetic CRS from the centre of mass of the Earth). The string that describes the GGRS87 geodetic CRS is therefore

+ellps=GRS80 +towgs84=-199.87,74.79,246.62

Joining everything together, the conversion command is

cs2cs +proj=tmerc +lat_0=0 +lon_0=24e +k=0.9996 +x_0=500000 \
                 +ellps=GRS80 +towgs84=-199.87,74.79,246.62 \
          +to    +proj=latlong +datum=WGS84

(This can actually be only one long line, without the backslashes; I’ve used the common Unix convention of breaking it into lines and using the backslash to indicate line continuation.)

cs2cs will then accept pairs of x-y GGRS87 projected co-ordinates in metres on its standard input (x and y separated by space) and output WGS84 latitude and longitude. The conversion is done in several stages, although this is not visible to the user. First, the Cartesian co-ordinates of the GGRS87 projected CRS are backprojected to the GGRS87 geodetic CRS, resulting in a longitude and a latitude on the GGRS87 geodetic CRS; then the resulting latitude and longitude are converted to the WGS84 geodetic CRS.

cs2cs obviously already knows some things, such as the definition of WGS84 and GRS80. In fact, I’ve so far not told you that it actually also knows about the GGRS87 geodetic CRS as well, and therefore the command above can be somewhat simplified:

cs2cs +proj=tmerc +lat_0=0 +lon_0=24e +k=0.9996 +x_0=500000 +datum=GGRS87 \
          +to    +proj=latlong +datum=WGS84

cs2cs actually even knows the GGRS87 projected CRS, so the whole thing simply becomes:

cs2cs +init=epsg:2100 +to +init=epsg:4326

2100 and 4326 are the SRIDs of the GGRS87 projected CRS and the WGS84 geodetic CRS. The +init=epsg:2100 parameter actually means “lookup the parameters of the CRS in file epsg, entry 2100”. In Debian, the epsg file is located at /usr/share/proj.

More information about cs2cs can be found using “man cs2cs” and in the accompanying documentation at http://trac.osgeo.org/proj/.

To use the co-ordinate transformation functions from a program (using an API) rather than the command line, try “man pj_init” (you may need to install additional packages, such as libproj-dev). There is also the pyproj project, which offers a Python API, and there are probably APIs in other languages as well.

Recommendations

If you use a database, don’t store x-y co-ordinates, or latitude-longitude; instead, store an abscissa, an ordinate, and a crs_id. The abscissa and the ordinate should be floating point numbers; whenever the CRS is a latitude/longitude system, abscissa and ordinate should be decimal degrees, positive for N/E and negative for S/W. Check the required precision for your application, and make sure you have it; double precision is usually required and sufficient. Store what is initially entered, and don’t do any conversions, because co-ordinate systems are complicated and they have complicated parameters, and it’s easy to get something wrong in the conversion; if you have the co-ordinates in the GGRS87 projected CRS, store them in the GGRS87 projected CRS. Maintain a CRS lookup table, with the following attributes:

crs_id, the surrogate primary key
name, a descriptive name like “GGRS87 projected CRS”
srid, such as 2100 for the GGRS87 projected CRS, or 4326 for WGS84
authority, the organisation who has assigned the above SRID to the CRS; in our example, this is the EPSG
proj_definition, the proj string that defines the CRS; as simple as “+init=epsg:2100” if you are confident that the installed proj version will be able to handle it, or a longer string with all the parameters

Even if it happens that all points to be stored in your database use a single CRS, still use the above scheme for the database, and limit usage to a given CRS at the application level. This way you’ll have less trouble when, later, you get international customers.

If you use some kind of GIS system, such as GRASS or PostGIS, first check whether it already has co-ordinate transformation features. Also note that GIS’s already have a table similar to the CRS lookup table described above (in OpenGIS it’s the spatial_ref_sys table), so if you decide to introduce redundancy do it carefully.

Written by Antonis Christofides, October 2009. It would have been impossible to write this essay if it were not for Stefanos Kozanis and Michalis Salachoris, who explained to me all that. Stefanos Kozanis has also written conversion software, available with more theory (in Greek) at http://www.itia.ntua.gr/~soulman/icoordstrans/, which however is for GGRS87 only. It is known to be very accurate, and I used it to verify that I got the parameters for cs2cs correct (except that icoordstrans uses slightly different parameters for the GGRS87 geodetic CRS shift).

Permission is granted to reproduce and modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 license.