Thread: Binary data representations for new protocol
4 stars based on
As of version 0. The OpenGIS specification defines two standard ways of expressing spatial objects: Using libpq floats and binary data of the text representations WKT of the spatial objects of the features are as follows:. The SRID is required when creating spatial objects for insertion into the database. For example, a valid insert statement to create and insert an OGC spatial object would be:. Examples of the text representations EWKT of the extended spatial objects of the features are as follows:.
For example, a valid insert statement to create and insert a PostGIS spatial object would be:. The "canonical forms" of a PostgreSQL type are the representations you get with a simple query without any function call and the one which is guaranteed to be accepted with a simple insert, update or copy.
For the postgis 'geometry' type these are:. The well-known text extensions are not yet fully supported. Examples of some simple curved geometries are shown below:. All floating point comparisons within the SQL-MM implementation are performed to a specified tolerance, currently 1E In order to ensure that meta-data remain consistent, operations such as creating and removing a spatial column are carried out through special procedures defined by OpenGIS.
There are two OpenGIS meta-data tables: The name of the standard or standards body that is being cited for this reference system.
PostGIS uses the Proj4 library to provide coordinate transformation capabilities. For more information about, see the Proj4 web site at http: The fully qualified name of the feature table containing the geometry column. Using libpq floats and binary data that the terms "catalog" and "schema" are Oracle-ish. The ID of the spatial reference system used for the coordinate geometry in this table.
The type of the spatial object. To restrict the spatial column to a single type, use one of: This attribute is probably not part of the OpenGIS specification, but is required for ensuring type homogeneity.
Here is another example, using libpq floats and binary data the generic "geometry" type and the undefined SRID value of To check validity of geometries using libpq floats and binary data can use the IsValid function:.
By default, PostGIS does not apply this validity check on geometry input, because testing for validity needs lots of CPU time for complex geometries, especially polygons. If you do not trust your data sources, you can manually enforce such a check to your tables by using libpq floats and binary data a check constraint:. The same is true if a PostGIS function returns an invalid geometry for valid input.
The IsValid function won't consider higher dimensioned geometries invalid! Invocations of AddGeometryColumn will add a constraint checking geometry dimensions, using libpq floats and binary data it is enough to specify 2 there. Once you have created a spatial table, you are ready to upload GIS data to the database.
If you can convert your data to a text representation, then using formatted SQL might be the easiest way to get your data into PostGIS. A data upload file roads. The loader has several operating modes distinguished by command line flags:. Drops the database table before creating a new table with the data in the Shape file. Appends data from the Shape file into the database table. Note that to use this option to load multiple files, the files must have the same attributes and same data types.
Creates a new table and populates it from the Shape file. This is the default mode. Only produces the table creation SQL code, without adding any actual data. This can be used if you need to completely separate the table creation and data loading steps. Use the PostgreSQL "dump" format for the output data. This can be combined with -a, -c and -d. It is much faster to load than the default "insert" SQL format. Use this for very large data sets. Keep identifiers' case column, schema and attributes.
Coerce all integers using libpq floats and binary data standard bit integers, do not create bit bigints, even if the DBF header signature appears to warrant it. Output WKT format, for use with older 0.
Note that this will introduce coordinate drifts and will drop M values from shapefiles. Specify encoding of the input data dbf file.
When used, all attributes of the dbf are converted from the specified encoding to UTF8. An example session using the loader to create an input file and uploading it might look like this:. In the section on SQL we will discuss some of the operators available to do comparisons and queries on spatial tables.
The most straightforward means of pulling data out of the database is to use a SQL select query and dump the resulting columns into a parsable text file:. However, there will be times when some kind of restriction is necessary to cut down the number of fields returned. In the case of attribute-based restrictions, just use the same SQL syntax as normal with a non-spatial table. This operator tells whether the bounding box of one geometry intersects the bounding box of another.
This operators tests whether two geometries are geometrically identical. This operator is a little more naive, it only tests whether the bounding boxes of to geometries are the same. Next, you can use these operators in queries. Note that when specifying geometries and boxes on the SQL command line, you must explicitly turn the string representations into geometries by using the "GeomFromText " function.
The most common spatial query will probably be a "frame-based" query, used by client software, like data browsers and web mappers, to grab a "map frame" worth of data for display. Using a "BOX3D" object for the frame, such a query looks like this:. The value -1 is used to indicate no specified SRID. The pgsql2shp table dumper connects directly to the database and converts a table possibly defined by a query into a shape file.
The basic syntax is:. In the case of tables with multiple geometry columns, the geometry column to use when writing the shape file. Use a binary cursor. This will make the operation faster, but will not work if any NON-geometry attribute in the table lacks a cast to text. Do not drop the gid field, or escape column names. Indexes are what make using a spatial database using libpq floats and binary data large data sets possible. Without indexing, any search for a feature would require a "sequential scan" of every record in the database.
Indexing speeds up searching by organizing the data into a search tree which can be quickly traversed to find a particular record. PostgreSQL supports three kinds of indexes by default: B-Trees are used for data which can be sorted along one axis; for example, numbers, letters, dates.
GIS data cannot be rationally sorted along one axis which is greater, 0,0 or 0,1 or 1,0? R-Trees break up data into rectangles, and sub-rectangles, and sub-sub rectangles, etc.
GiST Generalized Search Trees indexes break up data into "things to one side", "things which overlap", "things which are inside" and can be used on a wide range of data-types, including GIS data. In addition to GIS indexing, GiST is used to speed up searches on all kinds of irregular data structures integer arrays, spectral data, etc which are not amenable to normal B-Tree indexing. Once a GIS data table exceeds a few thousand rows, you will want to build an index to speed up spatial searches of the data unless all your searches are based on attributes, using libpq floats and binary data which case you'll want to build a normal index on the attribute fields.
Building a spatial index is a computationally intensive exercise: After building an index, it using libpq floats and binary data important to force PostgreSQL to collect table statistics, which are used to optimize query plans:. Firstly, GiST indexes are "null safe", meaning they can index columns which using libpq floats and binary data null values. Lossiness allows PostgreSQL to store only the "important" part of an object in an index -- in the case of GIS objects, just the bounding box.
Ordinarily, indexes invisibly speed up data access: Unfortunately, the PostgreSQL using libpq floats and binary data planner does not optimize the use of GiST indexes well, so sometimes searches which should use a spatial index instead default to a sequence scan of the whole table. If you find your spatial indexes are not being used or your attribute indexes, for that matter there are a couple things you can do:.
Firstly, make sure statistics are gathered about the number and distributions of values in a table, to provide the query planner with better information to make decisions around index usage. Starting with PostgreSQL 8. You should only use this command sparingly, and only on spatially indexed queries: Default value for the parameter is 4, try setting it to 1 or 2.
Decrementing the value makes the planner more inclined of using Index scans. The raison d'etre of spatial database functionality is performing queries inside the database which would ordinarily require desktop GIS functionality. Using PostGIS effectively requires knowing what spatial functions are available, and ensuring that appropriate indexes are in place to provide good performance.
Functions such as distance cannot use the index to optimize their operation. For example, the following query would be quite slow on a large table:. It will be slow because it is calculating the distance between each point in the table and our specified point, ie. This query selects the same geometries, but it does it in a more efficient way. Assuming that our query box is much smaller than the extents of the entire geometry table, this will drastically reduce the number of distance calculations that need to be done.