Can big data be an alternative tool for visualizing GIS and mapping works? Does big data plus location data equal to GIS data? Does big data visualization have any hidden card that surpasses GIS visualization and mapping? I will find answers to these questions in writing this section. In the visualization and demonstration technology, big data and GIS share together in some aspects.
Many big data visualization outcomes do not have any geographic traits or variables and belong to this exclusive area. Figure 9 is an exemplary map of the area A , while Figure 10 is an instance of the area C. Figure 9 shows US cities by their elevation in which larger bubble implies higher the city location. I can create this figure using US city and state shape. Figure 10 shows a gender and ethnicity in tech companies with online Tableau public. In this visualization, there is no evidence of location or mapping technology.
This is a pure big data visualization area that is not related with a spatial context or geographic coordinates. Big data visualization example: gender and ethnicity in tech companies with online Tableau public . What is the overlapping area B that both GIS and big data work together or cooperate? In the B area, locations or geographic coordinates are important factor, and big data visualization technologies are also playing a crucial role in demonstration.
In Figure 11 , I provide an example of area B with the Chernoff face and US map, in which the Chernoff face denotes multivariate big data visualization using human face-like variables with SAS or R programming. There are many other visualization examples available if any big data expressions are embedded in maps or spatial context. Figure 11 is also a good example of area B because it is clearly telling the location although it does not use a map. Figure 12 shows how much population is moving from a continent to another with big data visualization technology of Tableau software.
Does big data visualization overcome GIS and its limitation? About this issue, I describe some insights in the following section. GIS visualization has a limitation since it is basically rooted at the spatial context and geographic maps. Location matters at GIS visualization as it did at mapping and geography. Big data visualization opens a new horizon in GIS visualization because it does not just strengthen the spatial context, but also it gives new meanings and insights to GIS maps and demonstration. As is compared in Figures 8 and 10 , dots in GIS visualization turn into human faces in big data visualization.
Figure 11 implies that locations can be read without a map. More big data visualization skills and their outcomes will be brought out with more abundant insights and implications to GIS visualization. However, there are some risks of big data visualization in applying to GIS visualization because their fundamental approaches are different in some ways. Big data engineers and visual technicians are not necessarily geographers, spatial experts, or even urban planners. Big data visualization workers if loaded with GIS related jobs should be aware of basic spatial principles and mapping process.
Second, GIS experts who is creating big data related visualization should be ready to adapt themselves to engineering guidelines that ask them set their spatial norms aside to set up new GIS-based big data visualization works. When GIS professionals get a step back, they will experience a power of big data visualization technology.
Third, GIS and big data visualization works should be multidisciplinary projects or research, in which all possible fields of study are involved in the final production. Big data visualization can be a good measure if people involved are deliberately designed, called, instructed, and allocated. Big data is defined as very large-sized, various-formatted datasets and analytic methods based on engineering technology and social network services, including statistical fusion and new visualization. A narrow definition of big data emphasizes data source, collection, storage and other technical issues, but its wider definition embraces analysis and demonstration aspects.
R programming, Tableau software, and Python language are getting a new attention as effective visualization tool for big data demonstration. GIS data visualization displays the spatial patterns or relationship between or among locations. Especially, big data visualization can be a good measure if people involved are deliberately designed, called, instructed, and allocated.
I am indebted to Myongji University for its generous research fund in Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3. Help us write another book on this subject and reach those readers. Login to your personal dashboard for more detailed statistics on your publications. We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals.
Downloaded: Abstract Geographic information system GIS has expanded its area of applications and services into various fields, from geo-positioning service to three dimensional demonstration and virtual reality. Introduction For decades, geographic information system GIS has expanded its area of applications and services into various fields, from geo-positioning service to three dimensional demonstration and virtual reality. What is big data? Big data and geographic information system Big data and GIS are able to share several aspects together because they are similar in elements of data processing.
The Extract tools let you select features and attributes in a feature class or table based on a query SQL expression or spatial and attribute extraction. The output features and attributes are stored in a feature class or table. Overlay The Overlay toolset contains tools to overlay multiple feature classes to combine, erase, modify, or update spatial features, resulting in a new feature class. New information is created when overlaying one set of features with another. There are six types of overlay operations; all involve joining two existing sets of features into a single set of features to identify spatial relationships between the input features.
Pairwise Overlay The Pairwise Overlay toolset provides an alternative to some of the tools in the Overlay toolset. Proximity The Proximity toolset contains tools that are used to determine the proximity of features within one or more feature classes or between two feature classes. These tools can identify features that are closest to one another or calculate the distances between or around them. Statistics The Statistics toolset contains tools that perform standard statistical analysis such as mean, minimum, maximum, and standard deviation on attribute data as well as tools that calculate area, length, and count statistics for overlapping and neighboring features.
The toolset also includes the Enrich tool that adds demographic facts like population or landscape facts like percent forested to your data. Table 1. Can big data visualization overcome GIS limitations? Conclusion Big data is defined as very large-sized, various-formatted datasets and analytic methods based on engineering technology and social network services, including statistical fusion and new visualization.
Acknowledgments I am indebted to Myongji University for its generous research fund in Conflict of interest No potential conflict of interest was reported by the author. Download chapter PDF. More Print chapter. How to cite and reference Link to this chapter Copy to clipboard. Cite this chapter Copy to clipboard Junghoon Ki November 27th Available from:. Over 21, IntechOpen readers like this topic Help us write another book on this subject and reach those readers Suggest a book topic Books open for submissions. Therefore, by using multiband images, more information can be obtained as a result of having more colour combinations available for analysis.
Sensors can be divided into two groups: passive and active. Passive sensing resembles the way our eyes operate, by detecting the radiation reflected by the sensed objects from the sun or another source of illumination. In remote sensing, microwave radar Radio Detection And Ranging operates by active sensing. Because of Radar's ability to penetrate clouds, water, snow and thin vegetation, a radar image offers the possibility of obtaining a view of the examined phenomena through such obstacles. The technique is especially suitable for surveying areas with frequent cloud cover or under grass.
Radar is also very useful because it accentuates surface roughness on remote sensing images. This occurs because relative height increases when the wavelength of the sensing radiation becomes longer. Relative height also increases when the angle of incidence the degree of camera tilt from vertical position is widened.
To the soil scientist, this facility can be very useful when studying such phenomena as stoniness where relative height must be accentuated in order to improve detection. An image of a stony soil surface, taken by a side-looking radar sensor, can show a distinctly mottled texture. Before any remote sensing image can be made available for mapping, it must be cleared of errors caused by geometric displacement or atmospheric interference. The common sources of error in satellite imagery are:. However, image correction is not generally recommended in soil and vegetation inventories, because it may give false data as image corrections usually involve pixel-value manipulations.
Images marred by significant interference from atmospheric phenomena such as clouds should be avoided. Atmospheric errors are especially difficult to correct, because guess work is often inevitable. Similarly, images marred by 'noise' communication errors, which occur during the transmission of images from satellites to the receiving stations should also be avoided.
Digital correction of image errors is made possible by computerised image processing techniques. Remote sensing images are normally kept in digitised form recorded in binary code. Correcting the geometric errors of individual images is usually undertaken by giving the computer the correct coordinates of a number of control points points whose position is known exactly on each image.
The image is then automatically stretched rescaled and rotated to produce a geographically-correct image map. Aronoff, S. Geographic information systems: A management perspective. Burrough, P. Principles of Geographical Information Systems. Oxford: Oxford University Press. Huxhold, William E. New York: Oxford University Press. Larsson, R. Harlow, UK: Longman. Martin, D. Geographic Information Systems: Socioeconomic Applications. London: Routledge.
McDonnell, R. Star, J. Geographic information systems: An introduction. Englewood Cliffs: Prentice Hall. Accuracy: is the degree to which information on a map or in a digital database matches control values. Accuracy is not the same as precision. See Precision. Address matching: A mechanism for relating two files using street or postal address as the relate item the item in common. Algorithm: A computer procedure used to solve a mathematical or computational problem, or to address a data processing issue. Algorithms usually consist of a set of rules written in a computer language.
See BSI. Analogue: A continuously varying electronic signal contrast with binary or digital.
The term is also used to describe traditional paper mapping products and aerial photographs. Application programme interface API : Computer software designed to access services from programmes across a network. Arc: A line connecting a set of points that can form one side of a polygon. In a topological GIS system see Topology , arcs are linked to nodes arc-node topology and to polygons polygon-arc topology.
Area: A fundamental unit of geographical information, defined by a continuous, closed boundary. Also known as polygon. Examples include fields, counties, lakes, local authority boundaries, school districts and census enumeration districts. Aspect: The geographical direction toward which a slope faces, measured in degrees from north, in a clockwise direction. Attribute data: Descriptive information about features or elements of a database, listed as numbers, characters or images. For a database feature like census districts, attributes might include demographic facts such as population, average income, gender and age.
Each row represents a geographical feature, and each column represents one attribute variable. Automated cartography: The process of drawing maps with the aid of computer-based display devices such as plotters and visual display units monitors. Azimuth: The horizontal direction of a vector, measured clockwise in degrees of rotation from the positive y-axis.
For example, degrees on a compass measured from north. Backup: A copy of a file, or a set of files, saved on a separate device or computer, for safekeeping, in case the original data were lost or damaged. Band separate: An image format that stores each band of data in a separate file. The original data are usually collected by multispectral scanners, onboard satellites or aeroplanes. Band: A single range of multispectral data for an interval within the electromagnetic spectrum such as light or infrared energy. Bandwidth: A measure of the volume of data that can flow through a communications link cable.
Also known as throughput. Base map: A map containing geographical features used for locational map grid reference. Property boundaries are commonly used as base maps because they are accurately referenced. Base Maps provide the background on which other data layers are overlayed and analysed. Baud rate: The speed of data transmission between computer and other devices, measured in bits per second. Benchmark Tests: Procedures for comparing the performance of competing hardware and software. Specific benchmark tests can be developed by GIS managers to test new equipment or software cope under conditions close to those which will be encountered in day-to-day use.
See Sensitivity analysis. Binary code and binary files: Digital information and commands stored and used by hardware and software, as sets of on-off signals. Most systems of binary encoding in GIS are proprietary to particular hardware and software vendors.
Binary data are usually the most compact means of storing information. However, binary files are not easy to transfer between computer systems which use different binary configurations. Bit: The smallest unit of information that a computer can store and process digitally. A bit has two possible values, 1 or 0, which correspond to yes and no or on and off.
See byte. Boolean operator: A keyword that specifies how to combine simple logical expressions into complex expressions. Breakline: A linear feature that defines the areal extent and controls the surface interpolation of a digital elevation model. Terrain features containing shorelines are often clipped as breakline features, otherwise sea-level surfaces could be erroneously interpolated as hills or valleys.
Buffer: Enclosed polygon created around points, lines or areas at an equal distance in all directions. The results represent areas at set distances from the original object. For example, the creation of buffer zones around a polluted industrial site may represent the varying extent of pollution measured at specified distances from the source of the contamination. Buffers are therefore useful for proximity analysis and environmental impact assessment.
Bundled GIS: A reference to the way software and hardware are sold together or separately. Some GIS vendors offer a bundled package of hardware and software, at a discount negotiated with the software developers and hardware manufacturers. Byte: A computer-memory and data storage unit composed of contiguous bits, usually 8.
File sizes are measured in bytes or megabytes one million bytes. Bytes contain values of 0 to and a collection of bytes often 4 or 8 bytes represents real numbers and integers larger than Cadastre: A cadastral survey involves the mapping, tracing and recording of private and public land resources. The term cadastre is French in origin, meaning a record of the ownership, extent and market value of a property, for tax and legal purposes.
Cartesian coordinate system: A two-dimensional, planar coordinate system in which x and y represent distances from a point of origin, and where each point on the plane is defined by an x,y coordinate. Locations in the coordinate system can be established using any unit of measurement such as meters or yards. Relative measures of distance, area, and direction are constant throughout a Cartesian coordinate plane. Cartography: The geographical discipline concerned with map preparation and communicating geographical information. Cell: The basic element of spatial information in the raster grid description of spatial features.
See Pixel. Centroid: The centre of a polygon. In the case of an irregularly shaped polygon, the centroid is derived mathematically and is roughly the equivalent of its centre of gravity. Character: An alphanumerical alphabetical or numerical value that represents a single unit of data.
GIS focuses on data with dimensionality or more specifically data that can be tied to spatial locations. We can however deal with any data that can be stored in a relational database or for which statistical algorithms can be applied whether spatial, time series or tabular. Are there any events or workshops about GIS? throughout the world to highlight the progress the integrated systems are making in dealing with global issues.
Choropleth map: A map in which areas of different value are separated by clearly defined boundaries. The value of the underlying data eg, soil moisture are represented by colour or shading densities, and the map legend acts as a look-up key to explain the values shown on the map. Clip: A polygon which defines the boundaries of features in a map, by cutting the lines off its edge. Clips are used to restrict the extent of data processing or querying done in GIS. See Breakline. Column: The vertical dimension of a table, representing the variables of the geographical features listed in the rows.
Command line interface: The software window which allows the GIS operator to type in commands at a prompt, rather than use a Windows-based system. Command line instructions are faster to execute, but typing errors can cause delays. Most modern GIS are Windows-based, but would also allow the user to work from a command line interface. This type of access regime allows only one user at a time to change the content of the database ie, have write-access , while other users will have read-only access.
The next user wanting write-access to the database will have to wait until the first person has completed their transaction. All database changes would be logged, showing the time of transaction and the name of the operator. Conditional operators are used to query a database. Connectivity: A topological property relating to how geographical features relate to one another spatially. Lines that share a common node are said to be topologically linked. Connectivity is useful in network analysis.
Contiguity: The topological definition of adjacent polygons, by recording the left and right polygons of each edge of the enclosed polygon. Conversion: The process of transforming data derived from existing records and maps to a digital database, or from one digital form to another. Input begins at a point, moves along a given bearing for a set distance, and continues in the same fashion until the geographical feature such as a pond is completely outlined.
Unlike the error-prone process of tablet digitisation, COGO data entry generally establish more accurate locations and boundaries. Coordinate: A set of numbers that designate a location in a given geographical reference system, such as x,y in a two-dimensional coordinate system or x,y,z in three-dimensional terrain models. Coordinate system: A reference system used to measure horizontal and vertical distances on a planimetric two-dimensional map. A coverage represents a single theme such as vegetation type, population density or property boundaries.
Dangle: A line with one end or both that does not connect to any other feature in the map. A dangle occurs when the end of the line undershoots or overshoots the map feature to which it is meant to be attached. Data integrity: Generally speaking, the term data integrity is used to refer to the relevance of the data kept in a database. For example, the presence of characters in a column that is meant to hold only numbers indicates poor data integrity. Data model: A user-defined, abstract representation of data describing the behaviour of the geographical entities represented by the data.
For example, the terrain of a geographical location can be represented with an x,y,z data model known as Digital Terrain Model DTM , where x and y represent the horizontal plane and z represents spot heights at the respective x,y coordinates. Data type: The quantitative or alphanumeric characteristics of variables that define the type of data used. Examples include character, integer and floating point numbers with decimals. Database: A collection of organised information, usually stored on magnetic tape or disk.
A GIS database includes data about the location and the attributes of geographical features that have been coded as points, lines, polygons or grid cells pixels. Datum: The singular for data. However, the term datum is often used in GIS circles to refer to geodetic datum. See Geodetic datum. Also known as Digital Elevation Model.
See Data model. Digitisation: The process of converting analogue map data into digital codes stored and processed by computers. Digitising involves tracing map features into a computer using a digitising tablet or mouse. Directory: A set of data files, stored on computer disk. Operating systems use directories and subdirectories to organise data.
Under Windows-based systems, directories are visually represented with a folder icon. Discrete data: Geographical features containing well-defined represented by points, lines or polygons boundaries. Dots Per Inch DPI : Measure of the resolution of graphic displays and printers, representing the number of pixels per inch. DXF is widely used for exchanging map files. Edge Matching errors: Sliver polygons often occur when neighbouring map tiles are misaligned as they are laid edge-to-edge.
These misalignment errors occur when features run across the boundaries of their respective map sheets. The errors often originate during digitisation, generalisation of map features, or map projection. Warping rubber-sheet correction and the automatic elimination of user-defined tiny sliver polygons are two major methods of mitigating edge-matching errors. See Rubber sheeting. Also known as map entity. Ethernet: A local-area network LAN protocol used for high-speed communication between linked up computers.
Event: An additional geographical feature occurring on or along a linear feature, such as a route. Extended character set: Extended character sets support additional languages which require double-byte characters, such as Arabic or Greek. Facet: A tile or subset database which contains information about one sub-area of the overall digital map.
Facets are an effective way of dividing a continuous map into units which can be worked with separately. File: A collection of related data textual or graphical in a computer.
Files are the basic units managed by the computer's operating system. Filter: A grid of weighted pixel values, used to remove superfluous noise features, or to generalise spatial patterns from a raster data set. Format: The method in which data are organised and stored in a computer, for transmission between computers or between a computer and a device. Most GIS have proprietary formats used to store and process geographical information. Gazetteer: A work of geographical reference that supplies place names and location information.
Generalisation: In its basic form, the process of removing vertices from a line or polygon, according to a pre-specified tolerance level or algorithm. Geocoding: The process of identifying or designating the coordinates of a geographical object, given its address. Also known as address geocoding. Geodesy: The study of the size and shape of the Earth, and the determination of exact longitude and latitude positions on it. Geodetic Datum: A three dimensional ellipsoidal model used to represent the shape of the Earth in a specified region.
A geodetic datum is the basis for the geographical coordinate system adopted by a country. A national geodetic datum gives a more accurate representation of the shape of Earth at local level, because global models are quite generalised. Geodetic framework: A spatial framework of points whose positions have been precisely determined on the surface of the Earth. Also known as geodetic network.
Geographical Information System GIS : A computer-based system for capturing, storing, analysing and displaying locational data. Georeferencing: Establishing the location of a given geographical object, according to an agreed system of map coordinates such as the National Grid. Measurements from a fourth satellite are required to calculate altitude height position.
Instead of issuing commands at a prompt, the user performs the required tasks by using a mouse, or selecting a menu item. See Command line interface. Gravity model: An analytical technique used in geographical research to analyse the geographical pattern of economic behaviour.
The underlying assumption of the model is that the influence of populations on one another is inversely proportional to the distance between them. Grey scales: Levels of brightness used in displaying information on monochrome display devices, or on non-colour printers. Grid cell: A unit that represents a single position on an array of equally sized square cells arranged in rows and columns.
Each grid cell is referenced by its geographical x,y location. Also known as pixel in raster GIS. Hardware: The physical components of a computer system, such as the computer and its attached devices: digitisers, plotters, printers, etc. Heuristic: A computational method that uses trial and error to approximate a solution, for computationally complex problems. Host: A computer to which other guest computers are connected.
The host computer usually handles complex or time-intensive computing tasks. The guest computers pass requests to the host whenever its services are required. Hub: A node in a network that can be used to channel goods from origins to destinations. Hubs are used at strategic locations in a network to reduce transportation costs.
Image analysis: The processing and interpretation of raster information that are held in digital form such as satellite images. An image is usually stored as values which represent the intensity of reflected light, or other values in the electromagnetic spectrum. Impedance: The amount of resistance friction or cost required to traverse a line in a network from its origin node to its destination node.
Resistance may be measured as travel distance, time or other obstacles such as road conditions. An optimum path in a network is the path of least resistance. Infrastructure: A reference to the basic utility structures that support a local economy, such as roads, electricity pylons, water and drain pipes, etc. GIS are used to hold information about these structures, and map out their location. Input device: A hardware component for data entry, such as a digitiser, keyboard, scanner, mouse, disk drive, etc. An important US organisation, which has played a major role in setting standards for many engineering and GIS related applications.
Inter-Application communication IAC : The capability of different computer software systems to communicate with one another. With IAC, several computer programmes can execute commands simultaneously, share data, and make requests of each other. Interface: A hardware and software link that connects two computer systems, or a computer and its peripherals. Private and public networks have joined the Internet since then. Interpolation: Estimating the value of a point, from measurements made at surrounding points. A basic example is the estimation of z height values of a surface model, using the known z values of surrounding points.
Intersection: The topological integration of two spatial data sets within the area common to both data sets. Isopleth map: A map showing the distribution of data as lines connecting points of equal value. Item indexing: A means of accelerating logical queries by creating an index for key terms in the database table.
The benefits of this approach are similar to finding information quickly in a book by consulting its index. See Cadastre. Latitude: Angular distance, expressed in degrees, along a parallel north or south of the Equator. Also known as parallel. Latitude-longitude: A global coordinate system used to measure locations on the Earth's surface. Latitude and longitude are angles measured from the Earth's centre to locations on the Earth's surface.
Latitude measures angles in a north-south direction. Longitude measures angles in an east-west direction. Lattice: A surface representation with a rectangular array of grid points spaced at constant sampling intervals in the x and y directions. Layer: A thematic plane of GIS features containing geographically and logically related data such as vegetation types.
Layers are the basic components of overlay operations in GIS. Least-cost path: The path between two points which has the lowest travel cost, where cost is a function of time, distance, or other impedance factors. Legend: The part of the drawn map explaining the meaning of the symbols and colours used to encode the geographical elements.
Also known as map key. Line: A basic geographical element, defined by two or more points with known x,y coordinates. Examples include motorways, streams and cable paths. The networked computers on a LAN can share data and peripheral devices, such as magnetic storage devices, printers and plotters. Also known as Boolean operator. Longitude: The angular distance east or west from a standard meridian such as the Greenwich line expressed in degrees. Look-up table: A set of data values that can be accessed by a computer programme to convert data from one form to another for example from numerical values to colours or symbols.
Macro: A script text file containing a sequence of commands that can be executed as one command. Macros can be built to perform frequently used or complex operations. Many-to-one relate: A relational database arrangement in which many records in a table are related by a single record in another table. Map Generalisation: The process of reducing detail on a map as a consequence of reducing the map scale.