Understanding the Classification of Data, it’s classes and methods in relation to GIS !

Article shared by

Data classification, also known as data classing or selection of intervals, is the process by which a set of interval or ratio data are divided into a small number of classes or categories. Such classification is necessary for the construction of classed choropleth maps in which a range of different colors or shadings is used to depict the set of data classes. The selection of intervals so strongly influences the apparent information content of a map that knowing how to choose appropriate class intervals is a necessary skill for any GIS user.

Opensource GIS: You get way more than what you pay for. | PaleoNick

Image Source: nwaber.files.wordpress.com/2012/08/qgis_grass3d1.jpg

Number of Classes

While there is some disagreement as to the precise number, there is general agreement that human cognition limits our ability to visually discriminate more than 10 or 11 different colors or tint shadings in a single map. Most cartographers suggest no more than seven classes be used. The actual number of classes chosen depends not only on the color used to symbolize the data (the variation in tints for yellow are far fewer than for blue, for example) but also on various characteristics of the data and the map context, including the skill of the map reader, the distribution of the data, and the precision with which class discrimination is needed.

Methods

Data classification begins by organizing the set of data in order by value and possibly by summarizing the data with a distribution graph. Class breaks are then inserted at values along this ordered set by one of many different methods. Evans has outlined a generic classification of class-interval systems that suggests a very large number of possible methods. However, most commercial GIS include a small number of methods within their mapping functionality. The most common systems are as follows:

ADVERTISEMENTS:

Equal-interval
Divide the range of data values by the number of classes desired to produce a set of Classification, Data———39 class intervals that are equally spread across the data range. For example, if the data have a range of 1 to 99 and five classes are desired, then class breaks could be created at 20, 40, 60 and 80.

Quartiles

Divide the number of data values evenly into the number of classes that have been chosen. Thus, if there are to be five classes, each class will contain 20% of the observations.

Standard deviation

ADVERTISEMENTS:

Calculate the mean and standard deviation of the data set and then classify each value by the number of standard deviations it is away from the mean. Often, data classed by this method will have five classes (greater than 2, between 1 and 2, between 1 and –1, between –1 and –2, and greater than –2 standard deviations) and will be shaded using two different color ranges (e.g., dark blue, light blue, white, light red, and dark red, respectively).

Natural breaks (Jenk’s method)

Classes are based on natural groupings inherent in the data. Jenk’s method identifies the breaks that minimize the amount of variance within groups of data and maximize variance between them.

Karen K. Kemp

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Navigation

Related Articles: