About GAIA GAIA Case Studies Global GIS Agenda 21 Country Data Model Database

Landuse Mapping Using Self Organizing Map

Authors
Djoko Budiyanto R. Sadananda  Sushil Archaya 
CSIM AIT Professor CSIM AIT Affiliated Faculty CSIM AIT 
e-mail : f96331@cs.ait.ac.th e-mail: sada@cs.ait.ac.th e-mail : sushil@ait.ac.th

Abstract

  In this paper we deal with methodology for creating landuse mapping. Given the fact supervised classification need the complicated technique to avoid misclassification. In this paper, we propose the use of Self Organizing Maps (SOM) neural network, an unsupervised classification, to make landuse mapping. The process is divided in two main processes, unsupervised classification and knowledge generation. Unsupervised classification is used to make easier classification process. Knowledge generation from the digital image which is represented in IF-THEN model is purposed for interpretation the image. Our research shows that SOM neural network has a simple process but high performance in landuse mapping process.

 Keywords : Knowledge generation, SOM neural network, Landuse mapping, Unsupervised classification.

1. Introduction.

  Landuse mapping, one example of image processing application, uses a classification process as a main process. In image processing application, classification is used to assign corresponding level with respect to group with homogenous characteristics, discriminating multiple objects from each other within the image. Another important step in landuse mapping is interpretation which is assigned to communicate the classified map to the users. Both of the two process have a problem. The problem in classification process is that data should be classified correctly. In interpretation process, knowledge/facts should be integrated in order to give a meaning of the map.

Self Organizing Map (SOM) is a process of unsupervised learning whereby the significant patterns or features in the input are discovered [3,4]. The physical result of SOM process is clustering data. In the neural area, it is said Self organized because the process of adaptive modify weights as a learning process is done by Maps itself according to the condition input which is applied. The interaction of between input and each neuron is not only affect for neuron itself but also with its neighborhood

SOM neural network is one kind unsupervised classification technique. By reviewing the previous research that SOM neural network has better performance than other clustering technique[ 5,9,10,11], this study proposed the use of SOM neural network in landuse mapping process.
 

2. Landuse Mapping and its problem

  There are two type of method which is known in classification method, supervised and unsupervised classification. Both of the two method have the advantage and drawback. The advantage of supervised classification is that the classified image is interpretable or we can directly interpret the image after classification is completed. The drawback of supervised classification is that user should be able choose the satisfactory training area and its need a complicated technique [1,6,7]. If not then not only the unclassified data will be obtained but also overlap classification will be happen. In unsupervised method the advantage and the drawback is opposite with the supervised method, process is easier we do not need to choose the training set but difficult in the interpretation step.

  Supervised mean that all of the classification process is guided by knowledge which is given in a set of training data. Therefore the result is classified data which already have knowledge. The problem of unsupervised classification comes since classification process is only based on the spectral value without knowledge. To overcome that problem, to guide interpretation process, it should be add knowledge for interpretation step.

  Since unsupervised classification method particularly SOM neural network can be used to extract knowledge within databases[8], there are two objective of the use SOM unsupervised classification, to make easier classification process and interpretable classification result. The study methodology for that purpose is showed in Figure 1.

Relating to the methodology, the prediction cluster within image performed by visual interpretation. The image which is chosen for visual interpretation should be well image. After geometric correction, visualization in computer need only three band data while data used have six band (Landsat TM) data therefore it should be chosen three among six band used. Qualitative Analysis [2] used and performed in calculation Optimum Index Factor (OIF) as follow :
 

    (3 3)
    OIF = (S Sk ) (S Abs (rj))-1
    k=1 j=1

    where :

      Sk = standard deviation for band k
      rj = the absolute value of the correlation coefficient between any two of the three bands being evaluated
 

 

 Figure 1. The study methodology
 

3. Generating Remotely Sensed Knowledge Base for Interpretation

  Interpretation is the problem that is frequently found when an unsupervised method is applied. This problem exists since the classification is only based on a spectral value, while this spectral value is only the average value of the microwave energy reflected on a certain area. The area is dependent on the spatial resolution of satellite; for example Landsat TM satellite has pixel size 30x30m2. The interpretation problem also includes other problems out of spectral value. For example, a spectral value connected with a high vegetation component and with the fine texture; in irrigated area the interpretation of the object could be paddy field while in non irrigated area it could be sugar cane. Although classification process is unsupervised method, supervision is needed during image interpretation.

Based on this explanation, there is a need to generate a knowledge which will be useful for supervised image interpretation. Generally, unsupervised classification models can be used to generate knowledge or rules from databases. Generating knowledge or rule can be done using SOM neural network clustering, and the other data which can be obtained easily such as contour map field observation data etc. The generating process using SOM neural network is shown in Figure 2.

 

Figure 2. Phase in knowledge generation in databases
 

The main process of knowledge generation begins after the clustering process. Figure 3. shows the examples of databases, clusters and its domain, each cluster has similar data characteristics.

Knowledge is generated by generalizing relationship among some facts to other facts. In this research, the knowledge generation was done by inferencing the result of clustering with an input vector and other information. From Figure 5.3 suppose the result

 
Figure 3. Two database clustered according the domains

of clustering image database 1 is cluster which have domain 1, and other classified information which is represented by database 2 is which have domain 2. Let C1.. Cn with R1..Rn are connected logically. Both C1..Cn and R1..Rn can be inferenced to generate knowledge. By inferencing cluster Ci and Ri and other information, the knowledge can be generated using induction method.

  Using terminology that the result of clustering process is homogeneous, the characteristics of all member of the cluster is similar. Therefore, one of the cluster member should be able to represent all of the cluster members. By limiting the use of the cluster in a particular task , the relations between one of the member cluster and other cluster will also be able to represent all of the cluster members. It means that the generalization process can be used in order to generate the knowledge [ ].

  The detailed process of generating knowledge by induction method will be described as follow. Firstly, among the clustered image, classified map (based on the condition) and field observation are overlaid. Then, from the certain coordinate, one pixel from each cluster is taken as a sample pixel. The positions of the pixel which is taken should be able represent the cluster in all conditions. That method is based on the assumption that the object is similar, if the spectral value and the condition are also similar. Secondly the knowledge base which is resulted from the taken pixel is generalized, therefore the rule can be used in similar situations.

  The altitude map is chosen as it can be used to derive some information on :

  • plain or incline land
  • irrigated or non irrigated land
  • altitude
The altitude map will be classified manually due to the knowledge generation process. Based on the field observation and the characteristics of the area, the altitude map will be clustered into 3 (three) types :
  • R1 cluster which relate to plain, irrigated, low altitude area.
  • R2 cluster which relate to incline and , non irrigated area.
  • R3 cluster which relate to mixed area which are plain , incline and non irrigated and high plan area.
For example, let us consider the clustered image resulted from SOM training in which alpha is 0.7 neighborhood size is 1 and the no iterations for stabilization is 750 for generating knowledge.. To performed knowledge generation, the pixel sample are taken from the clustered image. The important consideration while taking a pixel sample is that all of the position of pixels sample should represent one cluster in all condition. Related to that, for example, it is taken four pixel which have coordinate (24,87), (87,168), (139,199), (255,238). The value of the classified map are also taken from the same coordinates. The known fact can be obtained from the field observation and map. After image is overlaid, all the fact can be written as follow:
  • Known fact :
    • a. Cluster 1 is a cluster related to the vegetation object.
    • b. Region 1 is irrigated , plain and low altitude area.
    • c. Region 2 is incline , non irrigated area.
    • d. Region 3 is mixed between plain and incline and high altitude area.
  • The fact from clustered and classified image :
    • e. Pixel with coordinate (24,87), (87,168), (139,199), (255,238) is member of cluster 1
    • f. Pixel with coordinate (24,78) is the member of region 1.
    • g. Pixel with coordinate (87,168) is the member of region 2.
    • h. Pixel with coordinate (139,199) is the member of region 2.
    • i. Pixel with coordinate (255,238) is the member of region 3.
  • The fact from the filed observation and map :
    • j. Pixel with coordinate (24,78) is the paddy field.
    • k. Pixel with coordinate (87,168) is the sugar cane.
    • l. Pixel with coordinate (139,199) is the forest.
    • m.Pixel with coordinate (255,238) is the sugarcane.
By conducting a, b, e, and j in the IF THEN relationship, it can be derived as :

If (pixel with coordinate (24,78) then paddy field
 

the known facts are :

  • pixel with coordinate (24,78) is member of cluster 1 and member of region 1
  • The characteristics all of the member of cluster 1 is similar.
  • The characteristics all of the member of region 1 is similar.
the knowledge inducted is
    IF ( [ pixel(24,78) is member of the cluster 1] and
    [ pixel(24,78) is member of the region 1]) THEN object is paddy field
the generalization of the knowledge can be written as :
    IF ( [pixel (x,y), pixel (x,y) is the member cluster 1] and
    [pixel (x,y), pixel (x,y) is the member region 1]) THEN object is paddy field.
Using similar procedure, from a,c,e,g,k, knowledge can be generated as :
    IF ( [pixel (x,y), pixel (x,y) is the member cluster 1] and
    [pixel (x,y), pixel (x,y) is the member region 2]) THEN object is sugar cane.
both two rules can be written as follow :
    IF ( cluster 1 and region 1) THEN object is paddy field.
    IF ( cluster 1 and region 2) THEN object is sugar cane.
By using similar process we can generate other knowledge. The complete of set generated knowledge of 8 clusters resulting from of SOM neural network can be written as follows :
    IF cluster 6 and region 1 THEN object is paddy field
    IF cluster 6 and region 2 THEN object is sugar cane
    IF cluster 6 and region 3 THEN object is forest dense
    IF cluster 2 and region 1 THEN object is urban (settlement)
    IF cluster 2 and region 2 THEN object is cassava or mixed crop
    IF cluster 2 and region 3 THEN object is bare soil
    IF cluster 3 and region 1 THEN object is land preparation
    IF cluster 3 and region 2 THEN object is cassava (non irrigated area)
    IF cluster 3 and region 3 THEN object is bare soil
    IF cluster 4 and region 1 THEN object is urban
    IF cluster 4 and region 2 THEN object is urban + orchard (banana crop)
    IF cluster 4 and region 3 THEN object is urban + orchard (manggo etc)
    IF cluster 0 and region 1 THEN object is river or water body
    IF cluster 0 and region 2 THEN object is forest
    IF cluster 0 and region 3 THEN object is forest
    IF cluster 8 and region 2 THEN object is orchard
    IF cluster 8 and region 3 THEN object is orchard
    IF cluster 8 and region 2 THEN object is orchard
    IF cluster 7 and region 3 THEN object is orchard (more canopy and high)
    IF cluster 7 and region 2 THEN object is orchard (more canopy and high)
    IF cluster 7 and region 3 THEN object is orchard (more canopy and high)
    IF cluster 9 and region 1 THEN object is orchard
    IF cluster 9 and region 2 THEN object is spreads forest
    IF cluster 9 and region 3 THEN object is spreads forest

Application of the set rules is limited in the clustered image or in other clustered image which have similar spatial characteristics. An example is that the rule can be used in another part of Yogyakarta city but can not be applied in a city in Europe. In order to generate knowledge, the amount of classified information can be as many as possible. The rule of thumb for generating knowledge is that: more information will give better knowledge. Therefore the IF THEN model can be generalized as :
 

    IF ( Cx and Ry ............ and Rz ) THEN An

    where :

    Cx is data Cx which related to cluster Cx
    Ry is data Ry which related to region Ry
    Rz is data Rn which related to region Rn

An is other information which is inferenced by region Cn and region Rn by using coordinate position.

Among Cx, Cy and An are connected and inferenced in coordinate position.

For example, the type of vegetation can connected logically with the condition of itÆs altitude and soil type,. then the clustered image can be inferenced with classified altitude map and classified soil map.

The interpretation of each resulted cluster can be done by using the generated knowledge. Each pixel is taken one by one, then the rule is applied for each pixel. From this process each pixel in the image can be identified. After the interpretation process, landuse mapping is created.

4. Accuracy Assessment

The measurement of accuracy assessment can be used to measure the accuracy of clustering process and the truth of generated knowledge. The result of overall accuracy is showed in Table 1 below :

Table 1. Overall accuracy
No
Class. method and its parameter 
Overall accuracy
1
SOM alpha 0.3 neigh 3, network size 6x10
77.8403 %
2
SOM alpha 0.3 neigh 2, network size 6x10
72.1596 %
3
SOM alpha 0.3 neigh 1, network size 6x10
65.0704 %
4
SOM alpha 0.5 neigh 3, network size 6x10
80.8450 %
5
SOM alpha 0.5 neigh 2, network size 6x10
70.1877 %
6
SOM alpha 0.5 neigh 1, network size 6x10
66.2441 %
7
SOM alpha 0.7 neigh 3, network size 6x10
66.4319 %
8
SOM alpha 0.7 neigh 2, network size 6x10
56,3849 %
9
SOM alpha 0.7 neigh 1, network size 6x10
74.1314 %
10
Isodata conv. threshold 50%
46.8544 %
11
Isodata conv. threshold 75%
73.0046 %
12
Isodata conv. threshold 90%
75.6807 %

 

Another comparison between ISODATA and SOM neural network is the ability of capturing the feature within image. The process of SOM is based on the pattern matching in normalized data. The normalization is useful to avoid the domination of large data, in this research the effect of normalization id shown in unique and small cluster (river feature). SOM shows that the feature can be captured better than ISODATA. That phenomena is also depicted by table of result calculation cluster where some result of SOM clustering have cluster with a few members. Viewing weight neuron as a cluster center, the adapting process of SOM neural network is simple than ISODATA method. SOM uses the Euclidean distance of each input data and weight neuron to refine weight neuron (cluster center) while ISODATA calculate the mean and standard deviation.

5. Conclusion

Interpretation is a problem can be solved by knowledge generation. The knowledge needed can be generated by inferencing the clustered image and other connected classified information, and represented using IF-THEN rule model.

By using SOM neural network the assessment accuracy can be increased without add the complex method. SOM also has better ability to capture the small and unique feature within image than ISODATA method.

6. References

[1] Kershaw, C.D.,and Fuller, R.M. 1992. Statistical problem in the Discrimination of Landcover from Satellite images: a Case Study in lowland Britain. International Journal of Remote Sensing, Vol. 13 No. 16, pp.3805-3104.

[2] Jensen, J. R. , 1986. Introdoction Digital Image Processing Aremote Sensing perspective. Prentice Hall, Englewood Cliffs, New Jersey.

[3] Kohonen, Teuvo , 1980 , The Self Organizing Maps, Proccedings of The IEEE vol 78 No 9 September 1980

[4] Kohonen, Teuvo, 1989, Self Organizing Maps and Associative Memory, Springern Verlag Hiedelberg Germany 1989.

[5] Mangiameli, P. Chen, S. K. , West, David, 1996 , A Comparison of SOM Neural Network and Hierarchichal Clustering methods, European Jurnal of Operation Research 93 (1996) 402-417.

[6] Murrai, Sunji (editor), 1993, Remote Sensing Note ,Japan Association on Remote Sensing

[7] Richards, 1993 , Remote Sensing Digital Image Analisys, Springer-Verlag Berlin Heidelberg, New York, London Paris Tokyo.

[8] Sushils Acharya, Sadananda R, Ranjit Murti R, 1995. Knowledge Extraction from Socio-economic Databases Using Self Organizing Maps,. Asian Institute of Technology Bangkok.

[9] Yoshida, T. , Omatu, Sigeru, Neural Network Aproach to land Cover Mapping, IEEE Transaction on GeoScience and remote Sensing vol 32 No 5 September.

[10] Wan, W. , Fraser, D. , M2dSOMAP: Clustering and Classification of remotely Sensed Imagery by Combining Multiple Kohonen Selg Organizing Maps and Assiciative Memory, Procceding of 1993 International Joint Conference on Neural Network.

[11] Zhang, X., Li,Yanda, 1993, Self-Organizing Map as A New Method for Clustering and Data Analisys, Procceding of 1993 International Joint Conference on Neural Network.

 


© Copyright 1995-2002 by:   ESS   Environmental Software and Services GmbH AUSTRIA