I.plr.py

From GRASS-Wiki
Revision as of 23:17, 2 August 2010 by ⚠️MilenaN (talk | contribs) (added category)
Jump to navigation Jump to search

Summary

Probabilistic Label Relaxation is a post-classification algorithm. Is is based on the Bayes' theorem using conditional probabilities to further improve the results of a classification process that was carried out using class membership probabilities (for example maximum likelihood).

See Documentation for details.

Example

Create groups using landsat7 data from the North Carolina dataset

i.group lsat7_2002 in=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_61,lsat7_2002_62,lsat7_2002_70,lsat7_2002_80
i.group lsat7_2002 sub=lsat7_2002_multi in=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_61,lsat7_2002_62,lsat7_2002_70

Create a signature file with 10 classes via cluster analysis

i.cluster group=lsat7_2002 sub=lsat7_2002_multi sig=cluster_10 class=10

i.plr.py help page

Description:
 Probabilistic label relaxation, postclassification filter

Usage:
 i.plr.py group=string subgroup=string sigfile=string output=string
   [iterations=value] [ntype=value] [--verbose] [--quiet] 

Flags:
 --v   Verbose module output
 --q   Quiet module output

Parameters:
       group   image group to be used
    subgroup   image subgroup to be used
     sigfile   Path to sigfile
      output   Name for resulting raster file
  iterations   Number of iterations (1 by default)
       ntype   type of neighbourhood (4(default) or 8)

Example using 8-neighbourhood, 5 iterations

i.plr.py group=lsat_2002 sub=lsat7_2002_multi sig=cluster_10 out=lsat7_2002_plr it=5 nt=8

Module output

These results were created with landsat7 data from the North Carolina dataset by using a signature file of 10 classes, generated by i.cluster: a) i.maxlik b) i.plr.py, 4-neighbourhood, 1 iteration c) i.plr.py, 4-neighbourhood, 5 iterations d) i.plr.py, 8-neighbourhood, 1 iteration e) i.plr.py, 8-neighbourhood, 5 iterations

Known Issues

  • Very slow, code uses several loops, perfomance might be better when written in C
  • Probabilities are extracted by running i.maxlik for every class. This could be solved by implementing some changes into the i.maxlik module, so that probabilities used during the process of classification can be saved.
  • At this point the implementation of the getProbability(x,y) function is rather exemplary. The function returns 1.0 if the two given classes are identical and 0.5 in each other case. Even though this approach does show some effect in processing classification results, this is also the place to start optimization. One way to think of could be an automatically generated lookup-table, based on total numbers of pixels for each class. Of course a lookup-table could also be created by user input, but since N! entries would be needed for a classification using N classes, this would only be practicable for very small numbers of classes. A default value could be used to minimize the amount of user input to important entries.


Code

  #!/usr/bin/env python
#
#############################################################################
#
# MODULE:      i.plr.py
# AUTHOR(S):   Georg Kaspar
# PURPOSE:     Probabilistic label relaxation, postclassification filter
# COPYRIGHT:   (C) 2009
#
#              This program is free software under the GNU General Public
#              License (>=v2). Read the file COPYING that comes with GRASS
#              for details.
#
#############################################################################

#%Module
#% description: Probabilistic label relaxation, postclassification filter
#%End
#%option
#% key: group
#% type: string
#% description: image group to be used
#% required : yes
#%end
#%option
#% key: subgroup
#% type: string
#% description: image subgroup to be used
#% required : yes
#%end
#%option
#% key: sigfile
#% type: string
#% description: Path to sigfile
#% required : yes
#%end
#%option
#% key: output
#% type: string
#% description: Name for resulting raster file
#% required : yes
#%end
#%option
#% key: iterations
#% type: integer
#% description: Number of iterations (1 by default)
#% required : no
#%end
#%option
#% key: ntype
#% type: integer
#% description: type of neighbourhood (4(default) or 8)
#% required : no
#%end

import sys
import os
import numpy
import grass.script as grass
from osgeo import gdal, gdalnumeric, gdal_array
from osgeo.gdalconst import GDT_Byte

def getMetadata():
    env = grass.gisenv()
    global GISDBASE 
    global LOCATION_NAME 
    global MAPSET 
    GISDBASE = env['GISDBASE']
    LOCATION_NAME = env['LOCATION_NAME']
    MAPSET = env['MAPSET']
        
def splitSignatures(path, sigfile):
    # split signature file into subfiles with 1 signature each
    stream_in = open(path + sigfile, "r")
    stream_in.next() # skip first line
    counter = 0
    stream_out = open(path + "plr_foo.sig", "w")
    for line in stream_in: 
        if line[0] == "#":
            stream_out.close()
            counter += 1     
            stream_out = open(path + "plr_" + str(counter) + ".sig", "w")
            stream_out.write("#produced by i.plr\n")
        stream_out.write(line)
    stream_out.close()
    stream_in.close()
    return counter
    
def normalizeProbabilities(counter):
    arg = ""
    for i in range(counter):
        arg = arg + "+plr_rej_" + str(i)
    arg = arg.strip('+')
    print "calculating multiplicands, arg=" + arg
    grass.run_command("r.mapcalc", multiplicand = "1./(" + arg + ")")
    for i in range(counter):
        print "normalizing probabilities for class " + str(i)
        grass.run_command("r.mapcalc", plr_rej_norm = "plr_rej_" + str(i) + "*multiplicand")
        grass.run_command("g.rename", rast="plr_rej_norm,plr_rej_norm_" + str(i))
        
def getProbability(a,b):
    # TODO: Implement this!!!
    if a == b:
        return 1
    else:
        return 0.5
    
def cleanUp(path):
    os.system("rm " + path + "/plr_*.*")
    os.system("rm /tmp/plr_*.*")
    grass.run_command("g.mremove", flags="f", quiet=True, rast="plr_*")
    
def plr_filter(probabilities, width, height, classes, type):
    # create an empty n-dimesional array containing results
    results = numpy.ones((classes,height,width))            
    print "starting label relaxation"
    progress = 0
    # for each pixel (except border)
    for y in range(height):
        p = int(float(y)/height * 100)
        if p > progress:
            print '\r' + str(p) + '%'
            progress = p
        for x in range(width):
            # for all classes create neighbourhood and extract probabilities
            for c in range(1, classes+1):
                if (x == 0) or (x == width-1) or (y == 0) or (y == height-1):
                    results[c-1,y,x] = probabilities[c-1,y,x]
                else:
                    if type == 8:
                        q = neighbourhoodFunction8(probabilities, x, y, c, classes)
                    else:
                        q = neighbourhoodFunction4(probabilities, x, y, c, classes)
                    p = probabilities[c-1, y, x]
                    # resulting cell contains the product of class probability and
                    # neighbourhood-function
                    results[c-1,y,x] = p * q         
    print ""
    return results
           
def neighbourhoodFunction4(probabilities, x, y, z, classes):           
    n = []
    neighbours = [[x-1,y],[x,y],[x+1,y],[x,y-1],[x,y+1]]
    for i in neighbours:
        l = []
        # for each possible class
        for c in range(1, classes+1):
            l.append(getProbability(z, c) * float(probabilities[c-1,i[1],i[0]]))
        n.append(sum(l))
    return sum(n)
        
def neighbourhoodFunction8(probabilities, x, y, z, classes):
    n = []    
    for j in range(y-1, y+2):
        for i in range(x-1, x+2):
            l = []
            # for each possible class
            for c in range(1, classes+1):
                l.append(getProbability(z, c) * float(probabilities[c-1,j,i]))
            n.append(sum(l))
    return sum(n)
    
def createMap(probabilities, width, height, classes):
    print "retrieving class labels"
    results = numpy.ones((height,width))
    progress = 0
    for y in range(height):
        p = int(float(y)/height * 100)
        if p > progress:
            print '\r' + str(p) + '%'
            progress = p
        for x in range(width):
            max_class = 1
            max_val = probabilities[0,y,x]
            for c in range(2, classes+1):
                current_val = probabilities[c-1,y,x]
                if current_val > max_val:
                    max_val = current_val
                    max_class = c
            results[y,x] = max_class
    #results = probabilities.max(0)
    return results;
    
def export(array, trans, proj):    
    driver = gdal.GetDriverByName('GTiff')    
    out = driver.Create('/tmp/plr_results.tif', array.shape[1], array.shape[0], 1, GDT_Byte)
    out.SetGeoTransform(trans)
    out.SetProjection(proj)
    gdal_array.BandWriteArray(out.GetRasterBand(1), array)


def main():
    # fetch parameters
    group = options['group']
    subgroup = options['subgroup']
    sigfile = options['sigfile']
    output = options['output']
    iterations = options['iterations']
    ntype = options['ntype']
    
    if iterations == "":
        iterations = 1
    iterations = int(iterations)
        
    if ntype == "":
        ntype = 4
    ntype = int(ntype)
    
    # fetch Metadata
    getMetadata()
    
    # split sigfiles
    sigpath = GISDBASE + "/" + LOCATION_NAME + "/" + MAPSET + "/group/" + group + "/subgroup/" + subgroup + "/sig/"
    counter = splitSignatures(sigpath, sigfile)
    print "found " + str(counter) + " signatures"
    
    l = []
    for i in range(1, counter+1):
        # extract probabilities
        print "extracting probabilities for class " + str(i)
        grass.run_command("i.maxlik", group=group, subgroup=subgroup, sigfile="plr_" + str(i) + ".sig", clas="plr_class" + str(i), reject="plr_rej_" + str(i))
        # export from GRASS
        print "exporting probabilities for class " + str(i) + " to /tmp"
        grass.run_command("r.out.gdal", inp="plr_rej_" + str(i), out="/tmp/plr_rej_" + str(i) + ".tif")
        
        # import via gdal
        print "reading file"
        tif = gdal.Open("/tmp/plr_rej_" + str(i) + ".tif")    
        l.append(tif.ReadAsArray())
        
        if i == 1:
            width = l[0].shape[1]
            height = l[0].shape[0]
            trans = tif.GetGeoTransform()
            proj = tif.GetProjection()
    
    # create n-dimensional array
    print "creating 3D-array"
    probabilities = numpy.array(l)
    
    print "Image size: " + str(width) + "x" + str(height)
    print "using " + str(ntype) + "-neighbourhood"
    # invoke relaxation process
    results = probabilities.copy()
    for i in range(int(iterations)):
        print str(int(iterations)-i) + " iteration(s) to go..."
        results = plr_filter(results, width, height, counter, ntype)
    labels = createMap(results, width, height, counter)
    
    # exporting results
    print "exporting results to /tmp"
    export(labels, trans, proj)
    
    # import via gdal into GRASS
    print "reading results"
    grass.run_command("r.in.gdal", inp="/tmp/plr_results.tif", out=output)
    grass.run_command("r.colors", map=output, color="rainbow")
        
    # clean up
    print "removing temporary files"
    cleanUp(sigpath)
    
if __name__ == "__main__":
    options, flags = grass.parser()
    main()