Difference between revisions of "GSoC Network Analysis"
|Line 1:||Line 1:|
== report #8 ==
== report #8 ==
Revision as of 10:14, 31 July 2009
This week cleaned the code, fixed several undocumented features and documented the other ones. In more detail: There were still minor issues with the code and several module had nonstandard behaviour, which should be fixed now. I also spent some time documenting most of the modules. As users are probably not familiar with graph theory, I tried to explain the general notions in the description. Hope that it is not too little or not too much.....
There are no blocking issues.
I worked mostly on module for finding fastest paths using timetables this week and I made several improvements. Firstly, the output is stored in two attributes tables. One for stops/points and another one for routes/lines. All information necessary to reconstruct the path and times is in these tables in the format very similar to the one used by v.net.path module. Secondly, the module can now output the route itself, not only stops joined by straight line segments. So instead of http://people.ksp.sk/~dano/grass/tt.png, you can get http://people.ksp.sk/~dano/grass/ttp.png. Although it is quite hard to see, note that the line joining two points on the left is a straight line. This is not a bug but a feature --this corresponds to walking between the stops. Thirdly, it is possible to give the coordinates of two points and the module finds the closest stop to each of them and hence the fastest path between them. Finally, I remember doing some minor bug fixes earlier this week, but I have already forgotten what it was...
There are no blocking issues. I plan to, finally, polish and finish all modules next week and, if time permits, begin documenting modules and/or code.
I worked on the module similar to v.net.path this week. The difference is that my module uses timetables to find the fastest path between two stops at/after the given time. The timetables are stored in an attribute table. For each route, the table contains the stops and times. Also, walking connections between the stops can be specified in a different table. Routing is done according to several options. Apart from the obvious ones such as: "from stop", "to stop" and "time", other options are: maximum number of changes (this can be any non negative integer or infinity), minimum time required for a change and whether walking is considered as a change (this is important when maximum number of changes is finite). Using these options, the modules finds the fastest path and produces textual output in the form (this is still only temporary and the output will likely change to something easier to process):
Route 5, from 100 leaving at 0 arriving to 130 at 10 Route 15, from 130 leaving at 15 arriving to 250 at 22 Walk from 250 leaving at 22 arriving to 300 at 24
And produces the following output map: http://people.ksp.sk/~dano/grass/tt.png. Where Route x, is the identificator of a route and from y to z are identificators/categories of stops.
In the tradition of v.net.path, the module reads data from the standard input, so the map may contain more than one single path.
For the next week, I plan some changes to the module. Especially, I want to improve the output, possibly outputting to an attribute table. Also, a nice feature of v.net.path is that it can take two coordinate points and finds the shortest path between the nodes in the network closest to these points. I think it would be nice to have something like this in this module.
I have finished v.net.distance module this week. The module is a mix of v.distance and v.net.path modules. It takes two sets of vector features and a network and for each feature in the first set computes the shortest path to a feature in the second set. It computes the length as well as the path itself. Here is a couple of nice pictures: http://people.ksp.sk/~dano/grass/nd.png, http://people.ksp.sk/~dano/grass/ndt.png, http://people.ksp.sk/~dano/grass/ndtd.png. Blue and orange lines show the shortest length and time paths respectively. I spent the rest of the week by bug fixing the modules and fixing any non standard behaviour. There was more work to do, especially in the latter case, than I had expected..... Hopefully, everything is fixed;)
For the next week, I am planning on integrating timetables into routing algorithms. Fortunately, there are no blocking issues this time.
I have finished v.net.connectivity module this week. The module computes the vertex connectivity of a graph (minimum cardinality set of vertices separating two given sets of vertices of the graph). The module also supports "node capacities". More precisely, as the edge capacities can be specified in the min-cut problem, node capacities are used to determine vertex connectivity. To have a better idea, please read my last week report that also has a couple of nice images. 
Also, I have done v.net.centrality module that computes several vertex centrality measures. Namely, degree, closeness, betweenness and eigenvector measure. Similar functionality is already present in v.generalize module. However, this module is more general. Firstly, it stores the measures in the database and secondly, it also takes edge lengths/costs into the account. (v.generalize assumes edge costs to be 1). The main difference is that v.generalize computes edge centrality measures. However, with code computing vertex measures already available, it is easy to extend the module to compute edge measures as well. Here is a couple of fancy pictures: http://people.ksp.sk/~dano/grass/close.png, http://people.ksp.sk/~dano/grass/betw.png. The pictures show closeness and betweenness measures respectively. The meaning of colours should be obvious: red-"nodes in the centre according to the measure", green-"middle", blue-"boundary".
Also, I have begun to work on v.net.distance module that will be a hybrid of v.distance and v.net.path modules. The module will compute the shortest path(length as well as the path itself) along the network between a set of "to" features and every "from" feature. I hope to finish it (early) next week, as I already know about one guy awaiting the module.
For the blocking issues: uploading centrality measures into the database takes an awful lot of time as I am doing it in a very naive way... I will look more into this next week.
Finally, I apologise that this report is too late. It is already 13 minutes past midnight in my timezone...., Daniel
After a break I took because of my exams, I have started working on my GSoC project. I worked mostly on the (v.net.) flow module and related issues. I have extended the module so that it is possible to specify more than one point as the source and sink. Also, the module finds a minimum cut (edges with minimum capacities separating source(s) from sink(s)) in the network. Here is a picture of this: http://people.ksp.sk/~dano/grass/mfmult.png. Red crosses are sources and green ones are sinks (maybe, it is the other way round...) Blue edges correspond to small flow, green to medium flow and red to high flow. And the yellow edges are a minimum cut.
If the capacities of all edges are the same, minimum cut corresponds to the smallest number of edges separating two sets of vertices. By constructing an appropriate graph, it is also possible to find the smallest set of vertices separating sources and sinks. I have written code that does exactly this. In the next picture, orange points separate red points from green points: http://people.ksp.sk/~dano/grass/large.png. The next two pictures show details of the previous one: http://people.ksp.sk/~dano/grass/detail1.png, http://people.ksp.sk/~dano/grass/detail2.png. For the next week, I plan to turn this code into a module as the current version is more useful for debugging than using. Also, I will extend it to handle "node capacities". And I hope I will have time to start working on another new module.
This week, I implemented a module that computes the shortest path between all pairs of nodes in the graph. You can also specify which pairs you are interested in, so that the module does not have to produce the entire NxN matrix. Also, I started on network flow modules. So far, I have implemented a flow algorithm and a simple module that finds the maximum flow between two given vertices(Well, for debugging purposes, it can handle only the flow between nodes 215 and 219, so far....) Anyway, here is couple of pictures: http://people.ksp.sk/~dano/grass/mfu.png, http://people.ksp.sk/~dano/grass/mfu2.png. One cross is source, the other one is sink. Blue edges correspond to low flow, green to medium and red to high flow. I used speed limit as edge capacities. Although it is not the case in the two pictures I posted, the forward and backward capacities of an edge can be different.
For the next weeks, I plan to revise for my exams. As you may be aware, I have started two weeks earlier and so I will take a short break now.... Currently, I have no blocking issues.
See you in July!
(What an embarrasing mistake. By July, I mean June....)
While working on the module for computing articulation points, I found a bug in the code calculating bridges. So the first real thing I did this week was some bug fixing. Also, I have updated the module for connected components so that it outputs the computed values in a more useful way (Read: now it can be used for more than just producing nice pictures to the first weak report....). Anyway, everything is kept in a nice little database table. As the first sentence suggests, I have done a module for computing articulation points in the networks. Moreover, I have implemented a module that finds a minimum spanning tree in the graph. This pictures shows both new modules in work: red crosses are articulation points, green edges form a minimum spanning tree (all edges have equal weight) http://people.ksp.sk/~dano/grass/mstap.png. In the following picture, the green edges also form a minimum spanning tree, but this time, the speed limit is used as the weight of an edge. Note that the highway in the centre is not included in the spanning tree: http://people.ksp.sk/~dano/grass/mst.png. The latest version of the code will appear shortly in the add-ons repository.
Currently, I cannot persuade the compiler to compile the library correctly. So far, it does make cause any problems as no code is used by two modules. I will look into this later next week and I will probably start working on the family of modules using max-flows algorithm as well.
I have agreed with my mentor, Wolf, to start working on my GSoC project bit earlier so that I can concentrate more on the exams later. Well, earlier means last Saturday and so this is my first week report. Just a short introduction: I am Daniel, 3rd year at Oxford doing Mathematics and Computer Science and some of you may remember me as the student working on GRASS v.generalize module two years ago. My project this year is about extending GRASS network/vector functionality by writing a couple of modules. I have created a wiki page: http://grass.osgeo.org/wiki/GSoC_Network_Analysis, which contains some useful information such as this email and a link to the abstract.
Anyway, after doing boring, but essential, tasks such as downloading and compiling GRASS (I use 6.4 developer branch) I have done more interesting stuff. I wrote a module for computing weakly and strongly connected components. Using this module, I found errors in the standard North Caroline datset. At least, I think that these are errors. For example, if you zoom to [north: 232051.61663403 south: 230521.9965903 east: 640369.10332306 west: 638669.65919416] and let GRASS draw roadsmajor map for you, you get something like this: http://people.ksp.sk/~dano/grass/wcc1.jpg. (colours correspond to different components, which I obtained using my module). Or, the same map, but different location [north: 239329.55684214 south: 238882.80682937 east: 630466.77182694 west: 629997.32746966], and you have another disconnected lines: http://people.ksp.sk/~dano/grass/wcc2.jpg
Then, I have also written a module for computing bridges(edges whose removal disconnects the graph) in the network and found another, probably error. This time, it is in map "railroads". The location is [north: 195851.98992388 south: 187604.09950596 east: 669691.16599023 west: 663129.724274] and the picture is: http://people.ksp.sk/~dano/grass/railroads.jpg. Note that the lines are not connected at the point the red arrow points at. Also, it seems, but I am not completely sure, that there is something wrong with the red segment the green arrow points at. I think, that the segment is not topologically connected to the network at both ends, although, visually, it seems that it is.
Finally, here is a random picture from a module for computing bridges: http://people.ksp.sk/~dano/grass/bridges.jpg. (The data is from streets_wake map). The red edges are bridges. Strictly speaking, the blue edges are bridges as well, but they are not the type of bridge you usually look for/consider important. So I wrote another module that can identify such chains hanging on the network and remove them.
The code should appear shortly in GRASS addons repository.
That is for the last week. I am not currently stuck at anything and I plan to develop another modules over the next week. For exampe, module for computing articulation points as the algorithm is almost the same as for bridges.