PDA

View Full Version : spec for vector data container - devs only



deniska
November 23rd, 2007, 18:00
[FONT=Fixedsys][FONT=Arial]This is for devs only
Attached is a proposed spec for vector data container.

The archive below contains the source code for the toolset to parse shape files, create and troubleshoot the geoline.dat file..
http://deniska.dcemu.co.uk/dev/geoline_src.tar.gz

http://deniska.dcemu.co.uk/dev/geoline.dat - sample vector data file for NY state

Edit: google occasionally changes the api version to higher numbers:
http://mt3.google.com/mt?n=404&v=w2.63&x=1206&y=1541&zoom=5

Nieko
November 23rd, 2007, 19:35
I would rename sx,sy,ex,ey to srcx, srcy, dstx, dsty, as they are more self-explaining;
What exactly is left side of the street? Is this seen from src looking towards dst, or the other way around?
Does stype also indicate the maximum speed allowed on that road? That would be interesting when calculating shortest route (in time it takes to drive there);
In the struct you can't indicate a road is one-way only;
Nitpicky, but still, I'd swap "size n" and "offset n" in geoline.dat;
Could you also include the format for a single chunk? Don't feel like deducing it from geoline.dat :-P.


Other than that, looks very good :).

deniska
November 23rd, 2007, 21:04
I would rename sx,sy,ex,ey to srcx, srcy, dstx, dsty, as they are more self-explaining;
What exactly is left side of the street? Is this seen from src looking towards dst, or the other way around?
Does stype also indicate the maximum speed allowed on that road? That would be interesting when calculating shortest route (in time it takes to drive there);
In the struct you can't indicate a road is one-way only;
Nitpicky, but still, I'd swap "size n" and "offset n" in geoline.dat;
Could you also include the format for a single chunk? Don't feel like deducing it from geoline.dat :-P.Other than that, looks very good :).

- s & e indicate the start and the end of the line.. it's not much of the src/dst thing
- if you align yourself looking from start toward the end of the line the left side would be on your left, of course ;-).. this whole start/end/left/right thing is borrowed from tiger line specs
-stype indicates line type. The roads/streets (tiger line type A lines) are interpreted according this table:

A1 - Primary Highway With Limited Access Interstate highways and some toll highways
A2 - Primary Road Without Limited Access This category
A3 - Secondary and Connecting Road This category
A4 - Local,Neighborhood, and Rural Road A road in this category
A5 - Vehicular Trail A road in this category
A6 - Road with Special Characteristics This category
A7 - Road as Other Thoroughfare A road in this category
Census Feature Class Code (2 digits)
A11 - Primary road with limited access or interstate highway, unseparated
A12 - Primary road with limited access or interstate highway, unseparated,in tunnel
A13 - Primary road with limited access or interstate highway, unseparated, underpassing
A14 - Primary road with limited access or interstate highway, unseparated, with rail line
in center
A15 - Primary road with limited access or interstate highway, separated
A16 - Primary road with limited access or interstate highway, separated, in tunnel
A17 - Primary road with limited access or interstate highway, separated, underpassing
A18 - Primary road with limited access or interstate highway, separated, with rail line in
center
A21 - Primary road without limited access, US highways, unseparated
A22 - Primary road without limited access, US highways, unseparated,in tunnel
A23 - Primary road without limited access, US highways, unseparated, underpassing
A24 - Primary road without limited access, US highways, unseparated, with rail line in
center
A25 - Primary road without limited access, US highways, separated
A26 - Primary road without limited access, US highways, separated,in tunnel
A27 - Primary road without limited access, US highways, separated, underpassing
A28 - Primary road without limited access, US highways, separated, with rail line in center
A31 - Secondary and connecting road, state highways, unseparated
A32 - Secondary and connecting road, state highways, unseparated,in tunnel
A33 - Secondary and connecting road, state highways, unseparated, underpassing
A34 - Secondary and connecting road, state highways, unseparated, with rail line in center
A35 - Secondary and connecting road, state highways, separated
A36 - Secondary and connecting road, state highways, separated,in tunnel
A37 - Secondary and connecting road, state and county highways, separated, underpassing
A38 - Secondary and connecting road, state and county highway, separated, with rail line in
center
A41 - Local, neighborhood, and rural road, city street, unseparated
A42 - Local, neighborhood, and rural road, city street, unseparated, in tunnel
A43 - Local, neighborhood, and rural road, city street, unseparated, underpassing
A44 - Local, neighborhood, and rural road, city street, unseparated, with rail line in center
A45 - Local, neighborhood, and rural road, city street, separated
A46 - Local, neighborhood, and rural road, city street, separated, in tunnel
A47 - Local, neighborhood, and rural road, city street, separated, underpassing
A48 - Local, neighborhood, and rural road, city street, separated, with rail line in center
A51 - Vehicular trail, road passable only by 4WD vehicle, unseparated
A52 - Vehicular trail, road passable only by 4WD vehicle, unseparated, in tunnel
A53 - Vehicular trail, road passable only by 4WD vehicle, unseparated, underpassing
A60 - Special road feature, major category used when the minor category could not be determined
A61 - Cul-de-sac, the closed end of a road that forms a loop or turn-around
A62 - Traffic circle, the portion of a road or intersection of roads forming a roundabout
A63 - Access ramp, the portion of a road that forms a cloverleaf or limited access interchange
A64 - Service drive, the road or portion of a road that provides access to businesses,
facilities, and rest areas along a limited-access highway; this frontage road may intersect
other roads and be named
A65 - Ferry crossing, the representation of a route over water that
connects roads on opposite
shores; used by ships carrying automobiles or people
A70 - Other thoroughfare, major category used when the minor category could not be determined
A71 - Walkway or trail for pedestrians, usually unnamed
A72 - Stairway, stepped road for pedestrians, usually unnamed
A73 - Alley, road for service vehicles, usually unnamed, located at the rear of buildings and
property
A74 - Driveway or service road, usually privately owned and unnamed, used as access
to residences, trailer parks, and apartment complexes, or as access
to logging areas, oil

So for streets A41 would be displayed as 41 in stype.

You may see some negative values in the data chunks - they represent terrain/hydro polygons
tiger defines them with a letter following 2 digits to stype value is created by multiplying ketter number by 100, adding the number part and negating the resulting value.
Example: D71 -> -371
So there is no exact speed rating per line - but you can assume certain speeds according the type of the road: 65 on high ways, 30 on streets, etc..
- tiger line does not provide street directions and this sucks.. I guess whenever we find a source for directional info, stype can be overloaded to carry this info too (like adding 10000 to all one-way streets, going from start to end and 20000 to all one-way streets going the oposite way...
-if size is 0 - there is no point continuing reading the offset (just a tyny possible performance gain)
-format is simple:
[l1][l2][l3]...[ln]
where [l*] is element of type LINE, defined by the structure in the spec..
check the decompr.c in the source code archive for info on how to navigate through the geoline file...

Nieko
November 23rd, 2007, 22:09
- if you align yourself looking from start toward the end of the line the left side would be on your left, of course ;-).. this whole start/end/left/right thing is borrowed from tiger line specs


True, but what if you draw the start at the top of the screen and the end at the bottom? Then left is the other side of what is actually meant. I just asked to be sure :).



-if size is 0 - there is no point continuing reading the offset (just a tyny possible performance gain)


Hmm, you could also leave out chunks of size 0 and instead have a list of all the chunks that do contain data (or prepend the chunk id). You save on not storing empty chunks, you lose on storing a list of chunks. If on a map there are fewer chunks that do have data, than those that don't, this might be an idea.

in7ane
November 23rd, 2007, 23:51
Looks good, I will have a go at converting the Canadian road network (which will take a while as always).

For segments that span two chunks, is it fine to include them in both?

- will they get displayed or are there min/max checks
- and if they do get displayed will they draw on top of each other without problems

or should those segments be split into two?

deniska
November 24th, 2007, 02:25
Looks good, I will have a go at converting the Canadian road network (which will take a while as always).

For segments that span two chunks, is it fine to include them in both?

- will they get displayed or are there min/max checks
- and if they do get displayed will they draw on top of each other without problems

or should those segments be split into two?
In my implementation I only check the sx/sy coordinates to see if it belongs to a chunk... this way there should not be any duplicates... even if there are duplicates - it should not create any major rendering problems...

There is one minor problem with my current shape-file parsing code - if a street segment is not a straight line - it get's divided in to a set of straight line pieces. Each peiece still gets the to/from address numbers from the parent segment. I guess, a better implementation would be to calculate from/to adrr numbers for each piece based on the length of it..

deniska
November 24th, 2007, 02:32
Hmm, you could also leave out chunks of size 0 and instead have a list of all the chunks that do contain data (or prepend the chunk id). You save on not storing empty chunks, you lose on storing a list of chunks. If on a map there are fewer chunks that do have data, than those that don't, this might be an idea.
The proposed design allows quick access to any chunk in the container. Basically, you just need to do 2 seek() calls - no searching required... same design seems to work well with gpsfs - current imgage map container format...

deniska
November 24th, 2007, 02:42
BTW, I am glad that people actually express interest in this.. Perhaps someone can put together routing routine, based on this...

zerodays
November 24th, 2007, 04:32
deniska,

1. I could try in Java, not C... sorry;; would you be interested in java...? answers are mostly no...?

2. I need to understand how the points work, for example, how does the point "changes" if maps are zoomed in/out ?
etc...

3. I breifly read the txt, etc, but can not really read your NY geoline.dat in text editor.
so, maybe, if you provide some very very 'simple' examples of say,,, 20 streets (of blocks) in simple txt format,
and say, im in this xxx st, and i want to go to xxx st (all within the 20 street), then, perhaps, i can apply dijsktra's SP algorithm,,
then again, i guess i need the 'Turn' Points, which means, whenever there is an "intersections". either T shape or +, etc one needs to mark them as Node.
Sorry, i think at least thats how SP algorithm needs to know,
and again, when you refer to 'chunk' im kinda lost, probably because i still don't know how the map zoom in/out coordinates changes, etc.. sorry.

but then again, I believe that you have the most knoweldge yourself, and probably you can just implment any psuedo-codes of algorithms into C,,,,so i now doubt whether i can even help;;

Cheers. :)

deniska
November 24th, 2007, 07:31
thanks for the offer..
I don't have any problems understanding and implementing dijsktra's or any other routing algorithms (there is plenty of pseudo code and explanations available on the web).. My biggest problem is lack of time to actually do all the things that I want to do with this project. This is why I was hopping that someone with spare time & knowlege of "C" / PSP programming would take over some of more or less self-contained tasks like route calculation, geo-coding / reverse geo-coding, etc

Nieko
November 24th, 2007, 10:28
In my implementation I only check the sx/sy coordinates to see if it belongs to a chunk... this way there should not be any duplicates... even if there are duplicates - it should not create any major rendering problems...


How would the routing algorithm work in that case? Or: when do you start taking roads from other chunks into account? It might be the case that the start of the road that's best suitable in the routing algorithm lies in another chunk.

I wonder if you would then always have to check the 8 surrounding chunks (assuming roads don't span more than one chunk).

Codaz
November 24th, 2007, 11:16
Nice Deniska,

i was wondering if you coun't lend some of the geocoding of opentom.

www.opentom.org

It's Tomtom, but then written in Linux.

The toolchain is in the wiki.

Maybe then the programming process would speed up.

Nieko i will talk to you in dutch on this.

deniska
November 24th, 2007, 20:19
Nice Deniska,

i was wondering if you coun't lend some of the geocoding of opentom.

www.opentom.org (http://www.opentom.org)

It's Tomtom, but then written in Linux.

The toolchain is in the wiki.

Maybe then the programming process would speed up.

Nieko i will talk to you in dutch on this.
Although, I did not spend much time reading all the stuff that they have, it seems that most of their code is based on closed-source tomtom api. Which means it cannot be easily ported to any other platform... Correct me if I am wrong..

deniska
November 24th, 2007, 20:25
How would the routing algorithm work in that case? Or: when do you start taking roads from other chunks into account? It might be the case that the start of the road that's best suitable in the routing algorithm lies in another chunk.

I wonder if you would then always have to check the 8 surrounding chunks (assuming roads don't span more than one chunk).
The easiest (perhaps not most practical) solution would be to load all the data chunks that make up a square, spanning over both source and destination points in to a memory and then run one of the famous algorithms on that set..