Saturday, October 3, 2015

C program to generate .osm[XML] with efficient memory management

It was a requirement to generate new .osm file with selectively extracted openstreetmap's data from a big file [e.g. a country / state .osm file].

Equally the requirement extended as no use of any pre-programmed library, only simple C programming and very importantly, the program should avoid -

1. memory leaking
2. buffer overflow
3. change of pointer address
4. extra memory allocation, and
5. efficient.

End output has to be something like this -

     <node id="73900462" lat="54.3884" lon="10.38285">  
         <tag k="name" v="Lidl"/>  
         <tag k="shop" v="supermarket"/>  
         <tag k="addr:city" v="Schönberg"/>  
         <tag k="addr:street" v="Große Mühlenstraße"/>  
         <tag k="addr:housenumber" v="51"/>  

The output will write to a file. Where, node_id, lat, lon, tag_value/s etc. will be dynamic.

As usual the first choice was to use of strcpy() & strcat(). But both function is dangerous, while working with dynamic length of character. Actually the entire family of string manipulation functions [and strcmp() ] in C are carrying same dangerous attitude as their siblings. They does not check buffer length and often causes buffer overflow or compilation error as "Segmentation Fault".

To get an immediate solution, it also possible to use these functions with "N" [strncpy(), strncat(), strncmp()].

In above scenario, to use strcpy & strcat also required to use malloc with realloc, due to the dynamic values [e.g. node_id, lat, lon, tag_value/s].

But use of realloc(), may cause many problem to program. Here are some list of problems to use realloc()

By considering all possible options, and stick on basic C programming facility, finally consider only sprintf() is good enough to successfully finish my task.

So, at last, my program is look like this. ->

 #include <stdio.h>  
 #include <string.h>  
 #include <stdlib.h>  
 int main()  
   FILE *indexfile;  
   indexfile = fopen("supermarket.osm", "a");  
   char *file_header="<osm version=\"0.6\" generator=\"osmconvert 0.8.3\">\n";  
   char *first_line = "<node id=\"\" lat=\"\" lon=\"\">\n";  
   char *second_line = "<tag k=\"name\" v=\"\"/>\n";  
   char *third_line = "<tag k=\"addr:city\" v=\"\"/>\n";  
   char *fourth_line = "<tag k=\"addr:street\" v=\"\"/>\n";  
   char *fifth_line = "<tag k=\"addr:housenumber\" v=\"\"/>\n";  
   char *node_close = "</node>\n";  
   char *last_line = "</osm>\n";  
   char *node_id="11111";  
   char *tagvl = "nah und frisch";  
   char *cityvl = "Schönberg";  
   char *add_street = "Große Mühlenstraße";  
   double lat = 54.3884, lon = 10.38285;  
   char *housenr = "100";  
   char *spnt_temp_all;  
  // the constant 17 is count as lat + lon both taken 8 character = 16 + 1 NULL.  
   spnt_temp_all = malloc(sizeof(*spnt_temp_all) * (17 + strlen(file_header) +  
                              strlen(first_line) +  
                              strlen(node_id)   +  
                              strlen(second_line) +  
                              strlen(tagvl)     +  
                              strlen(third_line) +  
                              strlen(cityvl)   +  
                              strlen(fourth_line) +  
                              strlen(add_street) +  
                              strlen(fifth_line) +  
                              strlen(housenr)   +  
                              strlen(node_close) +  
   "<osm version=\"0.6\" generator=\"osmconvert 0.8.3\">\n"  
   " <node id=\"%s\" lat=\"%1.5f\" lon=\"%1.5f\">\n"  
   " <tag k=\"name\" v=\"%s\"/>\n"  
   " <tag k=\"addr:city\" v=\"%s\"/>\n"  
   " <tag k=\"addr:street\" v=\"%s\"/>\n"  
   " <tag k=\"addr:housenumber\" v=\"%s\"/>\n"  
   " </node>\n"  
   return 0;  

No comments:

Post a Comment