[Community] Run time comparisons between C/GEOS and Python/Shapely
pascal leroux
pa.leroux at gmail.com
Wed Jul 9 15:47:34 EEST 2008
Hi Sean
2008/7/8 Sean Gillies <sgillies at frii.com>:
> Hi Pascal,
>
> A Python solution shouldn't be 100 times slower than C, especially since
> both OGR and Shapely use the same GEOS library to compute relationships
> between geometries. Is it possible that your other solution uses
> "intersects" instead of "not disjoint" (could be faster), or is more
> approximate?
>
No, my C code uses GEOSDisjoint.
And, when the function returns false/0, I call GEOSIntersection and write,
in WKT format, the result (to display and check the calculated intersections
vs my input shapefile). So the C program does more things than the Python
version.
>
> The Python script above also uses a great deal of memory since it never
> frees memory allocated by OGR's GetNextFeature(), and then copies the
> feature to a Shapely geom (and a GEOS geom). If your computer's free
> memory is low, performance can be poor. Is this a possibility?
>
I have no idea. Do you mean that the "process" that frees memory (when the
reference counter of an object equals 0) could be slow and explain the low
speed ? Or a more general problem (operating system level) ?
I added an explicit "del objet" statement in my script after the
list.append() but the results are the same.
>
> I recommend that you first spatially index your data for this kind of
> application -- maybe a quadtree (.qix) for your shapefile, or use an
> Rtree alongside your list of Shapely geoms. You can thereby efficiently
> select features that approximately intersect with your feature of
> interest (using the bounding box of "ref"), and then test only that
> smaller set for exact intersection. Often the cost of indexing is much
> smaller than the cost of performing unnecessary exact intersections.
>
Sure ! It is not a smart way to process such big shapefiles. Even if I code
it in C, I'll not look for intersections without using spatial indexes. The
purpose of my script was only to test Python/Shapely speed (interpreted vs
compiled, a lot of software layers, ...) with a big (but real) shapefile.
I'll make other tests using the OGRLayer::SetSpatialFilter[Rect] method and
I don't give up using Python and Shapely (not yet !)
> Using xrange() instead of range() when your lists are large is a good
> idea, but I don't think that's the issue here.
>
> Cheers,
> Sean
>
>
cheers
Pascal
ps: my script ran till the end : about 270 minutes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gispython.org/pipermail/community/attachments/20080709/f38f2184/attachment.htm>
More information about the Community
mailing list