Geoserver and threads number
We're using a Geoserver, and we've a performance problems in production with a large number of users.
We've made some load test with : 250, 150, and 20 threads. We've noticed that Geoserver works better with 20 threads than with 150 threads, and when thread number increase (150 or 250), performance decrease.
Is it normal ? How Geoserver manage the users request ? Does Geoserver use asynchronous strategy to manage users request ?
Thanks in advance.开发者_运维知识库
bsh
Sounds pretty normal. Threads (and cpu context switches) aren't free, and at some point you are going to spend more time thrashing around switch threads than actually doing anything useful. Often better to have a much smaller number of threads (number of cores * 2 is often reasonable) combined with some sort of front end queue that will accept a connection and hold it until a worker is free.
Here are some real-world use case statistics for you; in production, for mobile/web apps serving 'google-maps' style users for the outdoor market, my company has tested various configurations (several of these discussed by theonlysandman, a contributor to this question), and which also support the observation by Tyler Evans, also a contributor to this question).
We need loads of greater than 5000 requests / second ('qps), and as our Geoserver instances ubiquitously topped at nearly 100 qps each, we'd need to horizontally and vertically scale to over 50 Geoserver instances.
Parameters: mostly vector sources, local PostGIS databases all less than 2tb each and no table > 1M records (or if greater than 1M, simplified geometry at nodes > 1m apart), 60%-40%-10% WMS/WMTS/WFS requests, google cloud hosted servers, each 32 core, ssd drive cluster to 4Tb.
The bottleneck of qps appears to be Geoserver itself. (Styling, reprojection, all the niceties that come with it). I'm not advocating it is poorly written, but the heavier a car gets the slower it might drive.
If we replicate the wfs requests using GO or python +/- gdal to directly access postgis data we get faster throughput than geoserver (up to 1000 qps each instance or more, where PostGIS becomes the bottleneck).
The same goes for our homemade Java microservice based on PostGIS that creates pbf/mvt tiles from postgis- it, too, was very quick- at about 1000 qps.
Nginx for us performed slightly better than php (~110 qps vs ~89 qps), but this could be a result of a configuration of apache.
Where do we go from here? In all of our production use cases, for our users, serving miniature sharded sqlite/mbtile databases (vector or raster)... and maintaining them with custom code... was far more performant and scalable.
We may write a Java plugin for geoserver that pushes GeoWebCache TMS tiles into a Google Storage Bucket designed for slippy z/x/y calls... this way we could more easily maintain a tile pyramid with updates etc., using Geoserver tools.
The more threads, the harder the load on the server. See WikiPedia article on thrasing.
Geoserver performance is affect by many things. My advice is to look at each one and see where the bottle neck be occurring.
Here are a list of question to set you on the correct path:
What are the specs of your machine? It should have an SSD.
Are you generating your tiles on the way? Or are they pre-seeded?
- If they you are pre-seeding, is that running?
NOTE: pre-seeding helps but hammers the system so best done out of production.
- If they you are pre-seeding, is that running?
What is the source for your data, if postgis, are you using spatial indexes?
- Is PostgreSQL/postgis on the same machine?
- Is PostgreSQL/postgis on the same machine?
How many types of tiles are you generating?
NOTE: you could be generating extra tiles which you don't need/use.Do you use GeoWebCache?
With some more details, I can help you out.
精彩评论