performance - How to quickly retrieve a number of rows from data store? -
in python gae application i'm working on, need retrieve n rows storage, , running performance issues n > 100. expect n less 10000 cases.
so let's consider simple model:
class myentity(ndb.model): field1 = nbd.stringproperty() field2 = ndb.stringproperty() #... fieldm = ndb.stringproperty() # m quite large, maybe ~ 30. stored strings short - in order of 30 characters or less
i've populated data store data, , got bad performance using plain fetch()
. i've since removed filters, , trying number of entities seems bad performance (as compared expect, say, common sql deployment. know shouldn't compare gae sql, getting flat rows down - expect more performant, not less). here's i've tried:
- the simplest approach
myentity.all().fetch(n)
. scales linearlyn
, expected. although didn't expect take 7sn = 1000
. - trying coerce
fetch()
reasonablebatch_size
degrades performance further. i've tried values ranging 1 1000. - doing
keys_only
gives order of magnitude improvement. - doing query manually (
through ndb.query
), , getting out single field gives small improvement, in order of 1.2. - doing
fetch_async(n)
, waiting gives same performance. - splitting job
p
parts, doingfetch_async(n/p, offset=...)
, waiting , joining futures - gives @ best same performance, @ worst - worse performance. - similar story
fetch_page()
i've tried using db
instead of ndb
, , results pretty same. so, i'm not sure do? there way half decent performance n
in order of 10000? simplifying entities single fields, performance poor. expect entire payload uncompressed 1 mb. downloading 1mb in on minute unacceptable.
i seeing issue live, performance testing i'm using remote api. question similar question on so: best practice query large number of ndb entities datastore. didn't seem find solution, asked 4 years ago, maybe there 1 now.
if need subset of fields model, projection queries
Comments
Post a Comment