summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndrew Jeffery <andrew@aj.id.au>2018-05-09 12:52:23 +0930
committerAndrew Jeffery <andrew@aj.id.au>2018-05-16 14:57:29 +0930
commitc31ae3b0b8cbe39f37ec1d14c2420b4fa3e4b43e (patch)
treeb6e1aaaff9844c4dc17f267c92f8a8759cf86c44
parentce2a6f0354ed450ada2171e239ac1056b3df3126 (diff)
downloadpyphosphor-c31ae3b0b8cbe39f37ec1d14c2420b4fa3e4b43e.tar.gz
pyphosphor-c31ae3b0b8cbe39f37ec1d14c2420b4fa3e4b43e.zip
pathtree: Make dataitems() use the cache dict under restricted args
This achieves a 100x improvement in iteration time when the subtree rooted at the root of the full tree and there is no depth limit (i.e. the caller has requested all "populated" items). Whilst it sounds like a restricted case, it is a very common query by way of phosphor-objmgr's obmc/mapper/server.py:ObjectMapper.process_old_owner() Note this is a quasi API break, as some keys that were previously not present in the results may now appear: Explicitly storing 'None' into the data structure will have the key with 'None' appear in the full-tree dataitems() case. However, given this was filtered previously, no existing callers should be storing 'None' into the tree as they would not have been able to retrieve it via dataitems(). Before: $ python -m obmc.utils.testpathtree ... Iteration tests: depth=1, width=1, n=1000: 0.135853052139 depth=1, width=2, n=1000: 0.203811883926 depth=1, width=3, n=1000: 0.26814198494 depth=1, width=4, n=1000: 0.333888053894 depth=2, width=1, n=1000: 0.193987131119 depth=2, width=2, n=1000: 0.264018058777 depth=2, width=3, n=1000: 0.327262878418 depth=2, width=4, n=1000: 0.38805603981 depth=3, width=1, n=1000: 0.253651857376 depth=3, width=2, n=1000: 0.317117929459 depth=3, width=3, n=1000: 0.385557889938 depth=3, width=4, n=1000: 0.452265024185 depth=4, width=1, n=1000: 0.327889919281 depth=4, width=2, n=1000: 0.390358924866 depth=4, width=3, n=1000: 0.459683895111 depth=4, width=4, n=1000: 0.530153989792 After: $ python -m obmc.utils.testpathtree ... Iteration tests: depth=1, width=1, n=1000: 0.0012412071228 depth=1, width=2, n=1000: 0.00455403327942 depth=1, width=3, n=1000: 0.00307989120483 depth=1, width=4, n=1000: 0.00356507301331 depth=2, width=1, n=1000: 0.00118088722229 depth=2, width=2, n=1000: 0.00169396400452 depth=2, width=3, n=1000: 0.00234699249268 depth=2, width=4, n=1000: 0.00300288200378 depth=3, width=1, n=1000: 0.00100708007812 depth=3, width=2, n=1000: 0.00161695480347 depth=3, width=3, n=1000: 0.00234794616699 depth=3, width=4, n=1000: 0.00315403938293 depth=4, width=1, n=1000: 0.00101804733276 depth=4, width=2, n=1000: 0.00204801559448 depth=4, width=3, n=1000: 0.00281095504761 depth=4, width=4, n=1000: 0.0070219039917 Change-Id: Ice3afd12e2b112227735f0f1dedb6a8ea594740c Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
-rw-r--r--obmc/utils/pathtree.py13
1 files changed, 13 insertions, 0 deletions
diff --git a/obmc/utils/pathtree.py b/obmc/utils/pathtree.py
index 9896bfe..b24c162 100644
--- a/obmc/utils/pathtree.py
+++ b/obmc/utils/pathtree.py
@@ -194,6 +194,19 @@ class PathTree:
return [x for x in self.iteritems(subtree, depth)]
def dataitems(self, subtree='/', depth=None):
+ # dataitems() must return an iterable object containing all of the
+ # items explicitly inserted into the tree, rooted at subtree with
+ # depth number of path elements from the subtree root.
+ #
+ # Calling dataitems() to request all of the populated entries is
+ # unfortunately common, and it's (also) unfortunately expensive to
+ # generate (done by iterating over the entire tree). However, given we
+ # have a flat dict whose job is to cache the explicitly inserted values
+ # we can leverage that to provide a cheap-to-calculate answer to
+ # requests for the entire set of populated entries
+ if subtree == '/' and not depth:
+ return self.cache.items()
+
return [x for x in self.iteritems(subtree, depth)
if x[1] is not None]
OpenPOWER on IntegriCloud