If you haven't noticed, we've made some serious updates to gun.io!
As our search results are now rendered on the client side rather than on the server, we had to write a whole new API to get the data into the client. Once we had written it, we found that it was far slower than what we considered to be acceptable - results were averaging at 1.7s - and we shoot for sub 200ms response times. Not good! But what was going wrong? Why was Django being so slow? It actually wasn't easy to see immediately, and the normally wonderful Django Debug Toolbar wasn't any use as it doesn't work on AJAX requests.
Using a Profiler
What we needed was a way to profile our code through the browser. Much digging around on the internet found a very old snippet of a Django profiling middleware which seemed perfect for the tasks. Unfortunately, it didn't work with modern versions of Django, but I quickly fixed it up and am including it below.
# Orignal version taken from http://www.djangosnippets.org/snippets/186/ | |
# Original author: udfalkso | |
# Modified by: Shwagroo Team and Gun.io | |
import sys | |
import os | |
import re | |
import hotshot, hotshot.stats | |
import tempfile | |
import StringIO | |
from django.conf import settings | |
words_re = re.compile( r'\s+' ) | |
group_prefix_re = [ | |
re.compile( "^.*/django/[^/]+" ), | |
re.compile( "^(.*)/[^/]+$" ), # extract module path | |
re.compile( ".*" ), # catch strange entries | |
] | |
class ProfileMiddleware(object): | |
""" | |
Displays hotshot profiling for any view. | |
http://yoursite.com/yourview/?prof | |
Add the "prof" key to query string by appending ?prof (or &prof=) | |
and you'll see the profiling results in your browser. | |
It's set up to only be available in django's debug mode, is available for superuser otherwise, | |
but you really shouldn't add this middleware to any production configuration. | |
WARNING: It uses hotshot profiler which is not thread safe. | |
""" | |
def process_request(self, request): | |
if (settings.DEBUG or request.user.is_superuser) and 'prof' in request.GET: | |
self.tmpfile = tempfile.mktemp() | |
self.prof = hotshot.Profile(self.tmpfile) | |
def process_view(self, request, callback, callback_args, callback_kwargs): | |
if (settings.DEBUG or request.user.is_superuser) and 'prof' in request.GET: | |
return self.prof.runcall(callback, request, *callback_args, **callback_kwargs) | |
def get_group(self, file): | |
for g in group_prefix_re: | |
name = g.findall( file ) | |
if name: | |
return name[0] | |
def get_summary(self, results_dict, sum): | |
list = [ (item[1], item[0]) for item in results_dict.items() ] | |
list.sort( reverse = True ) | |
list = list[:40] | |
res = " tottime\n" | |
for item in list: | |
res += "%4.1f%% %7.3f %s\n" % ( 100*item[0]/sum if sum else 0, item[0], item[1] ) | |
return res | |
def summary_for_files(self, stats_str): | |
stats_str = stats_str.split("\n")[5:] | |
mystats = {} | |
mygroups = {} | |
sum = 0 | |
for s in stats_str: | |
fields = words_re.split(s); | |
if len(fields) == 7: | |
time = float(fields[2]) | |
sum += time | |
file = fields[6].split(":")[0] | |
if not file in mystats: | |
mystats[file] = 0 | |
mystats[file] += time | |
group = self.get_group(file) | |
if not group in mygroups: | |
mygroups[ group ] = 0 | |
mygroups[ group ] += time | |
return "<pre>" + \ | |
" ---- By file ----\n\n" + self.get_summary(mystats,sum) + "\n" + \ | |
" ---- By group ---\n\n" + self.get_summary(mygroups,sum) + \ | |
"</pre>" | |
def process_response(self, request, response): | |
if (settings.DEBUG or request.user.is_superuser) and 'prof' in request.GET: | |
self.prof.close() | |
out = StringIO.StringIO() | |
old_stdout = sys.stdout | |
sys.stdout = out | |
stats = hotshot.stats.load(self.tmpfile) | |
stats.sort_stats('time', 'calls') | |
stats.print_stats() | |
sys.stdout = old_stdout | |
stats_str = out.getvalue() | |
if response and response.content and stats_str: | |
response.content = "<pre>" + stats_str + "</pre>" | |
response.content = "\n".join(response.content.split("\n")[:40]) | |
response.content += self.summary_for_files(stats_str) | |
os.unlink(self.tmpfile) | |
return response |
To use this middleware, save this file in your application as middleware.py. Then, in the MIDDLEWARE_CLASSES of your settings.py, include the line
'yourproject.yourapp.middleware.ProfileMiddleware',
(with the 'yourproject' and 'yourapp' changed to your appropriate values, of course.) You will also need to install the 'hotshot' package, so if you don't have that installed already then pip install hotshot.
Reading the Output
Now, to use the profiler, simply add the string '?prof' to the end of your URL, and you'll see the profile of that page. Awesome! But.. what does this wall of voodoo mean?
The profiling output is not very intuitive if you haven't seen it before, so let's dive in!
The columns it's showing you are the number of times a function is called, the "total time" that function took, the time per call, the cumulative time of that function, and the ratio of cumulative time to primitive calls. We're interested in 'cumtime', the fourth column. The results aren't sorted, so you'll have to look through the list and see which functions are taking the most time. Anything greater than 0.100 or even .050 could be worth investigating.
If the function has a low per-call time but a high cumulative time, your code is probably spending too much time in a loop - see if you can find a more efficient algorithm, use less nested loops or way to break earlier! The string associated with your problem line may actually not be immediately helpful, but hopefully it will serve as a good starting point for you - if you see many expensive calls to 'sql', it could be your database code, if you see many expensive calls to 'template', it could be your templating code, etcetera etcetera.
In our case, we had forgotten to use a 'selected_related()' on one of our database calls. Now our API is super fast, hooray!
Once you're done, remember to remove the middleware from your settings!