Hi! I'm currently reworking my site, so although this content is still online things may change or look odd. Please get in touch to find out how I can help you!

Reduce Database Load in PHP with Array Filtering

If you’re developing a web app that queries a database, it’s very likely that at some point you’ll want to display one big set of results, split into subsets.

For example a time management app will usually offer users a view of tasks grouped by their status - overdue, in progress, completed and so on.

Our first reaction might be to query the database once for each subset. That works fine, but it doesn’t scale well. What if our app gets really popular and suddenly has millions of users? That’s multiple queries hitting our database, every single time the “tasks by status” page is loaded - assuming for a moment that we ignore any other load-reducing techniques like caching.

The Solution

With the help of a handy little function, we can reduce those many queries to one. OK, so that one query will be bigger but we’ll save overheads and reduce our overall database load.

That handy little function is array_filter(). The idea is to get an array of all the results we’re interested in (tasks for the day, perhaps) and then split our big array of results into smaller arrays (grouped by status) so that we can handle them individually:

// set up the functions that will actually do the filtering
function filter_overdue_tasks($task)
{
	return $task['status'] == 'overdue';
}

function filter_in_progress_tasks($task)
{
	return $task['status'] == 'in_progress';
}

function filter_completed_tasks($task)
{
	return $task['status'] == 'completed';
}

// get our big result set
$all_tasks = get_todays_tasks();

// do the actual filtering
$tasks['overdue'] = array_filter($all_tasks, 'filter_overdue_tasks');
$tasks['in_progress'] = array_filter($all_tasks, 'filter_in_progress_tasks');
$tasks['completed'] = array_filter($all_tasks, 'filter_completed_tasks');

The above code gives us an array — $tasks — containing three arrays, one for each task status. Each of those arrays contains the task results with that status. You could also have three separate arrays but I think wrapping them up like this is neater and easier to deal with later.

How Does It Work?

What array_filter() does is actually quite simple. It takes as its arguments the array you want to filter and a callback function that will determine what gets into the returned array. Think of it as an eagle-eyed, jobsworth nightclub bouncer. If you don’t match the criteria, you ain’t gettin’ in.

What happens is that the callback will get called for each element in the array, so you can apply a logical test to it. In the above example, we’re checking that the task’s status matches the one we’re currently interested in. If the logical test returns TRUE, the task makes it into the exclusive filtered array. If the test returns FALSE, the task has been turned away and must take the walk of shame back past the others in the queue. Probably.

Easy! Bear in mind that if you’re calling array_filter() from within a class you will need to pass the callback function in a slightly different way, like so:

array_filter($all_tasks, array($this, 'filter_overdue_tasks'));

The code in this example assumes that you have your results returned as arrays, but modifying the code to work with objects should be straightforward.

One Problem…

While array_filter() is great, it does have one major snag. Because we can’t pass any parameters to the callback, we’re limited to checking against pre-set properties. This is fine for some uses, but not so great for others. For example, it’s easy enough (although not necessarily elegant) to write twelve functions for filtering by month, but it would be impossible to filter tasks by user, for example.

Fortunately, all is not lost, we can still work on the big array and save precious resources; we just have to do a tiny bit more work ourselves. As an example, if we wanted to group tasks by month, we could do so like this:

foreach($last_years_tasks as $task)
{
	$month = date('F', $task['created_date'];
	$tasks[$month][] = $task;
}

The above code would give us an array — $tasks — containing an array of tasks for each month. A month would only be included if it had tasks created, which may or may not be a desired behaviour. If you don’t want that, you can just prepopulate your array with all possible values of the field you want to group on. So if you wanted to group by user, you could just grab an array containing the user IDs of every user in your system, and let the foreach() loop do it’s thing.

In Conclusion

While not the only way to reduce our database load, using the single query method can help us to dramatically reduce the number of times we query the database in our app. Each query saved means fewer overheads, which means happier servers as our app scales.

Tags: Databases Development Optimization PHP

Feedback

Sorry, feedback is now closed on this post, but please feel free to get in touch if you would like to talk about it!

Or find posts by tag →

Friends & Influences

  1. Aching Brain
  2. Andy Budd
  3. Anthony Killeen
  4. Ben Everard
  5. Cameron Moll
  6. Dan Cederholm
  7. Dan Mall
  8. Dave Shea
  9. Elliot Jay Stocks
  10. Jamie Knight
  11. Jamie Rumbelow
  12. Jason Santa Maria
  13. Jeff Croft
  14. Jeffrey Zeldman
  15. Jeremy Keith
  16. Jon Hicks
  17. Khoi Vinh
  18. Mark Boulton
  19. Matt Croucher
  20. Nocturnal Monkey
  21. Sarah Parmenter
  22. Shaun Inman
  23. Simon Collison
  24. Tim Van Damme
  25. Toby Howarth