Memcached tuning: "object too large for cache"

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
mshmsh5000's picture

We have a Pressflow 6 app with Cache Router routing to multiple memcache bins. The bins have plenty of memory allocated, but other than that are running with default options.

In local testing, I've found that some basic cache-setting fails because of object size. Examples:

<20 set cache_views-views_data%3Aen 1 0 2721377
>20 SERVER_ERROR object too large for cache

<20 set cache_menu-links%3Aadmin_menu%3Atree-data%3A788912d6fd7adf98c8c86af7c57d00fe 1 0 1670355
>20 SERVER_ERROR object too large for cache

These fail because the objects exceed the default 1 MB maximum object size. Overriding by passing memcached a larger value, e.g., -I 3m, leads to this warning:

WARNING: Setting item max size above 1MB is not recommended! Raising this limit increases the minimum memory requirements and will decrease your memory efficiency.

...But the SERVER_ERRORs go away, because all objects are storable.

I've noticed that (a) high-performance distros such as Pantheon run memcached with the default object size, and (b) people don't seem to encounter this error -- ever, if the silence on this and other forums is indicative.

Have you run into this before on sites with, e.g., very large views results? Have you had to tune memcached in this or other ways to resolve?

Comments

Happens on more than one site

mshmsh5000's picture

After some more internal auditing, I've found that at least two of our sites are failing to store objects in memcached because of the default size limitation.

Both sites have large CCK content types and many Views. Site 1 has more complex CCK definitions than does Site 2. Site 1 has much more content than does Site 2.

Examples of oversized objects failing to get stored in memcached:

cache_views
* Site 1: 1.2 MB
* Site 2: 2.6 MB

cache_menu
* Site 1: 1.2 MB, 1 more object over 1 MB
* Site 2: 1.5 MB, 13 more objects over 1 MB

cache_content
* Site 1: 1.1 MB
* Site 2: 1.9 MB

Under the default memcached config, none of these objects will be cached. I see this as the responsibility of the PHP memcached client layer itself: objects larger than the 1 MB default slab size must be broken up to be cacheable, and subsequently reconstructed from their parts.

To verify, try running memcached straight from the command line with verbose output. In its simplest form:

memcached -vv -p 11211

Then run your app against this instance and look out for SERVER_ERROR messages. It's possible your app is failing to store some of its big objects in memcached, which means a significant loss of efficiency -- each Drupal process that needs this info has to reconstruct this object from the DB, then will again attempt (and fail) to store the object in memcached.

PECL punt

mshmsh5000's picture

A quick search of the PECL memcache project's bug list indicates that they consider this a tuning issue:

http://pecl.php.net/bugs/bug.php?id=4110

This is actually a limitation

slantview's picture

This is actually a limitation of the memcached daemon and covered by their FAQ.

http://code.google.com/p/memcached/wiki/FAQ#What_is_the_maximum_data_size_you_can_store?_(1_megabyte)

Missing info

mshmsh5000's picture

That explains the limitation of the default value, but fails to explain that you can actually change this without recompiling simply by passing a larger value via the -I flag.

From memcached -h:

-I Override the size of each slab page. Adjusts max item size (default: 1mb, min: 1k, max: 128m)

We've found much higher efficiency and reliability so far by overriding the max item size than by allowing memcached to fail to store these large objects. The cost is RAM.

That's great news! I had no

slantview's picture

That's great news! I had no idea you could pass a flag. Good find. What size are you running it as?

-I 3m

mshmsh5000's picture

After auditing the cache data, we started running these three bins (cache_content, cache_menu, and cache_views) with -I 3m. Everything gets stored in memcached now.

I'd be interested to hear from our Pantheon friends whether they've looked tuning memcached for these or other bins. Running with the defaults, the most expensive objects to construct from scratch are the ones not getting cached.

I got this prob once on

mkalbere's picture

I got this prob once on "cache_menu". My solution was to gzencode / decode the data to cache if it is bigger than XXkb.
This implies to change the structure that contain the cached data to add the compression "detection".
If somebody is intresset, I can try to isolate & publish my "hack"

I think fixing this at the

catch's picture

I think fixing this at the memcache layer is the wrong approach, although as stop gap it might be better than nothing it's just deferring the root cause which is misbehaving application code.

Both core and contrib have several bad patterns in terms of caching, the main one being "put all possible configuration into a massive array, cache it as a single item, then load it into a static on every request". This isn't just about objects not getting stored, it also bloats PHP memory usage etc.

These issues are not unfixable, see http://drupal.org/node/1009596 which drastically reduces the size of the CCK content info cache for example, and other issues tagged with memory: http://drupal.org/project/issues/search?projects=&issue_tags=memory

I couldn't agree more. I

Jamie Holly's picture

I couldn't agree more. I really think we need some sort of best practices/guideline documentation for caching. Things like:

  • A lot of times querying the database is just as quick, if not quicker, than using cache. Attempt to optimize all queries to their best potential before deciding upon using cache.

  • Query count doesn't always indicate a poor performance indicator. In numerous cases, running a couple of queries and assembling your object in PHP can be quicker than one big join that you can't get optimized.(Not really limited to caching, but more of a general rule)

  • Account for the processing of serialization/deserialization of your cache object when deciding if/when/what to cache. If you're caching an object simply to eliminate redundant processing, then examine the execution times, processing power and memory utilization to query and unserialize, versus just having the computations run.

Just some quick ideas, but you can see where I'm going here.


HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.

Agreed

mshmsh5000's picture

Fixing the root cause is the better long-term solution, but not an option for those addressing a race condition on a production site. As the memcached docs themselves say, tweaking the item/slab size isn't the recommended memory configuration, as it leads to inefficient caching.

On the other hand, the most recent instance of this I saw was in the global variables config -- crept upwards of 1 MB after yet another module was enabled, thus breaking the camel's back. I don't see this as fixable in D6 at this point. That would be a major reworking of the global variable cache.

See

catch's picture

See http://drupal.org/node/987768 - I have this backported as a patch for Drupal 6/Pressflow already.

The Drupal memcache project compresses most objects, so the variables cache reaching 1mb uncompressed should not necessarily lead to it not being stored. It's usually the CCK and theme registry caches that get close to that limit first.

Reading your post...

catch's picture

.. I see you're using cacherouter, it might be worth trying http://drupal.org/project/memcache instead.

I'm having a hard time

Alex Andrascu's picture

I'm having a hard time finding the D6 / Pressflow patch you mentioning. Please help.

See Pressflow Launchpad code for 'catch' code

Peter Bowey's picture

goto: https://code.launchpad.net/~catch-drupal

Catch [Nat] has 5 cache 'releases' for Pressflow there....

other-wise we assume that that the PATCH code at; http://drupal.org/files/variable_cache_0.patch
is what could be used for Pressflow / Drupal. Though I note that the last comments #98 from catch were:

I'm probably going to backport this to Pressflow next and profile on a real install - that should give better numbers when more variables are actually in use. The other thing to look at is how many variables tend to be cache misses when visiting multiple different pages.

Above comment #98 nothing else is mentioned about Pressflow at http://drupal.org/node/987768

--
Linux: Web Developer
Peter Bowey Computer Solutions
Australia: GMT+9:30
(¯`·..·[ Peter ]·..·´¯)

Hey Pete,Thanks alot for the

Alex Andrascu's picture

Hey Pete,

Thanks alot for the quick reply. I will take a look at the mentioned patch. My problem is precisely with the theme_registry:[theme_name] entry in the cache table as mentioned here http://drupal.org/node/1011614
It's currently arround 2.8MB wich is bigger then memcached 1MB slab size.
I know there are some patches arround and prolly catch dealt with that in the past but i still have a hard time to find it :)

Alex

Thank you

mshmsh5000's picture

Catch, thanks for the patch link and the suggestion to use the Memcache module. Rodale (my employer) has a Drush makefile that's the basis of all of our high-traffic sites. It includes Cache Router now, but I'm working to make Memcache the module of choice instead.

Re: the patch, that's much-improved logic for the cache. I'm still finding way too many contrib modules that balloon, say, menu cache objects, and this can quickly bring down a site.

Thanks again for all your performance work.

FWIW:

Vacilando's picture

FWIW: https://drupal.org/node/435694 (not really a solution but better than nothing).


---
Tomáš J. Fülöpp
http://twitter.com/vacilandois

Alternatively, use

pingers's picture

Alternatively, use redis. It doesn't have the same object size limitations as memcache (See http://redis.io/topics/data-types 512M). You could continue to use memcache for most cache bins, but the known problematic, large ones like views_data could be moved to redis.