I am a big fan of Drupal and have used Drupal on more than 50 websites so far. My team has noticed many issues with performance for a website which was serving high traffic. We used modules like Boost for static file caching.
Some programmers i consulted told me that we should look for database optimization. So, i hired a person for DB optimization. He made a profiling tool to records all the queries our Drupal site makes.
Now, the site serves around 10k visitors in a day. And the number of queries for the last 12 hours are 6.1 million.
Some Facts important to be mentioned here
We are using very limited Modules
Views >> Recently stopped using the most popular and recent popular due to performance issues. But views is still enabled.
AcidFree Galleries >> Showing last three images post in galleries
We do not log access files due to performance reasons.
Server has 6 GB RAM, Intel Quad Core, Fast Hard Drive
Now, I want to ask, why Drupal makes so many queries ? Even if we do not install too many modules (in our case, around 3-4 modules, in addition to the core modules of Drupal), this number is too big. 12 million in a day for 10 K visitors ?
Drupal may work fine for small websites, but for websites which start getting some traffic, this CMS doesn't seem to work as good. Is there a need for Drupal team to check on the performance side ?
Regards

Comments
That seems absurdly high. I
That seems absurdly high. I don't know about stats for database queries, but we do anywhere from 500k - 1m unique visitors per day and we are running on one database server (with a hot spare doing replication), 4 web servers and 2 memcache servers, with most of the machines sitting idle.
I would suggest you look at cache router / memcache / boost.
http://drupal.org/project/cacherouter
http://drupal.org/project/memcache
http://drupal.org/project/boost
Steve
Slantview Media http://www.slantviewmedia.com/ | Blog http://www.slantview.com/
hi thanks for your reply. We
hi
thanks for your reply.
We are already using Boost
Cacherouter, i will try now.
We tried memcache, but for some reason, it didn't work well for us.
I saw your case study for performance about divx.com, hope to see more results.
BTW, we are running drupal 5.14
Acid free galaxies
for your acid free galaxy you might have a look at the: http://drupal.org/project/blockcache module that gives you a cached version of the block
A couple of comments about caching and queries
First, who cares how many queries Drupal has? If your database can handle it, why worry how many? Find your slowest queries, using MySQL's slow log and make sure the slowest queries in aggregate aren't slowing down total database performance time. If you believe the individual page load times are being hampered by excessive queries then use the devel module to identify which queries are performing slowly and tune those specific queries. Tune your database accordingly.
Second, when you say you were having problems with most viewed and most popular think about what this means. For every page load, you are recalculating all your views and the most popular. The more content you have and the more viewers you have this obviously won't scale. The solution is to use block caching so that all these dynamic features are cached to be calculated every 5 minutes instead of a hundred times a second.
Third, do you know you have a database problem? Did you use devel module to compare the page generation time total PHP generation time versus total database query time? I frequently see people worrying about their database performance when in fact the database is only representing 0.3s of a page that is being measured at 3-5s total. Page generation and load times frequently have to do with optimizing other parts of the LAMP stack.
For detailed analysis of how to tune your Drupal site start here: http://tag1consulting.com/performance_checklist
Cheers,
Kieran
Going by number of queries
Going by number of queries doesn't really help. As an example, a site I managed used to be on Wordpress. The most queries needed for any page was 17. If we had more than about 15,000 pi in an hour it would crash our database. I moved the site to Drupal and our pages average between 80-100 queries. Now we have had periods with over 50,000 pi in an hour and the database server sat near idle.
What you need to do is look at the slow queries. I would enable the query log in the Devel module and select to sort it by duration. Take your longer queries and run an explain on them in MySQL and see if they are doing anything bad like full table scans or filesorts.
You do seem to have an abnormally high number of queries (averaging about 600 per page). Using Devel you can narrow that number down further - such as if one page is generating a lot more queries than another.
Another thing that generates a lot of queries, depending on the site's layout, is the path module. There are various patches around that help eliminate many of those queries, either by ignoring lookups for certain paths (ie: admin), or by utilizing another caching backend for them like memcache.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Totally agree - not the
Totally agree - not the amount of queries, but the quality of database design matters. Once (not Drupal site) revisioned a database of site which had ~1million pageviews/day, averages ~40-100queries/second built on custom PHP script which had... database with NO indexes at all!!!.......
Of course, Drupal is very smart designed, but slow queries log should be very high on performance checklist.
---
naslenas.com. Something not interresting about Drupal.
drupal+me: jeweler portfolio
On a heavy site
On a heavy site, you can see up to 1000 questions per second, as measured by
show status like 'Questions';So that is around 8.6 million a day. Even with memcache enabled for anonymous users.
However, it may be easier just to enabled devel and see what it reports as number of queries for some representative pages, and see if it is high.
Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.
Drupal performance tuning, development, customization and consulting: 2bits.com, Inc..
Personal blog: Baheyeldin.com.
By 1000 questions per second
By 1000 questions per second do you mean db queries?
Anyway 1000 per sec = 86.4 mil queries per day. Now isn't that an amazing number...
We are running such a website and using devel I have measured many times over 1000 per sec. We have experienced mysql overload and crash few times. Boost helped a lot but still, we don't have a real fix.
Litenode
Here's an interesting take on reducing load time.
http://www.developmentseed.org/blog/2009/feb/4/litenode
Kieran
Optimize code?
Have you written any template code that's running unnecessary queries, or module code that could be improved? Yesterday I found some code I wrote ages ago which was very sloppy as I wasn't aware of how Drupal could automate some things for me, so half an hour rewriting it and updating related code dropped about twenty queries off my site's homepage.
200-300 queries per typical page
That's what I'm seeing on one client site right now, at least according the the Performance logs & Devel module. That's on a node page, viewed by an anonymous user, displaying 7 blocks (search, Primary Nav, another menu block, one CCK block, a Print/Email block, and two text blocks), with page caching in "normal" mode. The node on that page has 2 CCK fields, both filtered textareas.
"Tune your stack" is really not a very useful answer for the vast majority of sites. I realize this is a "high performance" group, but when you are running 300 queries per typical anonymous page, it seems to me that there's a problem right out of the gate.
Misbehaving Module
A single misbehaving module could explode the number of queries.
If you can narrow it down to what module / block / etc. contributes the most queries, that will go a long way towards fixing it.
Ken Winters
I wasn't trying to hijack the
I wasn't trying to hijack the thread w/ a support request, FWIW. But since you're asking:
reptag is the single biggest offender. It runs 14 queries, and each of those 13 times (182 queries). (I'm getting these details from a logged-in user, but the math makes it unlikely that reptag is doing much less for anons.) So, yes, that's pretty bad. I made heavier use of reptag on that site than I have since, which could help explain why the newer sites are so much faster. The real mystery is why it's showing up on this page at all, since I don't use any replacement tags there.
The most expensive queries are ones that I suspect aren't getting run for anons. (Curiously, though, the perceived load speed is about the same for anon and admin.) I'm doing some work on this site tonight, so maybe I'll enable log visibility for anons for a little while and get some #s from that to be sure what I'm really looking at.
What's especially interesting to me is that there are still so many unique queries. That's about 180 queries, most of them unique. (Based on the anon figure of 368, I think I said. Actual Admin-login page I'm getting details off of has about 398 queries.)
A cached page should only be
A cached page should only be generating a handful of queries (mainly to load the session, check that the user isn't blocked, and load the cache). If you are getting 300-400 for an anonymous user requesting a page that is in the cache then there is something seriously wrong with your site.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Looks as though that's very difficult to verify
I think you must be talking about Aggressive Mode caching. I've got normal mode caching (not aggressive mode), no block caching.
In that scenario, I'm seeing 314 queries. Some highlights: cache_get, 43 (at least 15 of those are non-unique queries); drupal_lookup_path, 31; menu_get_item, 13.
If I switch to aggressive mode caching, I see 332 queries on the first anon load of that page after the change in cache settings, and then I don't see that page show up on the performance log again. I don't see any because the page loads don't get recorded in the Performance Log.
So, I switch back to normal caching and reload the page. Again, nothing shows up in Performance Log. Clear browser cache; try a different browser; still no page load.
Only when I clear the site caches do I see an anon page load recorded in the Performance log. (This is with normal caching, not aggressive.) Then I don't se another until the next time I clear the site caches.
If I make Devel data visible to anon users, I can see the queries, but again, I see 312 on the page (311 in Performance Log).
So what I'm seeing is that Devel at least doesn't provide a means to ascertain how many queries are being run when any caching is switched on, and that if the devel data is exposed to anon users, caching is effectively switched off.
Are there any tools that would give a view onto the number of queries that are actually run per page load with caching on?
Sounds a lot like something
Sounds a lot like something odd is going on with a module or something. With devel output enabled for Anon and the options to display query output enabled, I do see the queries on the anon page view. First load after cache dump has all the queries and is the standard formatted table. Once the cache is primed then I get a standard print_r output of the global $queries. On a site with about 60 modules enabled, on a standard cached anon page it only has 7 queries (session and module load). Technically that is 10 queries though as there are 3 queries that execute before queries have the option to be saved (load variables from cache, access rules checking and session). That's because Drupal won't save queries unless the variable dev_query is set to 1.
On advanced cache you won't see this information. That's because of how aggressive cache works. Aggressive cache won't load any modules, even if they are set to bootstrap. Check out _drupal_bootstrap in bootstrap.inc and you'll see the logic under the DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE section, but I can tell you on the site I have that Drupal does 4 queries total. Here's a trick to find this out.
Disable all the query stuff in the devel module. In your settings.php override the dev_query variable:
$conf['dev_query']=1;
In _drupal_bootstrap in bootstrap.inc, make $queries a global. Look for :
// We are done.exit;
Under the DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE of _drupal_bootstrap. Before the exit add:
echo '<pre>';print_r($queries);
echo '</pre>';
Now you get a dump of every single query that runs, including the 3 that are missed during bootstrap. Just be warned that everyone will see these queries on every single page, so if you are on a production site I really wouldn't do it.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
I think I'll build your debug
I think I'll build your debug suggestions into my themes henceforth. I can easily enable them as-needed and do a couple minutes of testing. Not a good idea for high-traffic sites, but we don't generally build those. Plus, if it's a server-load issue, I'm thinking I should be able to get representative performance by creating a second website on the same server, in a subdomain that's got HTTP auth to prevent spidering. So, not such a bad idea, if handled with caution.
You might want to review the
You might want to review the bootstrap process or use a debugger to get a better idea of how this works. For normal-mode caching there are seriously just a handful of queries when there is a cache hit (unless you have a contrib module that is making excessive use of hook_boot(), hook_exit(), or shutdown functions). Aggressive mode caching happens before hook_boot() is called so that handful of queries gets reduced to just a couple.
I'm not quite sure why Devel is mis-reporting information during normal caching. In theory it should still work because it registers its shutdown function during hook_boot().
And of course, if the page is not in the cache, it will require those 312 queries to build it.
Also keep in mind, as mentioned elsewhere in this thread, that the number of queries per page is a fairly useless metric.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Thanks for the feedback. But
Thanks for the feedback.
But w.r.t. the number of queries as a useless metric: Doesn't each query introduce new overhead? So, forget about the total number of queries, and consider total cost of queries, which is going to be the sum of the execution times. Which is going to be more with more queries. So yes, raw # of queries is not as helpful as you might think, but the # of queries translates into a higher execution time than if queries are omitted. (E.g. by caching.)
And if you have high latency, doesn't a very large number of queries compound that problem?
If your database resides on a
If your database resides on a separate server, and you have optimized all your slow queries, then yes the latency between web server and database server does come into play. But I find that by enabling block caching, plus a bit of manual caching if necessary, the database time becomes manageable. YMMV.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Core Patch
Here's a core patch that should start to reduce the # of queries needed. This doesn't target cache hits though.
http://drupal.org/node/512962#comment-2463820
I have a list of functions that could benefit from the db_multi functionality in the next comment. Thought on this would be appreciated.
Insane number of Queries: ~2500...can it be???
I'm quite new to Drupal and I'm usually more interested in php frameworks as kohana, yii, symfony, ci. However I felt i needed to know more about drupal because of the concept of "node" and its extension using cck is very useful for building complex data type management and presentation. The modularity and granularity of this cmf is astonishing (i've been experimenting for a couple of month, reading everything i could find for improving site performance and patching the modules i needed accordingly).
Nevertheless some facts made me think about performance...
::Number of queries per page (250~400) for authenticated users with about 20 modules installed (tagadelic and taxonomy_redirect has been patched to use cache results with static variables). 20 may seem a lot, but they just make the site achieve common cms functionality as wysiwyg, multilanguage, image galleries.
::Number of queries for deactivation of a module (in this case "taxonomy_redirect" module) operation is 2513 ????????
..reporting Devel module output...
Page execution time was 16451.83 ms. Executed 2513 queries in 2452 milliseconds.
Memory usage:
Memory used at: devel_init()=2.01 MB, devel_shutdown()=35.21 MB.
I'm used to improve web applications reducing the numer of queries from 13 to 9...of course it's nothing compared to drupal capabilities...but 2500 queries to deactivate something seems a bit too much to me...
So, am I doing something wrong??? Can anybody explain to me what am i doing wrong? Anybody else having the same problems i have.
::Reactivating the module:
Page execution time was 14881.73 ms. Executed 2632 queries in 3087.63 milliseconds.
Memory usage:
Memory used at: devel_init()=2.01 MB, devel_shutdown()=35.49 MB.
Yes...I'm definitely doing something wrong....
::Trying to deactivate another module (tagadelic)...
Page execution time was 17027.04 ms. Executed 2469 queries in 2830.52 milliseconds.
Memory usage:
Memory used at: devel_init()=2.01 MB, devel_shutdown()=35.19 MB.
????
loose-coupledness / related question re. D7
1: Isn't a lot of this query proliferation due to the loose-coupling philosophy -- i.e., build an implementation by integration of many small independent parts? When you build one big app, you can consolidate the queries; but when you build an application by integrating many small, independent apps, you don't have an easy way to do that. (This doesn't explain the extraordinary number of queries required to enable/disable modules, though.)
2: Aren't there some query consolidation features in D7? How would they affect these ##s? My understanding is that D7 will actually degrade performance for an implementation like maksfeltrin's*; are those reasons related to PHP or SQL?
--
maksfeltrin's execution times seem to me to clearly indicate he's on a shared server -- that his site doesn't have many resources available to it. Either that or it's a local dev site (I see similar execution times on MAMP).
loose-coupling philosophy
Thanks for reply (escoles and dalin)
yes... i wanted to test it on a shared server (popular italian hosting provider used by many of my customers). I still have poor performance in my webserver (freebsd 4.9...yes still 4.9...very stable and lightweight, server apps compiled from source excluding mysql, ~ 40 websites) and in my laptop test server (freebsd 8.0).
I agree that loose-coupling philosophy inevitably leads to performance degradation. In my opinion database related stuff (dao,orm) should have an internal caching mechanism (like Doctrine does as an example). This is even more important in a project where thousands of (good) people are known to contribute. I also think that many configuration options like modules activation-deactivation, which are admin related stuff, should be put in configuration files. In my experience database caching is faster than filesystem caching, so i agree with drupal choice. So... i hope that next releases of drupal (>=8) will be focused more and more on performance improvements than in admin interface enhancements. Once you separate admin roles from editors, d6 interface may appear simpler than joomla to the average end user who just needs to input content, picture and tags/categories.
...just to make it clear: my point is not to criticize anything, just to outline the problems i encountered. I always consider problem reporting as the first step for improvement.
Hi, you mentioned: " In my
Hi,
you mentioned: " In my opinion database related stuff (dao,orm) should have an internal caching mechanism (like Doctrine does as an example)" - but didn't you just experience that? There is a lot of caching and because you effectively flushed the cache (by enabling/disabling functionality), you see a lot of queries.
Well (de)activating modules
Well (de)activating modules isn't something that you do on a regular basis. Many caches get cleared at this point in time (though there are a few issues open to reduce that. Search for them). And here is where a lot of caches and registries get rebuilt. The expensive stuff gets offloaded here so that it doesn't happen on every page load.
--
Dave Hansen-Lange
Director of Technical Strategy, Advomatic.com
Pronouns: he/him/his
Cut that number in half with a patch
Patch right above your first message will greatly reduce the number of queries; estimate by at least by a 1/3, maybe close to 1/2.
http://drupal.org/node/512962#comment-2463820 - First one
http://drupal.org/node/512962#comment-2819410 - Latest one
I plan on making this patch less aggressive (only push 1MB of data at a time) and thus will require less functions.
Thanks, i will try your
Thanks, i will try your solution and i think i will also start back from the basic core installation and look into the code of the modules as i activate them.... one by one... to see what happens..and if things can be done differently when performance is affected... will take time...but i think it's worth it.
Reducing the number of queries
I was able to reduce the number of queries from 224 to 139.
In my case the problem was in the tagadelic module theme function. I have two tagclouds blocks, total ~40 terms in two vocabolaries (per language terms). After deep analisys of tagadelic and taxonomy_redirect modules i found that the call of l(..) drupal function issued drupal_lookup_path(..) twice for each term.
Overriding or patching (as you show below) the theme_tagadelic_weighted(..) function will do the trick.
In my case vocabolary terms are either specific for a language or with no-language and are linked to a view which filters nodes by terms (language and no-language) and my tagadelic modue display only the tags for the current language or no-language as well.
Replaced code is commented out....
<?php
function theme_tagadelic_weighted($terms)
{
global $language;
$output = '';
if (module_exists("i18ntaxonomy"))
$terms = i18ntaxonomy_localize_terms($terms);
foreach ($terms as $term)
{
//$weight = $term->weight;
//$output .= l($term->name, taxonomy_term_path($term), array('attributes' => array('class' => 'tagadelic level'.$weight, 'rel' => 'tag'))) ." \n";
$output .= '<a href="'.base_path().$language->language.'/'.taxonomy_term_path($term).'" class="tagadelic level'.$term->weight.'" rel="tag">'.$term->name.'</a> ';
}
return $output;
}
?>
MySQL Database Engineering Suggestions....
The number of redundant queries is the issue here.
Why beat around the bush?
Drupal is VERY database-intensive.
Eliminating data and process redundancy is what transactional relational database technology is for...
So why not look at how to eliminate read query redundancies per page??!
Memcached and varnish can really help, but why not max out what the database can do?
Several MySQL architectures can confront database bottlenecks.
Gather, Collect and Parse MySQL Slow Query Logs
** => Parse for cumulative impact of small waits ('misdemeanors') and large hog queries ('felonious offenders').
** => Watch MySQL Processlist Count for Sleeping Connections
Provision Dedicated Physical Database Server(s)
MySQL Replication Between Master and Slave Database Servers,
Network Data Source Connection Management & Streamlined Default and Configurable Caching Strategies
=> Avoids sleeping DB connections and supports effective caching,
evidenced by cache performance ratios (request:hit).
Stored Routines and UDF's
Note that there is now the option of encapsulating the functionality of memcached
http://www.bluegecko.net/mysql/memcached-functions-for-mysql-1-1-released/
After properly implementing MySQL, you will find that all of your caching strategies
will become easier to implement and to manage.
Jeremy Donson
http://www.urbanspectra.com/resume
jjdonson@gmail.com
Jeremy Donson
Database and Systems Engineer
New York City
Drupal is not intended for that.
Drupal is a CMS. You can make a good modules setup, make a nice nodes constructor but if you will have more than 50 online users on the site you can't run it without pages cache enabled.
So your site should be static for the most part. After you create a node, entire node page goes to the page cache and then served only from the cache with minimal database queries. There will be several queries to check access rules, some bootstrap calls, flood control and throttle.
But with good database optimization when all your database will fit into RAM you will get a good performance but there is one thing. Drupal use a lot of JOIN-s for content output. JOINs require a lot of memory. The bigger your tables will get the more memory it will take. At some point your free RAM will end and your database will stuck.
I saw a drupal installation when for a one page there was 10k+ database requests. It was on dedicated server and it was not able to handle more than 10 users online. It's a very terrible practice.
So it's not really Drupal fault, it's more a bad choice of engine/tool for a site installation with big visitors count. It's true for all versions of Drupal.
Right now I'm working on Drupal optimization. The goal is to make a caching the part of the core and a part of Drupal architecture. So site should be dynamic, without agressive page caching but also with less database calls as possible. In the best cases it shouldn't make a database queries at all and serve all content from cache.
That will allow not to make a database connection and the speed in that case will be almost the same as when entire page is returned from cache but the page will be dynamic, not static!
The main idea is to precache all cacheable data only when data will change, e.g. node was created/updated. Almost all data on the site is cacheable. You may create node object my node_load() and put it into memcache and then you can get it only from cache without database calls. Afaik that is already implemented in D8 as "EntityCache", but i'm working with D6 and don't really know a lot about D8 internals.
Also there would be a cache warmup. When your memcache (and/or web server) is starting, you can run a php script which will fill the actual cache and only then start to serve HTTP requests.
Everything above will allow to make a geo distributed Drupal web servers cluster. Database may be installed on the one server with fast HDD (without a lot of RAM) and many web frontends with local memcache in different DCs. Yes, content updating will be quite long but it worh it and it's much cheaper than one big expensive server or several expensive medium servers.
The most part of the work is almost done but It's a pity that I can't share it because it's forbidden by Drupal trademark restrictions :-(
The most part of the work is
wat
The Boise Drupal Guy!
wat https://www.drupal.com/t
https://www.drupal.com/trademark
Thank you for reply. Andy clarified the meaning of the trademark rule below. No need to answer.
Drupal is for Sharing!
Drupal trademark restrictions have never stopped people sharing good ideas or suggested patches with the Drupal community. We even have a history of patched versions of core for performance improvements (most notably PressFlow).
The trademark protection simply allows Dries to control how the name Drupal is used - typically to prevent it being used in a commercial way that claims some official backing of the community when there is none.
So how would you like to share your work? Let's start a discussion. And if we need Dries to permit some use of the Drupal trademark (unlikely) we can ask Dries and he will advise on whether it's needed or what the best route is make this work available.
The community has always been open to proposed patches to core to improve anything.
What is new?
Riki, what has prompted you to write your post on this thread? I've just read more detail and your technical points are good, but you are posting on a thread started in 2009/10 and only updated once more in 2012. You say that all version of Drupal are the same, but then acknowledge that D8 has significant improvements in EntityCache (based on render_cache module in Drupal 7).
The technical issues you mention are well understood and have been worked on in many ways over the last 6 years. I assume you noticed that Drupal D8 was released last week and that Drupal 6 (which was very current when this thread started) is now in end-of-life support (which ends in Feb '16) - it really should not be the basis of future work.
You seem to criticise technology choices that were made many years ago for Drupal 6 (when they were probably the best balance of many competing factors) but then proudly announce that right now you're working on Drupal optimization, but only for Drupal 6. That is probably one of the most out-of-date choices you will ever make in your career.
Yes. Everything is correct.
Yes. Everything is correct. Sorry for necroposting.
Here is the story.
I've started working with D6 5-6 years ago when D7 was in alpha. My company have a medium-loaded web site (15-20k visitors a day, approx 10 view per visitor and overall 1.5m+ hits by background processing, search engine crawlers, etc...). After 6 years of hard work now I have a heavily modified engine which is capable to do all the work on a one cheap server and able to horizontal scaling with minimal effort. But I'm were using Drupal not as a CMS to publish nodes, but more like just engine with a robust API to create my own modules. It's something like E-store with a lot of custom written modules (~19 our modules, 8 core modules and 45 contrib modules). The site works without agressive cache turned on because pages are dynamic.
Imo Drupal for the past 6 years is moving to be more CMS rather than CMF. Drupal API is changing significantly, complexity of Drupal core and core modules is raising accordingly and there is no "LTS Drupal API" which developers may use without afraid that they should rewrite everything from the scratch in every 2-3 years. I understand that Drupal is trying to be on a "sharp edge" of technology and create LTS API is easier said than done. But unfortunately I can't to nothing about it.
So Drupal is moving on and we're left on our own island now. Yes, you're right, looks like I'm a little bit upset about D8 release and ending of D6 support.
Probably I should try to port some features to D8.1 but the problem is that some of them are too radical, e.g. keep entire bootstrap registry in cache or not to load all the modules for a single page request. Another problem is that some of the changes probably suitable only for my installation and cannot be used in common Drupal sites due some technical issues that I don't see yet. And also I'm not familiar with D8, I'm not sure that I will able to handle the task.
Migrate plan also will be a very hard task for us. My company have their own problems which developers department should solve every day and to make a good upgrading plan we should spend a lot of resources.
So the easiest way will do nothing and continue to work on our own fork of D6.
Thank you for reply, Andy.
Drupal 8 is the way forward
Riki, I sympathise with your situation. There are many who have invested a lot in D6 and I still have a couple of small client sites running it. But D6 has had 5 years of support since D7 was released and I think that's about as long term support as we can reasonably expect, especially from an open source project which has traditionally been completely reliant on volunteer time (although that is beginning to change in good ways, with safeguards against commercial monopolies).
The more you describe the improvements you've made though, the more I feel that D8 is the way forward - it has the same ideas built in of only loading modules needed, but it does this with the widely accepted PHP standard autoload mechanism (PSR-4) and in general is based on widely supported Symfony-2 framework components. Drupal has always been both a CMS and a CMF and many of the D8 changes are geared towards using it for services, including headless, not just traditional web front-ends.
You have probably achieved great things on your own over the last 6 years, but if you've done it your own way by modifying core, then you really can't expect any support from the Drupal community. There is nothing to stop you continuing to use D6 and others will too - for a while. There will probably be companies specialising in supporting older D6 sites that can't be migrated, but they will struggle to support a site where the core has been heavily modified. I've heard this story from developers numerous times over the years: "We took on a project to support an existing site (D6 or D7) but the core has been modified so it's impossible to maintain".
What the Drupal community has heavily committed to though, is to support migration from older versions of Drupal to D8, making migration from D6 a priority. You'll see lots more of this over the coming weeks and I'm still getting up to speed with it, but there is the migrate upgrade module (https://www.drupal.org/node/2257723), which does a lot of the legwork, plus Drupal Console (https://www.drupal.org/project/console) which has tools to create skeleton custom modules. There is also the Module Upgrader module (https://www.drupal.org/project/drupalmoduleupgrader) but this currently seems to be focussed on upgrading from D7, not D6 (yet).
All of that migrating helpers
All of that migrating helpers looks promising and I assume that migrating from D6 to D8 is not such a big deal as I thought.
But I'm sure that my problem is not unique. Using Drupal as CMS and as CMF is a two completely different ways. You can't satisfy both Webmasters (who make sites for their customers using only web browser) and Developers (who create a large custom environment for small and medium business needs).
Here how I see it (DCT - Drupal Core Team):
I'm not sure that it's true because I can tell only from my own side (as a developer) but that's how I see it. How many work-hours were spent on a "exporting and importing settings" feature? Yes, it's a killer feature for webmasters but it's absolutely useless when you have a server farm and automated deployment practices.
Another problem that bothers me is performance. When all our modules will be upgraded to D8 coding standards and everything will work fine, all tests will be green and we'll release everything to production, I'm sure that our database will handle drastically increased amount of SELECTs but the page speed will decrease for 100-200 milliseconds (30 core database requests + connection, of course). You can say that it's not important but I believe that it may significantly decrease our SEO Page Ranking in a long term.
So at this point I can see four options for the upgrade:
As you can see, the choice is not obvious. Probably someone from D8 core team may assist with option 1 or 2?
Riki, Thanks for your ideas
Riki,
Thanks for your ideas and input.
This is exactly what we are trying to do with D8:
Cache everything and make D8 behave as if it was a static page.
All pages in Drupal 8 are by default cached in dynamic_page_cache (authenticated and anonymous).
All blocks are themselves render cached. Only when something changes, the cache is invalidated. (cache tags)
It is perfectly possible to serve outdated information while re-generating the cache in the background (Even contrib could do this with around 20-30 loc).
Drupal 8 knows the difference between dynamic and static parts of the page and splits it up accordingly.
All information is cached on each server (APCu) and can even be cached nearer to the user in Varnish or CDNs like Fastly.
For more information please see:
https://events.drupal.org/barcelona2015/sessions/making-drupal-fly-faste...
All new ideas are obviously appreciated, so if you want to share something, please open a new topic with ideas here in the High Performance group.
There is nothing as "too radical" of an idea. All you need to do is: Share it. (or even open a core issue and ping me (Fabianx, WimLeers or catch) in IRC)
Thanks,
Fabian (D8 Core Developer)
Thank you Fabian, at last I
Thank you Fabian, at last I got something certain about D8.
Everything that I got before was something like "Don't ask. Just upgrade. D8 is better and will solve all your problems." (Sorry, Andy) :-)
How soon you plan to make it happen? What is the state of 8.0 and 8.1? That's exactly what I need. It was so indispensable for me 6 years ago that I had no another choice but to hack D6's core.
It's already in D8 release -
It's already in D8 release - that's what I tried to say earlier - end of story :-)
Do watch Fabian's Barcelona presentation - I was there and it covers this whole subject of how we can cache everything and know that it's always cached and cleared as expected.
In general D8 is very well documented on d.o. I've been googling a lot of D8 info over the last few weeks and much is out of date, but the d.o. documentation is by far the most reliable and quite comprehensive at this stage: https://www.drupal.org/8
Ok. Today I've spent a whole
Ok. Today I've spent a whole working day trying to inspect D8 capabilities and gotchas.
Here the list of cons that I've found:
The pros for me:
So the only thing that I can do is to conclude that D8 is not ready for production yet. And our team is not ready for upgrade too.
I think we should stick with our D6 modified core for some time until all problems will be resolved and I'm afraid that the time will be far beyond of February 2016.
I don't want to offend anyone but D8 is looks like a good product of Architecture Astronautics (there is an article about that from Joel on Software).
You had flew too high into the space :-(
I've discovered some
Because a SQL database is a simpler requirement to fulfill than a key-value store. A $5/mo web host will give you a MySQL database, but probably not a Redis… um… instance. So it makes sense for Drupal to just use the database for key-value storage until it is told otherwise.
Incidentally, you can do this in D7 too; as the cache and variable storage systems are amenable to being used with key-value store systems, it's possible to use them with such instead of using the SQL database.
The Boise Drupal Guy!
MySQL Performance
Not to mention that with MySQL 5.6+ InnoDB is really fast (5.7 is GA and even quicker); we ended up abandoning memcache due to high availability issues. If you can write non locking queries then MySQL is quick; 100,000 queries per second is no longer unattainable.
Yes, that is correct. With
Yes, that is correct. With all of that HA-issues I'm always forget about $5/mo hosting. KeyValueStore is not a big deal. One of memcache storage modules may implement KeyValueFactoryInterface. The big deal is that all that stuff is not ready-for-production (for my production) yet :-(
I think it's obvious that Drupal core for HA sites should be a little bit different than Drupal for $5/mo-hosting. I believe that community members may resolve this dilemma if they want.
There is NO native memcache
Would you expect this to be in Core directly?
The performance of the extension is mainly dependent on the used PHP extension.
Both could be stable by now - it is more a matter of releasing a new version.
The kev_value is the abstraction. Drupal 8 does not need anything than a DB out of the box. Why should D8 force users to install a KV store?
However, because it is abstracted you can store it where-ever you want.
As KV is meant for persistent data however, memcache would not be a good match.
cachetags by definition need a persistent storage, which memcache cannot provide. For redis the internal cache tags mechanism can be used.
There is a redis project in contrib - though probably only on GitHub right now.
There is a learning curve, but overall the used framework does not matter as much as the abstraction of services.
Or extremely easy. Your mileage may vary.
You had flew too high into the space :-("
lol, could say that, but overall the code has become cleaner and more abstracted.
Now for D9 we need to remove the unused layers and simplify where possible.
precache data
I'm starting to do a lot of this in D7.
https://www.drupal.org/project/apdqc fixes all the database deadlocking issues I've encountered so that MySQL is just as fast as Memcache and can handle 200+ concurrent users (open connections). It will also prefetch cache data so that cache reads are "free"; writes are async so that they are "free" as well.
In terms of precaching the dev version of https://drupal.org/project/httprl/ has an awesome "function cache" function; we use it for semi-delayed constantly updated, always available stats that take over a minute to generate. httprl_call_user_func_array_cache() is similar to call_user_func_array() but it returns the cached value of the function and then re-generates the cache value in the background. This gives you a 100% cache hit rate with minimal lag in terms of how old the cache is.
Thank you for reply. Very
Thank you for reply.
Very interesting approach with MEMORY tables.
It seems that such setup is not very HA when you use Master+Slave with load balancing between MySQL instances. MySQL becomes a single point of failure. Here the cons that I can see:
I'm going to use a schema based on a "reversed memcache cluster" when your frontends rely only on a local memcache instance and may even work without database in "readonly mode" if your database will goes down.
When some data changes (e.g. node created) the caching engine should spread modified cache items across all online instances. It can be done on a client POST synchronously or asynchronously by using a queue in database.
When one of frontends should be rebooted for a maintenance, on the system start their local cache will be warmed up and after that it will be ready to handle clients. One of the cons is that such scheme have a limitation of memcache storage size but for my needs 1G on each frontend is pretty enough. Another con (of course) is that Drupal core should be modified to implement cache warmup mechanics too.
Can you describe those issues?
Master will fail then all
This is only for the semaphore table and only if you're using MySQL 5.5 or lower. InnoDB performance in 5.6+ makes memory tables not that useful. Even still I don't see the big downside because it's only used for that table. Usually when things go bad you want to clear out all previous locks because your site is now waiting for something to be done and it's not going to happen until the lock times out (30s).
Not 100% sure what you're saying here. We use multiple webheads that then point to a master/slave MySQL setup. Network delay is inevitable in a high availability setup. I'm not worried about the one MEMORY table that holds locks losing it's data when the master crashes; from what I've seen this will actually speed up recovery time as locks will be acquired again instead of waiting.
This can be true. Same problem usually happens with most other key value stores if it's in a high availability cluster.
Not a post by me but it talks about some of the issues we've had. It mainly has to do with cache clears and how the whole memcache cluser get's reblanced when one node drops out (couchbase) resulting in oddities in the cached data. Since going with APDQC we've had zero db issues.
http://dev.mlsdigital.net/posts/Cloud-Native-Drupal/#remote-data-center-...
Thank You mikeytown2 I love
Thank You mikeytown2
I love you awesome https://www.drupal.org/project/apdqc
It's save me from slow drupal 7.
Now my authcache and apc module uninstalled, and my drupal 7 site run much faster.
Thanks.
BR,
D8 + Redis
I've been playing with the D8 Redis module (it's on GitHub). Normal page loads are only executing 6 queries on authenticated users. 3 are to load the user (session, roles and fields) and 3 are for states.
Haven't gone through real hard testing yet, but so far this seems like a pretty nice boost.
One thing I'm wondering on is how memcache will handle that tag checksums for cache bins. The only way I can see that working is to keep the tags in the database, since you want those to persist. One thing is for sure; with Drupal 8 we have opened a whole new world of performance enhancements.
HollyIT - Grab the Netbeans Drupal Development Tool at GitHub.
Write through is a must for
But then there is 2 possibilities:
But yes, it is still a 'hot topic'.