Hey guys,
I added the United Arab Emirates to our site that lists the top Drupal websites per country.
http://top-websites.burtronix.co.za/drupal/united-arab-emirates/2013-08-05
I've not been to Dubai in a while, but we still work for some agencies there and I'll be sure to pop in again some time this year.
For the technically inclined:
I'm now running the statistics gathering scripts as PHP CLI scripts that are much faster that the previous Drupal-based solution. I'm also storing data YAML format which is the new configuration file storage format for Drupal 8. The previous statistics gathering scripts ran for several days per chart, the new scripts finished the job for all the countries and CMSes in just a couple of hours.
Essentially, I obtain the list of top websites per country from Amazon through their web services, I then strip out all domains not in the ccTLD of the country and proceed with analysis. For analysis I grab the logic from the Wappalizer project and programmatically turn that into PHP logic; I then grab each top domain's http://, http://www, https:// and https://www versions and analyse the first that presents a 200 HTTP code. Finally, I run all of the found websites that were detected to run one of the CMSes past PhantomJS to get a screenshot for each.
The scripts aren't perfect, but since this was a weekend project, they can be made better in time.
Finally, the YAML files those scripts spit out are imported in one Drupal text field with a custom formatter to turn them into top charts. In time we'll gittify the source scripts, use Composer to get supporting libraries, base a lot more on Symfony (which supports newer YAML), write the scripts with more robust handling of sites (it simply skips a site if it can't connect to it the first try right now) and much more automation (still have a workflow around the generation of new charts right now).
The Amazon top list is based on usage of the Alexa toolbar, so if your site is not in the top list and you think it should be, simply install the Alexa toolbar for your browser.
Suggestions are very welcome!
Kind regards,
Riaan Burger

Comments
nice work riaan. i noticed
nice work riaan. i noticed the other day that mplus.ae has been rebuilt in drupal.
how many sites were in your amazon list for dxb? looks like over 10k... that's a lot of sites to crawl.
would be nice to know the stories, companies and people behind some of the drupal work on that list.
TopDrops.org
That was a lot... at the time ;-)
I had to first move out of Drupal to CLI, then even drop the overhead of Symfony that I used and finally even OOP to get this new one to perform well. It forks 100 processes in PHP to be able to parse a million sites in about a day.
TopDrops.org http://topdrops.org is a new site that is now without branding and more focussed in support of the Drupal community. It parses the top 1 million websites (if all goes well, weekly) to find the top Drupal websites, ranks them and notes the people who built them (as well as their Drupal Association membership).
Retweet TopDrops.org:
https://twitter.com/intent/retweet?tweet_id=386829076364677121
Share or +1 TopDrops.org on Google:
https://plus.google.com/105124241461962087866/posts/WL3Ux4FaFpD
List for .ae
Hey murraybiscuit ;-)
The list from Amazon/Alexa had 9,700 domains in it, so hitting the four variations made that 38,800 hits to crawl for the analysis run for .ae domains.
I used https://github.com/petewarden/ParallelCurl which which means it's pretty fast. But then I needed to run PhantomJS and learned to fork my PHP for multiple threads, so will probably re-write the analysis script one day as a forking PHP solution.
Still can't believe the temperatures I see there in Dubai on my weather list here. I came past in the winter, so was fine. Sure hope I always pop in when the weather is fine.
Over here in SA we tried to track down the companies and teams in the top list to help with building our community and highlighting talent. It may be a good idea to do the same over there.
Could you please share the git hub linke
Hi
Could you please share the git hub link
Unfortunately not on github
Well, not my code bits, at least which is littered with things like my Amazon Web Services key. The code base is, unfortunately, not clean for sharing. But if you need help with all the component parts most of those I can link you to and what I wrote isn't much more than some glue and theming the website. What do you need?