Data Mining

This group seeks to analyze Drupal-related data in order to better understand its evolution across different dimensions and over time. This includes both automated and manual analysis, chart and graphic generation, and may result in eventually developing various modules for drupal.org and its subdomains.

Microstrategy Analyst | Butler America

Employment type: 
Full time
Employment type: 
Contract
Telecommute: 
Allowed

Location: Washington, D.C.
Duration: 6 month contract (full-time)
Rate: $35-$50/hr W2 only

Requirements:
Microstrategy experience
writing strong sql queries (reporting)
working in a fast paced environment
good writing skills
customer acceptance testing
writing requirements
experience in an enterprise data warehouse (EDW)/microstrategy environment.

Job Descrioption:

Read more

Need Data for Infograph

I am trying to create an infograph about Drupal showcasing its strengths and advantages, especially showing how it has grown since it's inception, numbers of modules and themes by year possibly, etc... or any other Ideas I may not have thought of. I am not a Drupal user myself(yet), so if anyone has any Ideas I would be welcome to them. If anyone has any good links to data or any data or know anyone who would have this information it would be a great help.

gathering Drupal users and usage statistics...a plea for help

We are compiling a historic overview of Drupal for a book project, the Definitive Guide to Drupal 7 by APress. As the main research monkey on the project, i (Kasey Qynn Dolin, a complete Drupal newbie) am trying to gather Drupal user and usage statistics data, which we promise we will make easily available to the community as we compile this data.

Read more
danithaca's picture

Recommender Bundles: Update

I've finished these things so far:

<

ul>

  • Updated Recommender API to v2.0beta: improved performance; added BatchAPI support; added Drush support; added SimpleTest support; migrate to PHP5 OO paradigm for extensibility.
  • Released Browsing History Recommender that provides 2 blocks: "Users who browsed this node also browsed" and personalized recommendation "Recommended for you".
  • Read more
    David Strauss's picture

    Announcing CiviCluster and CiviConference

    I've created initial snapshots for CiviCluster and CiviConference for
    users of Drupal 5. It may take up to 24 hours for them to appear in the
    Drupal.org release system, but they are already in the DRUPAL-5 CVS branch.

    CiviCluster will rapidly identify duplicate contacts and walk users
    though merging them. CiviCluster supports CiviCRM 1.6 and 1.7 (except
    for CiviEvent). CiviCluster will be updated shortly to support CiviEvent
    schema changes.

    http://drupal.org/project/civicluster

    CiviConference allows online conference management and ticket sales by

    Read more
    David Strauss's picture

    Convenient SQL transactions with PressFlow Transaction

    I've released an in-development version of PressFlow Transaction for developers interested in convenient encapsulation of SQL transactions. The key features are intelligent use of scope for COMMITs and ROLLBACKs as well as safe, intelligent nesting of transactions to get exception-like semantics.

    Usage details are on the project page. Requires PHP 5.

    (I posted this to the High-Performance group because encapsulating updates in transactions can dramatically improve performance.)

    ChrisKennedy's picture

    October Download Statistics

    On November 15th Gerhard released the download statistics for all packages on Drupal.org (with formatting by Earl). Here are two charts that summarize the data and the accompanying Excel. Suggestions are welcome on how to improve them or on other ways to analyze and display the data.

    1. Top 30 Packages (click thumbnail to enlarge)

    These top packages are comprised of 3 versions of Drupal, 20 modules, 5 themes, and 2 videos.

    2. Overall Distribution (click thumbnail to enlarge)

    When looking at the distribution of downloads we see noticeable breaking points at 16, 36, and about 590, which segment packages into four classes: Tier 1 (critical), Tier 2 (very popular), Tier 3 (moderately popular), and Tier 4 (unpopular).

    ChrisKennedy's picture

    Group activity data analysis

    There has been some recent work analyzing the growth on drupal.org, and I think we should do something similar for groups.drupal.org.

    I would be interested in charts/histograms showing the distribution of:

    1. Groups by number of subscribers
    2. Groups by posts/week in the past three months
    3. Users by number of subscriptions (without identifying information)
    4. Users by number of posts (without identifying information)
    5. Users by number of posts (without identifying information)
    6. Total posts over time
    7. Total posts per week over time
    8. Total groups over time
    9. New groups per week over time
    10. Median subscriptions per user over time

    Did I forget anything or should some of these be removed/tweaked? I am willing to generate the charts if someone can run the queries, and I can figure out the exact sql queries if needed.

    joshk's picture

    Growth Graphs

    In preparation for starting the 5.0 drumbeat, I was able to get killes (thx Gerhardt!) to run some analysis on drupal.org. I think if we can keep up this kind of growth (and with the spiking numbers of developers, projects, and activity on the site I think we can) 2007 could be a sort of tipping point for Drupal!

    UPDATE: here's my blog post on the subject.

    The source XLS file is also attached for your own viewing pleasure.

    Read more
    Subscribe with RSS Syndicate content

    Data Mining

    Group organizers

    Group notifications

    This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: