Class extension / Mixins via Code Generation?

donquixote's picture

[UPDATE]
The technique outlined in this post is now published as a module,
http://drupal.org/project/stapler
[/UPDATE]

As chx said it in another discussion in this group:

I would love to see more OO used in Drupal, as it simplifies the syntax, makes it easier to comprehend, but without sacrifices to the modularity which we have now.

You can't do that because PHP does not let extend a class once it's defined. This is the fundamental problem and going around it requires a drastically different design to what we have now.

Is this a problem for us? Would we gain a lot of we had extensible classes?
I'm not going to answer this question here, just brainstorm different ways to achieve extensible classes in PHP.
Maybe we really don't need it in the end.

--------

One way to get around this is magic methods. These have a performance cost if compared to direct method calls, but might still be faster than other ways to gain the same flexibility.

--------

Another way is code generation.
Is this scary? Yes, a bit.

<?php
class somemodule_mixin_SomeClass {
  function
someMethod() {}
}

class
anothermodule_mixin_SomeClass {
  function
anotherMethod() {}
}
?>

By code generation, this becomes:

<?php
class SomeClass {
  function
someMethod() {}
  function
anotherMethod() {}
}
?>

-----------

There are different ways this code generation can work:
1. Reflection or parsing on the given mixin classes provided by different modules, then assemble as a new class, and store the code in a file somewhere.
2. The same, but store the code in the database to execute with eval(). Not sure why we would do that, but it is a technical possibility.
3. Let the mixin classes indirectly inherit from each other, in a big chain. The code generation would only declare the joints.

The third version requires a slightly different syntax in the source files:

<?php
class somemodule_mixin_SomeClass extends somemodule_mixinBase_SomeClass  {
  function
someMethod() {}
}

class
anothermodule_mixin_SomeClass extends anothermodule_mixinBase_SomeClass {
  function
anotherMethod() {}
}
?>

Generated code:

<?php
class somemodule_mixinBase_SomeClass {}
class
anothermodule_mixinBase_SomeClass extends somemodule_mixin_SomeClass {}
class
SomeClass extends anothermodule_mixin_SomeClass {}
?>

The code can be generated ad-hoc and then eval()'d, or it can be stored in a file.

-------

Anything that involves code generation into a file needs some extra thinking about when this code generation needs to happen. In every request? When a new module is enabled / disabled?

Comments

I'm trying to accomplish

wapnik's picture

I'm trying to accomplish something similar with views #872780 (updated version of the module is here #829096) with a slightly different approach (inspired by the hook system). And would also like some comments on that if someone interested. thx

I know too little about views

donquixote's picture

I know too little about views to give any useful comment on this one. But it looks quite different, technically.

This is not only about views,

wapnik's picture

This is not only about views, extending handlers/plugins in general maybe. Seems like kohana was using it too, but they probably switched to something else. I'm interested in the reason why.
http://forum.kohanaframework.org/discussion/212/can-developers-remove-ev...

Cool. But I don't see how

donquixote's picture

Cool.
But I don't see how your above code example has anything to do with that. I don't even find any "class" in there.

"they probably switched to something else"
-> where did you find that information?

Here they're saying that eval

wapnik's picture

Here they're saying that eval is not used.
http://forum.kohanaframework.org/discussion/4778/so-hiphop-huh/p1

But looking at this naming conventions reminds me of my own code there.
http://docs.kohanaphp.com/general/libraries
I didn't actually study kohana's codebase. Maybe it's time to start.

What "code example" do you mean? That views issue? The "class" is in the module's include file. Well let's rather say a "code template".

grep for "eval(" in Kohana v3.0.8 (stable)

donquixote's picture

...:~/sites/kohana$ grep -ro "eval(" .
./modules/userguide/classes/kohana/kodoc/missing.php:eval(
./modules/userguide/media/js/shCore.js:eval(

And these two hits have nothing to do with the mixin trick.
Why did they remove it? Good question.

Check the version

wapnik's picture

Check the version 2:
http://dev.kohanaframework.org/projects/kohana2/repository/revisions/482...
But they're creating only empty classes there using eval.

Some things to consider about

donquixote's picture

Some things to consider about the technique above:
- The technique can result in very big classes, if you count all inheritance layers. Usually we like to split things into smaller classes.
- In many cases there are better ways to achieve the same, with composition instead of inheritance.
- Once defined, a class can not be changed. Code generation does not change that. Thus, all mixin layers need to be defined before the class itself. Thus, the class definition can not depend on runtime information, unless that runtime information is available before the class is defined - such as, weights for the mixin layers.

A pattern for magic methods

mbutcher's picture

When I wrote QueryPath, I wanted to essentially make it possible to add "methods" to the QueryPath class on-the-fly (at runtime). I tried several things, including anonymous methods and closures (each of which has its place). But what I decided upon was using the __call() magic method and building a simple interface that implementors could use to register their own "methods".

Here's the somewhat simplified code (a method on the QueryPath class):

<?php
 
public function __call($name, $arguments) {
   
   
// Lazily load the extension registry.
   
if (empty($this->ext)) {
     
// Load the registry
     
$this->ext = QueryPathExtensionRegistry::getExtensions($this);
    }

   
// See if the method name matches anything registered. If so, use reflection to load and execute.   
   
if (QueryPathExtensionRegistry::hasMethod($name)) {
     
$owner = QueryPathExtensionRegistry::getMethodClass($name);
     
$method = new ReflectionMethod($owner, $name);
      return
$method->invokeArgs($this->ext[$owner], $arguments);
    }
  }
?>

Here's how it works:

  • __call() is only executed when the current object doesn't already have a method matching $name.

  • First, the method checks to see if the registry is loaded. If not, it loads the registry. (This greatly helps on performance, as extensions aren't scanned until they absolutely must be.)

  • Next, the registry is checked to see if any of the registered extensions have a method with the name $name.

  • If a matching method is found, the registry creates a new ReflectionMethod object and uses that object to execute the matching method.

  • Finally, the results of the method's invocation are returned.

From the API user's standpoint, extension methods are called in exactly the same way as regular methods. Example:

<?php
// Load base QueryPath
require 'QueryPath/QueryPath.php';

// Load QueryPath's XML-extras extension
require 'QueryPath/Extension/QPXML';


// Execute a "standard" non-extension method.
qp('some.xml')->top();

// Execute an extension method.
qp('some.xml')->cdata();
?>

So what does an extension look like? Here's a simple one:

<?php
class MyExtension implements QueryPathExtension {

 
// This is required by interface contract:
 
public function __construct(QueryPath $qp) {
   
$this->qp = $qp;
  }

  public function
myFunction() {
   
// Do something useful here.
 
}
}
QueryPathExtensionRegistry::extend('MyExtension');
?>

The above implements the QueryPathExtension interface, which dictates the form of the constructor. (We do this so that we can insure that every QueryPath extension has access to the current QueryPath instance.

We define one method (myFunction()) that will be accessed as qp()->myFunction().

On the final line, we include this:

<?php
QueryPathExtensionRegistry
::extend('MyExtension');
?>

This is how an extension is registered with the Extension Registry: It merely tells the registry that class MyExtension is to be treated as an extension of QueryPath.

That's pretty much all there is to it. (If you really want to read the real code, you can go to http://github.com/technosophos/querypath and take a look at the rest of the code).

Now, chx has expressed concern over the matter of accessing private and protected methods. The method above most definitely does not expose internals (private/protected) to extended methods. Only the QueryPath object passed in the constructor and the arguments passed to the method are ever accessible. (In essence, we treat an extension as a closure with access to the QueryPath object and any explicit arguments.)

However, we could do so if we wanted by explicitly passing variable references to the extension method.

My preference is to not burden the developer with the unnecessary weight that all her/his object-level variables be somehow accessible to extensions. Developers should merely design APIs that make it possible to mutate values that should be modifiable, while continuing to protect or hide variables that should not be modifiable. Class- and object-level variables require no more "inherent hackability" than function-scoped variables, and should be treated with the same regard.

This is no longer a problem

chx's picture

One, PHP 5.3 have reflection hacks, two this was a problem because (aside from code generation, yes) you could not access the protected / private properties. In Drupal 8, I very strongly recommend http://drupal4hu.com/node/269 not having them. Read also the article linked in there why protected is unnecessary.

Short version: "People will respect the underscore"

donquixote's picture

So, let's assume that people will respect the underscore. This can be debated, but let's just assume for one moment.

Where can you use the underscored property, and where can you not?
* methods of the same object -> yes, they can.
* methods of another object of the same class (or a subclass) -> in the traditional model, yes.
* methods and functions from outside, that have a pointer to the object in question -> in general, NO.
* debugging stuff -> yes.
* objects that act as "class extensions" via magic methods -> yes, i guess. if we want to have this kind of thing.

But wait, you don't like magic methods, do you?
Now, then we do the same with composition. Now, in your book, can these composed objects use each other's underscored attributes, or not?
If yes, then is the underscore not quite useless?
If no, then why is "protected" a problem?

For me the only little problem with "protected" is that it makes a var_dump look very boring. But I guess this could be worked around.

in general , no

chx's picture

but then the point is that sometimes you might need to. I have tried to explain this several times but it seems I am failing every time. You design an API. The user might find something that can't be done with the API. Now what? All I am saying is that Drupal has a history for hackability and that's all I ask for.

And yes, debug too. This is a total no brainer.

As for respecting, it's more like a contract: if you mess with the underscore properties then all bets are off -- your application might break and do not bother filing a bug report.

I think I see your point. You

donquixote's picture

I think I see your point. You can break the underscore rule, but it will be an exception.
This can be used as an argument against private/protected, but it does not explain why these would be a fundamental problem for an everyday use case.

The post was about class extension, and this was also the context of your comment quoted in the topic starter. The message I got from your comment was "private/protected prevents class extension". Turned around, this means that your idea of class extension does involve a lot of "breaking the underscore rule". Not just as an exception.

"protected" = will not survive next version

donquixote's picture

And here is my own idea about protected attributes and methods.

Public is the stuff that other modules can use as an api. I know it will hurt a lot of people, if i change the name or signature or behavior of a public method or attribute. As soon as something public sees the light of day, it petrifies. Or if not, there will be tears.

For anything that I imagine I will change in the future, I would rather use an underscore or a "protected" keyword. No tears.
Restrictions to the public api gives me more space for restructuring / refactoring.

When people kindly ask for it, they get some more stones to play with.
This is why I think that Introduction of new hooks in stable Drupal releases should be allowed and encouraged, btw.

The solution (for the "AOP vs encapsulation" tradeoff) is not to abandon the "protected" keyword altogether and open up everything, but to cut classes and objects in a smart way, improve the structure over time, and add new doors where things get stable enough.

How does this apply to the

donquixote's picture

How does this apply to the mixins idea from above?
As soon as inheritance (and mixins) become part of the game, the above argument has a problem. If I don't want to make my attributes private (and I usually don't), any renaming or restructuring can break code in subclasses. The subclass could be in some custom module that noone knows about, except the person who wrote it.
Now, at least there is now only one place to look at, not many. So there is still some benefit.

(I'm not at all saying that future compatibility is the only reason why one would use "protected")

All in-function variables are private

mbutcher's picture

It seems to me that the private/protected variables argument is essentially indistinguishable from this argument:

  1. All of the variables declared with function-level scope are inaccessible outside of the function.
  2. Many times I could more conveniently solve my programming problems if I could change those values from outside of the function
  3. If I want to change those values, I should be able to change those values
  4. Therefore, no variables should be declared within functions.
  5. Global variables are accessible and mutable outside of functions
  6. By 3 and 5, all variables should be global variables

The primary argument against this is that not all variables should be accessible outside of the function (essentially, denial of #3). The exact same argument holds for private and protected variables. A well-constructed API can make use of private and protected variables without making it impossible for others to accomplish what they need.

True. But there is a little

donquixote's picture

True. But there is a little difference.
global vars vs in-function vars are not only different in how they can be accessed, they are also different in scope and lifetime.
But in general I agree with you.

"Stapler" module

donquixote's picture

There is now a module for this kind of thing (eval'd mixin chains).
http://drupal.org/project/stapler

Have fun playing.

Interesting. Inspired by

wapnik's picture

Interesting. Inspired by kohana?

Anyway, see my new rewrite of the views issue #872780. I introduced something what you proposed, a file cache.

Talking about the Views issue

wapnik's picture

Talking about the Views issue #872780 previously mentioned, it has evolved to something that is outside of Views completely and become then this set of modules:

  • Decorator module is where generating of files, caching takes place.
  • Views Decorator module deals with Views API to determine the classes that are going to be generated, uses Decorator for generating them and sends them back to Views.
  • Views Field Rewrite module implements a hook_views_decorators() for registering these "decorators" template classes in Views Decorator.

The generated files are being purged when Views itself purges its cache. There is no eval(). Memcache can be used too i think. Follow up modules can be made for integration with other modules using handlers, plugins, etc. The only point is that the target module (like Views here) must have an API for registering new classes that it should use.

Maybe not very clearly formulated, sorry for that...
Does it make any sense?

The whole point about doing

pounard's picture

The whole point about doing OOP code is to make the design really clear and let the complex code live in specific implementations. Do not do eval() it will kill performances AND make your design really unclear. PHP has never been made for reflexion. Eval'd code is not optimized by OPCode caches, and you make work the interpreter twice as it should be.

Avoid dynamism and use well known design pattern instead, they exist for a reason: because they really do work! Using existing pattern does not mean that you are not intelligent nor creative, it just means that you put the right words on the right problem, and leave clean and understandable code for the newcomers.

@chx : protected is not that bad, private maybe (except for the singleton pattern implementation) but protected won't kill you.
Using hackish reflections method is bad. Whatever you are doing with them, you are doing it wrong! Too much dynamism will cut off any compiler/opcode/interpreter optimization that could have been done if the code remain static.

From the quote in the original post I see that OOP usage would force to use a different design, I'd say it depends. Most hooks exists and implement in their own way already existing design patterns, but all under one and only technical solution. Just give a try to analyse each one of those hooks, one by one, and you'll see that for the most, they are elegant and performant solutions that exists, without having to play with the whole code reflexion or generation.

Code generation is really bad, if you let the core generate code and put PHP files everywhere, you'll break other OPCode caches optimizatiosn (like the stat() function toggle on files) and you will make the life harder for sysadmin to determine which of these files comes from an external hack, and which comes from "legal generated files".

Pierre.

We should really look at

jmccaffrey's picture

We should really look at using the Spring model more here. Spring is a very powerful IOC/AOP framework for Java. In my mind Drupal is similarly an IOC/AOP framework for like PHP 4. Updating the software to the modern power of PHP should include updating to more modern patterns that other technology stacks are using.

Many people see these things like Mixins and even the use of deep inheritance as anti-patterns, we should instead be looking more for shallow POPOs (lol) and leveraging dependency injection to bring power to our objects.

Let's get a few real examples so that we can flesh out both approaches in this thread and come to the best solution possible.

By the way, let me introduce

jmccaffrey's picture

By the way, let me introduce myself.

My name is Jonathan McCaffrey, I am really interested in the Butler project and it's implications for Drupal 8. I am an experienced software architect and really want to help this effort in whatever way I can.

Web Services and Context Core Initiative

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: