Entity API update and summary

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
fago's picture

Here is a short summary of what we discussed at the Drupalcon London.

There has been a core conversation talk by Peter Wolanin and me, of which you can find the video here and the slides here.

Roadmap

The further roadmap is to:

  • define the API + do a test entity type
  • port a core entity
  • do performance testing
  • implement revisions + port node and other entity types
  • refactor field storage

Performance testing basically should be repeated after every step.

Status

  • There is a first patch that moves the entity API in its own module, which we need to get in first.
    http://drupal.org/node/1018602 -> Needs review.
  • There is an issue for implementing basic CRUD and porting a first entity type: comment
    http://drupal.org/node/1184944 - work in progress
  • We've worked on defining the basics of the API at the Drupalcon codesprint (see below).

The basic API

  • There is an EntityInterface and a class implementing the interface (Entity) for entities.
  • Entity types usually do their own entity class, e.g. Node which extend the Entity class, but we allow using Entity directly too
  • We are not using type-hinting on entity-type specific classes like Node, but only on the EntityInterface.
  • As entites represent our data in Drupal, we only want to have data-related methods on the entity classes.
  • Properties/fields of entities continue to use underscores, not CamelCase. Discussion at http://drupal.org/node/1233394

Example usage:

// Create a node with an array of values.
$node = entity_create('node', $values);
$node->save();
$node->delete;

// Update the node title of an existing node.
$node = entity_load('node', 1);
$node->set('title', 'foo');
$node->save();

Then we also worked on defining the EntityInterface and some important functions. Here is what we came up with:

<?php
/**
* Interface for all entity objects.
*/
interface EntityInterface {

 
/**
   * Returns the entity identifier, i.e. the entities machine name or numeric
   * id.
   *
   * @return
   *   The identifier of the entity. In case the entity has no identifier yet,
   *   it returns NULL.
   */
 
public function id();

 
/**
   * Returns whether the entity is new; i.e. whether it has been already saved.
   */
 
public function isNew();

 
/**
   * Returns the type of the entity.
   *
   * @return
   *   The type of the entity.
   */
 
public function entityType();

 
/**
   * Returns the bundle of the entity.
   *
   * @return
   *   The bundle of the entity. Defaults to the entity type if the entity type
   *   does not make use of different bundles.
   */
 
public function bundle();

 
/**
   * Returns the UUID of the entity.
   *
   * @return
   *   The UUID of the entity, or NULL if the entity types does not make use of
   *   UUIDs.
   */
 
public function uuid();

 
/**
   * Returns the UUID of the entity's revision.
   *
   * @return
   *   The UUID of the entity's revision, or NULL if the entity types does not
   *   make use of revisions.
   */
 
public function revisionUuid();

 
/**
   * Returns the label of the entity.
   *
   * @return
   *   The label of the entity, or NULL if there is no label defined.
   */
 
public function label();

 
/**
   * Returns the uri elements of the entity.
   *
   * @return
   *   An array containing the 'path' and 'options' keys used to build the uri
   *   of the entity, and matching the signature of url(). NULL if the entity
   *   has no uri of its own.
   */
 
public function uri();

 
/**
   * Returns the value of an entity property.
   *
   * @param $property_name
   *   The name of the property to return; e.g., 'title'.
   * @param $language
   *   (optional) In case the property is translatable, the language object of
   *   the language that should be used for getting the property. If set to
   *   NULL, the default language is being used.
   * @todo
   *   Which default language should be used.
   *
   * @return
   *   The property value, or NULL in case it is not defined.
   */
 
public function get($property_name, $language = NULL);

 
/**
   * Sets the value of an entity property.
   *
   * @param $property_name
   *   The name of the property to set; e.g., 'title'.
   * @param $value
   *   The value to set, or NULL to unset the property.
   * @param $language
   *   (optional) In case the property is translatable, the language object of
   *   the language that should be used for getting the property. If set to
   *   NULL, the default language is being used.
   * @todo
   *   Which default language should be used.
   *
   * @return
   *   The property value, or NULL in case it is not defined.
   */
 
public function set($property_name, $value, $language = NULL);

 
/**
   * Saves an entity permanently.
   *
   * @throws EntityStorageException
   *   In case of failures an exception is thrown.
   *
   * @return
   *   Either SAVED_NEW or SAVED_UPDATED is returned, depending on the operation
   *   performed.
   */
 
public function save();

 
/**
   * Deletes an entity permanently.
   *
   * @throws EntityStorageException
   *   In case of failures an exception is thrown.
   */
 
public function delete();

 
/**
   * Creates a duplicate of the entity.
   *
   * @return EntityInterface
   *   A clone of the current entity with all identifiers unset, so saving
   *   it inserts a new entity into the storage system.
   */
 
public function createDuplicate();

}

/**
* Exception thrown when storage operations fail.
*/
class EntityStorageException extends Exception { }

/**
* Creates a new entity object.
*
* @param $entity_type
*   The type of the entity.
* @param $values
*   An array of values to set, keyed by property name. If the entity type has
*   bundles the bundle key has to be specified.
*
* @return Entity
*   A new entity object.
*/
function entity_create($entity_type, array $values) { }

/**
* Exports an entity.
*
* @param $entity
*   The entity to export.
* @param $prefix
*   (optional) A prefix for each line.
*
* @return
*   The exported entity as serialized string.
*/
function entity_export($entity, $prefix = '') { }

/**
* Imports an entity.
*
* For persisting a newly imported entity use entity_save(). In case of
* failures, an exception is thrown.
*
* @param $entity_type
*   The type of the entity.
* @param string $export
*   The string containing the exported entity as produced by entity_export().
*
* @return Entity
*   The imported entity object.
*/
function entity_import($entity_type, $export) {}


// @todo: entity_load_by_uuid()
?>

Feedback on the high-level view is wanted! (= What methods or functions with which parameters do we need?)

Comments

I'm going off procedural

catch's picture

I'm going off procedural wrappers a fair bit. Could entity_create() etc. also be methods on a class (not sure what that class should be called though).

I think we should consider standardizing on uuid, id etc. properties on the entities and accessing them directly, instead of wrapper functions.

I will try to write up an interface for the storage class, and some pseudo code for some of the ideas at http://drupal.org/node/1237636.

interested about not having non data-related methods on the entity class - where do forms and rendering go? EntityForm and EntityRender classes? Something else?

Why we use both uuid and id,

Sylvain Lecoy's picture

Why we use both uuid and id, is this for legacy support ?

if you use uuid as primary

attiks's picture

if you use uuid as primary keys you run into serious performance problems on mysql, so the id is still the primary key

Step by step

Stalski's picture

Kinda yes.
First getting the UUID in and then launch with the normal incrementing nid, uid, ... . Once that's stable, they can be dished (but I heard that will take some time), so I heard in core conversations in DrupalCon London.

I'm going off procedural

fago's picture

I'm going off procedural wrappers a fair bit. Could entity_create() etc. also be methods on a class (not sure what that class should be called though).

It'd not be really a wrapper, but a factory. We shouldn't hardcode the entity classes anywhere, so we gain more flexibility - thus we need a factory.
Or do you mean doing static methods like Entity::create() instead so we lazy-load them?

I think we should consider standardizing on uuid, id etc. properties on the entities and accessing them directly, instead of wrapper functions.

Yep, we discussed that too. We'd like to standardize on 'id', 'uuid' too, but thought it's better to work on that iteratively; i.e. do id() first and then start fixing the internal implementation. However, probably we don't want to enforce something like $entity->id and $entity->uuid, as it would incorporating remote-data that doesn't fit that scheme harder + cannot be done in an interface. Still, we could standardize those for DX without strictly enforcing them.

interested about not having non data-related methods on the entity class - where do forms and rendering go? EntityForm and EntityRender classes? Something else?

We have not discussed that much, but we talked about doing multiple controllers (storage, cache, display, form, ..). That way the form, display, .. related code could also live in the controller + modules can provide further controllers, which entity-type providing modules may use and customize. E.g. there could be a "views controller" for providing views integration for the entity. We cannot really use the entity class for customizing stuff like that anyway, as it is not extendable my modules.

However, probably we don't

catch's picture

However, probably we don't want to enforce something like $entity->id and $entity->uuid, as it would incorporating remote-data that doesn't fit that scheme harder + cannot be done in an interface. Still, we could standardize those for DX without strictly enforcing them.

Remote entities could do this in __construct() but yes it can't be enforced in an interface. There must be a decent way to both document and require particular object properties though...

I'm strongly in favour of multiple controllers, this is mainly details but I see a couple of things to sort out:

  1. The entity class only holds entity data and methods to access it.

  2. The entity class holds that data, methods to access it, methods to manipulate it, and also factory methods for controllers for forms, rendering etc.

  3. The controllers are separate classes which have public facing APIs outside the entity class.

It sounds like the discussion is moving towards three - so that contrib can add completely new controllers as opposed to just swapping them in. That means defining exactly what $entity contains then I think.

Remote entities could do this

fago's picture

Remote entities could do this in __construct() but yes it can't be enforced in an interface. There must be a decent way to both document and require particular object properties though...

We could do so, but that means we'd have to fix up the property-names regardless whether we are actually using it later on. Also, from a DX point of view it makes sense to me to able to integrate remote data as is, so developers that know the data can rely on its usual properties and don't have to adapt to how the data is represented in Drupal. Having to translate back and forth between two different property naming schemes seems a bit weird to me.

So, I'd propose the following:
* Entities handled by Drupal modules should use Drupal's convention for property names: $entity->id, $entity->uuid, .. If developers know with which entity type they are dealing, they can directly make use of that properties.
* Entities originating from remote systems should not mess-up with the data to match the convention, but integrate the data as is. So developers used to that data structures get the data as they know it. Developers generically dealing with entities can still use the methods defined by the interface, i.e. $entity->id(), $entity->uuid().

Update: Posted the proposal also to https://drupal.org/node/1233394#comment-4973560 - let's have the discussion over there.

multiple controllers

fago's picture

I'm strongly in favour of multiple controllers, this is mainly details but I see a couple of things to sort out:

1. The entity class only holds entity data and methods to access it.

2. The entity class holds that data, methods to access it, methods to manipulate it, and also factory methods for controllers for forms, rendering etc.

3. The controllers are separate classes which have public facing APIs outside the entity class.

It sounds like the discussion is moving towards three - so that contrib can add completely new controllers as opposed to just swapping them in. That means defining exactly what $entity contains then I think.

Yes, I really think we should support contribs with defining a proper way to add new functionality around entities, while there is way to easily enable or customize it for certain entity-types. Thus, modules can create their own APIs and provide generic implementation(s) for entities, e.g. Views could use it for providing entity-based views integration that entity-types then may customize by overriding the controller. Still, the alter hook is "free" for other modules to be used.

So when one provides a new entity type, one can easily control how it integrates with the system by enabling, disabling or customizing the controllers. (Which or whether controllers should be enabled by default is something that probably depends on the controller.)

That way I'd say: The controllers are separate classes which integrate them with other system-components, which then have public facing APIs.

So, this is how I could image the possible core controllers (Form, Render, Access) to work:

Forms:
- Let's do an "entity-form" subsystem, which cares about providing the add/edit-form of an entity. It should have an API that allows embedding the form somewhere, maybe even inside other forms. The controller could be a default implementation integrating fields. With an entity property system in place, it could even provide a complete form as default, which then the developer can customize.

Render:
- The same way, I think we need a subsystem for displaying entities. It should have an public API for rendering the entity in various ways (view modes?), as well as an API for rendering parts related to the entity, i.e. fields, certain properties, links or whatever. The controller could do a default implementation of that API to provide display components for fields.

Access:
- A public facing API to determine access for an entity type, optionally for a certain entity (i.e. entity_access())). We can provide a controller implementing a reasonable default by providing basic permissions and implementing the access API.

Then, for contrib we can do it analogously. I'm doing it already that way with the entity API module for d7, e.g. it has a 'view controller' taking the base-table + entity property info + schema information to generate some default views integration; or a 'rules controller' exposing events to Rules, ..

Note: These are just examples on how I image the entity-controllers could work. Each system-component definitely needs its own discussion.

It'd not be really a wrapper,

catch's picture

It'd not be really a wrapper, but a factory. We shouldn't hardcode the entity classes anywhere, so we gain more flexibility - thus we need a factory.
Or do you mean doing static methods like Entity::create() instead so we lazy-load them?

I'm fine with either, although I think core needs to pick one and standardize (and probably on class factories for lazy loading even though they look ugly to me).

By procedural wrappers I mean things like entity_load() which don't currently return a class at all but use classes internally, really, really want to eliminate these in Drupal 8.

I'd really prefer if we could

dixon_'s picture

I'd really prefer if we could wrap entity_create(), entity_load() etc, in a class. In worst case it could be a singelton class or whatever.

The huge gain would be the lazy-loading ability. This would play very nicely together with the context system that's lazy-loading everything by nature. We really need this to strip down our footprint, as we've talked about for a long time.

We already talked about this, and implemented something simple for the UUID stuff. Maybe we should pause a bit and think harder around this, before we proceed.. Because I see the same patterns coming back everywhere. Maybe we should focus on some discussions here:

Define basic conventions for converting things into classes - http://drupal.org/node/1239644

agreed

fago's picture

Yep, we should follow the outcome of that general discussion and apply it to the entity api too.

We have agreement in this

pwolanin's picture

We have agreement in this issue to keep all entity properties named with underscores: https://drupal.org/node/1233394