Let's pivot

We encourage users to post events happening in the community to the community events group on https://www.drupal.org.
chx's picture

Currently the process values are a list of plugins and their configurations. This is causing a lot of headaches because of ordering. And also, noone thinks in terms in plugins. The Drupal 7 version of addFieldMapping was also taking destination as the first argument. Reading the current documentation at https://drupal.org/node/2127611 first might be helpful.

Let's change the process. Instead of that list of plugins+configurations, make it an associative array keyed by destination fields and the values describing how they come to be. These values can be of three kinds: a string, an associative array or a numeric array (list).

  1. If it's a string: it's simply copied from the source. So in the system site migration instead of
    process:
        -
            plugin: copy_from_source
            mail: site_mail
            'page:front': site_frontpage

    we would write:

    process:
      mail: site_mail
      'page:front': site_frontpage

    etc. Very simple.

  2. If it's an associative array which must have a plugin key and it's configuration. In the example below, the name is mapped. So, instead of
    process:
      -
        plugin: map
        map:
          filter:
            0: filter_html
            1: filter_autop
          php:
            0: php_code
        source:
          - module
          - delta
        destination: name

    we would write:
    process:
      name:
        plugin: map
        map:
          filter:
            0: filter_html
            1: filter_autop
          php:
            0: php_code
        source:
          - module
          - delta

    item 1. above is really just a degenerate case of this, because
    process:
      mail: site_mail

    can be written as
    process:
      mail:
        plugin: get
        source: site_mail
  3. And finally, it can be a list (numeric keyed array). Both notations (the complete associative array and the simplified string for copy) are available as list items. For example, let's copy the author value from the source to the destination property uid, then look up the value in a previous user migration and then finally set a default. So the old version was:
    process:
      -
        plugin: copy_from source:
        uid: author
      -
        plugin: migration
        property: uid
        id: user
      -
        plugin: default
        property: uid
        value: 0

    process:
      uid:
        - author
        -
          plugin: migration
          id: user
        -
          plugin: default
          value: 0

    In fact, 2. is a degenerate case of this as well: we allow an array containing a single element to drop the wrapping array.

To facilitate this, the code will look like this:

  1. I am retiring the ProcessPluginBag in place of a simple list of process plugins. The key-per-instanceid doesn't work well for us -- I can't be this sure we will never need to apply the same plugin twice. And keying per instance id somewhat means the order doesn't matter and it matters a lot. In Migration, before calling the process plugin manager getInstance we will 1. into 2 and then 2 into 3: every destination property will have a simple list of process plugins. The 1-2-3-ness of the entity will not be visible from the outside.
  2. The ProcessInterface::apply first argument changes from $row to a single value: in the first step, this value is the source property value(s) if one is present. For every other step, what the previous apply returned. No typehint on either @param or @return -- oh well.

This way we do not need to explicitly document and enforce that you must copy first from the source, it just happens in a very natural way. No need for plugins to decide whether they read from the source or from the destination -- currently only copy_from_source reads from the source but it's superb awkward as you might want to run the migration or map as the first step.

There's a slight problem here: if you want to run map but not as first. Say, you do a migration lookup first and then want to map with the current value and the . I think:

process:
  bar: ;whatever here.
  foo:
    -
      plugin: migration
      id: some_migration
    -
      plugin: map
      ; map: omitted for breviy
      source:
        -
        - @bar

where the null value means current and @bar means that it's actually the already migrated value. We will tell people to double their @ signs in the source and destination fields as escape, I am sure noone ever will actually need that.An even more edge case would be this:
  1. bar step1
  2. foo step1
  3. bar step2 depending on @foo
  4. foo step2 depending on @bar

This can't be encoded using the above syntax as you key by destination. However, as we pass on $row to the process plugins, you can encode this interdependency in a custom plugin.

During implementation it dawned on me that source is just a convenient notation of prepending a get plugin to the current step.

Thoughts?

Comments

talked with chx in irc.this

YesCT's picture

talked with chx in irc.

this does look better. it solves the order problem, and adds simplicity.
and if I'm understanding correctly, we dont lose anything functionally because it's still plugins.

Cathy Theys

+1 No Following in group :(

nithinkolekar's picture

+1
No Following in group :(

IMP

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds: