Currently the process values are a list of plugins and their configurations. This is causing a lot of headaches because of ordering. And also, noone thinks in terms in plugins. The Drupal 7 version of addFieldMapping was also taking destination as the first argument. Reading the current documentation at https://drupal.org/node/2127611 first might be helpful.
Let's change the process
. Instead of that list of plugins+configurations, make it an associative array keyed by destination fields and the values describing how they come to be. These values can be of three kinds: a string, an associative array or a numeric array (list).
- If it's a string: it's simply copied from the source. So in the system site migration instead of
process:
-
plugin: copy_from_source
mail: site_mail
'page:front': site_frontpagewe would write:
process:
mail: site_mail
'page:front': site_frontpageetc. Very simple.
-
If it's an associative array which must have a plugin key and it's configuration. In the example below, the
name
is mapped. So, instead of
process:
-
plugin: map
map:
filter:
0: filter_html
1: filter_autop
php:
0: php_code
source:
- module
- delta
destination: name
we would write:
process:
name:
plugin: map
map:
filter:
0: filter_html
1: filter_autop
php:
0: php_code
source:
- module
- delta
item 1. above is really just a degenerate case of this, because
process:
mail: site_mail
can be written as
process:
mail:
plugin: get
source: site_mail -
And finally, it can be a list (numeric keyed array). Both notations (the complete associative array and the simplified string for copy) are available as list items. For example, let's copy the author value from the source to the destination property uid, then look up the value in a previous
user
migration and then finally set a default. So the old version was:
process:
-
plugin: copy_from source:
uid: author
-
plugin: migration
property: uid
id: user
-
plugin: default
property: uid
value: 0process:
uid:
- author
-
plugin: migration
id: user
-
plugin: default
value: 0In fact, 2. is a degenerate case of this as well: we allow an array containing a single element to drop the wrapping array.
To facilitate this, the code will look like this:
- I am retiring the ProcessPluginBag in place of a simple list of process plugins. The key-per-instanceid doesn't work well for us -- I can't be this sure we will never need to apply the same plugin twice. And keying per instance id somewhat means the order doesn't matter and it matters a lot. In Migration, before calling the process plugin manager getInstance we will 1. into 2 and then 2 into 3: every destination property will have a simple list of process plugins. The 1-2-3-ness of the entity will not be visible from the outside.
- The ProcessInterface::apply first argument changes from $row to a single value: in the first step, this value is the source property value(s) if one is present. For every other step, what the previous apply returned. No typehint on either @param or @return -- oh well.
This way we do not need to explicitly document and enforce that you must copy first from the source, it just happens in a very natural way. No need for plugins to decide whether they read from the source or from the destination -- currently only copy_from_source
reads from the source but it's superb awkward as you might want to run the migration
or map
as the first step.
There's a slight problem here: if you want to run map
but not as first. Say, you do a migration
lookup first and then want to map
with the current value and the . I think:
process:
bar: ;whatever here.
foo:
-
plugin: migration
id: some_migration
-
plugin: map
; map: omitted for breviy
source:
-
- @bar
where the null value means current and @bar means that it's actually the already migrated value. We will tell people to double their @ signs in the source and destination fields as escape, I am sure noone ever will actually need that.An even more edge case would be this:
- bar step1
- foo step1
- bar step2 depending on @foo
- foo step2 depending on @bar
This can't be encoded using the above syntax as you key by destination. However, as we pass on $row to the process plugins, you can encode this interdependency in a custom plugin.
During implementation it dawned on me that source
is just a convenient notation of prepending a get
plugin to the current step.
Thoughts?
Comments
talked with chx in irc.this
talked with chx in irc.
this does look better. it solves the order problem, and adds simplicity.
and if I'm understanding correctly, we dont lose anything functionally because it's still plugins.
Cathy Theys
+1 No Following in group :(
+1
No Following in group :(