Series Overview & ToC | Previous Article | Next Article - coming soon!
In this article, we start implementing content model changes. Namely, we’ll migrate Drupal 7 nodes as Drupal 10 user and taxonomy term entities. After covering entity ID and high water mark considerations, we will explain how to map data between two different entity types. We’ll also show how to introspect the Drupal 10 site to determine which properties and fields are available in target entity types. Then, we’ll write the two migrations by hand to practice what we have learned so far in the series. Finally, we’ll share all properties and fields for users and taxonomy terms in our example to serve as reference in similar projects.
Entity ID and high water mark considerations
Let’s begin with migrating nodes into users and taxonomy terms. In the previous article, we discussed the need to account for node IDs and revision IDs when deciding on the new AUTO_INCREMENT values for the two entities. No need to repeat that today. Just remember that when incorporating content model changes, multiple Drupal 7 entities and tables should be considered.
Also note that when changing entity types, it will not be possible to reuse the same identifier from the original entity in most cases. Imagine that a node with nid
3 in Drupal 7 belongs to a content type that will be converted to the user entity in Drupal 10. And in Drupal 7 we already had a user with uid
3. If we preserve user IDs, trying to migrate nid
3 on the already migrated uid
3 will cause data loss. This is extra problematic when there are entity references to preserve. This was discussed in article 4 and later expanded in article 5.
Two possible ways to avoid problems in this type of situation are:
-
Do not migrate the identifier of the Drupal 7 entity that will be migrated into a different Drupal 10 entity. In the scenario described above, we would not migrate the
nid
property of the nodes that will be converted to users. This is the approach we will follow in our example project. -
Apply an offset to the identifier of the Drupal 7 entity that will be migrated into a different Drupal 10 entity. The offset should be high enough to avoid conflicts with the two entities at play. In the scenario described above, we could apply an offset of 1,000,000 so that
nid
3 will be imported asuid
1,000,003. This would work under the assumption that, by the time we run the final migration prior to launch, there are less than 1,000,000 users in Drupal 7. When migrating revisionable entities, a different offset value might be required for the revision identifier. In the case of nodes, that is the version id (vid
) property.
When we have a migration that involves an entity type change, the high_water_property
property should be selected based on the original entity type from Drupal 7, not the new one in Drupal 10. This is because the high water is a configuration option of the source plugin. Our source is Drupal 7 nodes; therefore, we need to choose a node property to use as the high water. We will use the node revision ID (vid
) as follow:
source:
key: migrate
plugin: d7_node
high_water_property:
name: vid
alias: nr
Technical note: Per our migration plan, we do not need to migrate past node revisions. Because of this, we used the node classic migration when generating the migrations. This is being highlighted because the node classic migration and the node complete migration use different source plugins: d7_node and d7_node_complete, respectively. Different source plugins will expose different sets of fields. Not all of them are listed by the drush migrate:fields-source command. Debugging the migration is the best way to find all available fields.
Destination entity properties and fields
When implementing content model changes, it is important to have a good understanding of Drupal entities. As part of the process pipeline, we need to set properties and fields in the Drupal 10 destination entity using data from the Drupal 7 entity retrieved by the source plugin. In today's example, we will set properties and fields for the Drupal 10 user
and taxonomy_term
entities using data from the Drupal 7 node
entities.
But, how do you know what properties and fields need to be set? There are multiple ways:
- Use the generated migrations (in the
ref_migrations
folder) to get a sense of what properties are expected based on the destination plugin of each migration. - Consult online references. This article has a list of properties for various Drupal core content entities. This other article has a list of properties for various Drupal Commerce content entities.
- Ask the Drupal API. The commands below will extract data for your current Drupal 10 installation based on the modules that are enabled:
# Get all entities Drupal 10 entities in the current installation.
ddev drush php:eval "print_r(array_keys(\Drupal::entityTypeManager()->getDefinitions()));"
# List only content entities.
# The use of single quotes to pass the code snippet to php:eval is important.
ddev drush php:eval 'print_r(array_keys(array_filter(\Drupal::entityTypeManager()->getDefinitions(), fn($entity) => $entity->getGroup() === "content")));'
# List all properties and fields for an entity. Replace ENTITY_ID with the machine name of the entity. Example: 'node'.
# Only content entities are supported.
ddev drush php:eval "print_r(array_keys(\Drupal::service('entity_field.manager')->getFieldStorageDefinitions('ENTITY_ID')));"
# List all properties and fields for a bundle of an entity. Replace ENTITY_ID and BUNDLE with the machine name of the entity and bundle, respectively. Example: 'node' and 'article'.
# Only entity types that implement \Drupal\Core\Entity\FieldableEntityInterface are supported.
ddev drush php:eval "print_r(array_keys(\Drupal::service('entity_field.manager')->getFieldDefinitions('ENTITY_ID', 'BUNDLE')));"
# List fields across bundles of an entity. Replace ENTITY_ID with the machine name of the entity. Example: 'node'.
ddev drush php:eval "print_r(\Drupal::service('entity_field.manager')->getFieldMap()['ENTITY_ID']);"
Knowing the names of an entity's properties and fields is the first step. You also need to know what type of data is expected. Is it a scalar value like an integer, a string, or a boolean? Is it an array? If so, what is the structure of the array? When assigning values to fields, it is important to know their cardinality and what sub-fields are available. This article includes a list of sub-fields per field type. It is also useful to know what is the default sub-field, if one exists, for the field type.
Again, looking at the generated migrations and consulting online references can help with this. When in doubt, ask the Drupal API:
# Get a list of sub-fields. Replace ENTITY_ID and PROPERTY_OR_FIELD with the machine name of the entity and machine name of the property/field, respectively. Example: 'node' and 'nid'/'body'.
ddev drush php:eval "print_r(array_keys(\Drupal::service('entity_field.manager')->getFieldStorageDefinitions('ENTITY_ID')['PROPERTY_OR_FIELD']->getColumns()));"
# Get details about the sub-field, including database base schema information.
ddev drush php:eval "print_r(\Drupal::service('entity_field.manager')->getFieldStorageDefinitions('ENTITY_ID')['PROPERTY_OR_FIELD']->getColumns());"
# Get the default sub-field, if one exists for the field type.
ddev drush php:eval "var_dump(\Drupal::service('entity_field.manager')->getFieldStorageDefinitions('ENTITY_ID')['PROPERTY_OR_FIELD']->getMainPropertyName());"
Properties and fields for user and taxonomy term entities
When creating Drupal 10 entities, it is not necessary to provide a value for every property and field. In fact, in some cases we will not have suitable Drupal 7 data to populate some Drupa 10 properties and fields. In other cases, we will intentionally not set a destination property. In our example, we will not set the primary identifiers of the user and taxonomy term entities (uid
and tid
) to avoid potential ID conflicts as explained above.
You can get a list of available destination properties and fields for the user
and taxonomy_term
entities in your current installation execute with the following commands inside the drupal10
folder:
ddev drush php:eval "print_r(array_keys(\Drupal::service('entity_field.manager')->getFieldStorageDefinitions('user')));"
ddev drush php:eval "print_r(array_keys(\Drupal::service('entity_field.manager')->getFieldStorageDefinitions('taxonomy_term')));"
If you want to obtain even more details, open an interactive PHP shell executing ddev drush php:cli
in the Drupal 10 folder. Then, run the following code:
// Get all property and field definitions for an entity. Pick one of the following.
$field_storage_definitions = \Drupal::service('entity_field.manager')->getFieldStorageDefinitions('user');
$field_storage_definitions = \Drupal::service('entity_field.manager')->getFieldStorageDefinitions('taxonomy_term');
// Find more details about each property and field.
$field_storage_data = array_map(function ($field_definition) {
$module = ($field_definition instanceof \Drupal\Core\Field\BaseFieldDefinition) ? $field_definition->getProvider() : $field_definition->get('module');
$label = $field_definition->getLabel();
$description = $field_definition->getDescription();
if ($field_definition instanceof \Drupal\field\Entity\FieldStorageConfig) {
$label = 'Field ' . $field_definition->getName();
$description = 'Attached to bundle(s): ' . implode(', ', $field_definition->getBundles()) . '.';
}
return [
'module' => $module,
'type' => $field_definition->getType(),
'label' => ($label instanceof \Drupal\Core\StringTranslation\TranslatableMarkup) ? $label->render() : $label,
'description' => ($description instanceof \Drupal\Core\StringTranslation\TranslatableMarkup) ? $description->render() : $description,
'cardinality' => $field_definition->getCardinality(),
'default_subfield' => $field_definition->getMainPropertyName(),
'subfields' => array_keys($field_definition->getColumns()) ?? $field_storage->getPropertyNames(),
];
}, $field_storage_definitions);
// Print the names of the entity's properties and fields.
array_keys($field_storage_definitions)
// Print details about the entity's properties and fields.
$field_storage_data
// Print details for a single property or field in the entity.
$field_storage_data['user_picture']
// The output of the $field_storage_data['user_picture'] is:
[
"module" => "image",
"type" => "image",
"label" => "Field user_picture",
"description" => "Attached to bundle(s): user.",
"cardinality" => 1,
"default_subfield" => "target_id",
"subfields" => [
"target_id",
"alt",
"title",
"width",
"height",
],
]
Migrating nodes as taxonomy terms
We could use the migrate_plus.migration.upgrade_d7_node_sponsor.yml
and migrate_plus.migration.upgrade_d7_taxonomy_term_tags.yml
files in the ref_migrations
as a reference to migrate Drupal 7 nodes into Drupal 10 taxonomy terms. We could also review the upgrade_d7_taxonomy_term
migration we created in the previous article.
Instead, we will create the migration file from scratch. We have seen and customized many migrations already in the series. At this point, you should have gained familiarity with the structure of migration files.
As a reminder, we want to migrate nodes of type sponsor
as taxonomy term entities. A corresponding sponsor
Drupal 10 vocabulary was created back in article 15. Below is a summary of how Drupal 7 data will be migrated into Drupal 10:
- The node title will be migrated as the taxonomy term name.
- The description field will be migrated as the taxonomy term description.
- The logo field will be migrated into a newly created image field attached to the
sponsor
taxonomy vocabulary.
Create an upgrade_d7_node_sponsor_to_taxonomy_term
file in the web/modules/custom/tag1_migration/migrations
folder of our Drupal 10 project. The same file name we will use as the migration ID. You can come up with any name as long as it is unique. When creating migrations that involve entity type conversions, I recommend including the source and destination entity machine names in the migration ID.
Below is the content of the file:
id: upgrade_d7_node_sponsor_to_taxonomy_term
class: Drupal\migrate\Plugin\Migration
migration_tags:
- 'Drupal 7'
- Content
- taxonomy_term
- tag1_content
label: 'Nodes (Sponsor) to taxonomy terms'
source:
key: migrate
plugin: d7_node
node_type: sponsor
high_water_property:
name: vid
alias: nr
process:
name:
-
plugin: get
source: title
description:
-
plugin: sub_process
source: field_description
process:
value: value
format:
-
plugin: static_map
source: format
map:
filtered_html: restricted_html
bypass: TRUE
changed:
-
plugin: get
source: changed
langcode:
-
plugin: default_value
source: language
default_value: und
field_logo:
-
plugin: sub_process
source: field_logo
process:
target_id: fid
alt: alt
title: title
width: width
height: height
destination:
plugin: 'entity:taxonomy_term'
default_bundle: 'sponsors'
migration_dependencies:
required:
- upgrade_d7_file
- upgrade_d7_taxonomy_term
optional: { }
Note that the source plugin fetches nodes and the destination plugin is configured to create taxonomy terms. Then, in the process section, we map available Drupal 7 node data to Drupal 10 taxonomy term data.
Drupal 7 uses a rich text field for the description. This means that the field_description
stores information about the text format used. As noted in article 22, in our example project text formats between Drupal 7 and 10 do not match verbatim. In the migration above, if the Drupal 7 filtered_html
format is used, it will be migrated as the restricted_html
in Drupal 10.
Text formats play an important role in the security of a Drupal website. They can filter out malicious markup that can be used to breach into a website and compromise their users. Security hardening is outside the scope of this series. Yet, we want to make sure you consider how text formats have an impact on security and incorporate that in your migration plan.
To specify the vocabulary a term belongs to, you can either use the vid
property as we did in the upgrade_d7_taxonomy_term
or use the specify the default_bundle
property in the entity:taxonomy_term
destination plugin. If both are set, the vid
property will take precedence.
Before executing the upgrade_d7_node_sponsor_to_taxonomy_term
migration, make sure to account for potential entity ID conflicts. This is particularly important when a migration performs content model changes. We covered this in great detail in article 23. You can use the AUTO_INCREMENT Alter module with the configuration from the start of this article.
Now, rebuild caches for our new migration to be detected and execute it. Run migrate:status
to make sure we can connect to Drupal 7. Then, run migrate:import
to perform the import operations.
ddev drush cache:rebuild
ddev drush migrate:status upgrade_d7_node_sponsor_to_taxonomy_term
ddev drush migrate:import upgrade_d7_node_sponsor_to_taxonomy_term
If things are properly configured, you should not get any errors. Go to https://migration-drupal10.ddev.site/admin/structure/taxonomy/manage/sponsors/overview
and look at the list of migrated taxonomy terms.
Migrating nodes as users
We could use the migrate_plus.migration.upgrade_d7_node_speaker.yml
and migrate_plus.migration.upgrade_d7_user.yml
files in the ref_migrations
as a reference to migrate Drupal 7 nodes into Drupal 10 taxonomy terms. We could also review the upgrade_d7_user
migration we created in the previous article. However, for this example, we will create the migration file from scratch again. It is good practice and will help us get a better understanding of the system.
As a reminder, we want to migrate nodes of type speaker
as user entities. In Drupal 7, this content type has many fields attached to it. The corresponding Drupal 10 fields were added to the user entity in article 22 by applying recipes and manually creating some. Below is a summary of how Drupal 7 data will be migrated into Drupal 10:
- The node title will be migrated as the username (
name
property). - The
field_email
field will be migrated as the user email (mail
property). - The
field_profile_picture
field will be migrated into theuser_picture
image field. - The
field_website
andfield_biography
fields will be migrated into corresponding Drupal 10 fields with the same name. - The
field_drupal_org_profile
,field_linkedin_profile
, andfield_x_twitter_profile
will be combined into a singlefield_social_media_links
field of type social links. - The
field_favorite_quote
field collection will be migrated into thefield_favorite_quote
paragraph.
The last two elements of the list above will be covered in the next article. The rest will be addressed below. Additionally, all users migrated from speaker nodes will get the Speaker
user role we created in article 22.
Unpublished nodes of type speaker
will not be migrated. Also note that even though we are creating user entities, we do not have suitable password information in Drupal 7. After the migration, individual users can trigger a password reset operation or administrators can force this operation in bulk.
Create an upgrade_d7_node_speaker_to_user
file in the web/modules/custom/tag1_migration/migrations
folder of our Drupal 10 project. The same file name we will use as the migration ID. Below is the content of the file:
id: upgrade_d7_node_speaker_to_user
class: Drupal\migrate\Plugin\Migration
migration_tags:
- 'Drupal 7'
- Content
- user
- tag1_content
label: 'Nodes (Speaker) to user accounts'
source:
key: migrate
plugin: d7_node
node_type: speaker
high_water_property:
name: vid
alias: nr
process:
name:
-
plugin: get
source: title
mail:
-
plugin: sub_process
source: field_email
process:
value: email
created:
-
plugin: get
source: created
changed:
-
plugin: get
source: changed
status:
-
plugin: skip_on_empty
source: status
method: row
message: 'Node was not migrated because it is unpublished.'
init:
-
plugin: get
source: '@mail'
roles:
-
plugin: default_value
default_value: [speaker]
user_picture:
-
plugin: sub_process
source: field_profile_picture
process:
target_id: fid
alt: alt
title: title
width: width
height: height
field_biography:
-
plugin: get
source: field_biography
field_website:
-
plugin: sub_process
source: field_website
process:
uri: value
title: title
options: attributes
destination:
plugin: 'entity:user'
migration_dependencies:
required:
- upgrade_d7_file
optional: { }
Note that the source plugin fetches nodes and the destination plugin is configured to create users. Then, in the process section, we map available Drupal 7 node data to Drupal 10 user data.
This example uses different concepts and techniques we have covered throughout the series. We do not want to repeat ourselves too much, but here is a list of highlights from our example:
-
Some Drupal 7's fields use different property names to hold data compared to their Drupal 10's counterparts. In such cases, the sub_process process plugin can be used to account for such changes in property names. This was needed to migrate Drupal 7's
Email
andURL
fields into Drupal 10'smail
property andLink
field respectively. -
The source and destination entity might share a property with the same name, but different meanings. In this example, the
status
property exists both in Drupal 7 nodes and in Drupal 10 users. For nodes,status
indicates the publication status: published or unpublished. For users, thestatus
indicates whether the user account is active or blocked. In our case, we use thestatus
as retrieved from the node to determine if the record should be migrated or not using the skip_on_empty process plugin. When the node is published, it stores a value of1,
which is sent to the destinationstatus
property meaning the account is active. Reusing a property with the same will not always be possible. It will depend on the meaning between the two entities and the process plugin chain used to assign the destination property. - **The
init
user property stores the email address used for initial account creation. We are reusing the value already assigned in themail
destination property to set theinit
property. -
The
roles
user property expects a flat array array structure with the machine names of the roles the user will be assigned. We use the default_value process plugin to assign theSpeaker
roles.
Before executing the upgrade_d7_node_speaker_to_user
migration, make sure to account for potential entity ID conflicts as mentioned above. Now, rebuild caches for our new migration to be detected and execute it. Run migrate:status
to make sure we can connect to Drupal 7. Then, run migrate:import
to perform the import operations.
ddev drush cache:rebuild
ddev drush migrate:status upgrade_d7_node_speaker_to_user
ddev drush migrate:import upgrade_d7_node_speaker_to_user
If things are properly configured, you should not get any errors. Go to https://migration-drupal10.ddev.site/admin/people?role=speaker
and look at the list of migrated users.
Entity properties and fields for user and taxonomy terms entities
For reference, below is a list of the entity properties and fields attached to user and taxonomy terms based on the modules enabled in our example project.
The following are properties and fields in the user
entity:
-
uid
(type:integer
): User ID. The user ID. -
uuid
(type:uuid
): UUID. The user UUID. -
langcode
(type:language
): Language code. The user language code. -
preferred_langcode
(type:language
): Preferred language code. The user's preferred language code for receiving emails and viewing the site. -
preferred_admin_langcode
(type:language
): Preferred admin language code. The user's preferred language code for viewing administration pages. -
name
(type:string
): Name. The name of this user. -
pass
(type:password
): Password. The password of this user (hashed). -
mail
(type:email
): Email. The email of this user. -
timezone
(type:string
): Timezone. The timezone of this user. -
status
(type:boolean
): User status. Whether the user is active or blocked. -
created
(type:created
): Created. The time that the user was created. -
changed
(type:changed
): Changed. The time that the user was last edited. -
access
(type:timestamp
): Last access. The time that the user last accessed the site. -
login
(type:timestamp
): Last login. The time that the user last logged in. -
init
(type:email
): Initial email. The email address used for initial account creation. -
roles
(type:entity_reference
): Roles. The roles the user has. -
default_langcode
(type:boolean
): Default translation. A flag indicating whether this is the default translation. -
field_biography
(type:string_long
): Field field_biography. Attached to bundle(s): user. -
field_favorite_quote
(type:entity_reference_revisions
): Field field_favorite_quote. Attached to bundle(s): user. -
field_social_media_links
(type:social_links
): Field field_social_media_links. Attached to bundle(s): user. -
field_website
(type:link
): Field field_website. Attached to bundle(s): user. -
user_picture
(type:image
): Field user_picture. Attached to bundle(s): user.
The following are properties and fields in the taxonomy_term
entity:
-
tid
(type:integer
): Term ID. The term ID. -
uuid
(type:uuid
): UUID. The term UUID. -
revision_id
(type:integer
): Revision ID. -
langcode
(type:language
): Language. The term language code. -
vid
(type:entity_reference
): Vocabulary. The vocabulary to which the term is assigned. -
revision_created
(type:created
): Revision create time. The time that the current revision was created. -
revision_user
(type:entity_reference
): Revision user. The user ID of the author of the current revision. -
revision_log_message
(type:string_long
): Revision log message. Briefly describe the changes you have made. -
status
(type:boolean
): Published. -
name
(type:string
): Name. -
description
(type:text_long
): Description. -
weight
(type:integer
): Weight. The weight of this term in relation to other terms. -
parent
(type:entity_reference
): Term Parents. The parents of this term. -
changed
(type:changed
): Changed. The time that the term was last edited. -
default_langcode
(type:boolean
): Default translation. A flag indicating whether this is the default translation. -
revision_default
(type:boolean
): Default revision. A flag indicating whether this was a default revision when it was saved. -
revision_translation_affected
(type:boolean
): Revision translation affected. Indicates if the last edit of a translation belongs to current revision. -
field_logo
(type:image
): Field field_logo. Attached to bundle(s): sponsors.
Next time, we’ll update this node to user migration to populate the field_favorite_quote
and field_social_media_links
fields. This will require migrating paragraphs and creating a custom process plugin. Stay tuned.
Image by Anne and Saturnino Miranda from Pixabay