Series Overview & ToC | Previous Article | Next Article - coming soon!
In the previous article, we learned how to migrate paragraphs and create custom process plugins. Good exercise for the brain. Today, we will do some exercises for the body. Get ready for a strength training session — Drupal style — where we will learn about creating custom source plugins, extending existing ones, and writing media migrations from scratch.
Grab a bottle of water and a towel. Let's go!
Entity ID and high water mark considerations
To warm up, we will do some shoulder rolls. While doing so let's revisit entity IDs and high water mark considerations. We covered file migrations back in article 24. Much of what was presented there applies today. The Tag1 Team Talks podcast on file and media migrations also contains lots of useful information for today's topic. To avoid repeating ourselves, we’ll summarize what you need to take into account to prevent entity ID conflicts and choose a high water mark.
You need to know which modules were used in the source site to provide media related functionality. Our example project uses Core's file
and image
modules plus the YouTube Field module to store references to external videos. Other Drupal 7 projects might use the File Entity module, which makes the file entity fieldable and allows it to have bundles similar to Drupal 10 media entities. Still other projects might use the D7 Media module and its vast ecosystem of related modules to provide a rich media management experience.
The reason we bring this up is because depending on how media was implemented in Drupal 7 and the content model you want to pursue in Drupal 10, you will have to look at different tables to determine the new AUTO_INCREMENT value for the media entity. Our Drupal 7 example project is relatively simple so checking the file_managed
table will suffice. We showed how to do this in article 24.
For brevity, we are going to show how to configure the auto_increment_alter_content_entities
setting of the AUTO_INCREMENT Alter module to apply the new AUTO_INCREMENT value. Refer to article 23 for more information of how this module works or how to perform the operation in custom code.
$settings['auto_increment_alter_content_entities'] = [
'media' => [350], // Alter the tables for the media content entity.
];
Now execute the command provided by the AUTO_INCREMENT Alter module to trigger the alter operation in the Drupal 10 project.
ddev drush auto-increment-alter:content-entities
As for the high water property, it will depend on where your data comes from. Today's example includes two different sources:
-
Field API tables. As we are going to discuss later, Drupal 7 fields use two tables: one for current revision data and another for past revision data. In theory, the
revision_id
column could be used as the high water mark. That said, the table for the current revision allowsNULL
values for therevision_id
column, making it less than ideal for our purposes. Many times there will be a value, but its presence is not enforced at the database level. For simplicity, we will not define a high water mark in the migrations that read data directly from field API tables. -
File entity tables. For this, we can use the same high water mark configuration used in the
upgrade_d7_file
andupgrade_d7_file_private
migrations:
source:
key: migrate
plugin: tag1_media_image
high_water_property:
name: fid
alias: f
Migrate source plugin
Take some lightweight dumbbells. I’m using 10 pounds, approximately 4.5 kilos. To start, we will do three rounds of bicep curls. In between each round, we will casually talk about migrate source plugins.
Migrate source plugins are responsible for fetching data from one of many supported data repositories. In our example project, all migrations retrieve data from a Drupal 7 database. Other supported sources are: JSON and XML files, CSV files, Excel and LibreOffice files, etc. From a technical point of view, they leverage Drupal's Plugins API.
Below are some highlights regarding their implementation:
- Migrate source plugins are classes in the
Drupal\[module]\Plugin\migrate\source
namespace that implement the MigrateSourceInterface interface. - The SourcePluginBase base class is available for convenience. It implements common methods invoked for source plugins.
- Many source plugins implement a prepareRow method that can add, edit, or delete data retrieved from the source. It is possible to use the retrieved data to further query the source and fetch extra information. For example, fieldable entities use this method to attach Drupal 7 Field API data to the entity being fetched. The
prepareRow
method can also be used to instruct that a record should not be processed by returningFALSE
. - If you need to inject a service, the plugin needs to implement the ContainerFactoryPluginInterface interface and its create method. Examples of this are the SqlBase, d7_node and spreadsheet. Notice that while
d7_node
does not explicitly implement the interface in its class declaration, it still can implement/overwrite the methods of the interface defined in one of its parent classes. - For discovery, they still use Doctrine annotations at the time of publishing this article. When this Drupal core issue is committed, they will use PHP attributes for discovery of migrate source plugins.
- Many source plugins related to Drupal 7 to 10 migrations extend the SqlBase class and implement the
query
method. This lets you alter the SQL query used to retrieve data from the source database. You can add conditions, join tables, expand the list of fields to fetch, and much more. A common use case is limiting the number of records to retrieve. For example, you can decide only published nodes will be migrated to Drupal 10, leaving any unpublished content behind. This will be presented in the next article. - The Migrate Drupal module offers a lot of functionality related to migrating data from Drupal 6 and 7. It even provides a way to introspect the current Drupal 10 installation to fetch content and configuration data via the content_entity and config source plugins respectively.
- The plugin.manager.migrate.source service is the plugin manager for migrate source plugins. You can use it to get a list of available source plugins and obtain more details about their definition.
# List of migrate source plugin ids.
ddev drush php:eval "print_r(array_keys(\Drupal::service('plugin.manager.migrate.source')->getDefinitions()));"
# Details on a specific migrate source plugin id.
ddev drush php:eval "print_r(\Drupal::service('plugin.manager.migrate.source')->getDefinitions()['SOURCE_PLUGIN_ID']);"
Creating custom source plugins from scratch
Put those dumbbells aside for a moment, because we are going to do bodyweight exercise. Let's do three rounds of push-ups, 45 seconds each. Go at your own pace and feel free to kneel if necessary. When done, come back to learn how to create custom source plugins from scratch.
Per our migration plan, we want to migrate data stored in YouTube fields in Drupal 7 as remote video media entities in Drupal 10. In practice, there is only one field that would undergo this transformation: field_video_recording
. The YouTube field module is available for Drupal 10 and provides an automated upgrade path via a field migration plugin. That said, reading field data to create media entities is a great example to learn how to create a custom source plugin and incorporate content model changes as part of the upgrade process.
Back in article 16, we had a deep dive into understanding how Drupal fields work, both from the perspective of PHP code and database tables. I also recommend watching this presentation that covers multiple examples of how to perform content model changes when migrating from Drupal 7.
In Drupal 7, YouTube fields have 2 sub-fields: input
and video_id
. Input stores the URL as submitted by the user. Example values are:
-
https://www.youtube.com/watch?v=HMYpxm-2o4c
-
https://youtu.be/HMYpxm-2o4c
Notice that both URLs link to the same video, but follow a different URL pattern. The video_id
sub-field calls the youtube_get_video_id function to extract the video ID out of multiple accepted URL formats as defined in the module's README.txt file. For these two video URLs, the video_id
value would be the same: HMYpxm-2o4c
. When migrating data into Drupal 10 we will deduplicate records like these. Namely, we will only migrate unique video ID values.
In Drupal 10, a remote video media entity uses a plain text field to store the URL to the video: field_media_oembed_video
. Plain text fields have a single value
sub-field. Our custom source plugin will fetch the video ID and create a valid YouTube URL that can be assigned to the text field in the remote video media entity.
Create a PHP file named Tag1YouTubeField.php
inside our tag1_migration
custom module's /src/Plugin/migrate/source
folder. The path relative to the Drupal 10's project docroot is web/modules/custom/tag1_migration/src/Plugin/migrate/source/Tag1YouTubeField.php
. The content of the file should be:
<?php
namespace Drupal\tag1_migration\Plugin\migrate\source;
use Drupal\Component\Plugin\Exception\InvalidPluginDefinitionException;
use Drupal\Core\State\StateInterface;
use Drupal\migrate\Plugin\migrate\source\SqlBase;
use Drupal\migrate\Plugin\MigrationInterface;
use Drupal\migrate\Row;
/**
* Drupal 7 YouTube Field data.
*
* Available configuration keys:
* - name: The name of the YouTube field.
* - revisions: (optional) If TRUE, retrieve field revisions. Defaults to FALSE.
*
* For additional configuration keys, refer to the parent classes.
*
* @see https://udrupal.com/migrate-source-plugins
*
* Example:
*
* @code
* source:
* plugin: tag1_youtube_field
* name: field_video
* revisions: TRUE
* @endcode
*
* @see \Drupal\migrate\Plugin\migrate\source\SqlBase
* @see \Drupal\migrate_drupal\Plugin\migrate\source\DrupalSqlBase
* @see \Drupal\migrate_plus\Plugin\migrate\source\Table
* @see \Drupal\migrate_source_csv\Plugin\migrate\source\CSV
* @see \Drupal\media_migration\Plugin\migrate\source\VideoEmbedField
*
* @MigrateSource(
* id = "tag1_youtube_field",
* source_module = "youtube"
* )
*/
class Tag1YouTubeField extends SqlBase {
/**
* YouTube field table.
*
* @var string
*/
protected string $tableName;
/**
* Column name storing the YouTube video ID.
*
* @var string
*/
protected string $videoIdColumnName;
/**
* YouTube URL prefix.
*
* @var string
*/
protected string $urlPrefix = 'https://www.youtube.com/watch?v=';
/**
* {@inheritdoc}
*/
public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration, StateInterface $state) {
if (empty($configuration['name'])) {
throw new \InvalidArgumentException("Table tag1_youtube is missing 'name' property configuration.");
}
parent::__construct($configuration, $plugin_id, $plugin_definition, $migration, $state);
// @see \Drupal\migrate_drupal\Plugin\migrate\source\d7\FieldableEntity::getFieldValues()
$this->tableName = ((bool) $configuration['revisions'] === TRUE ? 'field_revision_' : 'field_data_') . $configuration['name'];
// Retrieve the 'video_id' sub-field.
// @see youtube_field_schema() in Drupal 7's youtube.install file.
// @see youtube_get_video_id() in Drupal 7's youtube.inc file.
$this->videoIdColumnName = $configuration['name'] . '_video_id';
if (!$this->getDatabase()->schema()->tableExists($this->tableName)) {
throw new InvalidPluginDefinitionException($plugin_id, "Source database table '{$this->tableName}' does not exist.");
}
}
/**
* {@inheritdoc}
*/
public function query() {
$query = $this->select($this->tableName, 'yt')
->distinct();
$query->addField('yt', $this->videoIdColumnName, 'video_id');
return $query;
}
/**
* {@inheritdoc}
*/
public function prepareRow(Row $row) {
$video_id = $row->getSourceProperty('video_id');
$row->setSourceProperty('video_url', $this->urlPrefix . $video_id);
}
/**
* {@inheritdoc}
*/
public function fields() {
return [
'video_id' => $this->t('YouTube video id.'),
'video_url' => $this->t('YouTube video URL.'),
];
}
/**
* {@inheritdoc}
*/
public function getIds() {
$ids['video_id']['type'] = 'string';
$ids['video_id']['alias'] = 'yt';
return $ids;
}
}
Technical note: When the Core's Migrate Drupal
module is enabled, it reads the value of source_module
in the annotation and checks if the listed module is enabled in the source site. Because our custom source plugin reads data from the YouTube field module, we use its machine name youtube
. If the module specified in source_module
is not enabled, any migration using that source plugin will be filtered out and will not appear in the list of available migrations that can be executed. This is done with a combination of the enforce_source_module_tags
configuration in migrate_drupal.settings.yml
and tags being added to the migration_tags
section of a migration definition file. By default, Drupal 7
is one of the migration tags to enforce and it is included in the generated migrations.
A detailed review of the PHP code for the custom source plugin will be left as an exercise to the curious reader. What we want to highlight is that knowing the table structure of field API tables is key to accomplish our goal. First, we need to determine if we want to retrieve data from the current revision only or include all past revisions data. In Drupal 7, every field created two tables based on its machine name: field_data_FIELD_NAME
and field_revision_FIELD_NAME
.
Our process plugin exposes two settings:
-
name
: Required. Stores a string indicating the name of the YouTube field in Drupal 7 to fetch data from. In our example that would be:field_video_recording
. -
revisions
: Optional. A boolean indicating whether revision data should be retrieved or not. Defaults toFALSE
. In our example, we will not migrate revisions data.
With this information, the source plugin can figure out which Drupal 7 table to query: field_data_field_video_recording
. But what about its structure? We need to find out the column name that stores the video ID data. From the root of the Drupal 7 project execute ddev mysql
to access a command-line interface client MySQL. Then execute the following statement at the SQL prompt DESCRIBE field_data_field_video_recording;
.
You will get an output similar to the following:
+--------------------------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------------+------------------+------+-----+---------+-------+
| entity_type | varchar(128) | NO | PRI | | |
| bundle | varchar(128) | NO | MUL | | |
| deleted | tinyint(4) | NO | PRI | 0 | |
| entity_id | int(10) unsigned | NO | PRI | NULL | |
| revision_id | int(10) unsigned | YES | MUL | NULL | |
| language | varchar(32) | NO | PRI | | |
| delta | int(10) unsigned | NO | PRI | NULL | |
| field_video_recording_input | varchar(1024) | YES | | NULL | |
| field_video_recording_video_id | varchar(15) | YES | MUL | NULL | |
+--------------------------------+------------------+------+-----+---------+-------+
Drupal 7 field API creates tables with a common structure. The following columns are present for all fields no matter their type: entity_type
, bundle
, deleted
, entity_id
, revision_id
, language
, and delta
. Then, there will be one column for each sub-field as defined by the field type. In this case, we have field_video_recording_input
and field_video_recording_video_id
. Notice that the machine name of the field is prepended to the sub-field name with an underscore in between. The columns are the same for the field_data_FIELD_NAME
and field_revision_FIELD_NAME
tables. They differ in which columns are nullable and which act as primary keys.
In its query
method, our custom process plugin builds a query to retrieve unique video ID values from current revision data of the field_video_recording
field:
SELECT DISTINCT field_video_recording_video_id FROM field_data_field_video_recording;
Then, in its prepareRow
method, the custom source plugin reads the retrieved video_id
value and creates a new source property named video_url
with a valid YouTube URL. This can later be used in our migration to populate the field_media_oembed_video
field for the remote video media entity.
Speaking of which, let's write that migration.
From Drupal 7 YouTube fields to Drupal 10 remote video media entities
Remember to stay hydrated during today's routine. Next, let's pick up those dumbbells for three rounds of triceps kickbacks. When completed, we will learn how to write a migration to create remote media entities from scratch.
Before doing so, it is important to understand what entities will be created, their base field definitions, and the fields attached to each bundle. We covered this in great detail in article 26. Additionally, this article has a reference of base field definitions for Drupal 10 media entities. In short, we will create entities of type media
of the remote_video
bundle and populate its field_media_oembed_video
plain text field with the video_url
column retrieved by our custom source plugin.
Create an upgrade_d7_media_remote_video.yml
file in the web/modules/custom/tag1_migration/migrations
folder of our Drupal 10 project. We will use that same file name, without the file extension, as the migration ID. You can come up with any name as long as it is unique. Below is the content of the file:
id: upgrade_d7_media_remote_video
class: Drupal\migrate\Plugin\Migration
migration_tags:
- 'Drupal 7'
- Content
- media
- tag1_content
label: 'Media (Remote video)'
source:
key: migrate
plugin: tag1_youtube_field
name: field_video_recording
revisions: FALSE
process:
field_media_oembed_video: video_url
status:
plugin: default_value
default_value: 1
uid:
plugin: default_value
default_value: 1
langcode:
plugin: default_value
default_value: en
destination:
plugin: 'entity:media'
default_bundle: remote_video
migration_dependencies:
required: { }
optional: { }
Now, rebuild caches for our new migration to be detected and trigger an import operation.
ddev drush cache:rebuild
ddev drush migrate:status upgrade_d7_media_remote_video
ddev drush migrate:import upgrade_d7_media_remote_video
Note: The first time you execute the upgrade_d7_media_remote_video
migration, you might perceive that it runs slower than other migrations we have executed recently. Out of the box, when Drupal creates media entities, it tries to create thumbnails that serve as a preview when displaying media entities in listing pages. The thumbnail generation could be a slow process. Fortunately, you can configure media types to queue the thumbnail generation at a later time, speeding up the execution of the migration. This is configured on a per bundle basis. To enable this feature for remote videos go to https://migration-drupal10.ddev.site/admin/structure/media/manage/remote_video
and enable the Queue thumbnail downloads
option inside the Publishing options
section. Thumbnail generation will now happen via a queue worker on cron execution.
If things are properly configured, you should not get any errors. Go to https://migration-drupal10.ddev.site/admin/content/media?type=remote_video
and you shall be presented with multiple videos from our Tag1 Team Talks podcast series.
Note that in our upgrade_d7_media_remote_video
migration, we are calling the tag1_youtube_field
. This should match the id
of the MigrateSource
PHP annotation used in our custom source plugin. Then, we set the name
configuration to the Drupal 7 field name to fetch data from field_video_recording
. Finally, we set the revisions
configuration to FALSE
— meaning we only want to migrate data from current revisions. We could have opted for not including the revisions
configuration (because the plugin assumes a FALSE
default), but we decided to include it for completeness.
As it stands, our custom source plugin can read from one Drupal 7 field at a time. It is possible to write more advanced queries to be able to fetch data from multiple fields in one go. The Media Migration module provides many source plugins which advanced queries and logic in their prepareRow
implementations. For reference, take a look at the YoutubeFieldSource and VideoEmbedField source plugins which retrieve data from the YouTube field and Video Embed Field Drupal 7 modules respectively. These source plugins build dynamic queries that leverage UNION clauses to fetch data from the multiple tables at the time.
The rest of our migration is relatively straightforward. We use the entity:media
destination plugin to create media entities of type remote_video
. In the process
section, we assign the video_url
fetched by our source plugin to the field_media_oembed_video
plain text field. The remaining assignments in the process section are providing default values for the status
, uid
, and langcode
base field definitions.
Extending existing source plugins
Catch your breath, and let's go down to the mat again. Use your dumbbells for three rounds of chest presses. When finished, I will teach how to extend existing source plugins.
Remember that per our migration plan, we also want to replace some of the image fields attached to content types for media reference fields. To accomplish this, we need to migrate file entities from Drupal 7 into media entities in Drupal 10. As covered in article 24, image fields in Drupal 7 are an extension of file fields. The data for both is stored in the file_managed
table. In a stock Drupal 7 installation, an example record for this table would be:
*************************** 1. row ***************************
fid: 1
uid: 1
filename: druplicon.jpg
uri: public://article/druplicon.jpg
filemime: image/jpeg
filesize: 2890
status: 1
timestamp: 280299600
Technical note: If you are using the 7.x-2.x
branch of the File Entity module in Drupal 7, it alters the file_managed
table to add an extra column: type
. For the record above, the value for the type
column would be image
, which is one of the default file types provided by the module. The module also uses a queue triggered on cron to update the value of the new column based on the MIME type of the file.
We already used the d7_file plugin to migrate public and private files in article 24. Today, we are going to extend this core plugin and alter its query to only retrieve images. Namely, we will use the filemime
column in the file_managed
table to limit the records to retrieve. If our project used the File Entity module, we could use the extra type
column for filtering purposes.
Create a PHP file named Tag1MediaImage.php
inside our tag1_migration
custom module's /src/Plugin/migrate/source
folder. The path relative to the Drupal 10's project docroot is web/modules/custom/tag1_migration/src/Plugin/migrate/source/Tag1MediaImage.php
. The content of the file should be:
<?php
namespace Drupal\tag1_migration\Plugin\migrate\source;
use Drupal\file\Plugin\migrate\source\d7\File;
/**
* Retrieve permanent images
*
* @see \Drupal\file\Plugin\migrate\source\d7\File
*
* @MigrateSource(
* id = "tag1_media_image",
* source_module = "file"
* )
*/
class Tag1MediaImage extends File {
/**
* {@inheritdoc}
*/
public function query() {
$query = parent::query();
$query->condition('f.filemime', 'image/%', 'LIKE');
$query->condition('f.status', 1);
return $query;
}
}
Note: Source plugins that only alter the parent's query might no longer be necessary once this core issue lands.
A detailed review of the PHP code for the custom source plugin will be left as an exercise to the curious reader. You do not need to do much to create a custom process plugin that extends an existing one. In this case, we are only overwriting the query
method to add two conditions:
- That the
filemime
matches that patternimage/%
. In SQL, the%
character serves as a wildcard for pattern matching when paired with the LIKE operator. - That the
status
is1
which means the file is permanent.
Please note that we are adding conditions to the parent query provided by d7_file
. So, you need to review that source plugin's query
implementation to see how the initial query is built. At the risk of stating the obvious, by extending the Drupal\file\Plugin\migrate\source\d7\File
class, we also inherit all the other methods defined in that class and up in its parent class hierarchy. The importance of this will be evident when writing the migration for creating image media entities.
Before switching focus to writing a media migration, consider how a custom source plugin that extends d7_file
would look file if the Drupal 7 site used the File Entity module:
<?php
namespace Drupal\tag1_migration\Plugin\migrate\source;
use Drupal\file\Plugin\migrate\source\d7\File;
/**
* Retrieve permanent files optionally filtered by file type bundles.
*
* Available configuration keys:
* - type: Only retrieve files matching the specified file type bundles. Can be
* set to a string or an array. If not declared then files of all file types
* bundles will be retrieved.
*
* @see \Drupal\file\Plugin\migrate\source\d7\File
* @see file_entity_file_default_types() in Drupal 7's file_entity.module file.
*
* @MigrateSource(
* id = "tag1_d7_file",
* source_module = "file_entity"
* )
*/
class Tag1File extends File {
/**
* {@inheritdoc}
*/
public function query() {
$query = parent::query();
// Only migrate permanent files.
$query->condition('f.status', 1);
// Filter by file type bundle, if configured.
if (isset($this->configuration['type'])) {
$query->condition('f.type', (array) $this->configuration['type'], 'IN');
}
return $query;
}
}
The tag1_d7_file
above can accept an optional type
configuration to only retrieve files matching the specified file types. Then, you can use the plugin as follow:
source:
key: migrate
plugin: tag1_d7_file
type: image
Note to my future self: Thanks for providing the code for a source plugin I can copy/paste in the next migration project!
From Drupal 7 image fields to Drupal 10 image media entities
Don’t give up just yet. We have one more exercise to go. Take your dumbbells for three rounds of overhead presses. Afterward we’ll write another migration by hand.
Time to import Drupal 7 images as media entities in Drupal 10. Before doing so, you need to understand what entities will be created, their base field definitions, and the fields attached to each bundle. We covered this in great detail in article 26. Additionally, this article has a reference of base field definitions for Drupal 10 media entities. In short, we will create entities of type media
of the image
bundle and populate its field_media_image
image field with a reference from previously migrated files.
Create an upgrade_d7_media_image.yml
file in the web/modules/custom/tag1_migration/migrations
folder of our Drupal 10 project. We will use that same file name, without the file extension, as the migration ID. You can come up with any name as long as it is unique. Below is the content of the file:
id: upgrade_d7_media_image
class: Drupal\migrate\Plugin\Migration
migration_tags:
- 'Drupal 7'
- Content
- media
- tag1_content
label: 'Media (Image)'
source:
key: migrate
plugin: tag1_media_image
constants:
source_base_path: NULL
high_water_property:
name: fid
alias: f
process:
name: filename
status: status
created: timestamp
changed: timestamp
field_media_image/alt: filename
field_media_image/target_id:
- plugin: migration_lookup
source: fid
migration: upgrade_d7_file
no_stub: true
- plugin: skip_on_empty
method: row
message: 'The file was not found.'
uid:
- plugin: migration_lookup
source: uid
migration: upgrade_d7_user
no_stub: true
- plugin: default_value
default_value: 1
langcode:
plugin: default_value
default_value: en
destination:
plugin: 'entity:media'
default_bundle: image
migration_dependencies:
required:
- upgrade_d7_file
- upgrade_d7_user
optional: { }
Now, rebuild caches for our new migration to be detected and trigger an import operation.
ddev drush cache:rebuild
ddev drush migrate:status upgrade_d7_media_image
ddev drush migrate:import upgrade_d7_media_image
If things are properly configured, you should not get any errors. Go to https://migration-drupal10.ddev.site/admin/content/media?type=image
and you will be presented with beautiful images created by Drupal 7's Devel Generate module.
Note that in our upgrade_d7_media_image
migration, we are calling the tag1_media_image
. This should match the id
of the MigrateSource
PHP annotation used in our custom source plugin. We recycled the configuration for high_water_property
from the upgrade_d7_file
.
What is that strange source_base_path
constant with a value of NULL
? If you do not include it, you will get the following warning for each record retrieved by our custom source plugin: Undefined array key "constants" File.php:105
. Remember what we said earlier? By extending a source plugin, you inherit all its methods. Well, that warning comes from implementation of the prepareRow
method in the d7_file
source plugin. That plugin assumes a source constant named source_base_path exists for setting the filepath
of files to import. Our media migration makes no use of such source property, but the expectation that the constant exists is very much present. So, to suppress the warning, we can provide the source constant and set it to a NULL
value.
We use the entity:media
destination plugin to create media entities of type image
. In the migration_dependencies
section, we add upgrade_d7_file
and upgrade_d7_user
as required dependencies, because we will perform migration lookups against them. The process section contains mapping for multiple base field definitions, which we hope will be straightforward to understand now. The key part of this migration is populating the field_media_image
field.
In Drupal 10, image fields are entity reference fields. As noted in this article, this field type has five sub-fields:
-
target_id
: An integer representing the ID of the target entity. -
alt
: Alternative image text, for the image'salt
attribute. -
title
: Image title text, for the image'stitle
attribute. -
width
: The width of the image in pixels. -
height
: The height of the image in pixels.
Our source data does not contain information for width
and height
. When not specified, the ImageItem class responsible for providing image fields loads the file to determine its dimensions. If available, setting the value for these sub-fields will yield a performance boost, because there will be no need to calculate the image's dimensions.
While not ideal, we are using the filename as the alt
attribute. For title
, we are not even going to pretend to have a sensible value to use. Maybe a good compromise would be generating image descriptions and alt-text with AI like Drupal's project founder Dries Buytaert considered doing for his website.
Finally, the most important sub-field to set for any entity reference field: target_id
. Let's revisit the process pipeline to set it:
process:
field_media_image/target_id:
- plugin: migration_lookup
source: fid
migration: upgrade_d7_file
no_stub: true
- plugin: skip_on_empty
method: row
message: 'The file was not found.'
For image fields, target_id
should be set to the referenced file ID. Our source plugin provides the fid
source property that can be used for this purpose. It might be tempting to assign a copy fid
directly into field_media_image/target_id
. But that neglects the fact that it is possible for records to exist in Drupal 7's file_managed
table, but the corresponding files are not present on disk. We will not speculate about what would cause such a situation to happen, but I have seen this many times in real life projects. So, to safeguard us from creating invalid media entities that point to non-existent files, we perform a migration_lookup
against the upgrade_d7_file
migration. If that migration was not able to retrieve the file from Drupal 7 for any reason, we leverage the skip_on_empty
process plugin to bail out from creating the media entity.
If the above sounds familiar, it is because a similar technique is used to connect paragraphs to their host entities as explained in the previous article. The difference is that paragraph fields are of type entity reference revisions and require setting the target_revision_id
sub-field in addition to target_id
. Fields of type entity reference — like files, images, taxonomy term references, user references, etc. — only require setting the target_id
sub-field.
Time to cool down after an intense upper body exercise routine. Those migration muscles might be aching, but hey... no pain no gain!
Now you know how to create custom source plugins, extend existing ones, and use them to create media entities. In the next article, we will work our lower body... I mean, we will work on migrating yet another entity: nodes. We might or might not create one more source plugin. Only time will tell.
Image by PayPal.me/FelixMittermeier from Pixabay