DOCS File migration changes for 4.4.0 (#8910)

* DOCS File migration changes for 4.4.0

See https://github.com/silverstripe/silverstripe-versioned/issues/177

* Update docs/en/02_Developer_Guides/14_Files/03_File_Security.md

Co-Authored-By: chillu <ingo@silverstripe.com>

* Corrected statements on archived/versioned files

* Corrected statement on filesystem paths of protected vs. public

* Update docs/en/02_Developer_Guides/14_Files/03_File_Security.md

Co-Authored-By: chillu <ingo@silverstripe.com>

* Clarify redirect behaviour
This commit is contained in:
Ingo Schommer 2019-04-24 14:00:48 +12:00 committed by Aaron Carlino
parent b5b4014280
commit 15396116e5
8 changed files with 266 additions and 83 deletions

View File

@ -66,7 +66,7 @@ of a page with a shortcode image:
```html
<p>Welcome to SilverStripe! This is the default homepage.</p>
<p>[image src="/assets/e43fb87dc0/12824172.jpeg" id="27" width="400" height="400" class="leftAlone ss-htmleditorfield-file image" title="My Image"]</p>
<p>[image src="/assets/12824172.jpeg" id="27" width="400" height="400" class="leftAlone ss-htmleditorfield-file image" title="My Image"]</p>
```
File shortcodes have the following properties:
@ -137,7 +137,7 @@ SilverStripe\Assets\File:
## Modifying files
In order to move or rename a file you can simply update the Name property, or assign the ParentID to a new
In order to move or rename a file you can simply update the `Name` property, or assign the `ParentID` to a new
folder. Please note that these modifications are made simply on the draft stage, and will not be copied
to live until a publish is made (either on this object, or cascading from a parent).

View File

@ -210,6 +210,12 @@ SilverStripe\Core\Injector\Injector:
- { driver: imagick }
```
## Storage
Manipulated images are stored as "file variants" in the same
folder structure as the original image. The storage mechanism is described
in the ["File Storage" guide](file_storage).
## API Documentation
* [File](api:SilverStripe\Assets\File)

View File

@ -9,8 +9,8 @@ in your system. As pages and dataobjects can be either versioned, or restricted
members, it is necessary at times to apply similar logic to any files which are attached to these objects
in the same way.
Out of the box SilverStripe Framework comes with an asset storage mechanism with two stores, a public
store and a protected one. Most operations which act on assets work independently of this mechanism,
Out of the box, SilverStripe comes with two asset stores: a public and a protected one.
Most operations which act on assets work independently of this mechanism,
without having to consider whether any specific file is protected or public, but can normally be
instructed to favour private or protected stores in some cases.
@ -20,7 +20,7 @@ config option:
```php
$store = singleton(AssetStore::class);
$store->setFromString('My protected content', 'Documents/Mydocument.txt', null, null, [
$store->setFromString('My protected content', 'my-folder/my-file.jpg', null, null, [
'visibility' => AssetStore::VISIBILITY_PROTECTED
]);
```
@ -55,11 +55,9 @@ authorised by this code.
In order to ensure protected assets are not leaked publicly, but are properly whitelisted for
authorised users, the following should be considered:
* Caching mechanisms which prevent `$URL` being invoked for the user's request (such as `$URL` within a
partial cache block) will not whitelist those files automatically. You can manually whitelist a
file via PHP for the current user instead, by using the following code to grant access.
Caching mechanisms which prevent `$URL` being invoked for the user's request (such as `$URL` within a
partial cache block) will not whitelist those files automatically. You can manually whitelist a
file via PHP for the current user instead, by using the following code to grant access.
```php
use SilverStripe\CMS\Controllers\ContentController;
@ -79,9 +77,9 @@ class PageController extends ContentController
}
```
* If a user does not have access to a file, you can still generate the URL but suppress the default
permission whitelist by invoking the getter as a method, but pass in a falsey value as a parameter.
(or '0' in template as a workaround for all parameters being cast as string)
If a user does not have access to a file, you can still generate the URL but suppress the default
permission whitelist by invoking the getter as a method, but pass in a falsey value as a parameter.
(or '0' in template as a workaround for all parameters being cast as string)
```php
@ -91,8 +89,8 @@ class PageController extends ContentController
<% else %>
```
* Alternatively, if a user has already been granted access, you can explicitly revoke their access using
the `revokeFile` method.
Alternatively, if a user has already been granted access, you can explicitly revoke their access using
the `revokeFile` method.
```php
use SilverStripe\CMS\Controllers\ContentController;
@ -151,8 +149,9 @@ a single entity for access control, so specific variants cannot be individually
## How file access is protected
Public urls to files do not change, regardless of whether the file is protected or public. Similarly,
operations which modify files do not normally need to be told whether the file is protected or public
Filesystem paths can change depending if the file is protected or public,
but its public URL stays the same. You just need to use SilverStripe's APIs to generate URLs to those files.
Similarly, operations which modify files do not normally need to be told whether the file is protected or public
either. This provides a consistent method for interacting with files.
In day to day operation, moving assets to or between either of these stores does not normally
@ -168,25 +167,23 @@ Internally your folder structure would look something like:
```
assets/
.htaccess
OldCompanyLogo.gif
.protected/
.htaccess
a870de278b/
NewCompanyLogo.gif
33be1b95cb/
OldCompanyLogo.gif
```
The urls for these two files, however, do not reflect the physical structure directly.
* `http://www.example.com/assets/33be1b95cb/OldCompanyLogo.gif` will be served directly from the web server,
and will not invoke a php request.
* `http://www.example.com/assets/a870de278b/NewCompanyLogo.gif` will be routed via a 404 handler to PHP,
* The public file at `http://www.example.com/assets/OldCompanyLogo.gif` will be served directly from the web server,
and will not invoke a PHP request.
* The protected file at `http://www.example.com/assets/a870de278b/NewCompanyLogo.gif` will be routed via a 404 handler to PHP,
which will be passed to the `[ProtectedFileController](api:SilverStripe\Assets\Storage\ProtectedFileController)` controller, which will serve
up the content of the hidden file, conditional on a permission check.
When the file `NewCompanyLogo.gif` is made public, the url will not change, but the file location
will be moved to `assets/a870de278b/NewCompanyLogo.gif`, and will be served directly via
When the file `NewCompanyLogo.gif` is made public, the file
will be moved to `assets/NewCompanyLogo.gif`, and will be served directly via
the web server, bypassing the need for additional PHP requests.
```php
@ -200,13 +197,11 @@ After this the filesystem will now look like below:
```
assets/
.htaccess
NewCompanyLogo.gif
.protected/
.htaccess
33be1b95cb/
OldCompanyLogo.gif
a870de278b/
NewCompanyLogo.gif
33be1b95cb/
OldCompanyLogo.gif
```
## Performance considerations
@ -346,9 +341,9 @@ RewriteRule .* ../index.php [QSA]
```
You will need to ensure that your core apache configuration has the necessary `AllowOverride`
settings to support the local .htaccess file.
settings to support the local `.htaccess` file.
Although assets have a 404 handler which routes to a PHP handler, .php files within assets itself
Although assets have a 404 handler which routes to a PHP handler, `.php` files within assets itself
should not be allowed to be marked as executable.
When securing your server you should ensure that you protect against both files that can be uploaded as

View File

@ -53,32 +53,114 @@ storage backend to determine how variants are managed.
Note that the storage backend used will not be automatically synchronised with the database. Only files which
are loaded into the backend through the asset API will be available for use within a site.
## File paths and url mapping
## Public file paths
The hash, name, and filename are combined in order to build the physical location on disk.
For instance, this is a typical disk content:
Public files are published either directly through the "Assets" CMS UI,
or indirectly as part of a [versioned ownership structure](/developer_guides/model/versioning).
They are stored as you'd expect on the filesystem: In their folder, by their file name.
```
assets/
Uploads/
b63923d8d4/
BannerHeader.jpg
BannerHeader__FitWzYwLDYwXQ.jpg
my-public-folder/
my-public-file.jpg
```
The URL for this file will match the physical location on disk:
`http://www.example.com/assets/my-public-folder/my-public-file.jpg`.
## Variant file paths (e.g. resized images)
Each file can have variants, most commonly resized versions of an image.
These can be generated by resizing an image in the CMS rich text editor,
through template logic, or programmatically with PHP.
They are stored in the same folder alongside the original file,
but contain a special variant suffix.
```
assets/
my-public-folder/
my-public-file.jpg
my-public-file__FitWzYwLDYwXQ.jpg
```
The URL for this file will match the physical location on disk:
`http://www.example.com/assets/my-public-folder/my-public-file__FitWzYwLDYwXQ.jpg`.
Note: Before 4.4.0, public files were stored with a content hash by default
(see [Protected file paths](#protected-file-paths)).
## Protected file paths {#protected-file-paths}
Uploaded files are protected by default, which puts them in a "draft" mode
that requires permissions to view them. Protected files can also be published
but access restricted. In either case, they're stored in a special `assets/.protected` folder.
In this case, they're stored in a folder matching the truncated hash of the file's content.
```
assets/
my-public-folder/
my-public-file.jpg
my-public-file__FitWzYwLDYwXQ.jpg
.protected/
my-protected-folder/
b63923d8d4/
my-protected-file.jpg
my-protected-file__FitWzYwLDYwXQ.jpg
```
This corresponds to a file with the following properties:
- Filename: Uploads/BannerHeader.jpg
- Filename: my-protected-folder/my-protected-file-hash/my-protected-file.jpg
- Hash: b63923d8d4089c9da16fbcbcdfef3e1b24806334 (trimmed to first 10 chars)
- Variant: FitWzYwLDYwXQ (corresponds to Fit[60,60])
- Variant: FitWzYwLDYwXQ (corresponds to `Fit[60,60]`)
The URL for this file will match the physical location on disk:
`http://www.example.com/assets/Uploads/b63923d8d4/BannerHeader__FitWzYwLDYwXQ.jpg`.
The URL for this file will not match the physical location on disk.
It leaves out the `.protected/` folder, and leaves that to SilverStripe's integrated routing:
`http://www.example.com/assets/my-protected-folder/b63923d8d4/my-protected-file.jpg`.
For more information on how protected files are stored see the [file security](/developer_guides/files/file_security)
section.
## Versioned file paths
Older versions of file contents are kept in the `.protected` folder,
following the same rules as [protected file paths](#protected-file-paths).
```
assets/
my-file.jpg <- current published file
.protected/
dec83f348d/ <- old content hash of replaced file version
my-file.jpg
b63923d8d4/ <- old content hash of replaced file version
my-file.jpg
```
## Versioned and archived files
By default, when files are replaced or removed, their original file contents
aren't retained in order to avoid bloat on the filesystem.
Changes are only tracked for file metadata (e.g. the `Title` attribute).
You can opt-in to retaining the file content for replaced or removed files.
```yml
SilverStripe\Assets\Flysystem\FlysystemAssetStore:
keep_archived_assets: true
```
The filesystem structure follows the same rules as [protected file paths](#protected-file-paths):
```
assets/
my-file.jpg <- current published file
.protected/
dec83f348d/ <- old content hash of replaced file version
my-file.jpg
b63923d8d4/ <- old content hash of replaced file version
my-file.jpg
```
## Loading content into `DBFile`
A file can be written to the backend from a file which exists on the local filesystem (but not necessarily
@ -98,5 +180,5 @@ class Banner extends DataObject
// Image could be assigned in other parts of the code using the below
$banner = new Banner();
$banner->Image->setFromLocalFile($tempfile['path'], 'uploads/banner-file.jpg');
$banner->Image->setFromLocalFile($tempfile['path'], 'my-folder/my-file.jpg');
```

View File

@ -7,7 +7,7 @@ This section describes how to upgrade existing filesystems from earlier versions
## Running migration
Since the structure of `File` dataobjects has changed between 3.0 and 4.0, a new task `MigrateFileTask`
Since the structure of `File` dataobjects has changed between 3.x and 4.x, a new task `MigrateFileTask`
has been added to assist in migration of legacy files.
You can run this task on the command line:
@ -16,16 +16,13 @@ You can run this task on the command line:
$ ./vendor/bin/sake dev/tasks/MigrateFileTask
```
This task will also support migration of existing File DataObjects to file versioning. Any
pre-existing File DataObjects will be automatically published to the live stage, to ensure
This task will also support migration of existing File objects to file versioning. Any
pre-existing File objects will be automatically published to the live stage, to ensure
that previously visible assets remain visible to the public site.
If additional security or visibility rules should be applied to File dataobjects, then
make sure to correctly extend `canView` via extensions.
*IMPORTANT*: There is a [known bug](https://github.com/silverstripe/silverstripe-versioned/issues/177)
which breaks existing direct links to asset URLs unless `legacy_filenames` is set to `true` (see below).
## Automatic migration
Migration can be invoked by either this task, or can be configured to automatically run during dev build
@ -38,6 +35,8 @@ SilverStripe\Assets\File:
migrate_legacy_file: true
```
You can also run this task without CLI access through the [queuedjobs](https://github.com/symbiote/silverstripe-queuedjobs) module.
## Migration of thumbnails
If you have the [asset admin](https://github.com/silverstripe/silverstripe-asset-admin) module installed
@ -49,7 +48,7 @@ within the file edit details form.
## Discarded files during migration
Note that any File dataobject which is not in the `File.allowed_extensions` config will be deleted
Note that any File object which is not in the `File.allowed_extensions` config will be deleted
from the database during migration. Any invalid file on the filesystem will not be deleted,
but will no longer be attached to a dataobject anymore, and should be cleaned up manually.
@ -60,30 +59,22 @@ SilverStripe\Assets\FileMigrationHelper:
delete_invalid_files: false
```
Note that pre-existing security solutions for 3.x (such as
Pre-existing file security solutions for 3.x (such as
[secure assets module](https://github.com/silverstripe/silverstripe-secureassets))
are incompatible with core file security.
are likely incompatible with core file security. You should check the module README for potential upgrade paths.
## Support existing paths
## Keeping archived assets
Because the filesystem now uses the hash of file contents in order to version multiple versions under the same
filename, the default storage paths in 4.0 will not be the same as in 3.
Although it is not recommended, it is possible to configure the backend to omit this hash url segment,
meaning that file paths and urls will not be modified during the upgrade.
This configuration needs to be chosen before starting the file migration,
and can't be changed after migration.
By default, "archived" assets (deleted from draft and live stage) retain their
historical database entries with the file metadata, but the actual file contents are removed from the filesystem
in order to avoid bloat. If you need to retain file contents (e.g. for auditing purposes),
you can opt-in to this behaviour:
```yaml
SilverStripe\Assets\Flysystem\FlysystemAssetStore:
legacy_filenames: true
keep_archived_assets: true
```
This setting will still allow creation of protected (draft) files before publishing them.
It'll also keep track of changes to file metadata (e.g. title and description).
But it won't keep track of replaced file contents (not compatible with `keep_archived_assets=true`).
When replacing an already published file, the new file will be public right away (no draft stage).
## Migrating substantial number of files
The time it takes to run the file migration will depend on the number of files and their size. The generation of thumbnails will depend on the number and dimension of your images.

View File

@ -1229,7 +1229,7 @@ This should migrate your existing data (non-destructively) to the new SilverStri
#### Migrating files
Since the structure of the `File` DataObject has changed, a new task `MigrateFileTask`
has been added to assist in migration of legacy files (see [file migration documentation](/developer_guides/files/file_migration)).
has been added to assist in migration of existing files (see [file migration documentation](/developer_guides/files/file_migration)).
```bash
./vendor/bin/sake dev/tasks/MigrateFileTask

View File

@ -741,21 +741,14 @@ the [IntlDateFormatter defaults](http://php.net/manual/en/class.intldateformatte
File system has been abstracted into an abstract interface. By default, the out of the box filesystem
uses [Flysystem](http://flysystem.thephpleague.com/) with a local storage mechanism (under the assets directory).
In order to fully use this mechanism, we need to adjust files and their database entries,
through a [file migration task](/developer_guides/files/file_migration).
Because the filesystem now uses the sha1 of file contents in order to version multiple versions under the same
filename, the default storage paths in 4.0 will not be the same as in 3.
In order to retain existing file paths in line with framework version 3 you should set the
`\SilverStripe\Filesystem\Flysystem\FlysystemAssetStore.legacy_filenames` config to true.
Note that this will not allow you to utilise certain file versioning features in 4.0.
```yml
SilverStripe\Filesystem\Flysystem\FlysystemAssetStore:
legacy_filenames: true
```
Note: In order to allow draft files and multiple versions of file contents,
we're adding a "content hash" to the file paths by default.
Before 4.4.0, this was in effect for both public and protected URLs.
Starting with 4.4.0, only protected URLs receive those "content hashes".
Public files have a "canonical URL" which doesn't change when file contents are replaced ([4.4.0](details)).
See our ["File Management" guide](/developer_guides/files/file_management) for more information.
Depending on your server configuration, it may also be necessary to adjust your assets folder

View File

@ -11,8 +11,42 @@
## Upgrading {#upgrading}
### Optional migration to hash-less public asset URLs {#hashless-urls}
SilverStripe 4.x introduced an [asset abstraction](https://docs.silverstripe.org/en/4/developer_guides/files/file_storage/)
system which required a [file migration](https://docs.silverstripe.org/en/4/developer_guides/files/file_migration/) task.
It allowed files to be access protected, have a separate draft stage,
and track changes to file metadata and their contents through the `Versioned` system.
This change defaulted to adding a content "hash" to file paths,
unless the migration was performed with `legacy_filenames=true`.
The hash would be updated when file contents change, and any link generated
through SilverStripe (e.g. through `HTMLText` or `$Image` template placeholders)
would automatically adjust. However, any direct links from search engines, bookmarks, etc would break.
This limitation was pointed out in the [upgrading advice](4.0.0#asset-storage) to developers,
but the impact wasnt sufficiently highlighted.
SilverStripe 4.3.2 introduced a redirect to fix those broken links.
Dynamic redirects are more resource intensive than serving static files.
With SilverStripe 4.4.0, we're providing an optional migration script
to move public files into their "hash-less" locations, removing the need for most redirects.
The original "hashed" file paths continue to work through redirects.
In order to opt-in to moving these files, run the following command:
```
TODO File migration task command
```
Note that you can run this task without CLI access by installing and configuring
the [queuedjobs](https://github.com/symbiote/silverstripe-queuedjobs) module.
Further information is provided in the [Hash-less Public Asset URLs FAQ](#hashless-faq) below.
### Adopting to new `_resources` directory
The name of the directory where vendor module resources are exposed can now be configured by defining a `extra.resources-dir` key in your `composer.json` file. If the key is not set, it will automatically default to `resources`. New projects will be preset to `_resources`.
This will avoid potential conflict with SiteTree URL Segments.
1. Update your `.gitignore` file to ignore the new `_resources` directory. This file is typically located in the root of your project or in the `public` folder.
2. Add a new `extra.resources-dir` key to your composer file.
```js
@ -29,6 +63,88 @@
You may also need to update your server configuration if you have applied special conditions to the `resources` path.
### Hash-less Public Asset URLs FAQ {#hashless-faq}
#### How are files named and renamed?
Here's an example of how file names changed during an initial 4.x migration, and when upgrading to 4.4:
* Original file created under SS 3.x: assets/myfile.pdf
* File migrated under SS 4.x with `legacy_filenames=false`: `assets/[content-hash]/myfile.pdf` (Regression: links to `assets/myfile.pdf` no longer work)
* File migrated under SS 4.x with `legacy_filenames=true`: `assets/myfile.pdf`
* File with updated file content under SS 4.x: `assets/[new-content-hash]/myfile.pdf` (Regression: links to `assets/[content-hash]/myfile.pdf` no longer work)
* File with hotfix applied (SS 4.2.3): `assets/[new-content-hash]/myfile.pdf` (redirects to `assets/myfile.pdf`)
* File with full fix applied (SS 4.3.x) : `assets/myfile.pdf` (no redirect required)
* Newly uploaded file with full fix applied: `assets/my-other-file.pdf` (no redirect required)
More details on how files are stored can be found in the
["File Storage" guide](/developer_guides/files/file_storage).
#### How are redirects handled?
Redirects from (now legacy) hashed public URLs default to `301 Permanent Redirect`.
By opting into the [file migration to hash-less public URLs](#hashless-urls),
you can minimise these redirects. SilverStripe will automatically generate hash-less public URLs regardless,
but external links might still point to legacy hashed public URLs.
If you have a high traffic site with lots of direct references to asset URLs
(e.g. search engines indexing popular PDF documents),
we recommend that you configure HTTP caching for these redirects
in your webserver (e.g. through `.htaccess`).
The redirect code can be configured via `FlysystemAssetStore.redirect_response_code`.
#### Do I need to regenerate HTML with existing links to assets?
Pages and other views can contain links to asset locations,
e.g. as HTML content rendered with `<img>` and `<a>` tags.
These might point to old locations of files (prior to running the optional migration).
While SilverStripe will automatically fix these references the next time the view is rendered,
this content is often cached (e.g. in a CDN).
We have implemented an automatic redirect for URLs pointing to asset
locations prior to running the optional migration script,
so there is no need to regenerate any content.
#### Is there any data loss or data integrity issue?
There are no known issues around data integrity.
All files that were available on a SS 3.x site are still available after upgrading to SS 4.x.
#### Will this change affect my search engine ranking?
As long as your files are still linked on your website,
search engines will pick up the new links on any projects which have already been migrated.
The 4.3.2 bugfix will redirect links, which passes on any SEO rankings to the new link location.
Since links to files should be permanent after the bugfix has been applied,
this can lead to improved search engine rankings (since existing files under new links dont need to be re-ranked by search engines).
#### Can I migrate away from the legacy_filenames=true option?
Technically yes, but theres no official migration script for it.
#### Can I still choose legacy_filenames=true when starting new upgrades?
Technically yes, but you need to be aware of the tradeoffs.
Once the regression has been fixed, we dont see the need for people to choose this option any more.
#### How do I test that the fix has worked on my site?
Before applying the upgrade and migration script:
Find an existing published and draft file. Get the link by clicking on the preview in `admin/assets`.
It should link to `assets/[hash]/myfile.pdf`.
After applying the upgrade and migration script, you can test that it applied correctly.
Please perform these tests on a test or UAT environment, since your local environment might have different routing conditions.
* Check that the file still routes correctly with the hash in place (should redirect)
* Remove `[hash]/`, and check that it still routes correctly (should not redirect)
#### Will this patch redirect URLs for previous file versions?
Yes, it will attempt to find the most recent public "hash-less" URL
for this file and redirect to it.
## Changes to internal APIs
- `PDOQuery::__construct()` now has a 2nd argument. If you have subclassed PDOQuery and overridden __construct()