Fixes album export by reducing memory usage of the scripts. #18

Closed
Roflin wants to merge 1 commits from fix-album-export into master
Owner

Chunks the output buffer to 4kB chunks. This means the Download.php
script will no longer run out of memory when trying to stream a
tar file the output buffer.

Chunks the output buffer to 4kB chunks. This means the Download.php script will no longer run out of memory when trying to stream a tar file the output buffer.
minnozz reviewed 2020-02-26 22:15:26 +01:00
@ -63,4 +56,0 @@
);
while($asset = $iterator->Next())
{
$files[] = join(DIRECTORY_SEPARATOR, array('.', $asset->getSubdir(), $asset->getFilename()));
Member

Is this change still useful? It seems like an array with only paths ($files) and not the file contents should not get so large as to hit a memory limit.

Is this change still useful? It seems like an array with only paths (`$files`) and not the file contents should not get so large as to hit a memory limit.
Author
Owner

In the old situation the export would fail on the first chunk leaving behind a .tar of ~0kb, This means that the list of files of an album of around 270 pictures already consumed a considerable amount of memory. So I would still like to keep this more "streaming" approach as a optimization.

In the old situation the export would fail on the first chunk leaving behind a .tar of ~0kb, This means that the list of files of an album of around 270 pictures already consumed a considerable amount of memory. So I would still like to keep this more "streaming" approach as a optimization.
Member

I don't think that's possible, because if the memory limit was hit while building the array of file names, there would be no tar file at all, since tar is invoked after $files has been built.

While I don't think the streaming approach is wrong, this feels like a premature optimization (and certainly one that should not be necessary).

I don't think that's possible, because if the memory limit was hit while building the array of file names, there would be no tar file at all, since `tar` is invoked after `$files` has been built. While I don't think the streaming approach is wrong, this feels like a premature optimization (and certainly one that should not be necessary).
@ -67,4 +56,1 @@
}
}
$descriptorspec = array(
Member

The current code uses proc_open, but you might want to look into passthru.

The current code uses [`proc_open`](https://www.php.net/manual/en/function.proc-open.php), but you might want to look into [`passthru`](https://www.php.net/manual/en/function.passthru.php).
Author
Owner

I'm currently using ther tar's -T functionality to read a list of files in a file seperated by a "\0" byte. and I skipping the file all together by using - as the argument to -T and feeding it the files on STDIN. After a quick look at the passthru i don't think such a thing is possible. and I think i will run out of "command line" space to execute the tar command with passthru.

I'm currently using ther tar's -T functionality to read a list of files in a file seperated by a "\0" byte. and I skipping the file all together by using - as the argument to -T and feeding it the files on STDIN. After a quick look at the passthru i don't think such a thing is possible. and I think i will run out of "command line" space to execute the tar command with passthru.
Member

Agreed that passthru is probably the better solution, especially if you're not expecting to handle anything on STDOUT or STDERR.

I do not think the proposed solution to use STDIN to manually feed filenames will provide much in the way of performance benefit. If anything, I think tar can enumerate files more quickly than a PHP script.

Agreed that `passthru` is probably the better solution, especially if you're not expecting to handle anything on `STDOUT` or `STDERR`. I do not think the proposed solution to use `STDIN` to manually feed filenames will provide much in the way of performance benefit. If anything, I think `tar` can enumerate files more quickly than a PHP script.
minnozz reviewed 2020-02-26 22:17:37 +01:00
@ -28,3 +28,3 @@
// The real magic starts here!
ob_start();
ob_start(null, 32768);
Member

I haven't looked at the rest of the framework, but you might want to consider disabling the output buffer before calling tar. This code could then check whether an output buffer is still active with ob_get_level.

I haven't looked at the rest of the framework, but you might want to consider disabling the output buffer before calling `tar`. This code could then check whether an output buffer is still active with [`ob_get_level`](https://www.php.net/manual/en/function.ob-get-level.php).
Author
Owner

@aaron what are your thoughts on this? I'm not familiar enough to comment on this.

@aaron what are your thoughts on this? I'm not familiar enough to comment on this.
Member

I agree with @minnozz; much better to disable it locally (i.e. in the download controller) with ob_end_flush. In general, output buffering is quite useful.

I agree with @minnozz; much better to disable it locally (i.e. in the download controller) with `ob_end_flush`. In general, output buffering is quite useful.
Aaron reviewed 2020-02-26 23:45:21 +01:00
@ -98,0 +83,4 @@
foreach($album_ids as $album_id)
{
$iterator = AssetIterator::getByOptions(
array(
Member

This is an aside, but I'd like to stick to new-style (bracketed) arrays, that is [] instead of array().

This is an aside, but I'd like to stick to new-style (bracketed) arrays, that is `[]` instead of `array()`.
Author
Owner

This is being superseded by PR !19

This is being superseded by PR !19
Roflin closed this pull request 2020-03-11 17:48:35 +01:00

Pull request closed

Sign in to join this conversation.
No reviewers
No Label
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Public/pics#18
No description provided.