So I'm writing code that allows me to collect images collected by the Instaloader python library and put them into a gallery on my website. I've managed to collect and display these without any problems, however I've now started implementing headers for each post and I'm running into issues.
The way the library downloads photos is that if there is more than one photo in the collection, it will add _1, _2, etc. suffixes to the post based on the position of the image in the collection and provide the .txt file as a title.
Sample folder contents for collection:
2022-12-26_14-14-01_UTC.txt 2022-12-26_14-14-01_UTC_1.jpg 2022-12-26_14-14-01_UTC_2.jpg 2022-12-26_14-14-01_UTC_3.jpg
Single post posts work well Example:
2022-12-31_18-13-43_UTC.txt 2022-12-31_18-13-43_UTC.jpg
Main code block:
$array = []; $account_name = "everton"; $file_directory = "images/instagram"; $count = 0; $hasvideo = 0; $hasCaption = 0; $handle = opendir(dirname(realpath(__DIR__)).'/'.$file_directory.'/'); while($file = readdir($handle)){ $date = substr($file, 0, strpos($file, "_UTC")); $ext = strtolower(pathinfo($file, PATHINFO_EXTENSION)); // Using strtolower to overcome case sensitive if($ext === 'jpg'){ $count++; $collectionSize = (int)str_replace("_", "", str_replace(".jpg", "", explode("UTC",$file)[1])); if(!is_numeric($collectionSize)){ $collectionSize = 0; } $arrayKey = array_search($date, array_column($array, 'date')); if($arrayKey){ $amount = intval($array[$arrayKey]['collection-size']); if($collectionSize > $amount){ $array[$arrayKey]['collection-size'] = (int)$collectionSize; } }else{ array_push($array, array ("date" => $date, "collection-size" => (int)$collectionSize, "has-video" => false)); } } if ($ext === "txt"){ $file_location = dirname(realpath(__DIR__)).'/'.$file_directory.'/'. $file; $myfile = fopen( $file_location, "r") or die("Unable to open file!"); $caption = fread( $myfile, filesize($file_location)); $arrayKey = array_search($date, array_column($array, 'date')); //$arrayKey returns false when there is a collection. if($array[$arrayKey]){ $array[$arrayKey]['caption'] = $caption; }else{ array_push($array, array ("date" => $date, "caption" => $caption)); } fclose($myfile); } }
$arrayKey returns false when a collection exists on a regular single post.
I believe this has to do with the file order in which the script reads these files, as I'm assuming it reads (date)_(collectionposition).jpg before it reads (date).txt
If the array entry has already been created, the header is usually added to the array data, if not (such as when _1, _2, etc. are present), the array does not update anything and no error is raised.
edit: Further troubleshooting shows that the way I'm updating/checking the array keys based on the "date" value is wrong, hopefully the correct way to handle these operations can be found
Any guidance on what I can fix to make this work as expected would be appreciated, thank you!
P粉7399424052024-04-01 00:29:12
Let's study your code first. The problem you mentioned is. The following lines:
$arrayKey = array_search($date, array_column($array, 'date'));
...returns false
because the $array
entry with the date has not been created while processing the .txt
file. (The logic for creating array members using array_push
is below the code.)
Simple fix to continue moving to the relevant part of the if/else
logic has not been defined yet:
if($arrayKey !== false && $array[$arrayKey]){ ...
That is, if $arrayKey
is not false
, continue adding the value to the existing array member. Otherwise, create an array member.
Additionally, there is an issue when processing images, which generates a warning the first time it occurs:
$amount = intval($array[$arrayKey]['collection-size']);
This will fail with undefined array key 'collection-size' because the collection-size
key does not exist yet. Fix e.g. using null coalescing operator to set "default zero" before trying to operate on array keys:
$array[$arrayKey]['collection-size'] ??= 0;
These comments fix the bug, but it would be better to separate the "entry creation" in the first instance of the txt or jpg - with an empty array member with the expected key, before performing any txt/jpg specific logic. I would simply use $date
itself as the grouping so you can get rid of array_search
too. For example, after extracting the date, use:
$array[$date] ??= [ 'date' => $date, 'caption' => '', 'collection-size' => 0, 'has-video' => false, ];
Then modify the rest of the code to match. Your code should not depend on the order in which files are read. The order is not guaranteed. Otherwise, you can always read the list of files into a regular array first, then sort them, and iterate again when applying specific logic.
The actual amount of code required is much less than what you have. Here I have trimmed it for you. I don't have your file, so here's some dummy data:
$files = <<You can also put the
glob
files into an array (= list of file paths):$file_directory = "images/instagram"; $files = glob(dirname(realpath(__DIR__)).'/'.$file_directory.'/*');Then iterate as follows:
foreach($files as $filepath) { $filename = basename($filepath); $date = strstr($filename, '_UTC', true); $array[$date] ??= [ 'date' => $date, 'caption' => '', 'collection-size' => 0, 'has-video' => false, ]; $ext = strtolower(pathinfo($file, PATHINFO_EXTENSION)); if($ext === 'jpg'){ // Each JPG increments collection size: $array[$date]['collection-size']++; } elseif ($ext === "txt"){ // We use a dummy here: $caption = '---'; // $caption = file_get_contents($filepath); $array[$date]['caption'] = $caption; } }Notice how much it shrinks. what happened?
- We use
$date
as the grouping index of the array. No morearray_search
!- We initialize a default entry for each date. No further inspections or conditions required!
- We ignore "collection size" such as
_3
in the file name: just add 1 for each JPG.- We use
nglob
andfile_get_contents
instead ofreaddir
andfopen
.- The order of the files is not important. (Feel free to test and
shuffle($files)
!)result:
array(3) { ["2022-12-26_14-14-01"] · array(4) { ["date"] · string(19) "2022-12-26_14-14-01" ["caption"] · string(3) "---" ["collection-size"] · int(3) ["has-video"] · bool(false) } ["2022-12-27_14-14-01"] · array(4) { ["date"] · string(19) "2022-12-27_14-14-01" ["caption"] · string(3) "---" ["collection-size"] · int(2) ["has-video"] · bool(false) } ["2022-12-31_18-13-43"] · array(4) { ["date"] · string(19) "2022-12-31_18-13-43" ["caption"] · string(3) "---" ["collection-size"] · int(1) ["has-video"] · bool(false) } }reply0