Possible solutions to slow left joins in complex doctrine queries
<p>I have a symfony repository method that fetches a fairly complex dataset which is then placed in a CSV file by an export manager class. I don't want to put the entire code that handles the export job, but I managed to pinpoint the point at which the query slows down, so my question is about any other alternatives to making the query faster, rather than the code itself.
So the data fetched is some "site" data which has multiple "Memberships" and then has "Users".So the problem is that when my query tries to connect the user information to the site, it slows down the execution. It looks like this:</p>
<pre class="brush:php;toolbar:false;">$qb->leftJoin('s.memberships', 'ex_sm', 'WITH', 'ex_sm.revokedAt IS NULL');
$qb->leftJoin('ex_sm.user', 'ex_jappr', 'WITH', 'ex_sm.approverJobReactiveWeight IS NOT NULL');</pre>
<p>A few things to mention (that I tried or thought might help): </p>
<ul>
<li>I checked the table and all linked columns have an index and they are all the same int data type. </li>
<li>I red an article about DQL performance issues which mentioned that overuse of DQL Left Join calls can degrade performance because they remap the same entity object over and over again. One possible solution mentioned is to get the main dataset and then loop through the collection adding appends (connected data fields) to each element directly from the field's entity class. This might work (not sure how much impact it would have), the problem is that what I have is very complex legacy code and I don't want to touch the export manager logic as that would require too much testing. The export manager requires a query builder class, so I have to find the solution in the query itself. </li>
<li>The problem is definitely caused by the join, not the WITH clause or additional conditions. I tried calling the query using a normal leftJoin call, with the same result. </li>
<li>I know that the leftJoin method calls can be chained to each other, the code looks like this because some of the calls are used in if statements. </li>
<li>I spent 2 days trying everything I found here and on other sites. </li>
</ul>
<p>There are 6 different user types, now I just call the script to get the user type above and it took 33 minutes to return the data. We're talking about 512 sites, which is not a huge collection of data.So my question is: Is there another DQL or any Doctrine way to simplify or reduce the number of calls to leftJoins in such a complex query and somehow improve the performance? </p>
<p>Update:
I thought the problem was with the index, so I gave some details about the relationship:
The "memberships" entity comes from a table named "access" and the relationship to users in its model is as follows: </p>
<pre class="brush:php;toolbar:false;">/*** The user this membership encapsulates.
*
* @ORM\ManyToOne(targetEntity="User", inversedBy="siteMemberships", cascade={"persist"})
* @ORM\JoinColumn(name="security_identity_id", referencedColumnName="id")
*
* @var User*/
protected $user;</pre>
<p>This is a screenshot of the index assigned to the "security_identity_id" column
</p>
<p>The related users come from the Users table with a relationship pointing to membership</p>
<pre class="brush:php;toolbar:false;">/*** @ORM\OneToMany(targetEntity="SiteMembership", mappedBy="user", cascade={"persist"}, fetch="EXTRA_LAZY")*/
protected $siteMemberships;</pre>
<p>The primary key is the "id" in the entity.Hope this puts things into better perspective. I'm no sql expert but have tried everything I've found and can figure it out so far. </p>
<p>Update:
This is the query executed: </p>
<pre class="brush:php;toolbar:false;">SELECT s0_.name AS name_0, s0_.id AS id_1, GROUP_CONCAT(DISTINCT u1_.name SEPARATOR ', ') AS sclr_2 FROM site s0_
LEFT JOIN access a2_ ON s0_.id = a2_.entity_id
AND a2_.type IN ('site_member')
AND (a2_.revoked_at IS NULL)
LEFT JOIN user u1_ ON a2_.security_identity_id = u1_.id
AND (a2_.approver_job_reactive_weight IS NOT NULL)</pre>
<p>This will return the first site record along with its membership and user permissions. But even this row takes over 2 minutes. </p>
<p>Here is the table creation information for accessing the (member entity) table</p>
<pre class="brush:php;toolbar:false;">'CREATE TABLE `access` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`buddy_id` int(11) DEFAULT NULL,
`security_identity_id` int(11) DEFAULT NULL,
`revoked_at` datetime DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`type` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`approver_job_reactive_weight` int(11) DEFAULT NULL,
`entity_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `access_idx` (`type`,`security_identity_id`,`entity_id`,`buddy_id`),
KEY `IDX_6692B54395CE8D6` (`buddy_id`),
KEY `IDX_6692B54DF9183C9` (`security_identity_id`),
KEY `IDX_6692B5481257D5D` (`entity_id`),
KEY `idx_revoked_id_approver_type` (`revoked_at`, `entity_id`, `approver_job_reactive_weight`, `approver_job_planned_weight`, `type`),
KEY `idx_user_site_access` (`revoked_at`, `security_identity_id`, `buddy_id`, `type`),
KEY `idx_user` (`security_identity_id`),
KEY `idx_user_id` (`security_identity_id`),
CONSTRAINT `FK_6692B54DF9183C9` FOREIGN KEY (`security_identity_id`) REFERENCES `user` (`id`)
)
ENGINE=InnoDB AUTO_INCREMENT=262441 DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci'</pre>
<p>I deleted some irrelevant columns. </p>