Some notes on slow queues and Redis over TLS

When playing around with a load testing tool and some Laravel queues I started noticing a weird pattern: every 22 dispatched jobs, there would be a 5 second delay on all queues before dispatching the next set off jobs. This is what my log file looked like after adding some log statements:

Laravel slow dispatching

Jump to the TL;DR

I narrowed it down to the connection to Redis taking up to 5 seconds to connect. Not entirely related, or so I thought, but this was only the case when testing using a DigitalOcean Droplet and their managed Redis servers. I've ran the same test on a local app with a managed Redis server and some other combinations. The issue only seems to occur on the DigitalOcean Droplet combined with the DigitalOcean managed Redis server.

Cause & solution

After a lot of Googling I finally came across this thread: "Redis->auth() sometimes freezes after upgrade to php 7.4". There's also Simon Benett's excellent article that mentions this issue.

As it turns out, these hosted Redis servers excusively use TLS to communicate. Adding the tls:// prefix to my REDIS_HOST environment variable was an earlier hurdle I had to cross. When given the tls:// protocol, PHP likes to choose TLS1.3 for you if you haven't specified a version specifically. Switching to TLS1.2, by using the tlsv1.2:// protocol in the REDIS_HOST string fixed some of my problems! Dispatching jobs from 5 concurrent PHP FPM processes only hangs on the Redis connection every couple of minutes now.

Solution #2

Sadly, under high load I still saw the same issue re-occur but even worse as the high server load and increased PHP-FPM processes seemed to make things even worse. For now, as a "temporary" fix you can enable persistent connections to Redis in the config/database.php file:

'default' => [
    'host' => env('REDIS_HOST', '127.0.0.1'), // use tlsc1.2:// protocol!
    'password' => env('REDIS_PASSWORD', null),
    'port' => env('REDIS_PORT', '6379'),
    'database' => env('REDIS_DB', '0'),
    'persistent' => true, // keep Redis connection open
],

This config value will tell the PhpRedis extension to keep its connection to Redis open as long as the PHP-FPM process lives. By default PHP-FPM comes pre-configured with pm.max_requests = 500 to keep these processes alive for 500 requests. This means that every 500 requests PHP-FPM will respawn one of its processes and we will run into the chance of a slow Redis connection. However, the overall chance of this happening just became 500x less likely. This seemed to fix most of my issues, hooray!

TL;DR

A bug in either PhpRedis, PHP or Ubuntu seems to cause random delays when connecting to Redis over TLS (v1.3 and sometimes v1.2). Here's a related issue and a related article that mentions the issue.

I've ran into this only when using DigitalOcean Droplets with their managed Redis servers.

Some changes that might fix or mitigate the problem: