PHP JIT in Depth

Published On03 Dec 2020

PHP JIT in Depth

One of the most important new features in PHP 8.0 is Just-In-Time Compiler. JIT can bring performance improvements by compiling and storing the full or frequently called parts of a PHP application as CPU machine code, and directly execute it, which bypasses the Zend VM and its process overhead.

JIT is a hybrid of the traditional interpreters and Ahead-Of-Time (AOT) compilers. The hybrid model brings the pros and cons of both approaches, and a finely tuned application can outweigh the cons of JIT.

PHP's JIT implementation is with amazing efforts of Dmitry Stogov for over a few years' worth of discussions, implementations, and tests.


PHP JIT: The Basics For PHP 8.0's JIT overview and configuration options, see PHP 8.0: JIT. This post is about benchmarks, how JIT works, and ideal configuration options.


Most of the PHP applications accept HTTP requests, retrieve and process data from a database, and return a result. More often than not, the important performance bottlenecks are with IO: reading data from disk, writing, and network requests.

PHP 8.0 introduces JIT, as a next step to improve performance of PHP applications, but it also adds a significant barrier in debugging, because some parts of the application might be cached as CPU machine code, which standard PHP debuggers cannot work with. The PHP 8.0's JIT pull-request is well over 50,000 new lines added to PHP code base, that PHP core developers themselves, apart from those working on JIT, might not be well-versed at.


PHP VM

PHP code, once processed (tokenize, parse, build AST, and build opcodes), is run on Zend Virtual Machine. Similar to Java and JavaScript, the virtual machine abstracts the hardware side of the application, which makes it possible to "run" a PHP source code without compiling.

Opcache extension can help store the opcodes in a shared memory, to skip repetitive tokenize/parse/opcode steps.

PHP already includes several optimizations such as dead code elimination at Opcode level, but it was not possible to perform optimizations beyond the virtual machine level, because at that point, the code is interpreted by the virtual machine, as opposed to compiling it.

Handing Off to Other Applications

PHP already has several integrations that it invokes other applications that are already compiled.

GD extension might be the closest one that rings a bell; If PHP were to manipulate images at vector or bitmap level, it would have been very slow, due to the PHP's additional layer of virtual machine. GD extension, which invokes the compiled binaries, can make use of advanced CPU instructions to perform the same actions..

PHP 7.4 introduces Foreign Functions Interface (by none other than Dmitry's work), that provides a unified interface to invoke arbitrary applications without having to develop a PHP extension. It is possible to integrate traditionally compiled languages such as C and Rust with PHP, thanks to FFI.


Compiling PHP Code

The natural next step of reaching as close as possible to the CPU is skipping the virtual machine, and that is what JIT is.

Just-In-Time compilation is a feature that JavaScript successfully adopted many years ago with the V8 engine. Other languages implement a JIT one way or the other too. The biggest advantage is that the source code still does not require to be pre-compiled, but with a shared cache of compiled machine code, the language can trigger the code to be executed with compiled machine code, compiled for later, or executed without JIT.

LLVM

The LLVM is a popular compiler tool-set, that helps develop the compilers for a majority of AOT languages today.

LLVM's targets include the x86, x86-64, and several other types including graphics processors, web assemblers, ARM, etc.

PHP considered using LLVM, but it was not very fruitful due to the compiler speed being not in favor.

DynASM

DynASM, from LuaJIT project, was much faster for PHP's JIT. Its support for target CPU instruction sets is limited compared to LLVM, but it provides support for x86 and x86-64 instruction sets; the most common ones for a server-side programming language such as PHP.

PHP 8.0's JIT implementation uses DynASM for its code generation. PHP's JIT is bound by the limitations of DynASM for target processor architectures.


How PHP JIT Works

PHP JIT is implemented as a part of Opcache. This keeps JIT separated from the PHP engine.

The three components of JIT is to store, inspect, and seamlessly invoke the code with the virtual machine or directly using the machine code stored in the buffer.

Buffer

JIT Buffer is where the compiled CPU machine code is stored. PHP provides configuration options (opcache.jit_buffer_size INI setting) to control how much memory should be allocated for the JIT buffer.

Triggers

Triggers in Opcache are responsible in invoking the compiled machine code when it encounters a code structure. These triggers can be a function call entry, a loop, etc.

Tracer

JIT tracer functionality inspects the code before, after, or during its execution, and determines which code is "hot", as in which structures can be compiled with JIT.

Tracer can compile the code as it is being run, when a certain code structure reaches the threshold, that is also configurable.

PHP JIT chart


Tracing JIT and Function JIT

PHP 8.0 adds two modes of JIT operation. This is further customizable, but the most prominent types of JIT functionality are aliased function and tracing.

Function JIT

Function JIT mode is a rather simple one in comparison. It JIT compiles a whole function, without tracing for frequently used code structures such as loops inside a function. It still supports profiling for frequently used functions, and triggering a JIT compile or execution of the compiled machine code at before, after, or during the execution of an application request.

Tracing JIT

Tracing JIT, that is selected by default in PHP 8.0, tries to identify the frequently used parts of code, and selectively compiles those structures for the best balance of compilation time and memory usage. Not all programming languages support tracing JIT compilers, but PHP supports tracing JIT right out of the first release, and is selected by default too.

There are several configuration options that enable further tweaking how a hot code structure is determined, such as the number of function calls, number of iterations of a loop structure, etc.

Profiling and Optimizing

JIT can inspect, profile, and optimize the code as it is being run. PHP JIT offers granular control over the thresholds and triggers as to how many invocations make it a worthy candidate to JIT compile into machine code, and it can use the newly compiled code. Subsequent requests can make use of the compiled code if it is present in the buffer too.


JIT compiling as the code is executed


PHP's JIT implementation allows to fine tune when JIT should be used (when the script is loaded, after the first run, or during the execution), what (the whole function, or individual code structures), and how the optimizations be made (use of AVX instructions, use of CPU registers, etc.)

JIT-friendly code

JIT benefits heavily when it can offload as much as possible to native CPU registers and instructions. PHP is a weakly typed language, which makes it difficult to infer the type of a variable, and requires more analysis of the variable life-cycle because the type of a variable might change at a later point in the same code structure.

Strictly typed code, and functions with scalar types can help JIT to infer types and make use of CPU registers and specialized instructions where possible. For example, a pure function (that has no side-effects), with strict types enabled and with parameter and return types might make a perfect candidate:

declare(strict_types=1);

function sum(float $a, float $b): float {
    return $a + $b;
}

When PHP cannot infer the types, it might not be able to make the best use of the JIT optimizations.

Some of the improvements in PHP 7, in fact, come from these optimizations that it can eliminate dead code and improve reference counting. This means more strictly typed code gives more opportunities for PHP to optimize code at Opcache level, and also at JIT level.


Applications that are IO-bound, such as the ones that extensively use a database, DNS queries, file read/write operations, FTP, sockets, etc. might not see a noticeable difference because more often than not, the IO operations are themselves the bottleneck of such application.

Basic JIT Configuration

By default, JIT is enabled, but it is turned off by limiting the buffer size.

PHP JIT: The Basics For PHP 8.0's JIT overview and configuration options, see PHP 8.0: JIT. This post is about benchmarks, how JIT works, and ideal configuration options.

The simplest setup is to simply set a buffer size for JIT, and JIT will use the sensible defaults it comes with.

opcache.enable=1
opcache.enable_cli=1
opcache.jit_buffer_size=256M

This allocates 256 MB for the JIT buffer, and enables JIT on CLI applications as well.


The opcache.jit directive allows to fine tune the JIT functionality.

opcache.jit=tracing

opcode.jit is a somewhat complicated configuration value. It accepts disable, on, off, trace, function, and a 4-digit value (not a bit-mask) of 4 different flags in the order.

  • disable: Completely disables JIT feature at start-up time, and cannot be enabled run-time.
  • off: Disabled, but it's possible to enable JIT at run-time.
  • on: Enables tracing mode.
  • tracing: An alias to the granular configuration 1254.
  • function: An alias to the granular configuration 1205.

PHP JIT accepts tracing or function as an easy configuration that represents a combination of configuration.

In addition to the tracing and function aliases, the opcache.jit directive accepts a 4-digit configuration value as well. it can further configure the JIT behavior.

The 4-digit configuration value is in the form of CRTO, where each position allows a single digit value for the flag designated by the letter.

JIT Flags

The opcache.jit directive accepts a 4-digit value to control the JIT behavior, in the form of CRTO, and accepts following values for C, R, T, and O positions.

CPU-specific Optimization Flags

  • 0: Disable CPU-specific optimization.
  • 1: Enable use of AVX, if the CPU supports it.

Register Allocation

  • 0: Don't perform register allocation.
  • 1: Perform block-local register allocation.
  • 2: Perform global register allocation.

Trigger

  • 0: Compile all functions on script load.
  • 1: Compile all functions on first execution.
  • 2: Profile first request and compile the hottest functions afterwards.
  • 3: Profile on the fly and compile hot functions.
  • 4: Currently unused.
  • 5: Use tracing JIT. Profile on the fly and compile traces for hot code segments.

Optimization Level

  • 0: No JIT.
  • 1: Minimal JIT (call standard VM handlers).
  • 2: Inline VM handlers.
  • 3: Use type inference.
  • 4: Use call graph.
  • 5: Optimize whole script.

The option 4 under Triggers (T=4) did not make it to the final version of JIT implementation. It was trigger JIT on functions declared with @jit DocBlock comment attribute. This is now unused.


Both function and tracing JIT configurations make use of CPU instructions sets and CPU register allocations for to make the most of CPU capabilities (C=1, R=2).

opcache.jit=function

function is an alias to C=1, R=2, T=0, O=5.

The difference with function configuration is that it is eager to compile the script as soon as possible, and compiles the whole script. It is a more presumptuous and a bold approach, akin to preloading PHP files to Opcache with preloading feature in PHP 7.4.

opcache.jit=tracing

tracing is an alias to C=1, R=2, T=5, O=4.

With tracing enabled, JIT can be more granular and pick code segments within a function to compile. Ideal candidates would be looping structures, and functions that are called frequently.

This is the default configuration, that it can provide more balance between the performance benefits and compilation overhead.


JIT tracing functionality (T=2, 3, or 5) allows further tuning as to how many invocations it takes for a function to be marked as hot, and then eventually JIT compiled.

Directive Description Default value
opcache.jit_hot_loop After how many iterations a loop is considered hot. 64
opcache.jit_hot_func After how many calls a function is considered hot. 127
opcache.jit_hot_return After how many returns a return is considered hot. 8
opcache.jit_hot_side_exit After how many exits a side exit is considered hot. 8

The default values might be the most suitable for almost all applications, and lowering them results in more code structures to be compiled as they reduce the threshold.

Ideal JIT Configuration

More JIT compiled code does not necessarily mean a faster application (as seen in web application benchmarks below). The compilation overhead, coupled with a smaller buffer can make the applications rather slow, due to the time spent on JIT compilation steps.

The opcache.jit value is better left untouched (default is tracing) as it already provides a good balance of CPU utilization, memory, and keeping track on which code structures are compiled.

JIT will not gain any meaningful performance benefits for heavily IO-bound applications. Majority of web applications today are in fact IO-heavy, where JIT will not make a difference, let alone a positive one.

For the buffer size, pay attention to not have a too small memory, which can waste the JIT compiled code and result in frequent re-compilations. A too big of a memory can be an overkill too. A value of 50-100% of the current Opcache shared memory for Opcode might be the ideal value for opcache.jit_buffer_size.


JIT Benchmarks

All the tests below were done on an 8 core 16 thread x86-64 system. The tests however never use integers that require 64-bit registers, to keep the test more relevant to x86 CPUs.

PHP Script Benchmark

PHP source includes two benchmark scripts, that tests various PHP functionality. micro_bench.php and bench.php files were put to test on the PHP 8.0 branch (which contains a few bug fixes since PHP 8.0.0 release).

PHP script benchmark

The first test was done with Opcache completely turned off, and the second one with JIT turned off, but opcache enabled.

Both JIT modes bring substantial performance gains, with tracing mode being a little ahead.

This benchmark hardly represents a real-life PHP application. The repetitive calls to the same function, and the simpler and side-effect less nature of the tested code gives advantage to JIT.

PHP Fibonacci Benchmark

A simple Fibonacci function to calculate the 42nd number in the Fibonacci sequence.

PHP Fibonacci benchmark

Fibonacci sequences are all about recurring function calls, and does not say a full story of a real-life PHP application either, unless it is of course a Fibonacci calculator application.

Fibonacci: PHP vs Other Languages

The same Fibonacci(42) test was put with the other compiled languages (such as Go, Rust, and C), and Node JS, which has JIT feature too.

PHP vs Other Languages with JIT PHP 8.0's JIT does not attempt all possible optimizations that other AOT compiled languages can perform. PHP 8.0's JIT however brings a substantial performance boost, with still more leeway to improve.

Web Application Benchmark

It is difficult to predict the impact of JIT because JIT highly depends on the underlying workload. Most of the examples below are the hello-world examples of web frameworks, that do not necessarily represent real-world usage due to the various plugins and caching systems involved.

Applications that make use a database connection will likely have the biggest bottleneck at the database queries. On a web server test that requests per second is measured, the TLS, HTTP, and FPM overheads might far outweigh the performance difference JIT makes.

Laravel (8.4.4) and Symfony (demo 1.6.3, using 5.1.8 components) with their skeleton applications were tested on the same hardware in same scenarios as the previous benchmarks. Both applications were served by the built-in PHP web server, and benchmarked using Apache Bench (ab), with concurrency of 5, and 100 requests. Average of 5 tests.

Laravel and Symfony with JIT Both applications did not receive a noticeable benefit, and in Laravel, the performance was ~2% worse with JIT, likely due to the compiling overhead that did not outweigh the efforts.


Benchmark Each Application

For real performance benefits, each application will need to go under a benchmark to measure if using JIT can make a noticeable benefit.

CLI applications, especially CPU intensive ones will likely gain substantial performance improvements.

For network and file-intensive applications such as Composer and PHPUnit will not likely see a performance gain to as they do not benefit a lot from the machine code improvements. Throw in more SSD/RAM capacity and bandwidth for better results.


Closing Thoughts

JIT is a great step in making PHP perform faster, and make use of the capabilities of underlying hardware. It is many years of efforts, and it already shows substantial improvements in computationally intensive work loads.

There is still a leeway for PHP's JIT to improve, and it will likely only get better from this point forward.

A huge thanks to Dmitry Stogov and Nikita Popov for their amazing work on JIT. Nikita also kindly and quickly reviewed the first portion of this article in how JIT works part. Thank you ❤🙏🏼.

Recent Articles on PHP.Watch

All ArticlesFeed 
How to fix `mysql_native_password` not loaded errors on MySQL 8.4

How to fix mysql_native_password not loaded errors on MySQL 8.4

How to fix the SQLSTATE[HY000] [1524] Plugin 'mysql_native_password' is not loaded errors caused in MySQL 8.4 no longer enabling the mysql_native_password plugin by default.
How to fix PHP Curl HTTPS Certificate Authority issues on Windows

How to fix PHP Curl HTTPS Certificate Authority issues on Windows

On Windows, HTTPS requests made with the Curl extension can fail because Curl has no root certificate list to validate the server certificates. This article discusses the secure and effective solutions, and highlights bad advice that can leave PHP applications insecure.
AEGIS Encryption with PHP Sodium Extension

AEGIS Encryption with PHP Sodium Extension

The Sodium extension in PHP 8.4 now supports AEGIS-128L and AEGIS256 Authenticated Encryption ciphers. They are significantly faster than AES-GCM and CHACHA20-POLY1305. This article benchmarks them and explains how to securely encrypt and decrypt data using AEGIS-128L and AEGIS256 on PHP.
Subscribe to PHP.Watch newsletter for monthly updates

You will receive an email on last Wednesday of every month and on major PHP releases with new articles related to PHP, upcoming changes, new features and what's changing in the language. No marketing emails, no selling of your contacts, no click-tracking, and one-click instant unsubscribe from any email you receive.