JEP draft: Virtual Threads (Preview)
source link: https://openjdk.java.net/jeps/8277131
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Drastically reduce the effort of writing, maintaining, and observing high-throughput concurrent applications that make the best use of available hardware through virtual threads, a lightweight user-mode thread implementation with dramatically reduced costs. This is a Preview Feature.
- Add an additional implementation of
java.lang.Thread, designated virtual threads, that will scale to millions of active instances on heaps of a few gigabytes, and exhibits nearly the same behavior as the existing threads (designated platform threads).
- Support the troubleshooting, debugging, and profiling of virtual threads through the existing JDK tools and tooling interfaces, in a manner that is as similar to platform threads as possible.
- It is not a goal to change the existing implementation of platform threads, that represent Operating System (OS) threads.
- It is not a goal to automatically convert existing thread construction to virtual threads.
- It is not a goal to change the Java Memory Model.
- It is not a goal to add new inter-thread communication mechanisms.
- It is not a goal to offer a new data-parallelism construct in addition to parallel streams.
Developers have been using Java widely for the past couple of decades to write concurrent applications such as servers, and threads, specifically
java.lang.Thread, have served as their core building block. Threads work well to represent some application unit of concurrency, such as a transaction, because the platform and its tooling know about it and track it. The platform attaches troubleshooting context to exceptions in the form of a thread's stack trace, and thread dumps allow us to get a snapshot of what the program is doing by grabbing all the threads' stacks; the debugger allows us to step through the execution of a thread; Java Flight Recorder (JFR) emits events for analysis by profilers grouped by thread. These capabilities give us invaluable insight into the program, but only as long as the thread -- the platform's view of the program -- corresponds to the developer's logical view of the application, as, say, a collection of concurrent transactions.
Unfortunately, the current implementation of Thread consumes an OS thread for each Java thread, and OS threads are scarce and costly, much more than, say, sockets. This means that a modern server can handle orders of magnitude more concurrent transactions than OS threads. Developers writing high-throughput server software have had to make effective use of hardware so as not to waste it, and so had to share threads among transactions. First by using thread-pools that would loan a thread to a transaction so as to save on the cost of creating a new thread for each one, and then, when that wasn't enough, as the OS simply cannot support as many concurrent threads as needed for each transaction, developers have begun returning threads to the pool even in the middle of a transaction, when it's waiting on I/O. This results in the asynchronous style of programming, that not only requires a separate and incompatible set of APIs, but breaks the connection between the logical application unit (transaction) and the platform's unit (thread), which makes the platform unaware of the application's logical units. As a result, troubleshooting, observation, debugging, and profiling, become very difficult, as the platform's context -- the thread -- no longer represents a transaction, and so is not very useful. Better hardware utilization is bought for much more difficult development and maintenance, which also translates to waste. Developers, then, are forced to choose between a natural style that models logical units of concurrency directly as threads and wasting considerable throughput that their hardware could support.
Virtual threads -- user-mode implementations of
java.lang.Thread -- give us the best of both worlds. When using the same synchronous APIs on virtual threads, those cheap threads block without blocking any precious OS threads. Hardware utilization is close to optimal, allowing a high level of concurrency and, as a result, high throughput, while the program remains harmonious with the thread-based design of the Java platform and its tooling. Virtual threads are to platform threads what virtual memory is to physical RAM: a mechanism that gives the illusion of a plentiful "virtual" resource through an automatic and efficient mapping to the underlying "physical" resource.
Because virtual threads are cheap and plentiful, patterns of thread use are expected to change. For example, a server that, in the course of serving a request, consults two remote services, awaiting their responses concurrently, might today either submit two blocking HTTP client tasks to some thread-pool, or initiate two asynchronous HTTP client tasks that notify some callback upon completion. Instead, it could spawn two virtual threads, each doing nothing other than perform an HTTP client call on behalf of the transaction. This would be as efficient as the asynchronous option -- which, unlike the thread pool option, does not hold on to two precious OS threads for the duration of the requests -- and the code is not only as familiar and simple as the thread-pool option, but also safer, as threads are not shared by multiple tasks, risking thread-local pollution.
There is no need to learn a new programming model to use virtual threads. Anyone who uses Java to write concurrent applications today already knows this model, which is pretty much the same one as Java's original programming model. But we will need to unlearn old habits -- complicated, suboptimal ones -- that arose out of necessity, not elegance, because of threads' high cost; in particular, the use of thread pools that, like all pools, are only useful when the resource they're pooling is scarce and/or costly to create.
Virtual threads are instances of java.lang.Thread implemented by the JDK in such a manner that would allow a great many active instances to coexist in the same process. They can be created with the
java.lang.Thread.Builder interface like so:
Thread thread = Thread.ofVirtual().name("duke").unstarted(runnable);
Whether the thread is virtual or not can be queried by the
In practice, as it is today, developers will rarely directly construct virtual threads using the builder, but will, instead, use constructs that abstract the creation of the threads, possibly taking an instance of a
ThreadFactory created with the builder, like so:
ThreadFactory factory = Thread.ofVirtual().factory();
As far as Java code is concerned, the semantics of virtual threads are identical to that of platform threads, except that they all belong to a single ThreadGroup and cannot be enumerated. However, native code called on such threads may observe a different behavior; for example, when called multiple times on the same virtual thread, it may observe a different OS thread ID in each instance. In addition, OS-level monitoring will observe that the process uses fewer OS threads than the virtual threads created. Virtual threads are invisible to OS-level monitoring, as the OS is unaware of their existence.
The JDK implements virtual threads by storing their state, including the stack, on the Java heap. Virtual threads are scheduled by a scheduler in the Java class libraries, whose worker threads mount virtual threads on their backs when the virtual threads are executing, thus becoming their carriers. When a virtual thread parks -- say, when it blocks on some I/O operation or a java.util.concurrent synchronization construct -- it suspends, and the virtual thread's carrier is free to run any other task. When a virtual thread is unparked -- say, by an I/O operation completing -- it is submitted to the scheduler, which, when available, will mount and resume the virtual thread on some carrier thread, not necessarily the same one it ran on previously. In this way, when a virtual thread performs a blocking operation, instead of parking an OS thread, it is suspended by the JVM and another one scheduled in its place, all without blocking any OS threads (see the Limitations section).
While the carrier thread shares its corresponding OS thread with the virtual thread it mounts, from the perspective of Java code, the carrier and virtual threads are completely separate. The identity of the carrier is not known to the virtual threads, and the two threads’ stack traces are independent.
The JVM Tool Interface (JVM TI) can observe and manipulate virtual threads as it does platform threads, but some operations are not supported, as summarised below and detailed in the JVM TI spec. In particular, JVM TI cannot enumerate all virtual threads. Similarly, the debugger interface JDI supports most operations on virtual threads, but cannot enumerate them. JFR associates events occurring on a virtual thread with the virtual thread. Ordinary thread dumps will show all running platform threads and mounted virtual threads, but a new kind of thread dump is added, and will be described later.
java.lang.Thread API is updated as follows:
Thread.ofPlatform, are added as a new API to create virtual and platform threads.
Thread.Buildercan also be used to create a
Thread.startVirtualThread(Runnable)is added as convenient way to for start a virtual thread.
Thread::isVirtualis added to test if a thread is a virtual thread.
- Overloads of
Thread.sleepare added to allow the wait/sleep time be provided as
Thread.getAllStackTraces()is re-specified to return a map of all platform threads rather than all threads.
java.lang.Thread API is otherwise unchanged. The constructors defined by
java.lang.Thread create platform threads as before. No new public constructors have been added.
The API differences between virtual and platforms threads are:
- The public constructors cannot be used to create virtual threads.
- Virtual threads are daemon threads, the
Thread::setDaemonmethod cannot be used change a virtual thread to be a non-daemon thread.
- Virtual threads have a fixed priority,
Thread.NORM_PRIORITY, that cannot be changed with the
Thread::setPrioritymethod. This limitation may be re-visited in a future release.
- Virtual threads are not active members of a thread group.
Thread::getThreadGroupreturns a placeholder "VirtualThreads" thread group that is empty. The
Thread.Builder APIcannot be used to set the thread group for a virtual thread.
- Virtual threads have no permissions when running with the
- Virtual threads do not support the
resumemethods. These methods are specified to throw an exception if invoked on a virtual thread.
Virtual threads support thread locals and inheritable thread-locals, just like platform threads, so they can run existing code that uses thread locals.
In preparation for virtual threads, many usages of thread locals have been eliminated from the
java.base module. This should reduce some of the concerns with memory footprint when running with millions of virtual threads.
Thread.Builder API defines a method to opt-out of thread locals when creating a thread. It also defines a method to opt-out of inheriting the initial value of inheritable thread-locals When invoked from a thread that does not support thread locals, the
ThreadLocal::get method returns the initial value, and the
ThreadLocal::set method throws an exception.
The legacy context
ClassLoader is re-specified to work like an inheritable thread local. If
Thread::setContextClassLoader is invoked on a thread that does not support thread locals then an exception is thrown.
JEP: Scope Locals (Preview) proposes the addition of Scope Locals as a better alternative to thread locals for some use-cases.
LockSupport, the primitive API to support locking, has been updated to support virtual threads. If a virtual thread parks then it releases the underlying carrier thread to do other work if possible. Unparking a virtual thread submits it to the scheduler so that it is scheduled to continue. The update to
LockSupport enables all APIs that use it (
Semaphores, blocking queues, ...) to park gracefully when used in virtual threads.
A small number of APIs are added:
New methods are added to
Futureto obtain the result or exception of a completed task. It has also been updated with a new method to obtain the task state, as a enum value. Combined, these additions make it easy to use
Futureobjects as elements of streams (filtering can test the state, map can be used to obtain a stream of results). These methods will also be used with the API additions proposed for Structured Concurrency.
Executors.newVirtualThreadPerTaskExecutorare added to return an
ExecutorServicethat creates a new thread for each task. These can be used for migration and interoperability with existing code that uses thread pools and ExecutorService.
ExecutorServiceis retrofitted to extend
AutoCloseable, thus allowing this API to be used with the try-with-resource construct.
The implementation of the networking APIs defined in the java.net and
java.nio.channels API packages have been updated to work with virtual threads. An operation that blocks, e.g. establishing a network connection or reading from a socket, will release the underlying carrier thread to do other work.
To allow for interruption and cancellation, the blocking I/O methods defined by
java.net.DatagramSocket have been re-specified to be interruptible when invoked in the context a virtual Thread. Interrupting a virtual thread blocked on a socket will unpark the thread and close the socket.
java.io package provides APIs for streams of bytes and characters. The implementations of these APIs are heavily synchronized and require changes to avoid pinning when using these APIs from virtual threads.
As background, the byte-oriented input/output streams are not specified to be thread-safe and do not specify the expected behavior when close is invoked while a thread is blocked in a read or write method. In most scenarios it doesn't make sense to use an input or output stream from concurrent threads. The character-oriented reader/writers are also not specified to be thread-safe but they do expose a lock object for sub-classes. Aside from pinning, the synchronization is problematic and inconsistent, e.g. the stream encoder/decoders used by
OutputStreamWriter synchronize on the stream rather than the lock object.
As a workaround, to avoid pinning, the implementations are changed as follows:
PrintWriterare changed to use an explicit lock rather than a monitor when used directly. These classes will synchronize as before when they are sub-classed.
The stream encoder/decoders used by
OutputStreamWriterare changed to use the same lock as the enclosing
PushbackInputStream::closeis changed to not hold a lock when closing the underlying input stream.
Going further and eliminating the locking is beyond the scope of this JEP. A future JEP may re-examine all the locking in this area.
In addition to the changes to locking, the initial size of the buffers used by
BufferedWriter, and the underlying stream encoder for
OutputStreamWriter implementations, are changed to reduce memory usage when there are many output stream or writers in the heap (as might arise if there are 1M virtual threads, each with a buffered stream on a socket connection).
The scheduler for virtual threads is a work stealing
ForkJoinPool, that works in first-on-first-out (async) mode, and with parallelism set to the number of available processors.
Some blocking APIs temporarily pin the carrier thread, e.g.most file I/O operations. The implementations of these APIs will compensate for the pinning by temporarily expanding parallelism by means of the
ForkJoinPool "managed blocker" mechanism. Consequentially, the number of carrier threads may temporarily exceed the number of available processors.
The scheduler may be configured, for tuning purposes, with two system properties:
jdk.defaultScheduler.parallelismto set the parallelism, it defaults to the number of available processors.
jdk.defaultScheduler.maxPoolSizeto limit the number of carrier threads when parallelism is expanded. It defaults to 256.
Java Native Interface (JNI)
JNI has been updated to define one new function,
IsVirtualThread, to test if an object is a virtual Thread. The JNI specification is otherwise unchanged.
The debugger architecture consists of three interfaces, namely the JVM Tool Interface (JVM TI), the Java Debug Wire Protocol (JDWP), and the Java Debug Interface (JDI). All three interfaces have been updated to support virtual threads.
JVM TI has been significantly updated as follows:
- Most functions that are called with a jthread (a JNI reference to a
Threadobject) can be called with a reference to a
Threadobject for a virtual thread. A small number of functions, namely
GetThreadCpuTime, are not supported on virtual threads. The
SetLocalXXXfunction is only supported on virtual threads in limited cases.
GetAllStackTracesfunctions have been re-specified to return all platform threads rather than all threads.
- All events, with the exception of those posted during early VM startup or during heap iteration, may have event callbacks invoked in the context of a virtual thread.
- A new capability
can_support_virtual_threadsis added for agents that are developed, or upgraded, to support virtual threads. This capability allows agents to have finer control on the thread start and end events for virtual threads.
- The suspend/resume implementation has been significantly changed so that virtual threads can be suspended and resumed by debuggers. It also allows carrier threads to be suspended when there is a virtual thread mounted.
- New functions to support bulk suspend/resume of virtual threads are added. The new functions require the
Existing JVM TI agents will mostly work as before but may encounter errors if they invoke functions that are not supported on virtual threads. This will arise when a "virtual thread unaware" agent is used with an application that uses virtual threads. The change to
GetAllThreads to return an array containing only the platform threads may also be an issue for some agents. There may also be performance issues for existing agents that enable the
ThreadEnd events as they lack the ability to limit the events to only platform threads.
JDWP is updated as follows:
- A new command is added to the protocol to allow debuggers test if a thread is a virtual thread.
- A new modifier is added to the EventRequest command to allow debuggers restrict thread start/end events to platform threads.
JDI is updated as follows:
- A new method is added to
com.sun.jdi.ThreadReferenceto test if a thread is a virtual thread.
- A new method is added to
com.sun.jdi.request.ThreadDeathRequestto restrict the event generated for the request to only platform threads.
As noted above, virtual threads are not considered to be active threads in a thread group. Consequentially, the JVM TI function
GetThreadGroupChildren, the JDWP command
ThreadGroupReference/Children, and the JDI method
com.sun.jdi.ThreadGroupReference::threads return a list of platform threads in the thread group, they do not return a list of virtual threads.
Java Flight Recorder (JFR)
JFR is updated to support virtual threads. A number of new events are added:
jdk.VirtualThreadEndfor virtual thread start and end. These events are disabled by default.
jdk.VirtualThreadPinnedto indicate that a virtual thread parked while pinned (see the Limitations section). This event is enabled by default with a threshold of 20ms.
jdk.VirtualThreadSubmitFailedto indicate that starting or unparking a virtual thread failed, probably due to a resource issue. This event is enabled by default.
Troubleshooting and Diagnosability
A new thread dump implementation is added that supports virtual threads in addition to the platform threads. Virtual threads that are blocked in network I/O operations, or created by the "new thread per task"
ExecutorService listed above, are included in the thread dump. The new thread dump does not include object addresses, locks, JNI stats, heap stats, and other information that appear in a regular HotSpot VM thread dump. The new thread dump outputs JSON format to make it easy to parse. The JSON output has an array of "thread containers" with one for each thread pool (
ForkJoinPool) and thread-per-task executor.
A new method/operation is added to
com.sun.management.HotSpotDiagnosticsMXBean to generate threads dumps with the new implementation. This can be used directly, or indirectly via the platform
MBeanServer from a local or remote JMX tool.
A new command is added to jcmd to use the new thread dump implementation:
jcmd <pid> JavaThread.dump -format=json <file>
As listed in the Java Flight Recorder section, a JFR event is emitted when a thread is pinned when attempting to park with a native frame on the stack or while holding a monitor. A development-time system property,
jdk.tracePinnedThreads, is added to print a stack trace to
System.out when a thread is pinned. Running with
-Djdk.tracePinnedThreads=full prints a complete stack trace when a thread is pinned with the native frames and frames holding monitors highlighted. Running with
-Djdk.tracePinnedThreads=short limits the output to just the problematic frames.
Degrade java.lang.ThreadGroup API
java.lang.ThreadGroup is a legacy API for grouping threads that is rarely used in modern applications and not the right API for grouping virtual threads. It is significantly deprecated and degraded to "make space" to introduce a new construct for organizing threads in the future (link: JEP: Structured Concurrency (Preview)).
As background, the
ThreadGroup API dates from JDK 1.0 and was intended as a form of job control for threads, e.g. "stop all threads". Modern code is more likely to use the thread pool APIs provided by
java.util.concurrent API since Java 5.
ThreadGroup supported the isolation of applets in early JDK releases. The Java security architecture evolved significantly in Java 1.2 with thread groups no longer having a significant role.
ThreadGroup was also intended to be useful for diagnostic purposes but that aspect has been superseded by the monitoring and management support and
java.lang.management API since Java 5. Aside from relevance, the
ThreadGroup API and implementation have a number of significant problems, including:
- The API and mechanism to destroy thread groups is flawed.
- The API requires the implementation to have a reference to all live threads in the group. This adds synchronization and contention overhead to thread creation, thread start, and thread termination.
enumeratemethods that are inherently racy and flawed.
stopmethods that are inherently deadlock prone and unsafe.
ThreadGroup is re-specified, deprecated, and degraded as follows:
- The public constructors are deprecated.
- The ability to explicitly a destroy a thread group is removed. The terminally deprecated
destroymethod is degraded to be a no-op.
- The notion of daemon thread group is removed. The terminally deprecated
setDaemonmethod is degraded to be a no-op.
- The implementation no longer keeps a reference to the threads in the group, meaning the internal "threads" array is removed. The
enumeratemethods are re-implemented to use the VM thread list. The corresponding methods on
- The implementation on longer keeps a strong reference to sub-groups. Thread groups are now eligible to be GC'ed when there are no live threads in the group and there is nothing else keeping the thread group alive.
- The terminally deprecated
stopmethods are degraded to throw an exception.
There are situations when the VM cannot suspend a virtual thread, in which case it is said to be pinned. Currently, there are two:
- When a native method is currently executing in the virtual thread (even if it is calling back into Java)
- When a native monitor is held by the virtual thread, meaning it is currently executing inside a synchronized block or method.
The first limitation is here to stay, while the second might be removed in the future.
When a virtual thread tries to park, say, by performing a blocking I/O operation, while pinned, rather than released, its underlying OS thread will be blocked for the duration of the operation. For this reason, very frequent pinning for long durations might harm the scalability of virtual threads.
Therefore, to gain the most out of virtual threads, synchronized blocks or methods that are run frequently and guard potentially long I/O operations should be replaced with
java.util.concurrent.ReentrantLock. There is no need to replace synchronized blocks and methods that are infrequent (say, only performed at startup) or guard in-memory operations, although it's always a good idea to consider
java.util.concurrent.StampedLock for the latter case. As always, keeping a locking policy simple and clear should be a priority.
To assist in migration and help assess whether a particular use of synchronized should be considered for replacement with a j.u.c lock, the JFR event
jdk.VirtualThreadPinnedwill be emitted when a virtual thread attempts to park while pinned (with a default threshold of 20ms). See also the Troubleshooting section for further diagnostics of pinning.
- Continue to rely on asynchronous APIs. Asynchronous APIs are hard to integrate with synchronous ones, create a split world of two representations of the same I/O operations, and provide no unified concept of a sequence of operations that can be used by the platform to provide a context for such a unit for troubleshooting, monitoring, debugging, and profiling purposes.
- Introduce a new API to represent lightweight threads. A break from the past provides an opportunity to get away from the baggage that
java.lang.Threadhas accumulated over 25 years. Several alternatives were explored and prototyped. The explorations into completely new APIs grabbled with the issue of how to run existing code. The use of
Thread.currentThread, directly or indirectly, is pervasive, e.g. lock ownership. The use of thread locals is pervasive. In order to run existing code it is necessary for
Thread.currentThreadto return something that represents a
Threadobject for the current thread of execution. It would be confusing to have two objects to represent the current thread of execution.
The existing tests in the OpenJDK repository will be used to ensure that the changes do not cause any unexpected regressions in the multitude of configurations and execution modes that they are run.
- New tests will exercise all new and changed APIs.
- New tests will exercise the areas changed to support virtual threads.
- The jtreg test harness is modified to allow existing jtreg tests run in the context of a virtual thread. This avoids needing to have two versions of many tests.
- New stress tests will target the areas that are critical to reliability and performance.
- New micro benchmarks will target the performance critical areas.
- A number of existing servers, including Helidon and Jetty, will be used for larger scale testing.
Risks and Assumptions
The primary risks of this proposal are ones of compatibility due to changes in existing APIs and their implementation:
The internal (and undocumented) locking used by several APIs in the
java.iopackage has changed. More specifically, the locking in
PrintWriterhas changed. This may impact code that assumes that I/O operations synchronize on the stream. This change has no impact on code that extends these classes and assumes locking by the super class. It also does not impact code that extends java.io.Reader or java.io.Writer and uses the lock object exposed by those APIs.
java.lang.ThreadGroupis significantly changed. The ability to explicitly destroy a
ThreadGrouphas been removed. The notion of daemon
ThreadGrouphas been removed. The
stopmethods have been changed to throw an exception.
In addition, there are several behavioural differences between platform and virtual Threads that may be observed when using existing code with newer code that takes advantage of virtual threads or the new APIs:
resumemethods are specified to throw
UnsupportedOperationExceptionwhen invoked on a virtual Thread.
setPrioritymethod is a no-op when invoked on a virtual Thread (as the priority of virtual threads is always
UnsupportedOperationExceptionif invoked to change a virtual Thread to be a non-daemon
ThreadAPI has been updated to support the creation of threads that do not support thread locals.
Thread::setContextClassLoaderhas been changed to throw
UnsupportedOperationExceptionif invoked in the context of a
Threadthat does not support thread locals.
Thread.getAllThreadStackshas been re-specified to return map of all platform threads rather than all threads.
- The blocking I/O methods defined by
java.net.DatagramSockethave been re-specified to be interruptible when invoked in the context a virtual Thread. It may be surprising to existing code that interrupting a thread blocked on socket will cause the thread to wakeup and the socket to be closed.
- Virtual threads are not active members of a
Thread::getThreadGroupon a virtual thread will return a dummy "VirtualThreads" group that is empty.
- Virtual threads have no permissions when running with the SecurityManager set.
java.lang.management.ThreadMXBeanAPI has been re-specified to support the monitoring and management of platform threads. It does not support virtual threads.
- JVM Tool Interface (JVM TI)
- As listed above, a number of functions are not supported on virtual threads. The JVM TI spec has more details.
GetAllStackTraceshave been re-specified to return all platform threads rather than all threads.
GetThreadGroupChildrenhas been re-specified to only return platform threads.
- Java Debug Wire Protocol (JDWP)
VirtualMachine/AllThreadscommand has been re-specified to return all platform threads rather than all threads.
ThreadGroupReference/Childrencommand has been re-specified to only return platform threads.
- JEP 416: Reimplement Core Reflection with Method Handles in JDK 18 removes the use of the VM native reflection implementation. This allows virtual threads to park gracefully when operations are invoked reflectively.
- JEP 353: Reimplement the Legacy Socket API in JDK 13, and JEP 373: Reimplement the Legacy DatagramSocket API in JDK 15, replaces the underlying implementation of
java.net.DatgaramSocket. The new implementations are designed for use with virtual threads.
- JEP 418: Internet-Address Resolution SPI in JDK 18 defines a service-provider interface for host name and address lookup. This will allow an alternative resolver for
java.net.InetAddressto be deployed that does not pin threads during host lookup.
Aggregate valuable and interesting links.
Joyk means Joy of geeK