21

JEP 358: Helpful NullPointerExceptions

 4 years ago
source link: https://openjdk.java.net/jeps/358
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
AuthorsGoetz Lindenmaier, Ralf Schmelter
OwnerGoetz Lindenmaier
TypeFeature
ScopeJDK
StatusClosed / Delivered
Release14
Componenthotspot / runtime
Discussionhotspot dash runtime dash dev at openjdk dot java dot net, core dash libs dash dev at openjdk dot java dot net
EffortS
DurationS
Reviewed byAlex Buckley, Coleen Phillimore
Endorsed byMikael Vidstedt
Created2019/03/15 10:27
Updated2020/03/19 19:03
Issue8220715

Summary

Improve the usability of NullPointerExceptions generated by the JVM by describing precisely which variable was null.

Goals

  • Offer helpful information to developers and support staff about the premature termination of a program.

  • Improve program understanding by more clearly associating a dynamic exception with static program code.

  • Reduce the confusion and concern that new developers often have about NullPointerExceptions.

Non-Goals

  • It is not a goal to track down the ultimate producer of a null reference, only the unlucky consumer.

  • It is not a goal to throw more NullPointerExceptions, or to throw them at a different point in time.

Motivation

Every Java developer has encountered NullPointerExceptions (NPEs). Since NPEs can occur almost anywhere in a program, it is generally impractical to attempt to catch and recover from them. As a result, developers rely on the JVM to pinpoint the source of an NPE when it actually occurs. For example, suppose an NPE occurs in this code:

a.i = 99;

The JVM will print out the method, filename, and line number that caused the NPE:

Exception in thread "main" java.lang.NullPointerException
    at Prog.main(Prog.java:5)

Using the message, which is typically included in a bug report, the developer can locate a.i = 99; and infer that a must have been null. However, for more complex code, it is impossible to decide which variable was null without using a debugger. Suppose an NPE occurs in this code:

a.b.c.i = 99;

The filename and line number do not pinpoint exactly which variable was null. Was it a or b or c?

A similar problem occurs with array access and assignment. Suppose an NPE occurs in this code:

a[i][j][k] = 99;

The filename and line number do not pinpoint exactly which array component was null. Was it a or a[i] or a[i][j]?

A single line of code may contain several access paths, each one potentially the source of an NPE. Suppose an NPE occurs in this code:

a.i = b.j;

The filename and line number do not pinpoint the offending access path. Was a null, or b?

Finally, an NPE could stem from a method call. Suppose an NPE occurs in this code:

x().y().i = 99;

The filename and line number do not pinpoint which method call returned null. Was it x() or y()?

Various strategies can mitigate the lack of accurate pinpointing by the JVM. For example, a developer faced with an NPE can break up the access paths by assigning to intermediate local variables. (The var keyword may be helpful here.) The result will be a more accurate report of the null variable in the JVM's message, but reformatting code to track down an exception is undesirable. In any case, most NPEs occur in production environments, where the support engineer who observes the NPE is many steps removed from the developer whose code caused it.

The entire Java ecosystem would benefit if the JVM could give the information needed to pinpoint the source of an NPE and then identify its root cause, without using extra tooling or shuffling code around. SAP's commercial JVM has done this since 2006, to great acclaim from developers and support engineers.

Description

The JVM throws a NullPointerException (NPE) at the point in a program where code tries to dereference a null reference. By analyzing the program's bytecode instructions, the JVM will determine precisely which variable was null, and describe the variable (in terms of source code) with a null-detail message in the NPE. The null-detail message will then be shown in the JVM's message, alongside the method, filename, and line number.

Note: The JVM displays an exception message on the same line as the exception type, which can result in long lines. For readability in a web browser, this JEP shows the null-detail message on a second line, after the exception type.

For example, an NPE from the assignment statement a.i = 99; would generate this message:

Exception in thread "main" java.lang.NullPointerException: 
        Cannot assign field "i" because "a" is null
    at Prog.main(Prog.java:5)

If the more complex statement a.b.c.i = 99; throws an NPE, the message would dissect the statement and pinpoint the cause by showing the full access path which led up to the null:

Exception in thread "main" java.lang.NullPointerException: 
        Cannot read field "c" because "a.b" is null
    at Prog.main(Prog.java:5)

Giving the full access path is more helpful than giving just the name of the null field because it helps the developer to navigate a line of complex source code, especially if the line of code uses the same name multiple times.

Similarly if the array access and assignment statement a[i][j][k] = 99; throws an NPE:

Exception in thread "main" java.lang.NullPointerException:
        Cannot load from object array because "a[i][j]" is null
    at Prog.main(Prog.java:5)

Similarly if a.i = b.j; throws an NPE:

Exception in thread "main" java.lang.NullPointerException:
        Cannot read field "j" because "b" is null
    at Prog.main(Prog.java:5)

In every example, the null-detail message in conjunction with the line number is sufficient to spot the expression that is null in the source code. Ideally, the null-detail message would show the actual source code, but this is difficult to do given the nature of the correspondence between source code and bytecode instructions (see below). In addition, when the expression involves an array access, the null-detail message is unable to show the actual array indices which led to a null element, such as the run-time values of i and j when a[i][j] is null. This is because the array indices were stored on the method's operand stack, which was lost when the NPE was thrown.

Only NPEs that are created and thrown directly by the JVM will include the null-detail message. NPEs that are explicitly created and/or explicitly thrown by programs running on the JVM are not subject to the bytecode analysis and null-detail message creation described below. In addition, the null-detail message is not reported for NPEs caused by code in hidden methods, which are special-purpose low-level methods generated and called by the JVM to, e.g., optimize string concatenation. A hidden method has no filename or line number that could help to pinpoint the source of an NPE, so printing a null-detail message would be futile.

Computing the null-detail message

Source code such as a.b.c.i = 99; is compiled to several bytecode instructions. When an NPE is thrown, the JVM knows exactly which bytecode instruction in which method is responsible, and uses this information to compute the null-detail message. The message has two parts:

  • The first part -- Cannot read field "c" -- is the consequence of the NPE. It says which action could not be performed because a bytecode instruction popped a null reference from the operand stack.

  • The second part -- because "a.b" is null -- is the reason for the NPE. It recreates the part of the source code that pushed the null reference on to the operand stack.

The first part of the null-detail message is computed from the bytecode instruction that popped null, as detailed here in Table 1:

bytecode 1st part
aload "Cannot load from <element type> array"
arraylength "Cannot read the array length"
astore "Cannot store to <element type> array"
athrow "Cannot throw exception"
getfield "Cannot read field "<field name>""
invokeinterface, invokespecial, invokevirtual "Cannot invoke "<method>""
monitorenter "Cannot enter synchronized block"
monitorexit "Cannot exit synchronized block"
putfield "Cannot assign field "<field name>""
Any other bytecode No NPE possible, no message

<method> breaks down to <class name>.<method name>(<parameter types>)

The second part of the null-detail message is more complex. It identifies the access path that led to a null reference on the operand stack, but complex access paths involve several bytecode instructions. Given a sequence of instructions in a method, it is not obvious which previous instruction pushed the null reference. Accordingly, a simple data flow analysis is performed on all the method's instructions. It computes which instruction pushes to which operand stack slot, and propagates this information to the instruction which pops the slot. (The analysis is linear in the number of instructions.) Given the analysis, it is possible to step back through the instructions which make up an access path in source code. The second part of the message is assembled step-by-step, given the bytecode instruction at each step as detailed here in Table 2:

bytecode2nd part
aconst_null"null"
aaloadcompute the 2nd part for the instruction which pushed the array reference, then append "[", then compute the 2nd part for the instruction that pushed the index, then append "]"
iconst_*, bipush, sipushthe constant value
getfieldcompute the 2nd part for the instruction which pushed the reference that is accessed by this getfield, then append ".<field name>"
getstatic"<class name>.<field name>"
invokeinterface, invokevirtual, invokespecial, invokestatic If in the first step, "the return value of <method>", else "<method>"
iload*, aload*For local variable 0, "this". For other local variables and parameters, the variable name if a local variable table is available, otherwise "<parameter i >" or "<local i >".
Any other bytecodeNot applicable to the second part.

Access paths can be made up of an arbitrary number of bytecode instructions. The null-detail message does not necessarily cover all of these. The algorithm takes only a limited number of steps back through the instructions in order to limit the complexity of the output. If the maximum number of steps is reached, placeholders such as "..." are emitted. In rare cases, stepping back over instructions is not possible, and then the null-detail message will contain only the first part ("Cannot ...", with no "because ..." explanation).

The null-detail message -- Cannot read field "c" because "a.b" is null -- is computed on demand, when the JVM calls Throwable::getMessage as part of its message. Usually, a message carried by an exception must be supplied when the exception object is created, but the computation is expensive and may not always be needed, since many NPEs are caught and discarded by programs. The computation requires the bytecode instructions of the method which caused the NPE, and the index of the instruction which popped null; fortunately, the implementation of Throwable includes this information about the origin of the exception.

The feature can be toggled with the new boolean command-line option -XX:{+|-}ShowCodeDetailsInExceptionMessages. The option will first have default 'false' so that the message is not printed. It is intended to enable code details in exception messages by default in a later release.

Example of computing the null-detail message

Here is an example based on the following snippet of source code:

a().b[i][j] = 99;

The source code has the following representation in bytecode:

5: invokestatic  #7    // Method a:()LA;
   8: getfield      #13   // Field A.b, an array
  11: iload_1             // Load local variable i, an array index
  12: aaload              // Load b[i], another array
  13: iload_2             // Load local variable j, another array index
  14: bipush        99
  16: iastore             // Store to b[i][j]

Suppose a().b[i] is null. This will cause an NPE to be thrown when storing to b[i][j]. The JVM will execute bytecode 16: iastore and throw an NPE because bytecode 12: aaload pushed null on to the operand stack. The null-detail message will be computed as follows:

Cannot store to int array because "Test.a().b[i]" is null

The computation starts with the method containing the bytecode instructions, and the bytecode index 16. Since the instruction at index 16 is iastore, the first part of the message is "Cannot store to int array", per Table 1.

For the second part of the message, the algorithm steps back to the instruction that pushed the null which iastore was unfortunate enough to pop. Data flow analysis reveals this is 12: aaload, an array load. Per Table 2, when an array load is responsible for a null array reference, we step back to the instruction which pushed the array reference (rather than the array index) on to the operand stack, 8: getfield. Then again per Table 2, when a getfield is part of the access path, we step back to the instruction that pushed the reference used by getfield, 5: invokestatic. We can now assemble the second part of the message:

  • For 5: invokestatic, emit "Test.a()"
  • For 8: getfield, emit ".b"
  • For 12: aaload, emit "[" and stepback to the instruction that pushed the index, 11: iload_1. Emit "i", the name of local variable #1, then "]".

The algorithm never steps to 13: iload_2 which pushes the index j, or to 14: bipush which pushes 99, because they are not related to the cause of the NPE.

Files with many examples of null-detail messages are attached to this JEP: output_with_debug_info.txt lists messages when class files contain a local variable table. and output_no_debug_info.txt messages when class files do not contain a local variable table.

Alternatives

The presence of the null-detail message

The JVM could use other means to supply null-detail information, such as writing to stdout or using a tracing or logging facility. However, exceptions are the standard way to report problems on the JVM, and NPE already gives information about where the exception was raised by including the stack trace with line number information. As this information is insufficient to locate the cause, it is natural to enhance NPE by adding the missing information.

The null-detail message is switched off per default and can be enabled by command-line option -XX:+ShowCodeDetailsInExceptionMessages. There is no way to specify that only some NPE-raising bytecodes are of interest. For the following reasons the null-detail message might not be wanted in all circumstances:

  1. Performance. The algorithm adds some overhead to the production of a stack trace. However, this is comparable to the stack walking done when raising the exception. If an application frequently throws and prints messages so that the printing affects performance, already throwing the exception imposes an overhead that definitely should be avoided.

  2. Security. The null-detail message gives insight into source code that is otherwise not easy to obtain. The message could be switched off to avoid this, but exception messages are supposed to carry information about the cause of an exception so that a problem can be fixed. If exposing this information is not acceptable, the message should not be printed by an application, but caught and discarded. This should not be handled by configuration of the JVM.

  3. Compatibility. The JVM has not traditionally included a message for an NPE, and including a message now might cause problems for tools that parse stack traces in overly sensitive ways. However, Java programs have always been able to throw NPEs with messages, so tools are expected to adapt to messages on NPEs from the JVM. A related risk is that tools might depend on the precise format of the null-detail message.

We intend to enable the null-detail message by default in a future release.

The computation of the null-detail message

Computing the null-detail message on demand has consequences for the message's availability in advanced scenarios:

  1. When executing remote code via RMI, any exception thrown by the remote code is delivered to the caller via serialization. Serializing an exception object does not preserve its internal data structures, so if remote code throws and thus serializes NPE, the eventual deserialization will produce an NPE for which no null-detail message can be computed on demand.

  2. If the bytecode instructions of a method change while a program is running, such as due to redefinition of the method by a Java agent using JVMTI, then the original instructions are preserved for a while but can be discarded during a GC cycle. As the original instructions are required to compute the null-detail message, the null-detail message will not be computed on demand if this happens.

The choice not to support serialization was made in order to minimize changes in the NullPointerException class itself. If persisting the null-detail message for serialization became desirable, then writeReplace could be implemented in that class. Alternatively, the null-detail message could be computed when the exception object is created, and this would persist the null-detail message across both serialization and method redefinition.

The format of the null-detail message

The null-detail message is constructed of two parts: the first part describes an action that could not be performed (the consequence of the NPE) while the second part describes the expression that earlier pushed a null reference on to the operand stack (the reason for the NPE). In some cases, this results in verbose text where only a fraction of the message is really needed to pinpoint the null expression in source code. For example, it could be helpful to shorten the message in these two scenarios:

  1. In a failed array access -- Cannot load from object array because "a[i][j]" is null. -- the second part "a[i][j]" is null suffices to pinpoint the null expression in source code a[i][j][k] = 99;.

  2. In a failed method invocation -- Cannot invoke "NullPointerExceptionTest.callWithTypes(String[][], int[][][], float, long, short, boolean, byte, double, char)" because... -- the method's declaring type and parameter types are often bulky, and can be omitted without seriously harming the developer's ability to pinpoint the null expression.

Nevertheless, the null-detail message does not leave out this information. The algorithm computing the message deals with arbitrary sequences of bytecode instructions, so it does not always succeed in assembling a useful message. For example, for a failed array access, it might be unable to compute the second part altogether, so that no message would be printed at all if the first part was left out; in this case, the first part alone may be sufficient to pinpoint the null expression in source code. In general, due to assembling the message from individual building blocks for each instruction visited, it is not feasible to decide algorithmically whether enough information has been gathered at some point to leave out further parts without harming the usefulness of the message. Thus, the choice was made to print all the information to make the message helpful in as many situations as possible.

Risks and Assumptions

In a helpful NPE, the null-detail message may contain variable names from the source code. Specifically, if debug information is included in the class file (via javac -g), then local variable names are printed. These names were not previously exposed by reflection APIs directly; a program would have had to obtain them via the indirect route of inspecting a class file via ClassLoader::getResourceAsStream(). Exposing these names in NPEs might be considered a security risk, but leaving them out would limit the benefit of the null-detail message.

It is assumed that computation of the null-detail message will be extended if new bytecodes are added to the JVM Specification.

Testing

A prototype of this feature is implemented by JDK-8218628. The prototype contains a unit test that exercises every message part. A predecessor implementation has been in SAP's commercial JVM since 2006 and has proven to be stable.

To avoid regressions some larger amounts of code should be run. The jtreg tests should be run to detect other tests that handle the message and need to be adapted.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK