Java Core

Java Bytecode Simplified: Journey to the Wonderland (Part 1)

March 08, 2022
12639 Unique Views
5 min read

There are two ways to see a thing. One, see it as it appears to us; two, see it and appreciate it. For example, we get light when we switch on a lightbulb. We press the button and then get busy with our life. Pretty simple, but boring. On the other hand, if we know how the light gets energy from an electrical power grid far from our home with wires and while travelling through the wires and filament, the filament heats up and starts emitting photons, we get to see in the light; we can then appreciate the blessing.

The same way, when we write a piece of code, and if we know the mechanism behind it, we can then appreciate it more, how excellent engineering effort went into it, making our life so amazing.

Today I will make an attempt to appreciate how unique the JVM is. So, let’s begin the journey to how Java works!

We all know that the Java compiler takes Java source and then compiles to Bytecode. The process is pretty straightforward. It takes a file and transforms it into Bytecode. The just-in-time compiler (JIT) interprets the Bytecode to machine code so that it can run. While interpreting it, it collects data, for example, how frequently a particular method is called. When a specific portion of code (in hotspot terminology, "hot code") reaches a certain threshold, the JIT optimizes it and further compiles it to direct machine code so that it can perform better. This may seem an oversimplification, which begs for a more extensive explanation.

So in this article, we will keep it short: only one part of it, which is the Bytecode itself. What it is and its internals. It's definitely a fun journey.

Now the first question gets to be: what is Bytecode?

If we put it simply, bytecode is a set of instructions that are emitted from the Java compiler and the JVM then executes them.

Each Bytecode is 1 byte long, and that’s why it is called Bytecode. We know there are 8 bits in a byte. That’s why there are only 2^8 = 256 possible instructions that we could have in bytecode. Interestingly, we don’t even need the whole set of bytecode in Java. Today we use around 205. These are called operation codes or simply "opcodes".

First, we will write a simple Java program and then compile it to see what Java compiler emits:

public class Calculator {

  public int add(int a, int b) {
    return a + b;
  }
}

That’s the simplest Java program we could ever write. It’s a class with a public method, "add", which takes two integer arguments and then returns, summing them. That’s it.

Let’s compile it.

javac Calculator.java

The above command will produce a class file named "Calculator. class". This file contains a series of bytes and it’s not readable. You won’t be able to open it with a text file or anything.

However, an excellent Java command-line tool called "javap" allows us to read this bytecode from a class file. Let’s read them, as follows:

javap -c Calculator

If we run the above command in our terminal, we will get the following output.

Compiled from "Calculator.java"
public class Calculator {
  public Calculator();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public int add(int, int);
    Code:
       0: iload_1
       1: iload_2
       2: iadd
       3: ireturn
}

Look, we can see a constructor here. However, we haven’t added that in our Java source code. Well, the Java compiler did that. That’s our default constructor. The Java compiler added it.

Let’s change gears a bit.

We know the JVM interprets byte code, and it is a stack machine. It has an operand stack. It works like Reverse polish notation (RPN). For example, if we have an expression as follows:

1 + 2, then the RPN would be 1 2 +

If we want to evaluate this using the following images:

Firstly, we will push 1 and 2 to the stack. And then we will pop these two, add them, and put them in the stack again. The same thing is done in JVM by two instructions: iconst_<> and iadd.

iconst_1 and iconst_2, these two opcodess push 1 and 2 to the stack, and iadd opcode pops them from the stack and puts them back after adding them. The iconst_1 and iconst_2 are two special opcodes for loading 1 and 2 as they are constant.

Let’s get back to the bytecode again.

Bytecode is nothing but a list of instructions. For example, if we want to return from the method, the bytecode would be ‘return.’

This return opcode is nothing but a representation so that we can read bytecode and reason about it. In the class file, it is just a series byte. The hex value of the return is B1, and its binary is- 1011 0001.

JVM can understand these byte series and then convert them to appropriate machine code.

Let’s see another example -

java source code

opcode

hex

binary

int add(int a, int b) {
        return a + b;
 }

iload_1

iload_2

iadd

ireturn

0001 1011

0001 1100

0110 0000

1010 1100

In the above table, we have a method which takes two integers arguments and then adds them. Over here, a and b are not constant; that's why iload_1 and iload_2, these two opcodes are used. The generic format of this bytecode for loading integers is: iload_<n>. It essentially means that the is an index of the array of the local variables. The parameters are, in fact, local variables. iload_1 loads the a and iload_2 loads the b.

Now, look at the following method:

public int add() {
    return 1 + 2;
}

However, here is a caveat. If you write the above method, then compile it and try javap to read it, you will find something like this -

0: iconst_3
1: ireturn

The reason is that the Java compiler does a bit of optimization; when it sees we are just adding 1 and 2 and then returning their value, it can just load the 3 into the stack with one instruction rather than using 3 instructions. We will know much more about these sorts of optimization later.

Let’s assume the Java compiler doesn't do this little optimization in this article. The opcode, iconst_1 and iconst_2 will put the 1 and 2 in the stack and then use iadd to pop these two, add them and then put their result back to the stack and return.

That’s a brief introduction to how Java bytecode and how it looks and works. We will go a bit further in our next article.

But before closing the article, I can't resist sharing a simple piece of Java code that can read a class file and convert it into a series of bytes and print it out for you so you can see it the way the JVM sees it.

package ca.bazlur;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;

public class BytecodeReader {

  public static void main(String[] args) throws IOException {
    Path classfile = Path.of("src/Calculator.class");// put path of your classfile here
    byte[] bytes = Files.readAllBytes(classfile);
    for (byte aByte : bytes) {
      //ref https://stackoverflow.com/a/12310078/893197
      String byteString = String.format("%8s", Integer.toBinaryString(aByte & 0xFF))
          .replace(' ', '0');
      System.out.println(byteString);
    }
  }
}

If you run this program, you will get a series of 1 and 0. Those are bits. Every 8 bits make a byte, and each byte represents an opcode. The list of all opcode can be found here:

https://en.wikipedia.org/wiki/List_of_Java_bytecode_instructions

That's all, for today!

Happy coding!

Don’t Forget to Share This Post!

Java Core

A N M Bazlur Rahman

Author

A N M Bazlur Rahman is a Software Engineer with over a decade of specialized experience in Java and related technologies. His expertise has been formally recognized through the prestigious title of Java Champion. Beyond his professional commitments, Mr. Rahman is deeply involved in community outreach and education. He is the founder and current moderator of the Java User Group in Bangladesh, where he has organized educational meetups and conferences since 2013. He was named Most Valuable Blogger (MVP) at DZone, one of the most recognized technology publishers in the world. Besides DZone, he is an editor for the Java Queue at InfoQ, another leading technology content publisher and conference organizer, and an editor at Foojay.io, a place for friends of OpenJDK. In addition, he has published five books about the Java programming language in Bengali; they were bestsellers in Bangladesh. He earned his bachelor's degree from the Institute of Information Technology, University of Dhaka, Bangladesh, in Information Technology, majoring in Software Engineering. He currently lives in Toronto, Canada.

Foojay Podcast #75: JCON Report, Part 4 – Tips and Tricks for Java Devs

Testing an OpenRewrite Recipe

Data Modeling for Java Developers: Structuring With PostgreSQL and MongoDB

Creating Scalable OpenAI GPT Applications in Java

Clean and Modular Java: A Hexagonal Architecture Approach

Dissection of Joeffice: Open Source Office Suite in Java

Building a Real-Time AI Fraud Detection System with Spring Kafka and MongoDB

Prime Time: The High Performance Java Event

Project Panama for Newbies (Part 1)

How I Improved Zero-Shot Classification in Deep Java Library (DJL) OSS

foojay: A Place for Friends of OpenJDK

Dashboard for OpenJDK Update Release Details

JDK14: New Features and Enhancements

Fun with Flags: My Top 10 Resources for JVM Flags

Performance of Modern Java on Data-Heavy Workloads: Real-Time Streaming

Performance of Modern Java on Data-Heavy Workloads: Batch Processing

How does Java handle different Images and ColorSpaces – Part 1

How does Java handle different Images and ColorSpaces – Part 2

How does Java handle different Images and ColorSpaces – Part 3

How does Java handle different Images and ColorSpaces – Part 4

Indexing all of Wikipedia, on a laptop

Working with Multiple Carets in IntelliJ IDEA

Clean Shutdown of Spring Boot Applications

Java 17 on the Raspberry Pi

How to Create Mobile Apps with JavaFX (Part 1)

Project Panama for Newbies (Part 1)

Foojay Slack: bit.ly/join-foojay-slack

Beginning JavaFX Applications with IntelliJ IDE

SpringBoot 3.2 + CRaC

Debugging Java on the Command Line

Stable, Secure, and Affordable Java

Azul Platform Core is the #1 Oracle Java alternative, offering OpenJDK support for more versions (including Java 6 & 7) and more configurations for the greatest business value and lowest TCO.

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Learn about a number of experiments that have been conducted with Apache Kafka performance on Azul Platform Prime, compared to vanilla OpenJDK. Roughly 40% improvements in performance, both throughput and latency, are achieved.

Comments (6)

Alex

3 years ago

Great! Look forward the remaining part.

Vincz

It's seems iload and iconst are mixed in the body of the article and the illustrating images. Also if somebody's interested in the full specification of the the opcodes, here is the link from Oracle (Java 17): https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-6.html#jvms-6.5

That's correct; I made the change; please let me know what you think now.

-8

Mehdi Raza

Isn’t there a mistake in java code when you mentioned the optimisation. The jvm can only directly add 3 in the stack if the code is ‘return 1+2’ instead of ‘return a+b’

-2

Yes, I updated the article; please let me know what you think now. Thank you for pointing out the mistake.

Abhishek

3 months ago

Could you please share next part link?

-3

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Jakarta EE 11: Beyond the Era of Java EE

Stable, Secure, and Affordable Java

Java Bytecode Simplified: Journey to the Wonderland (Part 1)

A N M Bazlur Rahman

A N M Bazlur Rahman

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Payara

Digma

adesso

Trending

Stable, Secure, and Affordable Java

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Jakarta EE 11: Beyond the Era of Java EE

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Stable, Secure, and Affordable Java

Comments (6)

Alex

Vincz

A N M Bazlur Rahman

Mehdi Raza

A N M Bazlur Rahman

Abhishek

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Jakarta EE 11: Beyond the Era of Java EE

Stable, Secure, and Affordable Java

Do you want your ad here?

Java Bytecode Simplified: Journey to the Wonderland (Part 1)

A N M Bazlur Rahman

A N M Bazlur Rahman

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Payara

Digma

adesso

Trending

Stable, Secure, and Affordable Java

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Jakarta EE 11: Beyond the Era of Java EE

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Stable, Secure, and Affordable Java

Do you want your ad here?

Related Articles

Comments (6)

Alex

Vincz

A N M Bazlur Rahman

Mehdi Raza

A N M Bazlur Rahman

Abhishek

Set Event Reminder

Subscribe to foojay updates:

Share with