Execution Process in Java

[Spanish Version]

19 min readOct 4, 2022

When we write a program in a programming language that commonly has a considerably high level of abstraction such as Java, we must take into account that the set of instructions contained in it, as well as any type of manipulation or process that we carry out on objects, in at some point it will have to be decomposed into a sequence of elementary operations supported by the processor in charge of the execution. Thus, to better understand this need to first encode in a natural-like language with certain restrictions and then “translate” the program into another language whose content can be processed by a CPU; it is convenient to understand its operation, although its implementation complexity only allows us to grasp the fundamentals of its full potential.

The processor (CPU)

The upper schematic shows what is known as the Universal Turing Machine (simplified), which turns out to be the basic theoretical background of classical computing from the 20th century onwards. As can be seen, its fundamental components are the memory and the processor. On the one hand, memory is represented as a list of cells, each containing a single bit of information. This list has the capacity to be as long as one wants or needs (theoretically), which gives rise to postulates in which it is stated that any process computable by a Turing machine can be solved using a sufficiently capable memory. However, in practice, there is a material and physical limitation that prevents having infinite bits in memory.

On the other hand, since the memory necessarily has to be manipulated in some way to complete the corresponding processes, the Turing machine has a head (processor) that points to a specific bit and whose only function is to move along the list of bits following a sequence of instructions supplied by a person in order to change the logic state of the respective cells. Thus, to carry out this task, the CPU has a series of trivial actions supported by the physical implementation of its own circuitry. These actions turn out to be very simple, since they are of the type “move”, “read” or“overwrite a logical value at x memory location”. Especially when compared to high-level programming, which deals with classes, objects, robust data structures, etc.

But, despite its simplicity, a sufficiently long and orderly sequence of them is capable of generalizing, that is, carrying out all the required high-level computation correctly represented as a valid set of elementary instructions.

In practice, the processors have similarities with the previous model. First of all, they all contain an Instruction Set similar to the one shown in the image above that encapsulates all the possible instructions that the microprocessor can carry out. In the example, above you can see certain similarities with the Turing model, such as; value modifications in registers (increments and decrements), movement through memory, “jumps” between their positions and subroutines, etc…

Likewise, various additional operations can also be distinguished such as AND, OR, XOR, Logical Comparison, Push, or Pop. Most of these represent primary logical processes and operations directly implemented in the so-called Arithmetic-Logic Unit (ALU), which by using the properties and advantages of Boolean algebra facilitates the treatment of input and output data generically encoded in natural binary. Despite the real complexity of a full processor with an extensive ALU, its function can be better understood using a simplification such as the one in the image below, in which a 1 bit ALU is shown. On the one hand, we observe how there is an area in which certain logic gates are grouped, intended to perform logical operations with the respective input data. Contiguously, another section is located where, unlike the previous one, it deals with performing the corresponding arithmetic operations with the inputs. Note that the only arithmetic operation it can perform is addition, since it is the most basic of all and useful to combine when trying to generate more complex computations. Although, depending on the structure of the processor, it may have circuits specialized in other types of operations such as sign changes or bit rotations.

In addition to the UAL, the processors are also integrated with a Control Unit (UC), which collects instructions stored in memory, decodes them, and executes them with the help of the other units and data registers, addresses, accumulators, etc. In turn, this unit is in charge of being in contact with the RAM memory, making it necessary to use previously seen instructions such as Push and Pop, which in a nutshell add or remove content from a data structure called Stack, in which further encoded instructions, data, or whatever is required to be queued for processing/execution can be stored.

At this point, we can conclude with the necessary fundamentals to understand the place that the processor occupies when executing a program. However, the field of digital electronics, which thoroughly studies issues related to the construction, maintenance, and optimization of these devices, is much broader than previously explained. Therefore, it is convenient to continue researching from the following resources to get a better idea of what a processor really is:

How does a program run?

After having coded an algorithm in a programming language, the next step, if we are in a moderately modern code editor, is to press the RUN button and see the subsequent result, which may vary depending on what we are programming. However, these editors haven’t always been around, and not all of them offer the option to run code with a simple button. For this reason, it is convenient to be clear about the basic concepts necessary to be able to carry out the execution of a program without depending on a specific editor, alternative program, automation, or similar.

Fundamentally, the execution process of a program depends on the language in which it is written. Some are interpreted; that is, during execution, a program called an interpreter runs through the file where the code is stored and transforms each high-level instruction in real-time into a sequence of “low-level” instructions belonging to the instruction set of the processor in question. On the other hand, other languages have the peculiarity that they are compiled. In this case, there is a program known as a compiler which, unlike an interpreter, does not directly execute the written code, but rather compiles it. That is, it goes through the file analogously to an interpreter, converts all high-level instructions to low-level, and stores them in an executable file, so that the written code is subsequently executed from the new file created by the compiler. . Some examples of both types of languages are:

Interpreted: Python, MATLAB, JavaScript, Java
Compiled: C, C++, Java, Rust

The differences between both the interpretation and compilation processes have their respective advantages. Mainly, interpreted languages as a general rule tend to be slower in contrast to compiled ones, since having a file with all the instructions ready and optimized for them to be recognized by a processor supposes an improvement in the final execution time. However, a language being compiled results in a split in the entire execution process and implies that “execute” becomes a combination of 2 phases;

Compile: The compiler generates an executable file (.exe) with “binary” instructions from the initially written code.
Run: Start the executable file generated by the compiler and view the results of the algorithm.

Portability problem

There are numerous differences and possibilities for comparison between both types of language, not only in terms of execution speed or space but also memory management efficiency and portability. Despite the fact that the variance of these characteristics in all languages provides them with a characteristic utility for solving certain problems, portability, especially in those that are compiled, plays an important role when selecting the appropriate one to start a project at a large scale, design an algorithm which is expected to be used on multiple platforms or simply create a product that is going to be used on different operating systems.

Portability, in this context, refers to the ability of a software/language to be executed on various platforms without having to be recompiled or whose execution results in processing errors.

As shown in the diagram above, if we program in a compiled language and we need the program to be able to run on more than one operating system, in this case, Windows, Linux, and macOS; we must take into account the system specifications for each of the platforms, in addition to the characteristics of the processor such as its structure, socket on which it is mounted or interaction between each of its processing cores. In this way, from the code that we originally wrote, the corresponding versions must be created, specifically adapted for each platform in which it is going to be used. This specialization necessary for each operating system can be better understood with those mentioned in the previous example. So, if we were programming something directly related to the organization of files and storage units of each platform, we would notice that they all have slight and large differences that necessarily have to be considered so that the code does not return any error or worse, generate critical failures in the system. Due to this portability problem, in the compiling phase, a versioning step is attached that covers the needs of each platform, which in large projects can become a long task depending on the number of operating systems in which it is going to be used.

Java solution

Faced with this problem, the Sun Microsystems development team created the Java compiled language with the aim that after writing a program only once, it would be able to be compiled and executed on any platform, regardless of the operating system, hardware specifications, or similar. Also, these properties are fostered by the intention of Java to be a unified language oriented to the Internet, that is, that any machine connected to the Internet could work with Java code.

As shown in the image above, the basis of the solution that Java provides to the portability problem lies in the result of the compilation of the original high-level code written only once (Write Once, Run Everywhere), without any type of specific versioning for platforms or hardware, since what makes this language different is the versatility with which it can be integrated into any device (from toasters to parking meters). In this case, during the compilation process, Java transforms the high-level code into another type of “low-level” language known as Java-bytecode., which is nothing more than an intermediary between Java and the instructions contained in the instruction set of each processor. The nature of the bytecode itself is not necessarily relevant here. What is important is that once created by the compiler, it can be interpreted by the Java Virtual Machine (JVM) on whatever platform it is properly installed on.

Although Java may seem to be a compiled language thanks to the phase in which the original code is converted to bytecode, it cannot be completely affirmed that it is really compiled, since it has an interpretation phase in which the JVM interprets the bytecode file generated by the compiler at execution. In this way, Java takes advantage of both types of language. On the one hand, the versatility of the interpreted ones, and on the other, the temporary optimization of the compiled ones, despite the fact that in Java this use has many nuances, mainly due to the complexity of the process.

How does the compiler work?

In this section, we will try to better understand the structure that Java follows as a language in order to compile a program and transform its classes to Bytecode.

First of all, after installing Java on a device from which you want to develop a program, there are various sections in which each one stores the necessary tools to create said program from the coding to the final execution. On the one hand, the largest set represented in the previous image is the Java Development Kit (JDK), which, as its name indicates, encapsulates a series of generic development tools such as the Debugger (jdb), performance controllers ( javaw), or the compiler (javac). Remember that it is in the latter that the main task of creating the Bytecode falls and storing it in .class files that will later be processed by the respective virtual machine.

Next, the question may arise: How does the javac compiler work internally? Given this question, it is worth mentioning that the compiler itself is a very complex piece of software that has been in development for a considerable number of years, in addition to having undergone important changes and improvements in its way of operating. Therefore, explaining its operation in great detail would be too long and cumbersome for the purpose of knowing how to compile a Java program. Thus, here we will only see the fundamental steps that it follows to correctly generate the Bytecode code, although for a complete explanation of all the following steps there are resources such as Modern Compiler Implementation in Java, among many others.

1-Lexical analysis:

In this first step, the compiler takes as input the .java file with the algorithm encoded in Java high-level language. After this, javac performs a decomposition into “tokens” of the code written inside the file. In short, what it actually does is iterate through the code and separate the keywords, operators, separators, comments, identifiers, etc. analogous to the tokenization process in an NLP model, except that here you have very into account the structure of the language, as well as the symbols and keywords intended to perform certain functions.

As a curiosity, one of the most widely used programs to carry out this type of lexical analysis task is flex, which can be supplied with a “definition” of the structure that a certain language follows to generate an analyzer that can be reused in a given language. compiler.

2-Syntactic analysis:

This step is also known as “Parsing” and consists of checking that the syntax of the written code is correct, that is, that all semicolons are correctly placed at the end of each instruction, that there are no unclosed braces or parentheses, or that there are no typos in the keywords.

To do this, and in line with the previous step, a revision is made in a data structure that can be created between these first two steps called Abstract Syntax Tree(AST), whose function is limited to representing the code “abstractly”. high-level so that in future steps the tree or part of it is traversed in order to add the low-level instructions corresponding to each node of the graph.

Example of a program that writes Hello World! on screen, represented as AST.

3-Semantic analysis:

In a program that goes beyond a simple HelloWorld, it is quite possible to use variables, identifiers, methods, and other techniques that require a name. That is why in this step what is done is to make sense of the identifiers extracted from the first step, as well as to determine their meaning and properties in the entire code, relate their use with other variables or in expressions, or check type compatibility.

Likewise, each variable is also registered in a symbol table, with its type, identifier, dimension, address in memory, or associated value.

4-Generation of intermediate representation trees:

The main tree generated directly after the tokenization and analysis process results in a data structure that in very large or poorly optimized programs occupies a large space in memory, at the same time that its width and depth along its branches makes it expensive to traverse. Thus, in this step, which really encompasses many others aimed at very specific transformation and optimization tasks, intermediate representation trees are generated IR Trees, whose main purpose is to bring the way of representing expressions and branches of the tree closer to the possible instructions belonging to the machine language (bytecode in this case). During this process or set of processes, the execution order of some nodes is optimized, improving the data flow within the graph.

5-Choice of instructions:

Intermediate representation trees are grouped into clusters that must later be transformed into instructions supported by the final target language, which in any compiled language would be the instruction set of the processor; however, in Java, the Bytecode is chosen for its mentioned portability advantages in previous sections.

6-Flow analysis:

This step groups 2 processes called control flow analysis and data flow analysis. In each of these “substeps” tree traversal algorithms are executed that calculate the possible control paths that the program can follow when executed and are stored in a flow control graph.

Likewise, the use of the variables throughout the execution of the program, the data flow that passes through them, the expressions in which they are used or the references to their memory space are also calculated. All this analysis is aimed at optimizing the memory usage of the final code. For this reason, the algorithms oriented to carry out this task are mainly in charge of detecting when a variable is not useful within the execution cycle, freeing its occupied space by modifying the respective nodes of the tree.

As a consequence of this data flow analysis, more sophisticated optimization techniques such as Constant Folding or Constant Propagation can be included, through which it is determined if the value of an expression can be calculated in the compilation process rather than the execution process, since that by precalculating the value of a long expression (if possible) the calculation time that the processor would have to spend on execution is saved, improving the efficiency of the program.

7-Issuance of Java Bytecode code:

In this last step, an order of belonging is assigned to each variable so that it is contained within the processor registers without any value being erroneously overwritten. Finally, the resulting tree is traversed after the numerous optimizations carried out in the previous steps. The instructions obtained during the reading of the data contained in each tree node are added to the final bytecode file.

At the end of the entire process, the javac compiler will have generated as many .class files as classes we have implemented in the program, so that each of these files will have the name of the identifier of the respective class it contains.

How does the interpreter work?

As mentioned above, the execution of a .class file containing Bytecode is performed by the JVM virtual machine installed in the operating system. Its operating architecture is represented in the image above, although in a similar way to the explanation of the compiler, many non-essential details are omitted for the correct understanding of the complete process that the JVM follows to interpret Bytecode.

1-Class Loader Subsystem:

In the first block, we find the Class Loader Subsystem, which is basically an area in which 3 fundamental processes are grouped with which the interpretation of a .class file begins

First, the classes that are going to be executed must be loaded. Thus, a delegation process is followed in which different class loaders participate; BootStrap ClassLoader, Extension ClassLoader, and Application ClassLoader in that order. Each of these tools is dedicated to loading classes from different places in the system, so if a loader fails to reach the class that is required, the loading process will be delegated to the next lower priority loader, resulting in a ClassNotFoundException. if the class has not been loaded by any of the above. Finally, after being correctly processed, the class is transformed into data encoded directly in binary and saved in the Method Area, belonging to the Runtime Data Area.

Second, the JVM verifies that the loaded class is valid, that is, that the compiler has not made any kind of error or that the file itself is not corrupted . If the verification fails, java.lang.VerifyError is returned as an exception to the user. If all goes well, a class type object is created and stored in the memory heap . Finally and to summarize, in the resolution and initialization tasks of replacement of symbolic references and management of static variables are carried out.

-For more information consult the documentation !!

2-Runtime Data Area:

This area is responsible for providing the necessary memory to save all the necessary objects, data and values. Both bytecode and class objects, intermediate computations or Threads.

Broadly speaking, the Method Area is contained here, which is a memory space commonly reserved for class objects, with their corresponding attributes, methods and static variables. The Heap, reserved for all kinds of objects and global variables. And the Stack , which in turn is divided into a stack reserved for Threads , method executions , and native method information .

3-Execution Engine:

We come to one of the most important areas of all, since the interpretation of the Java Bytecode is carried out here, in addition to a series of optimizations that speed up the entire execution process.

As a first instance, the Interpreter tool takes care of the entire interpretation process, which consists of going through the Bytecode and executing each contained instruction line by line. However, there are certain disadvantages such as the execution of a method several times throughout the entire control flow of the program. In the event that this happens, the interpreter will be going through and processing all the code contained in the method over and over again unnecessarily during the multiple calls that were originally made, consuming memory and vital resources for other processes.

To solve this problem, the Execution Engine contains the Just in Time Compiler (JIT) , which transforms all instructions and methods from bytecode into native code in real time, stores them in the corresponding native memory space, and provides the Interpreter with everything this native code. So that each time a method call is detected, a complete reinterpretation of its content is not required, but the Interpreter directly uses the native code to execute the method efficiently .

In addition, there is a Garbage Collector , which accumulates and eliminates all memory spaces reserved for variables or objects that are never referenced, optimizing the use of available resources.

4/5-Java Native Interface (JNI) and Native Method Libraries:

These two areas make up a set of tools that enable the creation and correct operation of native code on specific hardware, relating Java code with libraries and resources from other languages such as C or C++.

Practical execution (Linux)

This section details all the steps to install Java on a device with a Linux operating system (Debian 11.5.0 distribution), create a program, compile it and run it with the help of the JDK.

1-Installation phase

This series of steps can be skipped if the system has a properly installed JDK and JRE(Java Runtime Environment) on the system. However, always check that the compiler and interpreter versions match, otherwise errors may occur.

First, it is convenient to update the apt package manager (advanced packaging tool) with the previous command so that the versions of what we install are the most recent. At the beginning of the line, it is recommended to use the sudo command to avoid errors caused by the lack of privileges of the user with which we are using the terminal. In this way, sudo allows us to temporarily access with administrator permissions (root user).

After having made sure that the apt is up to date, the Java Development Kit is installed with administrator permissions using the install command and the default-jdk argument to tell the terminal what to install.

The same is then done for the Java Runtime Environment.

2-Create a program

In Linux you can create and edit programs with any editor installed, although the most used are Vim, Nano or Visual Studio Code . In this case, Nano has been used to write a program that displays “Hello World!!”, and save it on the desktop as HelloWorld.java, ( according to the name of the main class).

3-Compile

Once the code is created, the javac command with its main argument in which the path of the .java file that we want to compile is written, although if the file is in the same directory as the terminal it is not necessary.

When the compilation process finishes (without any error), all the .class files corresponding to the classes that contain the code we have written will have been created in the same path as the .java file. In this case, there is only one class called HelloWorld , which turns out to be the main one, that is, it will be the one we use as an argument when executing, since it contains the main method .

If we try to see what is inside the .class file, we will directly come across Bytecode intermediary code like the one shown above.

4-Execute/(Interpret)

The last step of all after you have compiled without errors is to run the program with the help of the Java Virtual Machine. To do this, and analogously to the previous step, the java command is used followed by its main argument, which this time is the name of the main class of the program that we have written. In this command, you do NOT have to write the extension of the .class file in which the class is stored; since the argument only looks for the class by name, it is also essential that the command is executed from the same place where it is saved the .class file

As can be seen in the image, if we pass the file with its extension as an argument, the interpreter will search for the class “HelloWorld.class” instead of “HelloWorld”, which will produce an exception of the type java.lang.ClassNotFoundException.