Anatomy of a Bazel Project

Setting up the most simple Bazel project is easy. All you need to do is to add two empty files into a folder, and you have a valid Bazel project. You also need access to the Bazel executable. Let’s create a simple scenario where we, the programmer, set up a build action for a friend who is writing a book. The goal of this action is to take the chapters of the book as inputs and output a single document where all the chapters have been merged into a single output file. Now that we understand the problem, we need to solve, what are the inputs, what the action needs to do, and what it needs to output. We are ready to write our very first build action.

Figure 1.6: Inputs and output involved in compose_book

Let's imagine that the author structures their project in the following way. They have a root folder containing text files with each chapter. This is enough for us to get started:

├── root
│   ├── chap_1.txt
│   ├── chap_2.txt

The first step is to add a workspace file to the environment. For now, we can leave the file empty. Just by adding this file to our file structure, we have created an environment where Bazel can build. Adding the file tells Bazel that this will be the root of our project, and the files that are within this folder structure are the only ones that Bazel will know about when we run the build action. Just by adding the workspace file, we have created a hermetic environment:

├── root
│   ├── first_chapter.txt
│   ├── second_chapter.txt
│   ├── WORKSPACE <----

The workspace file is the file where we make sure that we have all the tools we need to do the action that we are planning to do. The build file, on the other hand, is where we define the build action itself, which is combining the book. First, let's take a look at the contents of the build file:

├── root
│   ├── first_chapter.txt
│   ├── second_chapter.txt
│   ├── WORKSPACE
│   ├── BUILD <----

Secondly, we add the BUILD file (also in capital letters) into the folder structure. The build file lets us define the actions that we want to run within the workspace. We will create a build action called compose_book, which takes in the two text files as input and returns a single output file:

// BUILD FILE
genrule(
   name = "compose_book",
   srcs = ["//:chap_1.txt",  "//:chap_2.txt"],
   outs = ["book.txt"],
   cmd = "cat $(location //:chap_1.txt) $(location //:chap_2.txt) > $(location book.txt)"
)

There is a lot to unpack here, so let's look at a brief overview of what's happening. In this example, we have created a general rule (genrule), which means that you can use some built-in tools that are available in shell environments. While there are a few things here that we’ll go into more detail in later chapters, it is important for now to understand the intent behind what we are doing rather than the details of what's happening behind the scenes.

The first thing to note is that to create this build action, we call the genrule function and pass in some arguments. The genrule function is a nice utility to make quick and easy build actions (such as the one that we are making now). However, we’ll eventually graduate to making more specific ones, such as py_binary() to build Python code or cc_binary() for C++ code. Calling the genrule function creates a build action with the given parameters and makes sure that it can be run.

The first argument that we pass into the function is the name parameter. This one is pretty self-explanatory since it defines a name for our build action. We give our build action the descriptive name “compose_book”:

name = "compose_book",

Next is an argument that allows us to pass in any source files that the build action will need to execute the build. Remember that if the build action is not informed about a file that it needs, even though it's sitting right next to all the others in the folder structure, Bazel won’t know about it when running the build action. This can either result in the build action failing (if it's trying to operate on a file that it doesn’t know about) or give incorrect results (even though the results are technically correct, Bazel just didn’t get the correct information). The srcs argument is an array so it accepts one or more files:

srcs = ["//:chap_1.txt", “//:chap_2.txt”]

You’ve probably noticed that the way we reference the files is a little bit different from what you are used to. By formatting our path this way, Bazel will treat the path to our file as a label rather than a string. A label is a unique identifier used by Bazel to reference targets, and in this case, our two source files are targets. We’ll dive deeper into labels in a later chapter, but for now, all we need to know is that the two slashes at the front mean that this path starts from our workspace root, and that the colon means that this file is part of the package (which we define inside of our build file).

The outs argument defines a list of outputs and tells Bazel which files will be output:

outs = ["book.txt"]

And lastly, let’s look at the build command itself. The way genrule works is that it has a cmd argument where we can pass in a command that gets executed in a shell. In this example, we use the cat (concatenate), which is a frequently used Linux command that is used to combine files. We pass it a list of files to concatenate, and it outputs a result:

cmd = "cat $(location //:chap_1.txt) $(location //:chap_2.txt) > $(location book.txt)"

Notice that we are using a special pre-defined variable called location to construct the path to the input files. The location variable gets resolved when the command is run and results in a valid path for chap_1.txt, chap_2.txt, and the output file book.txt. 

With this, we have created our first build action, and all we have to do now is to run it. Make sure that you have the Bazel executable installed on your computer and preferably add it to the path. To run our build, all we need to do is call:

bazel build //:compose_book

If everything went as planned, you should see the following message: 

INFO: Analyzed target //:compose_book (5 packages loaded, 9 targets configured).
INFO: Found 1 target...
Target //:compose_book up-to-date:
  Bazel-bin/book.txt
INFO: Elapsed time: 7.333s, Critical Path: 7.11s
INFO: 2 processes: 1 internal, 1 local.
INFO: Build completed successfully, 2 total actions

As you can see in the log, we found one target that needed to be built (//:compose_book), and this build target will output a file called book.txt. The build took 7.333 seconds on my humble workstation and was completed successfully. (I should note that it's not the cat command itself that is this slow, but the first time we run a Bazel build, it does some initialization that is taking the bulk of the time).

Now, let’s run the same command again and see what happens:

INFO: Analyzed target //:compose_book (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:compose_book up-to-date:
  Bazel-bin/book.txt
INFO: Elapsed time: 0.078s, Critical Path: 0.01s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

This time, we can see that after analyzing our target (checking the inputs and the environment), Bazel determined that we don’t need to do anything. It already has a cached result from a build with these exact inputs and environment, so it’ll just return that cache. Now, you’ll notice that the build time only took 0.078 seconds. 

Now, you are ready to take a look at the contents of your composed book. However, the book.txt is not anywhere to be found inside the workspace. And there is a very good reason for this. Before we go into that, let's find the file. You should see a few new folders that have appeared in the workspace, one of which is called/bazel-bin. This folder should contain a text file called book.txt, which is the output of our build actions:

├── root
│   ├── bazel-bin <----
│   ├── first_chapter.txt
│   ├── second_chapter.txt
│   ├── WORKSPACE
│   ├── BUILD

But why didn’t the file get written to the root of the project? This is one of the key concepts within Bazel. The build actions do not modify the workspace because doing so would break hermiticity and you might run the risk of inconsistent builds. This is why all outputs are written in a separate location and then linked into the workspace using symlinks.

Expanding the example across the Workspace

Now, let's expand our example. The author was so happy with our build action that they asked us to expand it. The new requirements are that the book additionally downloads an image from the internet and places it next to the book as part of the build action. The important thing here is that the image is not a part of the workspace and needs to be downloaded. Furthermore, we need some way of making sure that we can still do hermetic builds even though we are depending on external resources. Let's take a look at how we do this.

First, we define and load a workspace rule that allows us to run an action within the workspace. The workspace rules are there to prepare the workspace for the build action that we are about to run. This can be anything from downloading a compiler or any tool. However, for our example, we are just going to download a single file from the internet and use it as the cover of our book.

# WORKSPACE
load("@Bazel_tools//tools/build_defs/repo:http.bzl","http_file")

http_file(
  name = "book_cover",
  url = "https://upload.wikimedia.org/wikipedia/commons/c/c5/Big_buck_bunny_poster_big.jpg",
)

At the top of the file, we start by loading the rule that will perform the workspace action. The function we call is load(), which tells our workspace to fetch the http_file workspace rules from a definition that is built into Bazel.

load("@Bazel_tools//tools/build_defs/repo:http.bzl","http_file")

The second part works almost exactly like the genrule action we did earlier. We call the http_file function and give it a name. Let’s call this action book_cover.

name = "book_cover"

The next argument we give our workspace function is the url argument. This is where we point the rule to the URL address of the file that we want to fetch. In our case, we have an image of the internet that we’ll use as our cover:

 url = "https://upload.wikimedia.org/wikipedia/commons/c/c5/Big_buck_bunny_poster_big.jpg",)

With this rule, you have now created an external dependency on a file that can be referenced inside your workspace. Now, let's take a look at how we can extend the compose_book build action to use this newly added dependency:

genrule(
    name = "compose_book",
    srcs = ["//:chap_1.txt",  "//:chap_2.txt", "@book_cover//file"],
    outs = ["book.txt", "cover.png"],
    cmd = "cat $(location //:chap_1.txt) $(location //:chap_2.txt) > $(location book.txt) && cp $(location @book_cover//file) $(location cover.png)"
)

The first step is to include the external file as a source input. Since this file is not a part of the local workspace, it needs to be referenced through the workspace. The @ symbol at the start indicates that we are referencing a package that is defined in the workspace file. You’ll see when we called the http_rule function in the workspace, we gave it the book_cover name. When the file gets downloaded, it is made available as a package in the workspace, and to reference a package, we use the @ symbol. The http_file rule downloads the target file and makes it available through a filegroup. We’ll look at filegroups in more detail in a later chapter, but for this example, all we need to know is that to reference the actual file, we do it using //file:

@book_cover//file

The next step is to define the new output. We want the downloaded file to be placed next to the composed book and be called cover.png. All we have to do is add the name to the outs array, and then the build action knows that, in addition to the book.txt file, we also expect there to be a cover.png file created:

outs = ["book.txt", "cover.png"],

Lastly, we extend our build command to also copy the file from the book cover package into the output folder. The cp command is similar to the cat command that we were already using. cp copies a file from one location to another, and in this case, we pass it the book_cover file and output it as cover.png:

cp $(location @book_cover//file) $(location cover.png)"

Running the build action now outputs both the downloaded cover file and the combined chapters in the build file.

Previous
Previous

Defining the Bazel workspace

Next
Next

My journey of discovering the value of Bazel Build