Building C: gcc and make

As discussed in Basic principles of C, code written in C needs to be built (compiled and linked) to produce executables. What follows is a short discussion of this process.

Building with GCC

The GNU Compiler Collection (GCC) is a suite of tools for the compilation process. The GCC is free software. It is pre-installed on most GNU/Linux distributions, and is also available on OS X and Windows. (On Windows you will need to install MinGW, a development environment that emulates, and gives the functionalities of, a basic Unix-like system.)

Due to its popularity and its free and open-source nature, the GCC has become, officially or unofficially, the standard compilation system on most Unix-like systems.

The GCC is designed so that the interface is the same on all systems. To compile the C source file foo.c, the basic command is
$ gcc foo.c
This will compile and link foo.c and produce the executable file a.out. You can specify a different name for the output file using the -o flag:
$ gcc -o foo foo.c

Multiple source files

If your program is written across several different source files, then the linking process can be taken care of automatically:
$ gcc -o output foo.c bar.c baz.c

Options

There are many options that can be passed to gcc (and to its sisters, described below). Here are some of the more important ones:

  • -I/path/to/dir: Adds the specified directory to the list of places where #include commands will search for files. On Unix-like systems, directories such as /usr/include and /usr/local/include are usually searched automatically, but if you need the header file of a library that is not installed there, then you will need to specify the path.

    For example, on my laptop (running OS X), the GSL is installed in /usr/local/Cellar/gsl/1.16. To use the GSL headers, I would need to pass the flag -I/usr/local/Cellar/gsl/1.16/include/.
  • -llibrary: Indicates that library should be linked. For example, if I am using the GSL, then I would need to link using -lgsl.
  • -Wall: Tells the compiler to emit all warnings, not just fatal errors or serious warnings. This can be useful for revealing those places where your code works, but is not written as cleanly as it could be (for example, if you have declared variables which you do not actually use).
  • -g: Tells the compiler to include debugging information in the produced object code or executable. For example, line references or even the actual text of the source file might be included. This is useful because it allows a debugger such as gdb or valgrind to tell you not just that memory leaks are happening, but also tell you where they take place.
  • -O0: Tells the compiler not to optimise the compiled code. Usually, compilers do not simply translate source code directly into machine code; they may, for example, change the order in which some statements are executed. This improves the speed of the program and may reduce the size of the executable. Turning off optimisation may be desirable if you want a faster compilation, and is also useful in testing because the produced executable will better resemble the source in its structure.

Other tools in the GCC

The GCC also provides a suite of compilers for other languages, including g++ for C++ and gfortran for Fortran. All of these generate object code files (files with the .o extension) which are compatible: they may be linked together. Hence different parts of a program may be written in different languages, compiled into object code and then linked into one executable.

GNU Make

If you are working on a project with many source files and you have to compile many times during development, then typing out the above commands over and over again, and trying to remember what flags to pass to gcc, can be tedious. GNU Make is a very useful tool for automating the build process.

The GNU Make utility is not part of the C language, but it is an important part of its ‘culture’: Software written in C (or C++, or Fortran, or any other compiled language) is often built using GNU Make. Like GCC, GNU Make usually comes with GNU/Linux distributions and is available on OS X and Windows (with MinGW).

GNU Make will be described in more detail in a future post.

Other tools

I mainly stick to using GNU Make, but it is good to mention a few other tools: also not parts of the C language, but common parts of the culture.

Git

Git is a version control system that allows you to keep track of changes to source files in a project. You edit your files and then commit your changes each time you make a change. The history of these commits is recorded, and you can revert to previous versions.

Git is designed for multiple developers working on different parts of a project. Alice, having changed a few files, commits and then pushes her commits. Bob can then pull these changes: Git will try to update Bob’s copy of the code to apply the changes that Alice has made, without overwriting any changes that Bob might have made.

Alice and Bob may also create branches of the code if they want to work on different features of the project whilst keeping the master branch unchanged.

GNU Autotools

If you are working on a large project, then building can be tedious even with the aid of GNU Make: Keeping track of your source files and updating your makefiles to reflect their changes is difficult. The GNU Autotools suite is designed to help automate that process.

Integrated development environments (IDEs)

Alternatively, you may use an IDE such as Eclipse (cross-platform), Xcode (on OS X), or Microsoft Visual Studio (on Windows) to keep track of files and build processes. Most IDEs provide a compilation and linking system as well as a debugger. Their interfaces tend to be daunting, however.

Basic principles of C

History of C

C has existed since 1972. Its current standard is C11 (published in 2011), but only a few details have changed since the 1989 standard (C89, also known as ANSI C). It is an extremely standardised language; different operating systems provide different libraries with different functions, but the syntax of the language is very simple. At its core, C is a very small language; there are only a handful of keywords offering basic notions such as basic types (e.g. int and float) and control structures (e.g. for and switch). The basic language is supplemented with the standard library which provides functions for basic operations, such as input and output, string handling and mathematical functions.

C is the precursor of many modern programming languages, and has influenced many others. Hence, if you are familiar with any of C++, C#, Java, Perl, PHP or Python, then you will be familiar with many aspects of C’s syntax.

Compilation and linking

C is a compiled language. Raw C code is stored in source files or header files. In order to produce an executable application, the source files must be compiled into object files, and then the object files must be linked. (Traditionally, source files have extension .c, header files have extension .h, and the object file produced by compiling foo.c is called foo.o.) The combined process of compiling and linking is called building. Sometimes, the term compiling is used to mean both compiling and linking.

Often, the same functions will need to be used by many different applications: in particular, the standard library functions. It would be wasteful for each C programmer to implement their own exp function, but exp (and other basic mathematical functions) are provided in the standard library. When writing your own program foo, all you need to do before you can use exp is to tell the compiler that exp is a function that takes in a floating-point number and returns another floating-point number. This is a declaration. For a standard library function such as exp, the declaration is included in a standard header file, here math.h, which you include into your source file foo.c:

#include"math.h"

Somewhere in math.h is a line that looks like

double exp(double);

which is the declaration of exp. You then compile foo.c and produce an object file foo.o. At this point, foo.o does not contain any details about what exp actually does; all it knows is what arguments exp takes in and what it returns. Hence you cannot run foo.o; you must link it with another object file, probably called something like math.o, where the actual implementation of exp is given.

There are two advantages to doing things this way. One is that the implementation of a function like exp can be used by millions of different programmers. The other is that your usage of a function is independent of its implementation; different systems or machines will often provide different implementations, but your part of the code will work with any of them.

In practice

The build processes require a number of commands. When building a large project, it is common to automate the build processes using a number of tools—not actually part of the C language, but important parts of its ‘culture’. GNU Make, which will be the subject of a future post, is one such tool; many projects distribute their source code along with a ‘makefile’ which contain instructions for automating the build process.

An introduction to an introduction to Mathematica

This is the first post in a series on Mathematica. The series is meant to complement and supplement a five-day course in basic Mathematica that I shall be giving at Cambridge in June 2016. I will post after each lesson if I feel that there is something that needs to be clarified, or if some example code will be useful, but these might not be complete course notes.

About this course

I plan to begin with a basic introduction, where I will give an overview of some of the mathematical features that Mathematica offers. In particular, we will cover random number generation, Fourier transforms and NDSolve. We will also use Plot, ListPlot and related functions.

We will introduce Import, which is used for importing data and including code from source files.

I will also introduce the basic concepts of functional programming. Mathematica is a functional programming language; while it provides control structures such as If, For and While, there are often much neater ways of doing things using the likes of Map (or /@), Select and Nest. In fact, If, For and While are themselves treated as functions. We will learn about anonymous functions and the /. and // operators.

My aim will be to focus on concepts and style, not fluency: Mathematica functions tend to have a complicated and difficult-to-remember syntax, but the inbuilt help is very useful for looking up the details.

An introduction to an introduction to C

This is the first post in a series on C. The series is meant to complement and supplement a five-day course in basic C that I shall be giving at Cambridge in June 2016. I will post after each lesson if I feel that there is something that needs to be clarified, or if some example code will be useful, but these might not be complete course notes.

About this course

The course is intended for people with some prior experience of programming in an imperative language, although I shall assume no experience with C. I will not dwell on details of the language but will try to move quickly into applications in maths, particularly in numerical computing.

Here is the provisional course structure:

  • Language features. Hello world. Compiling and linking. Declarations and definitions. The types int and double. Functions and return values. The if and else, for, while and switch structures. The break and continue keywords.
  • Memory, pointers and arrays. The concept of memory and why variables must be declared. Declaration and definition of arrays. Declaration of pointers and the use of malloc and free. Arrays as pointers. Passing arrays to functions. Functions with multiple or array outputs.
  • (*) Input and output. The char type. Strings as char arrays, the string.h library. The printf and snprintf functions. Working with files.
  • Mathematics. Variations on the int type. Floating-point calculations. The math.h header. Random number generation with rand.
  • Applications. Euler’s explicit method. A Monte-Carlo integrator and pi calculator. A heat equation solver (Crank-Nicholson).
  • (*) Structures. Application: A simple three-body problem solver.
  • The GNU scientific library. Brief overview of features. Fourier transforms. (*) A KdV solver (split-step pseudospectral).

Before starting

You will need to have a building system set up on your computer. Most Linux distributions come with such a system. If you are on OS X, then two good options are clang and gcc. If you are on Windows, then Microsoft Visual Studio Express is probably the easiest to set up, although clang and gcc are also available.

GNU Make may be useful if you are using clang or gcc. GNU Make is a system for automating the building process, and is widely used, not just in C.

About this website

This is the first post on this website. At the moment, it consists of just a few pages describing me and my work. Occasionally, I will post things about maths, science, computing, Chinese history, or anything else that interests me.

I am starting this blog as part of a migration away from Facebook. There are several reasons for this, many of which are privacy-related. Richard Stallman and the Free Software Foundation has a more detailed list of Facebook’s transgressions. Below I explain some of the main points.

Privacy

When people talk about ‘privacy’ on a social network such as Facebook, they often think about controls that keep co-workers, bosses or students from seeing posts that they make in their personal lives. This is an important aspect of privacy, and while Facebook does not offer complete protection, it has made improvements, and a savvy user can achieve these controls quite easily.

But the true danger to privacy that Facebook presents is that Facebook themselves may read posts or things said in supposedly private conversations between users. The danger is not that Mark Zuckerberg will personally read your conversations and use it for blackmailing or shaming you. Rather, it is your usage patterns, writing style or unconscious behaviour which give away the most interesting information about you. Facebook is also capable of tracking your browsing habits on other sites. Logging out doesn’t protect you from this tracking.

The upshot: Even if you never write a message or post a status explicitly stating anything, and even if you give a false name, age or gender, it is easy to build a detailed profile of you, by linking together all of the information that is collected.

Targeted advertising is not a huge worry for me; I never pay attention to adverts anyway. I am most concerned by the prospect of medical information being collected or deduced: an insurance company could use this to set my premiums, or a prospective employer could discriminate against me based on my medical conditions. (The latter may be illegal, but that wouldn’t necessarily stop them.) This is not an unfounded concern: one of my friends noted that she was getting adverts targeted towards one of her conditions.

Ownership, openness and censorship

Centralised, proprietary systems such as Facebook, but also other networks such as Tumblr or WordPress.com, are not a sensible medium for storing or publishing media such as articles or photos. The danger comes from (a) the possibility that the service could be terminated with little or no warning, causing your media to be lost, and (b) the possibility of the host censoring your media.

I don’t know anything about copyright law or fair use, but the prospect of Facebook using my photos as their own (perhaps selling them off as stock photos, for example) is actually a fairly minor concern for me.

Facebook can censor posts arbitrarily. In 2014, it removed a photo of breastfeeding. In practice, its censorship seems to be motivated not by its own morality, but its desire to keep itself unblocked in countries such as Russia and Turkey. It does this by censoring pages of dissent, essentially to appease the Russian and Turkish governments.

Although there is no evidence of WordPress.com doing the same, one has no guarantee against it.

This website is hosted independently server in Cambridge (but independent of the University Computing Service), and is far less vulnerable to this sort of censorship. If I posted something illegal, libellous or extremely controversial, then the service provider may order the shutting down of this site or the government may order my arrest, but these powers are subject to public oversight, and are harder to abuse.

(Note that WordPress.com refers to the blog hosting service; this website is powered by the software WordPress but is hosted independently.)

Facebook as a walled garden

While Facebook can be useful for sharing things amongst immediate friends, the audience of such posts is in most cases ultimately limited to other users of Facebook. Hence Facebook is not really such a public platform. (Contrast that against this post, for example, which can be read by anybody on the Internet.)

Student unions often use Facebook to make announcements, rather than university email. This means that announcements, including important announcements such as upcoming committee elections, never reach students who are not on Facebook or not connected to the rest of the student body. This is undemocratic, and particularly affects mainland Chinese students.

Epilogue

Writing this has taken much longer than I had expected, and I need to go and do some work now, but hopefully it will be useful for persuading some other people to leave, perhaps reverting to email (or even face-to-face contact!) for communication.