The Bond Language
When Bond was created, the goal was not to invent yet another language; it was more of an endeavour to create a minimalistic runtime environment for a bytecode interpreted language. So, where possible, Bond borrowed its syntax and semantics from C. This document begins by covering some notable differences between Bond and C that exist in the spirit of keeping things clean and simple. Then the document moves on to cover the Bond language in detail.
- Notable Differences With C
- No preprocessor
- No header files or forward declarations
- Include directives
- Ambiguities in the grammar
- Type declarations
- Primitive type names
- Numeric types have a fixed width
- No function pointers
- No variadic functions
- Namespaces
- Structs can have member functions
- No global or static variables
- Scope-based resource management
- Native blocks
- Language Reference
- Basics of Bond
- Structure of a Program
- Identifiers
- Keywords
- Primitive Types
- Boolean Type
- Integer Types
- Floating-Point Types
- Literal Constants
- Boolean Constants
- Integer Constants
- Floating-Point Constants
- Character Constants
- String Literals
- Variables
- Operators
- Assignment Operator
- Arithmetic Operators
- Increment and Decrement Operators
- Relational and Comparison Operators
- Logical Operators
- Bitwise Logical Operators
- Bit Shifting Operators
- Compound Assignment Operators
- Ternary Operator
- Comma Operator
- Type Casting Operator
- sizeof and alignof Operators
- Operator Precedence
- Statements
- Expression Statements
- Declarative Statements
- Compound Statements
- if Statements
- switch Statements
- while Statements
- for Statements
- Jump Statements
- Functions
- Function Definitions
- Function Calls
- Data Types
- Enumerated Types
- Array Types
- Structure Types
- Pointer Types
- Namespaces
- Scope-Based Resource Management
- Native Blocks
Notable Differences With C
No preprocessor
There is no preprocessor, so there are no macros or other mechanisms to manipulate the source code before it is scanned by the compiler.
No header files or forward declarations
All code is placed in .bond files with no need for separate header files. Header files are not needed since type, function and constant declarations do not need to occur before they are referenced in source code; the compiler will resolve those references during semantic analysis after all files have been parsed. This important difference with C also means that structs do not need to be forward declared and functions do not need separate declarations and definitions.
Include directives
Despite the lack of header files, the Bond language supports include directives so that the compiler may be informed about where referenced code is defined so that it may be scanned and parsed prior to semantic analysis. In contrast to C's include directive, Bond's is a compiler directive, not a preprocessor directive. The text of the included file is not substituted in place into the including file. Instead, the referenced file is added to the list of files to be scanned by the compiler if it is not already in the list. If files are multiply or circularly included, the compiler will only scan the included file once. The syntax is as follows. Note that include is a reserved keyword.
include "file/to/include.bond";
Ambiguities in the grammar
C's grammar is ambiguous and Bond inherited those ambiguities. Some statements in C can be parsed in more than one way, such as the following:
A * B;
This line of code can be parsed as the declarative statement: B is a pointer to type A. Or it can be parsed as the expression statement: the value of variable A times the value of variable B. C compilers resolve the ambiguity using forward declarations and by giving the parser access to the symbol table so that it can look up whether A is a type. This approach introduces a burden on the programmer to add forward declarations to the source code and pushes elements of semantic analysis (e.g. maintenance of the symbol table) into the parser.
Fortunately, such ambiguities result in statements with no effect when parsed as expression statements, so it is not necessary to support such statements. Bond's approach to resolving the ambiguity is to favour parsing statements as declarative statements whenever possible. If it turns out that A is not a type, the semantic analyzer will raise an error to that effect during a later stage of compilation.
In hindsight, it may have been worth considering eliminating such ambiguities by modifying the syntax for declarative statements.
For clarity and to avoid other possible ambiguities, the syntax for casting and the sizeof operator were both slightly modified from C. The syntax for casting is:
cast<type>(expression)
The sizeof operator comes in two forms: one for types and one for expressions. The notation is the following:
sizeof<type> sizeof expression
The expression in the second form can optionally be wrapped in parentheses as is the case in C.
Type declarations
To simplify reading and parsing type declarations, all elements pertaining to type declarations (e.g. type names, [], *, const) appear prior to variable names. The most obvious effect of this change is with array declarations, but there are some other subtleties as well.
int[3][5] array; // C equivalent is: int array[3][5]; int[3] a, b; // Both a and b are of type int[3]. int* a, b; // Both a and b are of type int*. In C, a is int* and b is int.
Primitive type names
All primitive type names consist of a single reserved keyword. Where C has the type unsigned int, Bond has uint.
Numeric types have a fixed width
The types char, short, int and long (as well as their unsigned counterparts) have 8, 16, 32, and 64 bits respectively. The types float and double have 32 and 64 bits respectively. See the section on Primitive Types for further details.
No function pointers
Indeed, the Bond language makes no provisions for function pointers.
No variadic functions
Adding support for variadic functions involves work that has not been tackled. Bond may have variadic functions someday, but for the time being, that work is not planned.
Namespaces
Bond has namespaces that serve the same purpose as C++ namespaces, which is to prevent name collisions. All declarations of the Bond standard library are in the Bond namespace.
Namespace Foo { const int BAR = 42; } // Identifiers within namespaces are qualified with :: const int BAZ = Foo::BAR;
See the section on Namespaces for further details.
Structs can have member functions
Another aspect borrowed from C++ is struct member functions. See the section on Structure Types for further details.
No global or static variables
Global constants are permitted, however, global variables are not. (TODO: see the documentation on the Virtual Machine Architecture for the justification.)
const int[2][3] GLOBAL_CONSTANT = {{1, 2, 3}, {4, 5, 6}}; // Permitted int[2][3] GLOBAL_VARIABLE = {{1, 2, 3}, {4, 5, 6}}; // Error
There are also no static variables, so all local variables have a lifetime that does not surpass that of the scope in which they are declared.
Scope-based resource management
C has explicit resource management, meaning that the acquisition of a resource must be explicitly paired with the release of that resource in code. For example, memory allocated with malloc() must be deallocated with free() and files opened with fopen() must be closed with fclose(). Bond supports explicit resource management as well, but it also has a form of scope-based resource management that allows resources to be automatically released when a function returns. See the section on Scope-Based Resource Management for further details.
Native blocks
Declarations for functions and structs implemented in C++ can be placed in native blocks, allowing those functions to be called and those types to be manipulated from Bond code with full type checking and semantic analysis. See the section on Native Blocks for further details.
Language Reference
Basics of Bond
Structure of a Program
For a quick overview of several of the fundamental concepts of Bond, let us take a quick look at the ubiquitous Hello World program. The source code for it follows.
1 2 3 4 5 6 7 8
// This is a Hello World program written in Bond. include "io.bond"; int main() { Bond::StdOut()->PrintStr("Hello World!"); return 0; }
When this program runs, it prints the following output to the screen.
Hello World!
Here is a line-by-line breakdown of the program.
- Line 1: // This is a Hello World program written in Bond.
- Two consecutive forward slashes indicate the beginning of a comment. Comments are useful for documenting code, but do not influence the program. They are ignored by the compiler.
- Line 2: include "io.bond";
- An include directive tells the compiler that the programmer wishes to reference code written in another file. In this case, the referenced file is io.bond. The file io.bond is part of the Standard Bond Library and defines input and output operations.
- Line 3: A blank line.
- Blank lines have no effect on the program, but do help to organize code and improve readability.
- Line 4: int main()
- This line begins the declaration of a function whose name is main. A function encapsulates the code that performs a specific task or operation. Functions are covered in greater detail in the Functions section later.
A function declaration specifies a return type (i.e. what kind of result the function produces; in this case int), a name (main), and a parameter list enclosed in parentheses (i.e. the operands of the operation performed by the function). This example does not require any parameters.
The main function is a special function for standalone Bond programs. It is the entry point of the program (i.e. where execution of the program begins). When Bond is embedded in a larger application written in C++, any function may be an entry point.
- Lines 5 and 8: { and }
- A function has an opening brace and a closing brace to enclose its body, the code that performs the function's task.
- Line 6: Bond::StdOut()->PrintStr("Hello World!");
- This line of code is a statement. Statements perform actions. This particular line of code is quite loaded, since it calls two functions.
The first function is called Bond::StdOut and it is defined in io.bond, which was included on line 2. The function is actually just called StdOut and it is defined in the namespace called Bond. We will see more about namespaces in the Namespaces section. The pair of parentheses that follow indicate that we are calling the function. The function returns an OutputStream, which allows the program to print some output.
The second function is called PrintStr. The -> symbol indicates that the function belongs to the OutputStream returned by the first function. Again the pair of parentheses indicate that we are calling the function. However, this time the parentheses enclose the string "Hello World!". That string is given to the PrintStr function so that the function may print it to the screen.
- Line 7: return 0;
- This line of code is a statement that specifies what value is returned by the main function. As we saw on line 4, the main function returns an int. In Bond, as with other programming languages, such as C and C++, the return value of the main function indicates the exit status of the program. By convention, a return value of 0 indicates that the program terminated normally, whereas any other value indicates an application specific error code.
Identifiers
Identifiers are names for symbols such as variables, functions and types defined in code. Identifiers consist of any sequence of letters, digits and underscores ('_'), with the exception that the first character cannot be a digit. Also, keywords are reserved and cannot be used as identifiers. Identifiers are case sensitive.
Keywords
Keywords are reserved by the language and have special meaning for the compiler. The following is a list of all Bond keywords.
alignof bool break case cast char const continue default do double else enum false float for if include int long namespace native null return short sizeof struct switch this true uchar uint ulong ushort void while
Primitive Types
Bond is a statically typed language, so every variable and expression has a type known at compile time. The type of a value describes how much space it occupies in memory, how that memory is interpreted and what operations and range of values are valid. At the core of the type system are the primitive types. Primitive types are predefined in the language and are named by their reserved keyword. The type system is actually much more vast and the remaining types are covered in the Data Types section.
Boolean Type
There is one type to represent Boolean values named bool. It has two values: true and false. The bool type is implemented in the Bond Runtime Library using the C++ uint8_t type.
Integer Types
The integer data types are used for storing values which are whole numbers. There are several types that vary in size and range of values that they can represent. Note that the char and uchar types are used to store characters.
The following table provides the details for all of the integer types, including the underlying C++ types used to implement them in the Bond Runtime Library.
Type | C++ Type | Storage Size | Value Range |
---|---|---|---|
char | int8_t | 8 bits | -128 to 127 |
uchar | uint8_t | 8 bits | 0 to 256 |
short | int16_t | 16 bits | -32,768 to 32,767 |
ushort | uint16_t | 16 bits | 0 to 65,535 |
int | int32_t | 32 bits | -2,147,483,648 to 2,147,483,647 |
uint | uint32_t | 32 bits | 0 to 4,294,967,295 |
long | int64_t | 64 bits | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
ulong | uint64_t | 64 bits | 0 to 18,446,744,073,709,551,615 |
Floating-Point Types
Bond supports two floating-point types which are used for storing values that approximate real numbers: float and double. These two types are implemented in the Bond Runtime Library using the C++ float and double types, which are assumed to conform to the IEEE 754 binary32 and binary64 formats respectively.
Literal Constants
Literal constants represent immutable values in source code. They can be booleans, numbers, characters, strings of characters or the null pointer.Boolean Constants
The type bool has two values represented by the keywords true and false.
Integer Constants
Integer constants represent integral values (i.e. whole numbers). They can be expressed in decimal (base 10), octal (base 8) or hexadecimal (base 16).
Hexadecimal integers consist of 0x and a sequence of hexadecimal digits which are the digits 0 through 9 and the letters a through f and A through F. For example, 0x2f represents decimal value 47.
Octal integers consist of 0 and a sequence of octal digits which are the digits 0 through 7. For example, 063 represents the decimal value 51.
Decimal integers consist a sequence of the digits 0 through 9 and cannot begin with a 0.
An optional suffix u or U can be appended to force the value to be of type uint and an optional suffix l or L can be appended to force the value to be of type long. The two can be combined to create values of type ulong. For example, 27u is of type uint and 27ul is of type ulong.
Floating-Point Constants
Floating-point constants represent an approximation of real numbers. They are composed of an integral part, a decimal point, a fractional part and an exponent part. The integral part or the fractional part may be omitted, but not both and the exponent may always be omitted. The integral and fractional parts consist of a sequence of digits, while the exponent consists of e or E followed by an optional sign and a sequence of digits.
An optional suffix f or F can be appended to force the value to be of type float, otherwise the value will be of type double.
The following are examples of valid floating-point constants.
3. // Decimal part only. Without a '.' the value would be an int. .5 // Fractional part only. 3.14159 // Both decimal and fractional parts. 314159e-5 // Same as above expressed with an exponent. 3.14159f // This value is of type float.
Character Constants
Character constants represent values of type char and are typically expressed as a single character between single quotes. For example, 'a' represents the letter a.
Some values cannot be expressed with a single character and are expressed as an escape sequence between single quotes. Escape sequences begin with a backslash and are depicted in the table below.
String Literals
String literals represent a sequence of characters of type const char[]. They consist of a sequence of character constants enclosed in double quotes and are implicitly terminated with a null character (\0).
Variables
A variable is a named storage area for a value of a particular type. A variable can be used to store the result of a computation, and recalling it later. The syntax to declare a variable is a type name followed by a name (an identifier). The following snippet shows the declaration of a variable named x that stores an int.
int x;
Several variables of the same type can be declared at once by using a comma separated list of names.
int x, y, z;
However, it is often easier to read and understand code that uses separate declarations.
int x; int y; int z;
A variable can be initialized when it is declared by following its name with an equal sign and an expression that produces the value that is assigned to the variable.
int x = 7;
The value of a variable can be accessed by placing its name in an expression.
int x = 6; int y = 8; // Add the value of variables x and y. int sum = x + y;
A variable can be assigned a new value over the course of a program's execution by placing it on the left hand side of an equal sign.
// Initialize x to 0. int x = 0; // Add 1 to the current value of x and assign the result to x. x = x + 1; // The value of x is now 1.
The value assigned to a variable must be of an appropriate type, meaning that it has exactly the same type as the variable or that converting a numerical value from one type to another would not lead to a loss of data. For example, assigning a value of type double to a variable of type int would truncate the fractional part of the value, a value of type int cannot fit into a short since the latter's range is smaller, and an int would lose its sign when assigned to a uint. Conversely, assigning an int to a double and a short to an int are both acceptable since they are lossless operations. Assigning a uint to an int remains unsafe since both types can represent values that cannot be represented by the other.
A variable declared with the keyword const cannot have its value changed after it has been initialized.
const double PI = 3.14159; PI = 4; // Error. Cannot assign a value to a const variable.
There are at least two benefits to using constant variables. The first is that it enables the compiler to perform optimizations. If a constant variable is initialized with a constant expression that can be evaluated at compile time, writing and reading the variable's value to and from memory becomes a superfluous activity. The compiler can instead substitute the variable's value, which it already knows, directly into the generated code wherever that variable is referenced.
The second benefit to constant variables is that they improve the readability of code. Even if the value of a constant variable is not known at compile time, it provides the reader with a guarantee that there is no code that alters the value of the variable after it has been initialized.
const int result = CalculateResult(); // Skip many lines of complicated code. // Without understanding all of the above code, we are certain // that the value of result has been determined on the first line. int resultPlusOne = result + 1;
Operators
Assignment Operator
As we have seen in the Variables section, the assignment operator, represented by an equal sign =, assigns a value to a variable. Even though the assignment operator is represented by an equal sign, it is not used for determining equality.
The expression producing the value that is assigned is on the right side of the assignment operator, and the variable receiving the value is on the left. In the following example, the value of the variable y is assigned to the variable x.
x = y;
As with all other expressions, the assignment expression evaluates to a value, which is the value assigned. The value of the assignment expression can then be used in another expression. This approach should be used carefully, since it can lead to code which is difficult to understand. In the following example, the assignment of the value 7 to the variable y produces the value 7, which is then added to the value 8 and assigned to the variable x.
x = 8 + (y = 7); // x is 15 and y is 7.
Arithmetic Operators
The five binary arithmetic operators are +, -, *, / and %, and they perform addition, subtraction, multiplication, division and modulo division respectively. The syntax for using these operators is straightforward and resembles the notation for basic arithmetic (i.e. the two operands appear on either side of the operator). The operands can be any expression that evaluates to a numerical type, including constants, variables, other arithmetic expressions, function calls and so forth.
double a = 2.0; double b = a + 1.0; // Addition: b is 3.0. double c = b - 7.0; // Subtraction: c is -4.0. double d = b * c; // Multiplication: d is -12.0. double e = d / a; // Division: e is -6.0. int remainder = 23 % 10; // Modulo division: remainder is 3.
Modulo division, which is applicable only to integers, produces the remainder of a division.
The + and - operators can also be used as unary operators and appear before their lone operand. The - operator negates its operand, meaning that it is equivalent to multiplying the operand by -1. The + operator is equivalent to multiplying its operand by 1, which leaves it intact. It is not particularly useful and is only included in the language for completeness.
double a = 3.0; double b = -a; // b is -3.0. double c = -b; // c is 3.0. double d = +a; // d is 3.0. double e = +b; // e is -3.0.
Putting it all together:
double a = -2.0 - -((3.0 * 12.0) / (3.0 + 6.0)); // a is 2.0.
Increment and Decrement Operators
Since adding and subtracting 1 from variables is such a common thing to do when programming, special shorthand operators exist specifically for that task denoted with ++ and --. The former adds 1 to its operand and the latter subtracts 1.
The following two lines of code are equivalent methods of adding 1 to the variable a.
++a; a = a + 1;
Likewise, the following two lines of code are equivalent methods of subtracting 1 from the variable a.
--a; a = a - 1;
Furthermore, the increment and decrement operators can both be used in prefix or suffix forms. Both forms perform the same operation, however they differ in their order of evaluation. In the prefix form, the addition or subtraction is performed first, then the value of the expression is evaluated. In the suffix form, the value of expression is evaluated first, then the addition or subtraction is performed. The following examples show the difference.
int a = 7; int b = ++a; // a is 8 and b is also 8. a is incremented before its value is captured. a = 7; b = a++; // a is 8, but b is 7. The value of a is captured before it is incremented.
Relational and Comparison Operators
The six relational operators are ==, !=, <, >, <= and >=, and they are used to compare the value of two expressions. The result of a relational operator is a bool (i.e. either the value true or false) indicating whether the comparison was successful.
The == and != operators test whether two values are equal or not equal respectively.
int a = 7; int b = 7; int c = 9; a == b; // Result is true since both operands are 7. a == c; // Result is false since both operands are 7. a != b; // Result is false since one operand is 7 and one is 9. a != b; // Result is true since one operand is 7 and one is 9.
The <, >, <= and >= operators test whether the first operand is less than, greater than, less than or equal to and greater than or equal to the second operator respectively.
int a = 7; int b = 7; int c = 9; a < b; // Result is false since 7 is not less than 7. a < c; // Result is true since 7 is less than 9. c < a; // Result is false since 9 is not less than 7. a > b; // Result is false since 7 is not greater than 7. a > c; // Result is false since 7 is not greater than 9. c > a; // Result is true since 9 is greater than 7. a <= b; // Result is true since 7 is equal to 7. a <= c; // Result is true since 7 is less than 9. c <= a; // Result is false since 9 is not less than or equal to 7. a >= b; // Result is true since 7 is equal to 7. a >= c; // Result is false since 7 is not greater than or equal to 9. c >= a; // Result is true since 9 is greater than 7.
Logical Operators
The two binary logical operators && and || are used for combining the truth values of two Boolean expressions.
The conjunction operator && evaluates to true only if its operands are both true, otherwise it evaluates to false. If the first operand is false, the second operand is not evaluated at all, since the expression will evaluate to false regardless of the second operand's value.
The disjunction operator || evaluates to false only if its operands are both false, otherwise it evaluates to true. If the first operand is true, the second operand is not evaluated at all, since the expression will evaluate to true regardless of the second operand's value.
bool a = true; bool b = true; bool c = false; bool d = false; a && b; // Result is true since both operands are true. a && c; // Result is false since second operand is false. c && a; // Result is false since first operand is false. c && d; // Result is false since both operands are false. a || b; // Result is true since both operands are true. a || c; // Result is true since first operand is true. c || a; // Result is true since second operand is true. c || d; // Result is false since both operands are false.
The third logical operator is the unary not operator !. This operator negates the value of its Boolean operand, such that false becomes true and vice-versa.
We can combine all of the operators that we have learned so far to formulate more complex conditions. The following snippet of code checks whether a variable named value falls in the range [5 - 10] if the range check is enabled. If the range check is not enabled, the condition always succeeds.
bool conditionValid = !rangeCheckEnabled || ((value >= 5) && (value <= 10));
Bitwise Logical Operators
There are a handful of operators for directly manipulating the bits of integer values. First are the bitwise logical operators, which are analogous to the Boolean logical operators seen above. With each of the binary bitwise logical operators, each bit of the left operand is paired with the corresponding bit of the right operand and the operation is performed between each pair of bits. A bit value of 0 corresponds to a truth value of false, whereas a bit value of 1 corresponds to a truth value of true.
The & operator performs a bitwise conjunction, while the | operator performs a bitwise disjunction. A third operator, ^, performs an exclusive disjunction. In the case of the exclusive disjunction operator, the result is true if only one, but not both of the operands are true.
int a = 12; int b = 10; a & b; // Result is 8, since 1100 & 1010 is 1000. a | b; // Result is 14, since 1100 | 1010 is 1110. a ^ b; // Result is 6, since 1100 ^ 1010 is 0110.
The ~ operator is analogous to the unary Boolean not operator. It flips each bit of its operand such that 0 becomes 1 and 1 becomes 0. Note that all bits including leading zeroes and the sign bit of signed integers are flipped.
int a = 12; ~a; // Result is -13, since ~1100 is 11111111111111111111111111110011.
Bit Shifting Operators
Another set of operators for manipulating the bits of integers are the bit shifting operators. The left shift operator << shifts the bits of the left operand to the left. The right operand specifies the number of positions by which the bits are shifted. Bits shifted out of the left side are discarded whereas bits inserted to the right are zero.
int a = 123; // a is 00000000000000000000000001111011. a << 3; // Result is 00000000000000000000001111011000.
Similarly, the >> operator shifts the bits of the left operand to the right by the number of positions specified by the right operand. Bits shifted out of the right side are discarded and bits inserted into the left side follow slightly more complicated rules. Inserted bits are usually zero, however, if the left operand is a signed integer, then the inserted bits are the same as the contents of the leftmost bit (i.e. the sign bit). This behaviour has the interesting property of preserving the sign of the left operand.
int a = 123; // a is 00000000000000000000000001111011. a >> 3; // Result is 00000000000000000000000000001111, zeros inserted on the left. a = -123; // a is 11111111111111111111111110000101. a >> 3; // Result is 11111111111111111111111111110000, ones inserted on the left.
Compound Assignment Operators
There are ten compound assignment operators that, in addition to performing an operation, assign the result to the variable or memory location (see Arrays and Pointers) described by the left operand. These operators do not introduce any new operations, they just provide convenient shorthand notation. See the table below for an overview of all of the compound assignment operators.
Compound Operator | Equivalent Expression |
---|---|
a += b; | a = a + b; |
a -= b; | a = a - b; |
a *= b; | a = a * b; |
a /= b; | a = a / b; |
a %= b; | a = a % b; |
a &= b; | a = a & b; |
a |= b; | a = a | b; |
a ^= b; | a = a ^ b; |
a <<= b; | a = a << b; |
a >>= b; | a = a >> b; |
Ternary Operator
The ternary operator, also known as the conditional operator, allows for an expression to evaluate to one of two results based on the evaluation of a boolean condition. The notation for the ternary operator is as follows:
condition ? expression1 : expression2
The condition expression is evaluated first, which must evaluate to a value of type bool. If the result of the condition is true, then expression1 is evaluated and the result of the ternary operator is the value of expression1. Otherwise, if the condition is false, then expression2 is evaluated and the result of the ternary operator is the value of expression2. Note that only one of expression1 or expression2 is evaluated, and both expressions must have compatible types.
bool a = true; bool b = false; int c = 6; Int d = 8; a ? c : d; // Result is 6 since a is true so the value of c is evaluated. b ? c : d; // Result is 8 since b is true so the value of d is evaluated. ((c >= 1) && (c <= 10)) ? (d - 1) : (d + 1); // Result is 7 (d - 1) since c is in the range [1 - 10]. // Multiple possible outcomes based on the result of function calls. Condition1() ? (Condition2() ? Result1() : Result2()) : (Condition3() ? Result3() : Result4());
Comma Operator
The comma operator is used to concatenate several expressions where only one expression is expected. All of the subexpressions are evaluated in order, however, the result of the enclosing expression is that of the last subexpression only.
int a = 0; int b = (a = 7), a * 2; // a is assigned the value 7, then b is assigned the value 14 (i.e. 7 * 2).
The example above would be much easier to understand if it had been written in two statements: one that assigns a value to a and one that assigns a value to b. This lack of clarity is almost always present when using the comma operator, so it is best to avoid it when possible. There are cases, as we will see in the For Loop section, where the comma operator is an appropriate tool.
Type Casting Operator
The type casting operator allows a value of one type to be converted to another. It is often used to convert between numeric types, such as from double to int, but it can also be used to convert pointers from one type to another. We will cover pointers in the Pointers section. The notation for a type cast is as follows:
cast<Type>(value)
The result of the expression is value converted to Type. The following code snippet shows an example.
double a = 1.5; double b = 2.6; int c = cast<int>(a + b) + 5; // cast<int>(1.5 + 2.6) + 5 // cast<int>(4.1) + 5 // 4 + 5 // 9
sizeof and alignof Operators
The sizeof operator is used to evaluate the size in bytes of a value of a given type. It can be expressed in two forms as follows:
sizeof<type> sizeof expression
In the first form, the type whose size is evaluated is explicitly named. In the second form, the size of the type of an expression is evaluated. Note that the expression itself is not evaluated; it is only used to specify a type. The expression can also optionally be wrapped in parentheses.
The notation for the alignof operator is the same, but evaluates instead to the memory alignment of the specified type.
uint sizeofDouble = sizeof<double>; // doubles occupy 8 bytes of memory. sizeofDouble = sizeof(4.0 + 2.0); // equivalent to sizeof<double>. uint alignofDouble = alignof<double>; // doubles align on 8 byte memory address boundaries. alignofDouble = alignof(4.0 + 2.0); // equivalent to alignof<double>.
Operator Precedence
Since expressions can contain several operators, there are rules that determine the order in which operators are evaluated. These rules are analogous to the way that multiplication takes precedence over addition in arithmetic. Likewise, parentheses can be used to make the evaluation order explicit. The following table enumerates the precedence and associativity of Bond operators. Note that it includes some operators that will be covered in later sections of this document.
Precedence | Operator | Description | Associativity |
---|---|---|---|
1 | ++ -- | Postfix increment and decrement | Left-to-right |
() | Function call | ||
[] | Array subscript | ||
. -> | Member access | ||
2 | ++ -- | Prefix increment and decrement | Right-to-left |
+ - | Unary plus and minus | ||
! ~ | Logical and bitwise not | ||
cast<> | Type cast | ||
* | Pointer dereference | ||
& | Address of | ||
sizeof alignof | Size and alignment of type | ||
3 | * / % | Multiplication, division and modulo division | Left-to-right |
4 | + - | Addition and subtraction | Left-to-right |
5 | << >> | Bit shifting operators | Left-to-right |
6 | < <= > >= | Relational operators | Left-to-right |
7 | = != | Equality and inequality | Left-to-right |
8 | & | Bitwise conjunction | Left-to-right |
9 | ^ | Bitwise exclusive disjunction | Left-to-right |
10 | | | Bitwise disjunction | Left-to-right |
11 | && | Conjunction | Left-to-right |
12 | || | Disjunction | Left-to-right |
13 | ?: | Ternary operator | Left-to-right |
14 | = | Assignment | Right-to-left |
*= /= %= += -= <<= >>= &= ^= |= |
Compound assignment | ||
15 | , | Comma | Left-to-right |
Statements
Expression Statements
Many of the code examples presented so far in this document have been examples of expression statements. An expression statement consists of a single expression terminated with a semicolon. Expression statements only do meaningful work if they have some kind of side effect, such as assigning a value to a variable, or calling a function that in turn has side effects.
a + b; // Calculates a result which is discarded. Does no meaningful work. a = b + c; // Performs an addition and assigns the result to the variable a.
Declarative Statements
A declarative statement is used to declare and optionally initialize a variable. In fact, we have already covered declarative statements in some detail in the Variables section, and have seen many examples in the sample code. The only aspect of declarative statements that we have yet to cover is the initialization of aggregate types, however, that will be deferred to the sections on Array Types and Structure Types.
Compound Statements
A compound statement, also known as a block, provides a means to collect several statements together that can be placed wherever a single statement is expected. A block consists of a list of zero or more statements enclosed in curly braces: {}. The statements contained within a block are all executed in order and can be any type of statement, including nested compound statements. Furthermore, any variables declared within a block are local to that block and cease to exist when program execution leaves the scope of the block.
int a = 10; int b = 20; // The following is a compound statement that contains two statements. { int c = 30; // Declare a new variable that only exists between {}. b = a + c; // Assign a new value (40) to b declared above. } int sum = a + b; // a and b are still 10 and 40 respectively. // The following would be an error because c no longer exists. // sum += c;
if Statements
An if statement allows part of a program to be executed conditionally based on the evaluation of a boolean expression. The basic syntax for an if statement is the following, where if and else are reserved keywords:
if (condition_expression) true_statement else false_statement
The condition_expression is evaluated and if its result is true, the true_statement is then executed, otherwise the false_statement is executed. Both the true_statement and false_statement are single statements, but they can be compound statements to allow for an arbitrary amount of code to be conditionally executed.
if (x > 10) Bond::StdOut()->PrintStr("X is greater than 10."); else Bond::StdOut()->PrintStr("X is less than or equal to 10.");
The else clause, which includes the false_statement, is optional. This example also demonstrates the use of a compound statement.
if (x > 10) { Bond::StdOut()->PrintStr("This is a compound statement.\n"); Bond::StdOut()->PrintStr("X is greater than 10."); } Bond::StdOut()->PrintStr("This statement executes regardless.");
Multiple if statements can be daisy-chained to evaluate several conditions. This is known as a cascaded if statement.
if (x < 10) Bond::StdOut()->PrintStr("X is less than 10."); else if (x < 20) Bond::StdOut()->PrintStr("X is greater than or equal to 10 but less than 20."); else if (x < 30) Bond::StdOut()->PrintStr("X is greater than or equal to 20 but less than 30."); else Bond::StdOut()->PrintStr("X is greater than or equal to 30.");
switch Statements
Another type of statement, called the switch statement, can be used to conditionally execute pieces of code. It selects a block of code to execute based on the value of an integral expression. The syntax is as follows, where switch, case, default and break are all reserved keywords:
switch (control_expression) { case value_1: statements_1 break; case value_2: statements_2 break; ... case value_n: statements_n break; default: default_statements break; }
To put the switch statement into perspective, it is worth noting that the above switch statement is roughly equivalent in functionality to the following cascaded if statement.
if (control_expression == value_1) { statements_1 } else if (control_expression == value_2) { statements_2 } ... else if (control_expression == value_n) { statements_n } else { default_statements }
The switch statement works by evaluating the control_expression once, which must be of type int or uint, then comparing the result against the values next to each case label looking for an exact match. The values (value_1, value_2 and so forth), must be compile time constants (either literals, constant variables, or enumerators). When a match is found, the statements following the matching case label are executed. When a break statement is encountered, execution resumes after the switch statement. If no exact value is matched, the statements following the default label are executed. If no exact match is found and there is no default label, then none of the statement contained within the switch statement are executed and execution resumes after the switch statement.
const int FEET = 1; const int METERS = 2; switch (units) { case FEET: Bond::StdOut()->PrintStr("Units are in feet."); break; case METERS: Bond::StdOut()->PrintStr("Units are in meters."); break; default: Bond::StdOut()->PrintStr("Units are not recognized."); break; }
So why is there a need for a switch statement when a cascaded if statement is just as capable? The fundamental difference between the two is one of efficiency; the cascaded if statement does a sequential search for a match, whereas the internal machinery of a switch statement can perform either a table lookup or a binary search. The cascaded if statement also requires the Bond compiler to generate a longer stream of instructions.
while Statements
A while statement, also known as a while loop, is a statement similar in structure to an if statement, but rather than conditionally executing a statement once, it repeatedly executes a statement as long as the condition remains true. The syntax is:
while (condition_expression) body_statement
The while loop first executes the condition_expression and if its result is true, the body_statement is executed. In contrast to the if statement, execution then returns to the top of the loop and the condition is then re-evaluated, and the cycle repeats until the condition_expression is false. For the loop to terminate, the body of the loop should typically have a side effect (e.g. increment a counter) to cause the condition to become false after some number of iterations.
int i = 0; while (i < 3) { Bond::StdOut()->PrintStr("i is: ")->PrintI(i)->PrintStr("\n"); ++i; } Bond::StdOut()->PrintStr("After the loop, i is: ")->PrintI(i)->PrintStr("\n");
The output from the above snippet of code would be:
i is: 0 i is: 1 i is: 2 After the loop, i is: 3
Another form of the while loop is the do-while loop. It exchanges the order of evaluation of the condition_expression and the body_statement, so that the body is always executed at least once. This form of loop is less common, but useful in cases where something required to evaluate the condition is determined within the body of the loop. The notation for the do-while loop is:
do body_statement while (condition_expression);
for Statements
Another form of iteration statement is the for statement, also called the for loop. The for loop provides a bit of additional syntax to manage the iteration, which is as follows:
for (init_expression; condition_expression; increment_expression) body_statement
The above for loop syntax is equivalent to the following while loop:
init_expression; while (condition_expression) { body_statement increment_expression; }
All three expressions between the parentheses following the for keyword are optional. The init_expression executes first and once only. It can be used to declare and/or initialize a variable. Then, as with the while loop, the condition_expression is evaluated. If it evaluates to true, the body_statement executes. Subsequently, the increment_expression is evaluated; it is expected to have a side effect (e.g. increment the variable declared in the init_epression), since the result of the expression is not used otherwise. Finally, execution returns to the top of the loop and the condition_expression is evaluated again. Once the condition evaluates to false, execution proceeds to the statement following the body of the loop.
The example presented in the while Statements section can be written more concisely as:
for (int i = 0; i < 3; ++i) { Bond::StdOut()->PrintStr("i is: ")->PrintI(i)->PrintStr("\n"); }
Jump Statements
There are two statements that can be nested in the body of a loop that allow for greater control over the execution of the loop. They are the break and continue statements.
The break statement allows a loop to be terminated before its exit condition is satisfied. It can be placed anywhere within the body of a loop and is usually contained within an if statement that tests for special conditions.
int a = 0; int b = 0; for (int i = 0; i < 10; ++i) { ++a; if (i == 7) { // Exit loop immediately. break; } ++b; } Bond::StdOut()->PrintStr("a is: ")->PrintI(a)->PrintStr("\n"); Bond::StdOut()->PrintStr("b is: ")->PrintI(b)->PrintStr("\n");
The output from the above snippet of code would be:
a is: 8 b is: 7
The continue statement allows the remainder of the body of a loop to be skipped. The loop continues to iterate as though execution had reached the end of the body.
int a = 0; int b = 0; for (int i = 0; i < 10; ++i) { ++a; if ((i % 3) == 0) { // Skip numbers divisible by 3. continue; } ++b; } Bond::StdOut()->PrintStr("a is: ")->PrintI(a)->PrintStr("\n"); Bond::StdOut()->PrintStr("b is: ")->PrintI(b)->PrintStr("\n");
Note that this code hits the continue statement four times when i is equal to 0, 3, 6 and 9, so b is not incremented on those four executions of the loop.The output is:
a is: 10 b is: 6
If a continue or break statement appears within nested loops, its effect applies only to the innermost loop. It is an error to use a continue statement outside the body of a loop or to use a continue statement outside the body of a loop or a switch statement.
Functions
A function is a sequence of statements that perform a specific task or operation, which is defined once but can be called from multiple locations in a program. Functions therefore promote code reuse by not requiring the operation to be reimplemented in all of the locations where it needs to be performed. Functions also help to provide abstraction and to keep code concise. One does not need to be exposed to a function's implementation details when calling it, only its name, inputs and what type of result it outputs need to be known. In Bond, all executable code is encapsulated in functions.
Function Definitions
Thus far we have only seen one example of a function definition in the Hello World program. Now we will take a detailed look at the general form of a function definition:
return_type function_name ( param_type1 param_name1, param_type2 param_name2 … ) { statements }
The first element of a function definition is its return type. When a function performs work and returns a result, the return type specifies the type of value that is returned. If the function does not return a value, then the return type should be specified using the void keyword.
The next element of a function definition is its name. As with variable names, the name must be a valid identifier that is not a reserved keyword. A function's name is used to call it to when it is needed to perform work.
Following the name is a parameter list. The parameter list is a comma separated list of variable declarations enclosed in parentheses. Each parameter holds a value passed to the function as one of its inputs. The parameter list may be empty if the function requires no inputs, however, the parentheses are not optional.
The final part of a function definition is the body. The body is a code block enclosed in curly braces, as we saw in the Compound Statements section, that contains all of the statements that perform the function's work.
Functions that return a value must end with a return statement. A return statement consists of the return keyword followed by an expression whose value is returned to the caller. The type of the expression must match the function's return type.
// A trivial example of a function with no parameters. double GetThreePointNine() { return 3.9f; } // A function with a single parameter. double Square(double x) { return x * x; } // A function with two parameters. double Multiply(double x, double y) { return x * y; } // A function with multiple return paths. double Max(double x, double y) { if (x > y) { return x; } return y; }
Function Calls
While a function definition specifies what a function does, a function call is an expression that invokes the function to perform work. When a function is called, program execution transfers from the call site to the top of the function and continues until the bottom of the function or a return statement is reached. When finished, program execution returns to the call site and the function's return value becomes the value of the function call expression.
In code, a function call consist of the function's name followed by an argument list in parentheses. The argument list is a list of comma separated values that are assigned to the parameters in the function's parameter list. As with assigning a value to variable, the type of a function argument must be compatible with the type of the parameter to which it is assigned. Even if the called function has no parameters, the parentheses are not optional.
double value = GetThreePointNine(); double valueTimesTwo = Multiply(value, 2.0);
Since the value of function arguments are copied into their respective parameters which are variables local to a function, a function cannot modify the arguments, only its local copies of them.
void AssignToParameter(int value) { // This only modifies our local copy of the argument assigned to the parameter named value. value = 3; } void TryToModifyAFunctionArgument() { int a = 1; AssignToParameter(a); // b is assigned the value 1 since a was not modified by the function call. int b = a; }
To give functions the ability to modify data, pointers can be passed as function arguments. This topic is covered in the Pointer Types section.
Function calls can appear anywhere that an expression is expected. Functions can call other functions, and can also call themselves in a pattern known as recursion. To end the recursion, recursive functions must test a base case.
int Factorial(int n) { // Test for the base case to end the recursion. if (n <= 1) { return 1; } // Otherwise, recurse. return n * Factorial(n - 1); }
Data Types
Bond is a statically typed language, so every variable and expression has a type known at compile time. Bond is also strongly typed, meaning that values of one type will not be implicitly converted to another type (e.g. during a variable assignment or when passing an argument to a function) if it could lead to a loss of data. An example where a loss of data could occur is when a numeric value of one type is converted to another, such as the loss of the fractional part of a floating-point number in a conversion to an int, or the truncation of an int to a short. Such conversions are allowed, however, the programmer is required to use an explicit cast to demonstrate intent, otherwise the compiler will generate an error.
We have already seen the primitive types for storing simple values, however, Bond's type system supports several others.
Enumerated Types
An enumeration is a set of named integer constants. Values of an enumeration are represented using the type int, and behave as the int type in every respect (e.g. array indexing, arithmetic operations, relational operations, etc.).
An enumeration is declared using the enum keyword, followed by an identifier to name the type, an open brace, an enumerator list, a closing brace and a semicolon.
enum AnimalSound { Moo, // Value is 0 since this is the first enumerator. Oink, // Value is 1 since Moo + 1 = 1. Baa = 42, // Value is explicitly specified as 42. Quack, // Value is 43 since Baa + 1 = 43. Meow = -1, // Values can be negative. Woof // Values can be duplicated. This one is also 0. };
The enumerator list consists of a comma separated list of enumerators. A comma following the last enumerator is allowed, but not required.
An enumerator can be a single identifier. In that case its value is 0 if it is the first enumerator, or it takes the value of the previous enumerator + 1. Alternatively an enumerator can be an identifier followed by an equal sign and a constant expression. In that case it takes on the value of the constant expression.
We can declare and initialize a variable of an enumerated type as we would a variable of any other type. In this case, we declare the type as AnimalSound, and use the name of any one of the enumerators in the initializer. We could also have used any other expression, such as a function call, that resolves to an AnimalSound.
AnimalSound sound = Moo;