CMSC216
C Syntax⌗
Preprocessor Directives⌗
Code at the beginning of a file that starts with a #. This is run before compilation.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_COLS 20 /* max # of columns to process */
#define MAX_INPUT 1000#define defines a constant value, #include includes another file’s declarations.
Main Function⌗
- Returns an int, takes no arguments (so void)
int main(void){
/* Some code */
return 0;
}Prototypes⌗
- The first function “declaration”
- No body
- Placed at the top to allow the compiler to do proper type checking
Basic Concepts⌗
Environments⌗
Translation Environment⌗
- Source code get compiled into object code
- Which then gets linked with other object code by the linker to form a single executable
Execution Environment⌗
- The code is then excuted by loading it into memory (an os is not neccessary but usually used), setting it up, running it, and terminating it by returning main’s error code.
Lexical Rules⌗
Trigraphs⌗
- A 3-sequence of a characters that may not be easily typeable
Comments⌗
- Begin with
/*and end with*/ - Can be multiple lines, by not nested
- If they’re multiple lines, they get everything on those lines
Declaration⌗
- A declaration causes memory to be reserved for the variable
- There is no extra instantiation
- If variables are uninitialized they’ll just have garbage values
Expression and assignment⌗
- Assignment returns the value the variable was assigned
- An expression can be a simple statement, its just not useful unless it has a side effect
Side Effect⌗
- A change in program state
Booleans in C⌗
- There is no boolean type, 0 is false, non-zero is true
Arithmetic Conversion⌗
- Both
charandshorttypes are promoted tointtypes when doing math - If arithmetic operations are performed with different
inttypes, the smaller type is promoted to be the higher type
Standard Library⌗
stdio⌗
printf⌗
printftakes in a character string with 0 or more format specifiers, if any are present, it also takes in that many other values after the string- They contain escape sequences, format specifiers, and normal characters
- Format specifiers include
- %d for integers
- %f for floats
- %c for characters
- %u for unsigned int decimal
- %x for unsigned int hex
- %o for unsigned int octal
- Padding can be done by adding the number of spaces to pad by before the specifier ("%4d")
- You can fill with 0s instead by adding a 0 beforehand ("%04d")
scanf⌗
scanftakes in a character string with 0 or more format specifiers, if any are present, it also takes in that many other variable pointers after the string- It takes in format specifiers and reads each one into its corresponding variable
- If too few variables are put in, scanf will wait
- It too many variables are put in, they will remain in the input
- If the last data in the input has been read, scanf returns a special
EOFvalue - Format specifiers include
- %d for integers
- %f for floats
- %c for characters
feof⌗
feof(stdin)will be true if the previous try to read didn’t work because there was no more data- You must try to read something for
feofto work
assert⌗
- If
assert()argument is false, it will kill the program
stddef⌗
Data⌗
Integers⌗
Values⌗
short <= int <= long
| Type | Signedness | Max Start | Min End |
|---|---|---|---|
| int | signed | -32767 | 32767 |
| unsigned int | unsigned | 0 | 65535 |
| short int | unsigned | -32767 | 32767 |
| unsigned short int | signed | 0 | 65535 |
| long int | signed | -214783647 | 2147483647 |
| unsigned long int | unsigned | 0 | 4294967295 |
| char | NA | 0 | 127 |
| signed char | signed | -127 | 127 |
| unsigned char | unsigned | 0 | 255 |
Literals⌗
- Decimals
- In octal when starting with a 0
- In hex when starting with an 0x
- Characters literals in single quotes
Promotion⌗
Enumerated Type⌗
Values⌗
- Enums are stored as ints
- The numbers after the name are optional
enum Jar_Type { CUP=8, PINT=16, QUART=32, HALF_GALLON=64, GALLON=128 };Floating-Point⌗
- Have a decimal point
3.141 - Have an exponent
1E10 - Or both
6.023e23
Pointers⌗
- Every memory location has an address
- Pointer is another name for address
- Pointer Variable is a variable whose value is an address of a moeory location
- There are no point constants because we can’t predict where memory addresses will be
- To declare a pointer, use the indirection operator before the variable
- The indirection operation gets the value at an address, so therefore the variable stores a pointer to an address with a value of the given type
int *ais an in pointer- If you declare multiple on the same line, you need a
*for each one - When initializing it on the same line, the value you give goes to the variable itself, not the pointer
String⌗
- C has no string type, but there is a string literal
- A string is a sequence of characters terminated by a
NULbyte- A sequence of zero characters is valid
Arrays⌗
Declaration⌗
- Like declaring any variable, but with square brackets after the name with a capacity value:
int values[20] - Array sizes must be constants at compile time, so literals or symbolic constants
Initialization⌗
- You can’t assign variables to each other or compare entire arrays
- You can initialize an array when you declare it by setting it equal to values in curly braces:
int values[20] = {1,2,3}- This sets the first 3 items to 1,2, and 3 respectively and fills the others with 0
- If you pass fewer, it fills the rest with 0s, but you need at least 1
Behavior⌗
- Array parameters don’t take sizes (at least for one-dimensional arrays)
- Arrays are passed by value but act like they’re passed by reference, because of Pointers
- Arrays don’t keep track of their length
- Accessing an invalid index won’t be an error itself but can cause one by accessing or changing memory its not supposed to
Typedef⌗
Scope⌗
Block Scope⌗
- Exists within the block
- Have No Linkage
File Scope⌗
- Exists within the whole file
- Have External Linkage
Shadowing⌗
- The variable with the narrowest scope will shadow the other(s)
Linkage⌗
- Determines if multiple instances of the same identifier refer to the same thing or not
No Linkage⌗
- All occurrences are different
Internal Linkage⌗
- All occurrences in a given file are the same
External Linkage⌗
- All occurrences are the same
Changing Linkage⌗
externwill change a Block Scope variable from no linkage to externalstaticwill change a File Scope variable from external linkage to internal
Storage⌗
- Determines the lifetime of the memory of the variable while the program is executing
- It’s still accessible only from within its Scope, but controls when the value’s memory gets destroyed
const⌗
Structures⌗
Unions⌗
**
Operators⌗
Unary⌗
Negation⌗
-- flips the sign
Increment/Decrement⌗
++/--- Increments/Decrements by 1
- Before the expression computes before substituing the expression
- After expression changes the variable after substitution, so the existing value is used.
Size⌗
sizeof- Gets the size in
size_tof a given value
Comma⌗
expr1,exp2- Evaluates its first operand and then the right one, producing the right’s output as its own
Indirection⌗
*- Gets the value at the given address
Address⌗
&- Gets the address a given value is stored at
Logical⌗
- Both
&&and||use short circuit evaluation
Bitshifts⌗
Left Shift⌗
value << n- shifts
value’s bits left bynpositions- Rightmost bits get 0, leftmost bits get discarded
Right Shift⌗
value >> n- shifts
value’s bits right bynpositions- If unsigned, the leftmost bits get 0
- If signed, whether the leftmost bits get 0 or 1 is implementation dependent
Bitwise⌗
- And (
&): Ands two bits - Or (
|): Ors two bits - Xor (
^): Xors two bits - Complement (
~): Negates 1 bit
Masking⌗
- We can construct masks to select only certain bits from multiple of them
rvalues and lvalues⌗
- rvalues can appear on the right hand side of an assignment
- lvalues can appear on the left hand side of an assignment
Evaluation⌗
- Order of evaluation goes first by precedence, then associativity, and finally unspecified as long as
&&,||,(),?:, and,work
Precedence⌗
- Order of given operators
Associativity⌗
- Whether to operate left to right or left to right
Pointers⌗
- A pointer is a variable that holds the address of something in memory
- Pointers can be rvalues and lvalues but addresses can only be rvalues
- Pointer declarations must be initialized
- Functions can return a pointer but you must ensure the variable will not be destroyed after the function leaves
NULL Pointer⌗
- Special values that pointer can have that doesn’t point anywhere
- Defined in stddef
Make⌗
makeautobuilds programs by using separate compilation to efficiently only build what’s needed using a user defined dependency hierarchy- A Makefile has multiple rules
- A rule consits of a target, a dependency list, and a set of actions
- The target is the name/identifier of the rule and it can specifically be built with
make target - The dependency list follows the target and a colon and an update to any of these dependency files will cause the target’s action(s) to run
- Each action line follows a target line and starts with a tab character. Each action line is run when the target needs to be rebuilt
- A dependency will cause the target to be rebuilt if its newer than the target
- An error in a given action line will cause the action lines following it to not run unless the action line starts with a
-which will allow actions lines following to continue to run even if the current one fails - Comments start with a
#and lines can be broken onto following ones with a\ - Makefiles provide macros that represent a predefined value defined at the start of the makefile and repeatedly used within it
- They’re defined with
name=valueand used with$(name) CCis the name for the c compiler, andCFLAGSis the name for options to use with the compiler- Use these to allow for ease of addition and reading
- Also you don’t need
CFLAGSwhen linking
- They’re defined with
- Makefiles often have targets that aren’t true files, like
allandclean, which are phony targets
Dynamic Memory Allocation⌗
- Most data in C is stored on the stack, but sometimes we want to store dynamic variables whose size we can’t predict at compile time
- Dynamic memory allocation allows us to request memory to use while the program is running
malloc⌗
- malloc allocates space of the given size and returns a pointer to it, without intializing the memory
- returns NULL if error
free⌗
- free releases the memory at the pointer
- the pointer must point to the start of dynamically allocated memory
- you must free all memory dynamically allocated if you want to avoid leaks
- free does nothing other than release memory
- free(NULL) does nothing
calloc⌗
- calloc allocates spaces to store
numthings of eachsizeand initializes all the memory to 0 - returns NULL if an error
realloc⌗
- reallocates dynamically allocated memory at
ptrwithnew_size - Copies space over to the new space if it can’t be done in the same current space
Structures & Pointers⌗
- Structures can be self-referential by containing a pointer to its structure tag
- Structures can contain each other by using a partial struct declaration (just
struct name), declaring the other struct, and then fully defining the partially defined struct - Function pointers dont need to be deferenced to be called, nor do they need to be set to the address of the function name (the function name itself suffices)
Strings⌗
- Strings are character arrays with a terminating
NULbyte (which does not count for its length) strlen(char const *string)returns the length of the string without its terminatingNULbyte- it returns an unsigned int so a combination of its return values can never be negative
- Unrestricted functions
strcpy(char *dst, char const *src)copies the string from src into dststrcat(char *dst, char const *src)copies the string in src to the end of dststrcmp(char const *s1, char const *s2)compares two strings lexicographically- These functions assume a
NULbyte exists and will run until it is found, which may not always be the case - An equivalent version of each of the functions exists (
strncpy,strncat,strncmp) that take an extra length parameter and only go up until there strchr(char const *str, int ch)searches for a character from the leftstrrchr(char const *str, int char)searches for a character from the rightstrpbrk(char const *str, char const *group)searches for a group of characters from the leftstrstr(char const *s1, char const *s2)searches for a substring
Command Line Arguments⌗
main(int argc, char *argv[])is a valid declaration for mainargcis the number of arguments when called, including the nameargvis that list, with the first being the name, andargv[argc]being NULL- It’s also possible for main to take a third parameter
char *envp[], which is an array of strings in the formKEYWORD=VALUEfor all environment variables
The Preprocessor⌗
#definedefines a constant and the value to replace it with- Double underscored symbols are already predefined
#definealso defines a macro, which is the same thing but it takes parameters- When defining a macro, wrap the result and uses of the parameters in parentheses
- Macros do complete textual replacement, so statements with side effects may get executed more than once
- Macro substitution is not done in function calls
- Macros can invoke each other
- Conditional compilation
#if condwill only include the code if cond is true- you need an
#endif, and optionally an#elifand a#elseif wanted
- you need an
defined(NAME)checks if the following symbol is defined- Nested inclusion
- Give each headerfile its own symbol, which will only include it once no matter who many times
#includeis done
- Give each headerfile its own symbol, which will only include it once no matter who many times
#undef NAMEundefines name- -D to the compiler will define a symbol
Process Control⌗
- Kernel: The layer of the OS that is always running and serves as the hardware interface
- Not all programs have privileges, so if they want to do important things they must ask the OS to do it for them through a system call
- Process Table: stores info about each process, page table stores info about each process, and file table stores info abotu files
- Process can be in the states: new, ready, waiting, running, terminated
- Signals: A message to a process to notify it of en event
- The
kill()system call sends messages, not alwaysSIGKILL
- The
- Including
<err.h>gives you functions that properly print errorserr()prints a given error string and the library explanations and quitserrx()prints a given error string and quits
fork()- Creates a new process and returns a
pid_t, which is 0 for the child, >0 (child pid) for the parent, and -1 for an error - Most things are inherited
- Creates a new process and returns a
wait()- Reaps the next finishing direct child process from the process table
- Passing in a pointer to an int will give you a status
- Using macros like
WIFEXITEDandWEXITSTATUS
exec()- Runs the given programs, and never returns, and the new program replaces the process
- If error, returns -1
:ID: 43947200-ddd3-4794-b144-db11f2fa7dc4
System-Level I/O⌗
- File Descriptors
- A process-level object that is associated with a given file
- STDIN is fd 0, STDOUT is fd 1, and STDERR is fd 2
open(const char *filename, int flags, int mode (optional))- Opens a file, returning a file descriptor
I/O⌗
Assembly⌗
Registers⌗
- 4 bytes
- Names start with a $
- General purpose registers are
$t0through$t9 - Using a register in parentheses, like
($t0), means use the address that the register stores- Putting a constant before the opening parentheses offsets the address stored by the given amount
- Register spilling is when we have too many values to keep only in registers
System Calls⌗
- Asks the OS to do something
- Done in MIPS by loading a syscall code in
$v0, arguments into$a0through$a3, and then thesyscallinstruction
Syscall Codes⌗
- exit
- code: 10
- arguments: none
- return value: none
- print_int
- code: 1
- arguments:
$a0- the integer to print - return value: none
- print_char
- code: 11
- arguments:
$a0- address of string to print - return value: none
- read_int
- code: 5
- arguments: none
- return value:
$v0- the integer read
- read_char
- code: 12
- arguments: none
- return value;
$a0- the character read
Directives⌗
- They tell the assembler something about how to assemble the program
.textsays that what comes after goes into the text segment.datasays that what comes after goes into the data segment.wordsays stores what follows on the current line to successive memory locations as ints.asciizsays store the contents of the following double-quoted string as a null terminated string
Instructions⌗
- A command to run that consists of 3-4 letter code, and then 2-3 registers
- The first register is always where to put the result
- The third operand can be a literal
Labels⌗
- A human-readable alias for a memory address of an instruction/data item
Functions in Assembly⌗
- Store how to get back, jump there, jump back, reset
Stack Frame⌗
- A set of data that stores information about a called function
$fppoints to the first word in the current stack frame$sppoints the first free location just past the stack- When growing the stack, make sure to leave space for local variables + 8 for
$fpand$sp - Make sure to reload registers that you need after a function call
Prologue
sub $sp, $sp, 8 sw $ra, 8($sp) sw $fp, 4($sp) add $fp, $sp, 8
Epilogue
lw $ra, 8($sp) lw $fp, 4($sp) add $sp, $sp, 8 jr $ra
Parameters + Return Value + Local Variables
main: li $sp, 0x7ffffffc # init sp li $t0, 1 sw $t0, ($sp) # put the argument on the stack sub $sp, $sp, 4 # grow the stack jal f # call the function add $sp, $sp, 4 # pop the arg off mov $t1, $v0 # get the return value mov $a0, $t0 # print it mov $v0, 1 syscall mov $v0, 10 # exit syscall f: sub $sp, $sp, 12 # grow the stack sw $ra, 12($sp) # store the ra sw $fp, 8($sp) # store the fp add $fp, $sp, 12 # set the new fp lw $t0, 4($fp) # load the argument add $t0, $t0, 1 sw $t0, 4($sp) # store it as a local variable mov $v0, $t0 # set the return value lw $ra, 8($sp) # get the return address lw $fp, 4($sp) # get the old frame pointer add $sp, $sp, 12 # decrement the stack jr $ra # go back
Concurrency⌗
- If it can be active at multiple places, it’s concurrent
- Threads are part of the same program and can share data
- To use pthreads in C, use
-lwith gcc
Threads⌗
- Threads share heap memory, global and static memory, files, and a virtual address space but have their own thread id, runtime stack, and other important registers (stack pointer, etc.)
- This means they share global variables, static local variables, and dynamically allocated data
int pthread_create(pthread_t *tid, pthread_attr_t *attr, void *(*func)(void *), void *arg)⌗
tidis a pointer that will be filled in with an idattrisNULLfuncis the function to executeargpasses an argument- 0 on success, nonzero on erro
pthread_t pthread_self(void)⌗
- Get’s own thread id
void pthread_exit(void *retval)⌗
- Terminates the calling thread, which returns the
retval
int pthread_cancel(pthread_t tid)⌗
- Terminates the thread with the given
tid - Returns 0 on success, nonzero on error
int pthread_join(pthread_t tid, void **retval)⌗
- Reaps the given thread and frees its memory usage
- Blocks until termination (if already terminated, instant)
- Returns 0 on success, nonzero on error
- If success,
*retvalgets the return value
- If success,
Synchronization⌗
- Force concurrent operations to happen in some relative order
- Code that modifies a shared variable is a critical section
Atomicity⌗
- When something starts running by one thread and is not interrupted until it finishes
Semaphore⌗
- A special integer counter
- If counter is 0, threads must wait
- If counter is nonzero, threads process and counter is decremented
- Once a thread is done, it tells the semaphore, which then increments the counter
- If a counter becomes nonzero while threads are waiting on it to do so, one of the threads will be picked to run
int sem_init(sem_t *sem, 0, unsigned int value)- Intializes the semaphore given to
value - 0 on success, nonzero on error
- Intializes the semaphore given to
Wait
- The
Poperation - Blocks if counter is 0
- Returns if counter is positive
- Atomic operation
int sem_wait(sem_t *sem)- 0 on success, nonzero on error
- The
Post
- The
Voperation - Increments the counter
- Unblocks a random waiting thread if threads are waiting
- Atomic operation
int sem_post(sem_t *sem)- 0 on success, nonzero on error
- The
Mutual Exclusion⌗
- Only allow a certain number of threads to execute a block of code
Condition Synchronization⌗
- Control the relative order of actions performed by different threads based on some condition
sem_waitandsem_postare usually in different threads