IP University - POPL - Unit 3 Notes

20 Mar 2020

This post gives a quick review on the third unit of the syllabus for the Principles of Progamming Languages (POPL) subject in the IP University syllabus.

Storage Management

Static and Dynamic control

Stack vs Heap

Stack

follows a LIFO (Last in first out) strategy
block in memory is of fixed size
last entry in the stack is accessible at any moment
contiguous memory with pointers named as stack base, top of the stack
support function calls (with stack frames / activation record)

Heap

does not follow any definite approach
allows random assignment and deassignment of the memory
process accesses the allocated memory area through the pointer
deallocation is carried through the deallocation request
develops holes in the memory allocation when data structures are built and freed
used at the runtime.

Differences

In a stack, the allocation and deallocation is done by CPU and is automatic whereas, in heap, it needs to be done by the programmer manually.
Heap frame handling is costlier than stack frame handling.
Implementation of a stack is complex. As against, implementation of a heap is simple.
A function call in stack takes O(N) time. In contrast, it takes O(1) time in a heap.
Stack implementation mainly suffers from the memory shortage problem. On the contrary, the main issue in a heap is fragmentation.
Access to a stack frame is easier than the heap as the stack is confined to the small region of memory and it always hit the cache, but heap frames are dispersed throughout the memory so the memory accessing can cause more cache misses.
Stack is not flexible, the memory size allotted cannot be changed. On the other hand, a heap is flexible, and the allotted memory can be altered.
A heap takes more accessing time than a stack.

Sequence Control (source)

Control structure in a PL provides the basic framework within which operations and data are combined into a program and sets of programs

Sequence Control -> Control of the order of execution of the operations
Data Control -> Control of transmission of data among subprograms of program

Sequence Control may be categorized into four groups:

Expressions – They form the building blocks for statements
Statements – The statements (conditional & iterative) determine how control flows from one part of program to another
Declarative Programming – This is an execution model of program which is independent of the program statements
Subprograms – In structured programming, program is divided into small sections and each section is called subprogram

2 types:

Implicit SC - those defined by the programming language itself. These structures can be modified explicitly by the programmer
Explicit SC - those that programmer may optionally use to modify the implicit sequence of operations defined by the language. eg. (parentheses within expressions, or goto statements and labels)

Sequence Control within Expressions

Arithmetic Expressions
- evaluated from left to right and using the rules of precedence of operators
- if expression involves parentheses, the expression inside parentheses is evaluated first
Relational Expressions
- The relational operators <, >, <=, >= are given the first priority
- operators (== and != ) are given the second priority
- arithmetic operators have higher priority over relational operators
- resulting expression will be of integer type, true = 1, false = 0
Logical Expressions
- result of a logical expression is either true or false
- expression involving NOT is evaluated first
- then expression with AND
- finally the expression having OR is evaluated

Concepts:

Controlling the evaluation of expressions
1. Precedence - the operator at higher level of precedence is evaluated first
2. Associativity - operators of the same precedence are evaluated either from left to right or from right to left depending on the level
Expression tree
Syntax for expressions
1. Prefix or Polish notation
2. Postfix or reverse polish
3. Infix notation
Execution-Time representation Translators evaluate the expression using a method so as to get efficient result (optimum value at optimum time with optimum use of memory and processor) Two phases:
1. the basic tree control structure for expression is established
2. stage whole evaluation process takes place Following methods are used for translation of expression:
3. Machine code sequences
4. Tree structure
5. Prefix or postfix form

Problems with Evaluation of Expressions:

Uniform Evaluation Code
1. Eager Evaluation Rule
  - for each operation node, first evaluate each of the operands, then apply the operation to the evaluated operands
2. Lazy Evaluation Rule
  - never evaluate operands before applying the operation
  - pass the operands unevaluated and let the operation decide if evaluation is needed
  - impractical to implement the same in many cases as it requires substantial software simulation
Side effects
- c / func(y) + c
- if fun(y) has the side effect of modifying the value of c, the order of evaluation is critical
Short-circuit Boolean Expression
- the left expression in the boolean expression is evaluated first and second expression is evaluated only when needed

Sequential Control within Statement

The kinds of control flow statements supported by different languages vary, but can be categorized by their effect:

Continuation at a different statement (unconditional branch or jump)
Executing a set of statements only if some condition is met (choice - i.e., conditional branch)
Executing a set of statements zero or more times, until some condition is met (i.e., loop - the same as conditional branch)
Executing a set of distant statements, after which the flow of control usually returns (subroutines, coroutines, and continuations)
Stopping the program, preventing any further execution (unconditional halt)

Sequential Control Statements:

Basic statements
1. assignment statement
2. input and output statements
3. declaration
4. goto
5. break, continue
Statement level sequence control
1. implicit sequence control
  1. composition type - standard form of implicit sequence, statements placed in order of execution
  2. alternation type - there are two alternate statement sequence in the program, the program chooses any of the sequence but not both at same time
  3. iteration type - normal sequence is given to statements but the sequence repeats itself for more than one time
2. explicit sequence control goto, break etc.
Structured sequence control
1. compound statement
2. conditional statement
3. iteration statements

Subprogram Control

Suprogram sequence control

how one subprogram invokes another and called subprogram returns to the first.

Simple Call-Return Subprograms

program is composed of single main program
calls various subprograms which may call other subprograms and so on to any depth
each subprogram returns the control to the program/subprogram after execution
execution of calling program is temporarily stopped during execution of the subprogram
after the subprogram is completed, execution of the calling program resumes at the point immediately following the call
Copy Rule - effect of the call statement is the same as would be if the call statement is replaced by body of subprogram (with suitable substitution of parameters)

Assumptions:

Subprogram can not be recursive
Explicit call statements are required
Subprograms must execute completely at call
Immediate transfer of control at point of call or return
Single execution sequence for each subprogram

Implementation:

Difference:
- Subprogram definition – The written program which is translated into a template
- Subprogram activation – Created each time a subprogram is called using the template created from the definition
An activation is implemented as two parts
- Code Segment – contains executable code and constants
- Activation record – contains local data, parameters & other data items
The code segment is invariant during execution. It is created by translator and stored statically in memory. They are never modified. Each activation uses the same code segment.
A new activation record is created each time the subprogram is called and is destroyed when the subprogram returns. The contents keep on changing while subprogram is executing

Current Instruction Pointer (CIP)
- pointer which points to the instruction in the code segment that is currently being executed (or just about to be) by the hardware or software interpreter.
Current Environment Pointer (CEP)
- activation record contains its set of local variables
- it represents the “referencing environment” of the subprogram
- pointer to current activation record is CEP
Execution of Program
- activation for the main program is created and CEP is assigned to it
- CIP is assigned to a pointer to the first instruction of the code segment for the subprogram
- when a subprogram is called, new assignments are set to the CIP and CEP for the first instruction of the code segment of the subprogram and the activation of the subprogram
- To return correctly from the subprogram, values of CEP and CIP are stored before calling the subprogram
- when return instruction is reached, it terminates the activation of subprogram, the old values of CEP and CIP that were saved at the time of subprogram call are retrieved and reinstated
Recursive Subprograms
- In Recursive subprogram calls A subprogram may call any other subprogram including A itself, a subprogram B that calls A or so on
- difference between a recursive call and an ordinary call is that the recursive call creates a second activation of the subprogram during the lifetime of the first activation.
- If execution of program results in chain such that \(k\) recursive calls of subprogram occur before any return is made. Thus \(k+1\) activation of subprogram exist before the return from \(k^{th}\) recursive call
- both CIP and CEP are used to implement recursive subprogram

Data Control & Referencing Environments

The main elements of data control are referencing environments, visibility, dynamic scope, referencing operations, and reference locality.

Referencing environments may be local, nonlocal, global, or predefined. The reference environment of a statement is the collection of all names that are visible in the statement. Referencing environments are generally invariant during program execution, i.e., values may change but name associations do not.

local : associations created on entry to that subprogram (formal parameters, local variables, etc)
nonlocal : associations not created on entry to that subprogram
global : associations created at the start of execution of that main program
predefined : defined in the language definition

Visibility refers to the associations a subprogram has access to during its execution. Redefining a variable within a subprogram “hides” the original association during the execution of the subprogram.

There are 2 ways through which a data object can be made available as an operand for an operation:

direct transmission : object computed at one time during an operation may be transmitted directly to another operation as an operand
referencing through a named data object: named when created, and that name may be used to refer it in an operand for an operation

Static and Dynamic Scoping

In static scope, if a variable name’s scope is a certain function, then its scope is the program text of the function definition: within that text, the variable name exists, and is bound to the variable’s value, but outside that text, the variable name does not exist. A block defines a new scope. Variables can be declared in that scope, and aren’t visible from the outside. However, variables outside the scope – in enclosing scopes – are visible unless they are overridden.

(Algol, Ada, C, Pascal, Scheme, and Haskell)

In dynamic scope (or dynamic scoping), if a variable name’s scope is a certain function, then its scope is the time-period during which the function is executing: while the function is running, the variable name exists, and is bound to its value, but after the function returns, the variable name does not exist. Using this scoping rule, we first look for a local definition of a variable. If it isn’t found, we look up the calling stack for a definition.

(Lisp, older interpreted languages such as SNOBOL and APL)

This means that if function f invokes a separately defined function g, then under lexical scope, function g does not have access to f’s local variables (assuming the text of g is not inside the text of f), while under dynamic scope, function g does have access to f’s local variables (since g is invoked during the invocation of f).

Parameter Passing

The formal parameters are the parameters associated with the function, and the actual parameters are the parameters that are used to call the function.

Methods:

call by value: copy going into the procedure. This is the mechanism used by both C and Java. Note that this mechanism is used for passing objects, where a reference to the objected is passed by value
call by result: copy going out of the procedure, the formal parameters are copied back into the actual parameters. (Note that this only makes sense if the actual parameter is a variable, or has a l-value, such as an array element.)
call by value result: copy going in, and again going out
call by reference: The actual parameters and formal parameters are identified. The natural mechanism for this is to pass a pointer to the actual parameter, and indirect through the pointer
call by name: re-evaluate the actual parameter on every use. For actual parameters that are simple variables, this is the same as call by reference. For actual parameters that are expressions, the expression is re-evaluated on each access

Block Structure

a block or code block is a lexical structure of source code which is grouped together
consist of one or more declarations and statements
fundamental to structured programming, where control structures are formed from blocks
In a block-structured programming language, the objects named in outer blocks are visible inside inner blocks, unless they are masked by an object declared with the same name.

The semantic meaning of a block is twofold.

it provides the programmer with a way for creating arbitrarily large and complex structures that can be treated as units
it enables the programmer to limit the scope of variables and sometimes other objects that have been declared

Sharat Sachin Maximizing my potential