Fundamental Abstraction Facilities
There are two fundamental abstraction facilities, those are process abstraction and data abstraction. Process abstraction was emphasized from early days. Data abstraction was emphasized in the 1980s. All subprograms are process abstractions because they provide a way for a program to specify that some process is to be done, without providing the details of how it is to be done. Object-oriented programming is an outgrowth of the use of data abstraction.
Fundamentals of Subprograms
There are three fundamentals of subprograms, those are each subprogram has single entry point, the calling program is suspended during the execution of the called subprogram, and last control always returns to the caller when the called subprogram’s execution terminates.
Basic Definition
A subprogram definition describes the interface to and the actions of the subprogram abstraction. A subprogram call is the explicit request that a specific subprogram be executed. A subprogram is said to be active if, after having been called, it has begun execution but has not yet completed that execution. The two fundamental kinds of subprograms, procedures and functions. A subprogram header, which is the first part of the definition, serves several purposes. First, it specifies that the following syntactic unit is a subprogram definition of some particular kind.
Parameters
The parameters in the subprogram header are called formal parameters. They are sometimes thought of as dummy variables because they are not variables in the usual sense. Subprogram call statements must include the name of the subprogram and a list of parameters to be bound to the formal parameters of the subprogram. These parameters are called actual parameters. In nearly all programming languages, the correspondence between actual and formal parameters—or the binding of actual parameters to formal parameters—is done by position: The first actual parameter is bound to the first formal parameter and so forth. Such parameters are called positional parameters.
Procedures and Functions
There are two distinct categories of subprograms—procedures and functions—both of which can be viewed as approaches to extending the language. All subprograms are collections of statements that define parameterized computations. Functions return values and procedures do not. Functions structurally resemble procedures but are semantically modeled on mathematical functions. Functions are called by appearances of their names in expressions, along with the required actual parameters. The value produced by a function’s execution is returned to the calling code, effectively replacing the call itself.
Design Issues for Subprograms
There are eight design issues for subprograms, those are the exclusivity of objects; are subclasses subtypes?; type checking and polymorphism; single and multiple inheritance; object allocation and deallocation; dynamic and static binding; nested classes; and initialization of objects.
Local Referencing Environments
Local variables can be stack-dynamic or static. The advantages of stack-dynamic local variables are supporting recursion, flexibility, and the storage for locals is shared among some subprograms. The disadvantages are the allocation and the deallocation needs much time for the initialization. The advantages of static local variables are more efficient, require no runtime overhead for allocation or deallocation, and history sensitive. The disadvantages are the inability to support recursion and the storage can not be shared with the local variables of other inactive subprograms.
Semantic Models of Parameter Passing
There are three modes of semantic models of parameter passing, those are in mode, out mode and in-out mode. In mode means they can receive data from the corresponding actual parameter. Out mode means they can transmit data to the actual parameter. And in-out mode means they can do both. There are two conceptual models of how data transfers take place in parameter transmission: Either an actual value is copied (to the caller, to the called, or both ways), or an access path is transmitted. Most commonly, the access path is a simple pointer or reference.
Pass by Value
When a parameter is passed by value, the value of the actual parameter is used to initialize the corresponding formal parameter, which then acts as a local variable in the subprogram, thus implementing in-mode semantics. Pass-by-value is normally implemented by copy, because accesses often are more efficient with this approach. It could be implemented by transmitting an access path to the value of the actual parameter in the caller, but that would require that the value be in a write-protected cell (one that can only be read). Enforcing the write protection is not always a simple matter.
The advantage of pass-by-value is that for scalars it is fast, in both linkage cost and access time. The main disadvantage of the pass-by-value method if copies are used is that additional storage is required for the formal parameter, either in the called subprogram or in some area outside both the caller and the called subprogram. In addition, the actual parameter must be copied to the storage area for the corresponding formal parameter. The storage and the copy operations can be costly if the parameter is large, such as an array with many elements.
Pass by Result
Pass-by-result is an implementation model for out-mode parameters. When a parameter is passed by result, no value is transmitted to the subprogram. The corresponding formal parameter acts as a local variable, but just before control is transferred back to the caller, its value is transmitted back to the caller’s actual parameter, which obviously must be a variable.
The pass-by-result method has the advantages and disadvantages of pass-by-value, plus some additional disadvantages. If values are returned by copy (as opposed to access paths), as they typically are, pass-by-result also requires the extra storage and the copy operations that are required by pass-by-value. As with pass-by-value, the difficulty of implementing pass-by-result by transmitting an access path usually results in it being implemented by copy. In this case, the problem is in ensuring that the initial value of the actual parameter is not used in the called subprogram.
Pass by Value Result
Pass-by-value-result is an implementation model for in-out-mode parameters in which actual values are copied. It is in effect a combination of pass-by-value and pass-by-result. The value of the actual parameter is used to initialize the corresponding formal parameter, which then acts as a local variable. In fact, pass-by-value-result formal parameters must have local storage associated with the called subprogram.
At subprogram termination, the value of the formal parameter is transmitted back to the actual parameter. Pass-by-value-result is sometimes called pass-by-copy, because the actual parameter is copied to the formal parameter at subprogram entry and then copied back at subprogram termination. Pass-by-value-result shares with pass-by-value and pass-by-result the disadvantages of requiring multiple storage for parameters and time for copying values. It shares with pass-by-result the problems associated with the order in which actual parameters are assigned.
Pass by Reference
Pass-by-reference is a second implementation model for in-out-mode parameters. Rather than copying data values back and forth, however, as in pass-by-value-result, the pass-by-reference method transmits an access path, usually just an address, to the called subprogram. This provides the access path to the cell storing the actual parameter. Thus, the called subprogram is allowed to access the actual parameter in the calling program unit. In effect, the actual parameter is shared with the called subprogram.
The advantage of pass-by-reference is that the passing process itself is efficient, in terms of both time and space. Duplicate space is not required, nor is any copying required. There are, however, several disadvantages to the pass-by-reference method. First, access to the formal parameters will be slower than pass-by-value parameters, because of the additional level of indirect addressing that is required. Second, if only one-way communication to the called subprogram is required, inadvertent and erroneous changes may be made to the actual parameter.
Another problem of pass-by-reference is that aliases can be created. This problem should be expected, because pass-by-reference makes access paths available to the called subprograms, thereby providing access to nonlocal variables. The problem with these kinds of aliasing is the same as in other circumstances: It is harmful to readability and thus to reliability. It also makes program verification more difficult.
Pass by Name
Pass-by-name is an in-out-mode parameter transmission method that does not correspond to a single implementation model. When parameters are passed by name, the actual parameter is, in effect, textually substituted for the corresponding formal parameter in all its occurrences in the subprogram. This method is quite different from those discussed thus far; in which case, formal parameters are bound to actual values or addresses at the time of the subprogram call.
A pass-by-name formal parameter is bound to an access method at the time of the subprogram call, but the actual binding to a value or an address is delayed until the formal parameter is assigned or referenced. Implementing a pass-by-name parameter requires a subprogram to be passed to the called subprogram to evaluate the address or value of the formal parameter. The referencing environment of the passed subprogram must also be passed. Pass-by-name parameters are both complex to implement and inefficient. They also add significant complexity to the program, thereby lowering its readability and reliability.
Parameter-Passing Methods
In most contemporary languages, parameter communication takes place through the run-time stack. The run-time stack is initialized and maintained by the run-time system, which manages the execution of programs. The runtime stack is used extensively for subprogram control linkage and parameter passing. Pass-by-value parameters have their values copied into stack locations. Pass-by-result parameters are implemented as the opposite of pass-by-value. Pass-by-value-result parameters can be implemented directly from their semantics as a combination of pass-by-value and pass-by-result. Pass-by-reference parameters are perhaps the simplest to implement.
Design Considerations
Two important considerations are involved in choosing parameter-passing methods: efficiency and whether one-way or two-way data transfer is needed. Contemporary software-engineering principles dictate that access by subprogram code to data outside the subprogram should be minimized. With this goal in mind, in-mode parameters should be used whenever no data are to be returned through parameters to the caller.
Out-mode parameters should be used when no data are transferred to the called subprogram but the subprogram must transmit data back to the caller. Finally, in-out-mode parameters should be used only when data must move in both directions between the caller and the called subprogram. There is a practical consideration that is in conflict with this principle. Sometimes it is justifiable to pass access paths for one-way parameter transmission.
Referencing Environment
Although the idea is natural and seemingly simple, the details of how it works can be confusing. If only the transmission of the subprogram code was necessary, it could be done by passing a single pointer. However, two complications arise. First, there is the matter of type checking the parameters of the activations of the subprogram that was passed as a parameter. The second complication with parameters that are subprograms appears only with languages that allow nested subprograms.
The issue is what referencing environment for executing the passed subprogram should be used. There are three choices, those are sallow binding, deep binding, and ad hoc binding. Shallow binding is the environment of the call statement that enacts the passed subprogram. Deep binding is the environment of the definition of the passed subprogram. Ad hoc binding is the environment of the call statement that passed the subprogram as an actual parameter.
Overloaded Subprograms
An overloaded subprogram is a subprogram that has the same name as another subprogram in the same referencing environment. Every version of an overloaded subprogram must have a unique protocol; that is, it must be different from the others in the number, order, or types of its parameters, and possibly in its return type if it is a function. The meaning of a call to an overloaded subprogram is determined by the actual parameter list (and/or possibly the type of the returned value, in the case of a function). Although it is not necessary, overloaded subprograms usually implement the same process.
Generic Subprograms
A polymorphic subprogram takes parameters of different types on different activations. Overloaded subprograms provide a particular kind of polymorphism called ad hoc polymorphism. Overloaded subprograms need not behave similarly. Languages that support object-oriented programming usually support subtype polymorphism. Subtype polymorphism means that a variable of type T can access any object of type T or any type derived from T.
Parametric polymorphism is provided by a subprogram that takes generic parameters that are used in type expressions that describe the types of the parameters of the subprogram. Different instantiations of such subprograms can be given different generic parameters, producing subprograms that take different types of parameters. Parametric definitions of subprograms all behave the same. Parametrically polymorphic subprograms are often called generic subprograms.
Closures
A closure is a subprogram and the referencing environment where it was defined. The referencing environment is needed if the subprogram can be called from any arbitrary place in the program. Explaining a closure is not so simple. If a static-scoped programming language does not allow nested subprograms, closures are not useful, so such languages do not support them. All of the variables in the referencing environment of a subprogram in such a language (its local variables and the global variables) are accessible, regardless of the place in the program where the subprogram is called.
Coroutines
A coroutine is a special kind of subprogram. Rather than the master-slave relationship between a caller and a called subprogram that exists with conventional subprograms, caller and called coroutines are more equitable. In fact, the coroutine control mechanism is often called the symmetric unit control model.
Coroutines can have multiple entry points, which are controlled by the coroutines themselves. They also have the means to maintain their status between activations. This means that coroutines must be history sensitive and thus have static local variables. Secondary executions of a coroutine often begin at points other than its beginning. Because of this, the invocation of a coroutine is called a resume rather than a call.