The UNIX Shell As a Fourth Generation Language

There are many database systems available for UNIX. But almost all are software prisons that you must get into and leave the power of UNIX behind. Most were developed on operating systems other than UNIX. Consequently their developers had very few software features to build upon, and wrote the functionality they needed directly, without regard for the features provided by the operating system. The resulting database systems are large, complex programs which degrade total system performance, especially when they are run in a multi-user environment. UNIX provides hundreds of programs that can be piped together to easily perform almost any function imaginable. Nothing comes close to providing the functions that come standard with UNIX. Programs and philosophies carried over from other systems put walls between the user and UNIX, and the power of UNIX is thrown away. The shell, extended with a few relational operators, is the fourth generation language most appropriate to the UNIX environment. 1. Fourth Generation Systems In recent years, a variety of developments in programming language design have emerged. Object-oriented languages are becoming common, and languages explicitly supporting multiple tasks and inter-task communication are also gaining popularity. Unfortunately, these efforts have resulted in productivity increases too small to offset the growth in the size and complexity of software systems. A response to this has been the development of fourth generation programming languages. Although not commonly thought of as such, the UNIX shell is one of the most powerful and flexible fourth generation languages available. 1.1 Attempts at a Definition There is no consensus on the definition of what constitutes a third or fourth generation language. Mainstream third generation languages are typed, procedural languages. They are standardized and largely hardware independent. Operations in the language must be specified in a detailed, step-by-step algorithmic fashion. Third generation languages do very little implicit processing. Third generation languages are general purpose, even most of those which were ostensibly designed as special purpose languages. Fourth generation languages are usually intended as design tools for a particular application domains. They are usually free form in their use of variables, often not requiring type definitions and allowing dynamic typing of variables. They don’t emphasize a modular, procedure-based coding style. Instead, they contain a number of predefined procedures for performing various high-level operations. The high-level operations involve large amounts of implied processing. For example, a "sort" operator is usually available. The facilities of a fourth generation language are usually both more powerful and less flexible than the facilities available in a third generation language. A fourth generation programming language (4GL) should make possible the simple statement of what you want, rather than a detailed procedure of how to produce it. Although there are many products calling themselves 4GL today, they are mostly rewrites of COBOL and report writers. They are too low level and tedious. This is definitely not what a 4GL should be. 1.2 Previous Generations The first generation of computer languages was the sequence of zeroes and ones that were the machine instructions. In the beginning people had to code in this way. The second generation was "assembly language", which has a one-to-one correspondence with machine instructions. Humans could write names words to be converted into machine language. For example this assembler code adds register 1 to register 2. Figure 1. Second Generation Program