Plain English Programming: Difference between revisions

From OSDev.wiki
Jump to navigation Jump to search
[unchecked revision][unchecked revision]
Content added Content deleted
(Introduction: briefly describe relevant ascpects of the language.)
(→‎Compiling with CAL: Mention some available code, split paragraph)
Line 41: Line 41:
== Compiling with CAL ==
== Compiling with CAL ==


The compiler, text editor, "finder", and a document editor are all integrated into a single binary. Compilation is triggered by the Run menu entry in the text editor (Ctrl-R). All the files without an extension in the current directory are combined and compiled. The resultant Windows executable is then run. To compile OS code, CAL must be modified. CAL's subsystems are, on the whole, simple and easy to modify, so this will likely be no harder than many other elements of OS development. The compiler relies only on its own internals and files in the single source directory, there is no include search path, so there is very little to go wrong. Of course, you may wish to add path search for your own project as the present arrangement makes it hard to maintain a consistent version of the library.
The compiler, text editor, "finder", and a document editor are all integrated into a single binary. Compilation is triggered by the Run menu entry in the text editor (Ctrl-R). All the files without an extension in the current directory are combined and compiled. The resultant Windows executable is then run. To compile OS code, CAL must be modified. CAL's subsystems are, on the whole, simple and easy to modify, so this will likely be no harder than many other elements of OS development.

The compiler relies only on its own internals and files in the single source directory, there is no include search path, so there is very little to go wrong. Of course, you may wish to add path search for your own project as the present arrangement makes it hard to maintain a consistent version of the library. Folds/english may have useful code for this; it searches subfolders if a certain folder name is present.


=== The assembly language issue ===
=== The assembly language issue ===

Revision as of 13:35, 3 February 2024

This page is a stub.
You can help the wiki by accurately adding more contents to it.

(Stub details: Search for FIXME and TBD.)

This page relates to Plain English Programming as developed by the Osmosian Order of Plain English Programmers. Notes primarily pertain to the CAL-4700 standalone IDE and compiler. There is at least one other version, Folds/english on Github, 'an authorized "dynamic fork". (The major differences are in UI, with Folds/english offering the standard Win32 UI elements.) Language differences, if any, should be noted below.

Español Llano is a Spanish-language equivalent.

The language is very much like pseudocode. It allows a small degree of arbitrary word choice, but the code understood by the compiler must precisely describe the program's operation. It's strongly typed with a tiny runtime; less than 2KB. It natively targets only 32-bit Windows, but is so simple that retargetting is likely no harder than any other aspect of OS development.

Far from "Hello, world," the first thing a user learns to do with CAL-4700 is recompile CAL itself. Then, the user is walked through making an app which takes a string, downloads an image from Google Image Search, and renders it in the style of famous painter Monet. It's a rather broad introduction to the language's features.

Note that a sense of humor will be an advantage when reading CAL's instructions.

Language

The language and its documentation are almost entirely free of jargon and symbols other than a subset of English punctuation. Only the period, comma, colon, and semicolon are commonly used. The minus sign functions as a unary minus for negating variables.

The Order's blog makes a case for the language being no more verbose than C or C++. This works if you're a good typist, finding whole words no more trouble to type than symbols. (Having a lot of practice in typing English, I (eekee) find it much less trouble to type than other languages. I find it hard to omit the spaces in both CamelCase and underscore-joined words, and I dislike symbols as I find using the shift key to be uncomfortable. CAL plain English only requires me to use 4 shifted symbols; colon double-quotes and the two parentheses.)

"Not" and the suffix "n't" are recognized and understood as you'd expect.

Some words are ignored. Development of the language started with the realization that human infants ignore a lot of words when they're starting to learn to understand language.

Specifying parameters is more verbose than in human English. Example: a number and another number and a third number and a fourth number. It could be shortened, but typically, you set up a structure and pass that to a routine. For instance, you set up a box and then "Draw the box with the black color." Routines with more than 2 parameters are rare in CAL-4700.

Wording tends to differ from English in some other ways too. For example, "Draw the box with the black color", rather than "Draw the box in black." FIXME: check if "in/into/to" is interpreted differently from "with". If not, the latter example is possible.

Definition order doesn't matter. (The Order likes to sort definitions by name and rely on incremental search to find them.)

Types

It has a strong type system and checks types strictly -- perhaps a little too strictly. "A buffer is a string", the compiler is told, and you can "Read the file into a buffer", but you can't "Read the file into a string." (The instructions are wrong on this point. eekee intends to raise the issue with the Order.) There are workarounds. For instance, a string is actually a structure with 2 pointers, so you can create a string, set its pointers to the buffers pointers, and then use non-destructive string operations on it.

Routines vs. Functions

Like C functions, routines may take parameters and return values, but unlike C functions, they may not be part of an expression. You call one routine per line.

Functions may be part of expressions. Function calls may even look like references to structure elements. In that regard, they seem similar to the methods of a pure OO language but the instructions recommends using them sparingly.

Compiling with CAL

The compiler, text editor, "finder", and a document editor are all integrated into a single binary. Compilation is triggered by the Run menu entry in the text editor (Ctrl-R). All the files without an extension in the current directory are combined and compiled. The resultant Windows executable is then run. To compile OS code, CAL must be modified. CAL's subsystems are, on the whole, simple and easy to modify, so this will likely be no harder than many other elements of OS development.

The compiler relies only on its own internals and files in the single source directory, there is no include search path, so there is very little to go wrong. Of course, you may wish to add path search for your own project as the present arrangement makes it hard to maintain a consistent version of the library. Folds/english may have useful code for this; it searches subfolders if a certain folder name is present.

The assembly language issue

CAL doesn't include an assembler, nor does it work with an external assembler. Rather, hex strings of machine code are entered in-place, together with comments showing the assembly language. It looks like this:

To add a number to a pointer;
To add a number to another number:
Intel $8B8508000000. \ mov eax,[ebp+8] \ the number
Intel $8B00. \ mov eax,[eax]
Intel $8B9D0C000000. \ mov ebx,[ebp+12] \ the other number
Intel $0103. \ add [ebx],eax

The good news is that there are only 447 "Intel" lines totalling 1841 bytes of machine code. They're all in the standard library, called the noodle.

The authors of CAL hand-assemble this code, but there are alternatives. You could write a script to assemble a file with raw output format and hexdump the result. Write each snippet to the file and run the script. Or, You could write an inline assembler. There's little need to cover the entire instruction set of your CPU, CAL-4700 only uses 36 different instructions with a small number of addressing modes. Using all the instructions is for full-complexity compiler projects; the sort which aren't likely to leave you with any time for your OS.

Cross-compiling (binary format)

To target x86-32-ELF, such as for a bootloader, modify the compiler to output the required format, compile this with an unmodified CAL, and you have a cross-compiler.

Cross-compiling (ABI)

To target ABIs which pass arguments in registers, the minimum necessary change is to alter only the Call statement. Internal calls will still use the stack for argument passing.

Alternatively, you could modify the compiler so all calls pass arguments in registers. In CAL-4700, the compiler only emits 2 instructions on its own, to push and pop eax. Some work may be needed around those.

Cross-compiling (CPU architecture)

First, decide if you're changing the ABI. Calling convention is baked into machine code in the noodle.

If you're keeping the ABI, the compiler may only need 4 lines changed; 2 each to push and pop a register. You may also want to check over other binary code emitted by the compiler for endianness and arch-dependent ABI details.

Changing "intel" to another name can be done with a simple search-and-replace on the compiler, replacing 12 instances on 11 lines.

The remaining work is to rewrite the 447 machine-code instructions in the noodle.

Porting CAL

CAL looks like an OS with its own design and interface standards. Internally, it wraps Windows calls to present neat, simple interfaces for files, images, vector graphics, GUI, and HTTP. This all suggests porting it to "bare metal" to make it an OS. How much work would be required? Is it a desirable goal? This section attempts to answer these questions.

Low-level

Numbers FIXME: signed integers only, no floats, ratios, smaller integers available, characters are bytes.

Memory allocation evidently incorporates some clever ideas such as storing strings in their own region. How much complexity is required from the underlying OS and libraries is TBD.

Files TBD

Network TBD

Execution Model is perhaps the toughest point. CAL is a single-tasking executable which runs as a single task with a single memory space. Not very OS-like! On the other hand, when you switch tabs or open something from the finder, it appears to switch programs instantaneously. This design seems suitable for a limited-purpose system, but many OS developers will want to add program loading and other features.

Display

CAL is capable of rendering images but draws everything of its own with vector graphics. It relies on Windows library calls to render these vectors. It also has drawing routines which are essentially turtle graphics, but only straight lines are implemented well.

The coordinate system works entirely with integers. The basic unit is the "twip" which is 1/20th of a printer's point. Inches and other units are converted to twips, which in turn are multiplied by a number derived from the PPI (pixels per inch) setting to produce pixel coordinates. Rotation may be specified in degrees or fractions of a circle.

Font rendering deserves special mention. As supplied, CAL depends on Windows libraries for rendering fonts, even the one bundled in the binary. It also has routines for drawing all 96 ASCII characters with its own turtle graphics routines. It's a simple old-school plotter font. It looks better at some PPI values than others, but once you have line drawing implemented, you have ready-made text routines you can use for debug prints. ("Write" a string to draw it in the plotter font, "Draw" a string to use a Windows font.)

License

Links

  • The Osmosian Order's blog — Introduces the language, presents examples and answers many questions.
  • Download - search the blog for "http" to get the latest link.
  • osmosian.com has very little content, just a semi-humorous slideshow and a link to the manifesto. (This server also hosts downloads.)
  • Folds/english on Github — An older version with an open-source licence. Its GUI is more deeply integrated with Windows.
  • Instructions (PDF) for a deeper look at the language. It's a little out of date. The up to date instructions are included in the CAL download in CAL's native document format. (It can be exported as PDF.)

People and OSs using PEP

  • eekee says, "I'm very happy to find a language which circumvents my problems with jargon and symbols. I can just read CAL's code without getting confused. I'm also happy with its simplicity and extensibility."