JWasm: Difference between revisions

[unchecked revision][unchecked revision]
Content deleted Content added
m Bot: Replace deprecated source tag with syntaxhighlight
 
(7 intermediate revisions by 4 users not shown)
Line 1:
'''JWASMJWasm Macro Assembler''' is an x86 assembler that targets 16, 32 and 64 bit platforms. JWASMJWasm is designed as a MASM-compatible assembler using the historical Intel notation and is available under the Sybase Open Watcom Public License. It produces binaries for the DOS, Windows, Linux, OS/2 and FreeBSD operating systems. JWASMJWasm is an almost complete rewrite of the earlier Watcom assembler WASMWasm. JWasm is written in portable C and has been successfully tested with the Open Watcom development environment, the Microsoft Visual Studio family of development tools, the GNU (GCC) compiler and others. It is currently being upgraded by Japheth.
 
==History==
JWASMJWasm is an upgrade of the earlier Open Watcom assembler WASMWasm. JWASMJWasm has been extensively rewritten to modernize, extend capacity and add additional platform support to it. Among its design targets is a very high level of MASM compatibility. Its initial release is dated 05/20/2008 as v1.7. The current version as of 1/19/2010 is v2.02, adding 64 bit capabilities. It is actively being updated to support the latest operating systems.
 
===Copyright String===
JWasm v2.02, Jan 19 2010, Masm-compatible assembler.
Portions Copyright (c) 1992-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
 
==Usage==
JWASMJWasm conforms to the historical Intel x86 assembly notation commonly associated with the Microsoft Macro Assembler notation and uses the standard [[MASM]] documentation and later as a technical reference.
 
===Abbreviated Notation===
The historical IntelThis notation is a fully specified format which occurs in the following form.:
 
<syntaxhighlight lang="asm">
mov eax, DWORD PTR [edi]
</syntaxhighlight>
 
Over time, the parsers in assemblers have improved to the stage where if the assembler can recognize the size of the data then the SIZE specifier may be omitted as such.
 
<syntaxhighlight lang="asm">
mov eax, [edi]
</syntaxhighlight>
 
This allows for clearer code that is easier to read. However, there are some contexts where the assembler cannot independently determine the data size; For example, if the source operand is a memory operand. In this situation the historical data SIZE specifiers must be used. The following is an example of this situation.
 
<syntaxhighlight lang="asm">
movzx eax, [esi] ; this generates an error as the data, SIZE cannot be determined by the assembler
movzx eax, '''BYTE PTR''' [esi] ; zerogenerates extendan aerror BYTE- intodata SIZE thecannot 32be bitdetermined EAXby the registerassembler
movzx eax, BYTE PTR [esi] ; zero extend a BYTE into the 32 bit EAX register
</syntaxhighlight>
 
===OFFSET Operator===
TheJWasm's historical Intel notationsyntax makes a distinction between fixed and transient addressing using the '''OFFSET''' operator. Data written in either the initialised or uninitialised data sections is a known ADDRESS at assembly time, as are code labels, all of which are referenced by the '''OFFSET''' operator. Transient addressing is performed with the normal Intel mnemonics for reading the stack within a procedure.
 
For a corresponding data entry in the initialised data section,
 
<syntaxhighlight lang="asm">
textitem db "This is a text item",0
</syntaxhighlight>
 
This data entry can be addressed in the following manner.
 
<syntaxhighlight lang="asm">
mov eax, '''OFFSET''' textitem
</syntaxhighlight>
 
===Transient Stack Addressing===
Operating systems provide memory for the area of memory referred to as the stack. Under x86 hardware, the stack is the main method of transferring arguments to procedures. Arguments are normally placed on the stack by the PUSH mnemonic in the following form. This example assumes the '''STDCALL''' calling convention and 32 bit data size.
 
<syntaxhighlight lang="asm">
'''push arg3'''
'''push arg2'''arg3
'''push arg1'''arg2
'''push arg3'''arg1
'''call FunctionName'''
</syntaxhighlight>
 
The CALL mnemonic pushed the return address onto the stack then branches to the address of the named procedure. If the procedure has a stack frame where the stack pointer register '''ESP''' is stored in the base pointer register '''EBP''' the first argument for the procedure occurs at address [ebp+8]. While this form of mnemonic notation can be written by experienced assembler programmers, the assembler provides a naming method to remove an un-necessary level of abstraction from writing code of this type.
Line 49 ⟶ 56:
The programmer can use the '''name''' of the argument in the place of the direct [EBP+displacement] notation to make the code more readable with no loss of performance. When the programmer needs to use the ADDRESS of a transient stack variable (normally referred to as a '''LOCAL''' variable) they have a number of methods. In a prototyped function call they can use the '''ADDR''' operator to obtain the address of a '''LOCAL''' variable. Alternatively they can use the direct Intel mnemonic '''LEA''' to load the effective address of the variable into a register:
 
<syntaxhighlight lang="asm">
'''lea''' eax, named_local_variable
</syntaxhighlight>
 
===Square Brackets===
JWASMJWasm, supportslike the historical Intel technique ofMASM, usinguses named variables to represent both fixed and transient addresses. Square brackets are used around the complex addressing Intel notationexpressions to denote that the contents are a memory operand. Programmers coming from a different background where square brackets are used as general ADDRESS operatorsindicators have at time had problems with this notation difference but the historical Intel notation as it is implimented in JWASMJWasm tolerates the use of square brackets around named variables by simply ignoring them. Intel and compatible x86 processors do not have mnemonics to produce an extra level of indirection implied by the ambiguous usage of square brackets.
 
There is some flexibility in how square brackets can be used in historical Intel notation compatible assemblers.
 
<syntaxhighlight lang="asm">
'''mov eax, [ecx+edx]'''
'''mov eax, [ecx][+edx]'''
'''mov eax, [ecx+][edx]'''
</syntaxhighlight>
 
Both notations are correct here and in the second example the extra pair of square brackets function as an ADDITION operator.
 
===Limited Type Checking===
JWASMJWasm supports a pseudo high level notation for creating procedures that perform argument size and count checking. It is part of a system using the '''PROC ENDP PROTO''' and '''INVOKE''' operators. The '''PROTO''' operator is used to define a function prototype that has a matching '''PROC''' that is terminated with the '''ENDP''' operator. The prototyped procedure can then be called with the '''INVOKE''' operator which is protected by the limited size and argument count checking. There is additional notation at a more advanced level for turning off the automatically generated stack frame for the procedure where stack overhead in the procedure call may have an effect with very small procedures. JWASMJWasm is also capable of being written completely free of the pseudo high level notation using only bare Intel mnemonics.
 
Using an example prototype from the 32 bit Windows API function set,
 
<syntaxhighlight lang="asm">
SendMessage '''PROTO''' STDCALL :DWORD,:DWORD,:DWORD,:DWORD
SendMessage equ <SendMessageA>
</syntaxhighlight>
 
The code to call this function using the '''INVOKE''' notation is as follows.
 
<syntaxhighlight lang="asm">
'''invoke SendMessage,hWin,WM_COMMAND,wParam,lParam'''
</syntaxhighlight>
 
Which is translated exactly to,
 
<syntaxhighlight lang="asm">
'''push lParam'''
'''push wParam'''lParam
'''push WM_COMMAND'''wParam
'''push hWin'''WM_COMMAND
'''push lParam'''hWin
'''call SendMessage'''
</syntaxhighlight>
 
The advantage of the '''INVOKE''' method is that it tests the size of the data types and the argument count and generates an assembly time error if the arguments do not match the prototype.
 
===Pseudo High Level Emulation===
JWASMJWasm conforms to the historical MASM notation in terms of emulating high level control and loop structures.<br />
It supports the '''.IF''' block structure,
 
<syntaxhighlight lang="asm">
'''.if'''
-.if
; ...
'''.elseif'''
-.elseif
; ...
'''.else'''
-.else
; ...
'''.endif'''
</syntaxhighlight>
 
It also supports the '''.WHILE''' loop structure,
 
<syntaxhighlight lang="asm">
'''.while eax > 0'''
'''sub eax, 1'''
'''.endw'''
</syntaxhighlight>
 
And the '''.REPEAT''' loop structure.
 
<syntaxhighlight lang="asm">
'''.repeat'''
'''sub eax, 1'''
'''.until eax''' <sub eax, 1
.until eax < 1
</syntaxhighlight>
 
The high level emulation also supports C runtime comparison operators that work according to the same rules as Intel mnemonic comparisons. For the .IF block notation the distinction between SIGNED and UNSIGNED data is handles with a minor data type notation variation where the storage size '''DWORD''' which is by default UNSIGNED can also be specified as '''SDWORD''' for SIGNED comparison. This data type distinction is only appropriate for the pseudo high level notation as it is unused at the mnemonic level of code where the distinction is determined by the range of conditional evaluation techniques available in the Intel mnemonics.
 
The combined pseudo high level emulation allows JWASMJWasm to more easily interface with the later current operating systems that use a C style application programming interface. Generally the pseudo high level interface is used for non-speed critical code where clarity and readability are the most important factors, speed critical code is usually written in directly in mnemonics.
 
==Pre-processor==
The pre-processor in JWASMJWasm emulates the capacity in the Microsoft Macro Assembler and for most practical purposes it is near enough to identical. It is an old design dating back to about 1990 When Microsoft introduced the MASM 6.00 series of assemblers that is known to experienced users as quirky and complicated to use for advanced macro designs. Notwithstanding its archaic format it is a reasonably powerful pre-processor with loop techniques, conditional testing, text manipulation commands and the normal text substitution methods associated with arguments passed to the pre-processor.
 
At its simplest, a macro in JWASMJWasm is constructed as follows.:
 
<syntaxhighlight lang="asm">
ItemName MACRO argument1, argument2, argument3:VARARG
mov argument1, argument2
mov argument3, argument1
ENDM
</syntaxhighlight>
 
This macro is called as follows,
 
<syntaxhighlight lang="asm">
ItemName eax, ecx, edx
</syntaxhighlight>
 
It is expanded by the pre-processor to,
 
<syntaxhighlight lang="asm">
mov eax, ecx
mov edx, eax
</syntaxhighlight>
 
==Licence==
JWASMJWasm is licenced under the '''Sybase Open Watcom Public License''' and is available for use in environments and projects that are excluded by the Microsoft EULA for MASM. JWASMJWasm has no restrictions in writing Open Source software or writing software for non-Microsoft operating systems.
*[http://www.japheth.de/JWasm/License.html JWasm License]
 
== External Links, Reference And Footnotes ==
*[http://www.japheth.de/JWasm.html JWasm Home (broken link)]
*[https://github.com/JWasm/JWasm JWasm on Github]
*[http://sourceforge.net/projects/jwasm/ JWasm project page on SourceForge]
*[http://www.masm32.com/board/index.php The MASM Forum]