DIS(6) DIS(6)
NAME
dis - Dis object file
DESCRIPTION
A Dis object file contains the executable form of a single
module, and conventionally uses the file suffix .dis.
The following names are used in the description of the file
encoding.
byte An unsigned 8-bit byte.
word A 32-bit integer value represented in exactly 4
bytes.
long A 64-bit integer value represented in exactly 8
bytes.
operand An integer stored in a compact variable-length
encoding selected by the two most significant bits
as follows:
0x signed 7 bits, 1 byte
10 signed 14 bits, 2 bytes
11 signed 30 bits, 4 bytes
string A variable length sequence of bytes terminated by
a zero byte. Names thus represented are in utf(6)
format.
All integers are encoded in two's complement format, most
significant byte first.
Every object file has a header followed by five sections
containing code, data, and several sorts of descriptors:
header code-section type-section data-section module-
name link-section
Each section is described in turn below.
Header
The header contains a magic number, a digital signature, a
flag word, sizes of the other sections, and a description of
the entry point. It has the following format:
header:
magic signatureopt runflag stack-extent
code-size data-size type-size link-size entry-pc entry-type
magic, runflag:
Page 1 Plan 9 (printed 11/5/25)
DIS(6) DIS(6)
operand
stack-extent, code-size, data-size, type-size, link-size:
operand
entry-pc, entry-type:
operand
The magic number is defined as 819248 (symbolically XMAGIC),
for modules that have not been signed cryptographically, and
923426 (symbolically SMAGIC), for modules that contain a
signature. The symbolic names XMAGIC and SMAGIC are defined
by the C include file /include/isa.h and by the Limbo module
dis(2).
The signature is present only if the magic number is SMAGIC.
It has the form:
signature:
length signature-data
length:
operand
signature-data:
byte ...
A digital signature is defined by a length, followed by an
array of bytes of that length that contain the signature in
some unspecified format. Data within the signature should
identify the signing authority, algorithm, and data to be
signed.
Runflag is a bit mask that selects various execution options
for a Dis module. The flags currently defined are:
MUSTCOMPILE (1<<0)
The module must be compiled into native instruc-
tions for execution (using a just-in-time com-
piler); if the implementation cannot do that, the
load instruction should given an error.
DONTCOMPILE (1<<1)
The module should not be compiled into native
instructions, when that is the default for the
runtime environment, but should be interpreted.
This flag may be set to allow debugging or to save
memory.
SHAREMP (1<<2)
Each instance of the module should use the same
module data for all instances of the module. There
is no implicit synchronisation between threads
using the shared data.
Page 2 Plan 9 (printed 11/5/25)
DIS(6) DIS(6)
Stack-extent, if non-zero, gives the number of bytes by
which the thread stack of this module should be extended in
the event that procedure calls exhaust the allocated stack.
While stack extension is transparent to programs, increasing
this value may improve the efficiency of execution at the
expense of using more memory.
Code-size, type-size and link-size give the number of
entries (instructions, type descriptors, linkage directives)
in the corresponding sections.
Data-size is the size in bytes of the module's global data
area (not the number of items in data-section).
Entry-pc is an integer index into the instruction stream
that is the default entry point for this module. It should
point to the first instruction of a function. Instructions
are numbered from a program counter value of zero.
Entry-type is the index of the type descriptor in the type
section that corresponds to the function entry point set by
entry-pc.
Code Section
The code section describes a sequence of instructions for
the virtual machine. There are code-size instructions. An
instruction is encoded as follows:
instruction:
opcode address-mode middle-dataopt source-dataopt dest-dataopt
opcode, address-mode:
byte
middle-data:
operand
source-data, dest-data:
operand operandopt
The one byte opcode specifies the instruction to execute;
opcodes are defined by the virtual machine specification.
The address-mode byte specifies the addressing mode of each
of the three operands: middle, source and destination. The
source and destination operands are encoded by three bits
and the middle operand by two bits. The bits are packed as
follows:
bit 7 6 5 4 3 2 1 0
m1 m0 s2 s1 s0 d2 d1 d0
The following definitions are used in the description of
addressing modes:
Page 3 Plan 9 (printed 11/5/25)
DIS(6) DIS(6)
OP 30 bit integer operand
SO 16 bit unsigned small offset from register
SI 16 bit signed immediate value
LO 30 bit signed large offset from register
The middle operand is encoded as follows:
00 none no middle operand
01 $SI small immediate
10 SO(FP) small offset indirect from FP
11 SO(MP) small offset indirect from MP
The middle-data field is present only if the middle operand
specifier of the address-mode is not `none'. If the field
is present it is encoded as an operand.
The source and destination operands are encoded as follows:
000 LO(MP) offset indirect from MP
001 LO(FP) offset indirect from FP
010 $OP 30 bit immediate
011 none no operand
100 SO(SO(MP)) double indirect from MP
101 SO(SO(FP)) double indirect from FP
110 reserved
111 reserved
The source-data and dest-data fields are present only when
the corresponding address-mode field is not `none'. For
offset indirect and immediate modes the field contains a
single operand value. For double indirect modes the values
are encoded as two operands: the first is the register indi-
rect offset, and the second is the final indirect offset.
The offsets for double indirect addressing cannot be larger
than 16 bits.
Type Section
The type section contains type-size type descriptors
describing the layout of pointers within data types. The
format of each descriptor is:
type-descriptor:
desc-number memsize mapsize map
desc-number, memsize, mapsize:
operand
map:
byte ...
The desc-number is a small integer index used to identify
the descriptor to instructions such as new. Memsize is the
Page 4 Plan 9 (printed 11/5/25)
DIS(6) DIS(6)
size in bytes of the memory described by this type.
The mapsize field gives the size in bytes of the following
map array. Map is an array of bytes representing a bit map
where each bit corresponds to a word in memory. The most
significant bit corresponds to the lowest address. For each
bit in the map, the word at the corresponding offset in the
type is a pointer iff the bit is set to 1.
Data Section
The data section encodes the contents of the data segment
for the module, addressed by MP at run-time. The section
contains a sequence of items of the following form:
data-item:
code countopt offset data-value ...
code:
byte
count, offset:
operand
Each item contains an offset into the section, followed by
one or more data values in a machine-independent encoding.
As each value is placed in the data segment, the offset is
incremented by the size of the datum.
The code byte has two 4-bit fields. The bottom 4 bits of
code gives the number of data-values if there are between 1
and 15; if there are more than 15, the low-order field is
zero, and a following operand gives the count.
The top 4 bits of code encode the type of each data-value in
the item, which determines its encoding. The defined values
are:
0001 8 bit bytes
0010 32 bit integers, one word each
0011 string value encoded by utf(6) in count bytes
0100 real values in IEEE754 canonical representation,
8 bytes each
0101 Array, represented by two words giving type and
length
0110 Set base for data items: one word giving an array
index
0111 Restore base for data items: no operands
1000 64 bit big, 8 bytes each
The loader maintains a current base address and a stack of
addresses. Each item's value is stored at the address
formed by adding the current offset to the current base
address. That address initially is the base of the module's
Page 5 Plan 9 (printed 11/5/25)
DIS(6) DIS(6)
data segment. The `set base' operation immediately follows
an `array' data-item. It stacks the current base address and
sets the current base address to the address of the array
element selected by its operand. The `restore base' opera-
tion sets the current base address to the address on the top
of the stack, and pops the stack.
Module name
The module name immediately follows the data section. It
contains the name of the module implemented by the object
file, as a sequence of bytes in UTF encoding, terminated by
a zero byte.
Link Section
The final section contains an array of link-size external
linkage items, listing the functions exported by this mod-
ule. Each variable-length item contains the following:
link-item:
pc desc sig fn-name
pc, desc:
operand
sig:
word
fn-name:
string
Fn-name is the name of an exported function. Adt member
functions appear with their full names: the member name
qualified by the adt name, in the form adt-name.member-name,
for instance Iobuf.gets.
Pc is the instruction number of its entry point. Desc is an
index value that selects a type descriptor in the type sec-
tion, which gives the type of the function's stack frame.
Sig is an integer hash of the type signature of the func-
tion, used in type checking.
SEE ALSO
asm(1), dis(2), sbl(6)
``The Dis Virtual Machine Specification'', Volume 2
Page 6 Plan 9 (printed 11/5/25)