DIS(6)                                                     DIS(6)

     NAME
          dis - Dis object file

     DESCRIPTION
          A Dis object file contains the executable form of a single
          module, and conventionally uses the file suffix .dis.

          The following names are used in the description of the file
          encoding.

          byte      An unsigned 8-bit byte.

          word      A 32-bit integer value represented in exactly 4
                    bytes.

          long      A 64-bit integer value represented in exactly 8
                    bytes.

          operand   An integer stored in a compact variable-length
                    encoding selected by the two most significant bits
                    as follows:

                    0x   signed 7 bits, 1 byte
                    10   signed 14 bits, 2 bytes
                    11   signed 30 bits, 4 bytes

          string    A variable length sequence of bytes terminated by
                    a zero byte.  Names thus represented are in utf(6)
                    format.

          All integers are encoded in two's complement format, most
          significant byte first.

          Every object file has a header followed by five sections
          containing code, data, and several sorts of descriptors:

               header code-section type-section data-section module-
               name link-section

          Each section is described in turn below.

        Header
          The header contains a magic number, a digital signature, a
          flag word, sizes of the other sections, and a description of
          the entry point.  It has the following format:

               header:
                    magic signatureopt runflag stack-extent
                         code-size data-size type-size link-size entry-pc entry-type
               magic, runflag:

     Page 1                       Plan 9            (printed 12/21/24)

     DIS(6)                                                     DIS(6)

                    operand
               stack-extent, code-size, data-size, type-size, link-size:
                    operand
               entry-pc, entry-type:
                    operand

          The magic number is defined as 819248 (symbolically XMAGIC),
          for modules that have not been signed cryptographically, and
          923426 (symbolically SMAGIC), for modules that contain a
          signature.  The symbolic names XMAGIC and SMAGIC are defined
          by the C include file /include/isa.h and by the Limbo module
          dis(2).

          The signature is present only if the magic number is SMAGIC.
          It has the form:

               signature:
                    length signature-data
               length:
                    operand
               signature-data:
                    byte ...

          A digital signature is defined by a length, followed by an
          array of bytes of that length that contain the signature in
          some unspecified format.  Data within the signature should
          identify the signing authority, algorithm, and data to be
          signed.

          Runflag is a bit mask that selects various execution options
          for a Dis module. The flags currently defined are:

               MUSTCOMPILE (1<<0)
                    The module must be compiled into native instruc-
                    tions for execution (using a just-in-time com-
                    piler); if the implementation cannot do that, the
                    load instruction should given an error.

               DONTCOMPILE (1<<1)
                    The module should not be compiled into native
                    instructions, when that is the default for the
                    runtime environment, but should be interpreted.
                    This flag may be set to allow debugging or to save
                    memory.

               SHAREMP (1<<2)
                    Each instance of the module should use the same
                    module data for all instances of the module. There
                    is no implicit synchronisation between threads
                    using the shared data.

     Page 2                       Plan 9            (printed 12/21/24)

     DIS(6)                                                     DIS(6)

               HASLDT (1<<4)
                    The dis file contains a separate import section.
                    On older versions of the system, this section was
                    within the data section.

               HASEXCEPT (1<<5)
                    The dis file contains an exception handler sec-
                    tion.

          Stack-extent, if non-zero, gives the number of bytes by
          which the thread stack of this module should be extended in
          the event that procedure calls exhaust the allocated stack.
          While stack extension is transparent to programs, increasing
          this value may improve the efficiency of execution at the
          expense of using more memory.

          Code-size, type-size and link-size give the number of
          entries (instructions, type descriptors, linkage directives)
          in the corresponding sections.

          Data-size is the size in bytes of the module's global data
          area (not the number of items in data-section).

          Entry-pc is an integer index into the instruction stream
          that is the default entry point for this module.  It should
          point to the first instruction of a function.  Instructions
          are numbered from a program counter value of zero.

          Entry-type is the index of the type descriptor in the type
          section that corresponds to the function entry point set by
          entry-pc.

        Code Section
          The code section describes a sequence of instructions for
          the virtual machine.  There are code-size instructions.  An
          instruction is encoded as follows:

               instruction:
                    opcode address-mode middle-dataopt source-dataopt dest-dataopt
               opcode, address-mode:
                    byte
               middle-data:
                    operand
               source-data, dest-data:
                    operand operandopt

          The one byte opcode specifies the instruction to execute;
          opcodes are defined by the virtual machine specification.

          The address-mode byte specifies the addressing mode of each
          of the three operands: middle, source and destination. The
          source and destination operands are encoded by three bits

     Page 3                       Plan 9            (printed 12/21/24)

     DIS(6)                                                     DIS(6)

          and the middle operand by two bits. The bits are packed as
          follows:

               bit  7  6  5  4  3  2  1  0
                   m1 m0 s2 s1 s0 d2 d1 d0

          The following definitions are used in the description of
          addressing modes:

               OP   30 bit integer operand
               SO   16 bit unsigned small offset from register
               SI   16 bit signed immediate value
               LO   30 bit signed large offset from register

          The middle operand is encoded as follows:

               00    none    no middle operand
               01    $SI     small immediate
               10    SO(FP)  small offset indirect from FP
               11    SO(MP)  small offset indirect from MP

          The middle-data field is present only if the middle operand
          specifier of the address-mode is not `none'.  If the field
          is present it is encoded as an operand.

          The source and destination operands are encoded as follows:

               000   LO(MP)     offset indirect from MP
               001   LO(FP)     offset indirect from FP
               010   $OP        30 bit immediate
               011   none       no operand
               100   SO(SO(MP)) double indirect from MP
               101   SO(SO(FP)) double indirect from FP
               110              reserved
               111              reserved

          The source-data and dest-data fields are present only when
          the corresponding address-mode field is not `none'.  For
          offset indirect and immediate modes the field contains a
          single operand value.  For double indirect modes the values
          are encoded as two operands: the first is the register indi-
          rect offset, and the second is the final indirect offset.
          The offsets for double indirect addressing cannot be larger
          than 16 bits.

        Type Section
          The type section contains type-size type descriptors
          describing the layout of pointers within data types. The
          format of each descriptor is:

     Page 4                       Plan 9            (printed 12/21/24)

     DIS(6)                                                     DIS(6)

               type-descriptor:
                    desc-number memsize mapsize map
               desc-number, memsize, mapsize:
                    operand
               map:
                    byte ...

          The desc-number is a small integer index used to identify
          the descriptor to instructions such as new.  Memsize is the
          size in bytes of the memory described by this type.

          The mapsize field gives the size in bytes of the following
          map array.  Map is an array of bytes representing a bit map
          where each bit corresponds to a word in memory.  The most
          significant bit corresponds to the lowest address.  For each
          bit in the map, the word at the corresponding offset in the
          type is a pointer iff the bit is set to 1.

        Data Section
          The data section encodes the contents of the data segment
          for the module, addressed by MP at run-time.  The section
          contains a sequence of items of the following form:

               data-item:
                    code countopt offset data-value ...
               code:
                    byte
               count, offset:
                    operand

          Each item contains an offset into the section, followed by
          one or more data values in a machine-independent encoding.
          As each value is placed in the data segment, the offset is
          incremented by the size of the datum.

          The code byte has two 4-bit fields.  The bottom 4 bits of
          code gives the number of data-values if there are between 1
          and 15; if there are more than 15, the low-order field is
          zero, and a following operand gives the count.

          The top 4 bits of code encode the type of each data-value in
          the item, which determines its encoding.  The defined values
          are:

               0001  8 bit bytes
               0010  32 bit integers, one word each
               0011  string value encoded by utf(6) in count bytes
               0100  real values in IEEE754 canonical representation,
                     8 bytes each
               0101  Array, represented by two words giving type and
                     length

     Page 5                       Plan 9            (printed 12/21/24)

     DIS(6)                                                     DIS(6)

               0110  Set base for data items: one word giving an array
                     index
               0111  Restore base for data items: no operands
               1000  64 bit big, 8 bytes each

          The loader maintains a current base address and a stack of
          addresses.  Each item's value is stored at the address
          formed by adding the current offset to the current base
          address.  That address initially is the base of the module's
          data segment.  The `set base' operation immediately follows
          an `array' data-item. It stacks the current base address and
          sets the current base address to the address of the array
          element selected by its operand.  The `restore base' opera-
          tion sets the current base address to the address on the top
          of the stack, and pops the stack.

        Module name
          The module name immediately follows the data section.  It
          contains the name of the module implemented by the object
          file, as a sequence of bytes in UTF encoding, terminated by
          a zero byte.

        Link Section
          The link section contains an array of link-size external
          linkage items, listing the functions exported by this mod-
          ule.  Each variable-length item contains the following:

               link-item:
                    pc desc sig fn-name
               pc, desc:
                    operand
               sig:
                    word
               fn-name:
                    string

          Fn-name is the name of an exported function.  Adt member
          functions appear with their full names: the member name
          qualified by the adt name, in the form adt-name.member-name,
          for instance Iobuf.gets.

          Pc is the instruction number of its entry point.  Desc is an
          index value that selects a type descriptor in the type sec-
          tion, which gives the type of the function's stack frame.
          Sig is an integer hash of the type signature of the func-
          tion, used in type checking.

        Import Section
          The optional import section lists all those functions
          imported from other modules. This allows type checking at
          load time. The size of the section in bytes is given at the
          start in operand form. For each module imported there is a

     Page 6                       Plan 9            (printed 12/21/24)

     DIS(6)                                                     DIS(6)

          list of functions imported from that module. For each func-
          tion, its type signature (a word) is followed by a 0 termi-
          nated list of bytes representing its name.

        Handler Section
          The final optional section lists all exception handlers
          declared in the module. The number of such handlers is given
          at the start of the section in operand form. For each one,
          its format is:

               handler:
                    offset pc1 pc2 desc nlab exc-tab
               offset, pc1, pc2, desc, nlab:
                    operand
               exc-tab:
                    exc-name pc ... exc-name pc pc
               exc-name:
                    string
               pc:
                    operand

          Each handler specifies the frame offset of its exception
          structure, the range of pc values it covers (from pc1 up to
          but not including pc2), the type descriptor of any memory
          that needs destroying by the handler (or -1 if none), the
          number of exceptions in the handler and then the exception
          table itself. The latter consists of a list of exception
          names and the corresponding pc to jump to when this excep-
          tion is raised. This is then followed by the pc to jump to
          in any wildcard (*) case or -1 if this is not applicable.

     SEE ALSO
          asm(1), dis(2), sbl(6)
          ``The Dis Virtual Machine Specification'', Volume 2

     Page 7                       Plan 9            (printed 12/21/24)