Two weeks ago, I briefly introduced a way of representing Assembly Language in Haskell code, in order to illustrate the programming paradigm to which it belongs. In a move that is sure to scare off both high-level and low-level programmers, I shall now explain how a Monad implementing 6502 Assembly can be created. In real code, this would use a lot more type-abstraction, but in this article I'll be as direct as possible, under the principle that abstraction makes concepts harder to understand at first.

Since Haskell is good at functions, we will represent a fragment of assembly as a function. This function takes a code location and returns some bytes, a new location, and an arbitrary return value. This return value is what makes it a monad. To avoid confusion, we shall say that the ASM6502 produces a bytestring and returns its monadic return value.

newtype ASM6502 a = ASM6502 (Word16 -> ([Word8], Word16, a)) -- Retrieve the inner function asmFunction (ASM6502 f) = f -- Evaluate the monad and get a list of bytes asm start (ASM6502 f) = bytes where (bytes, _, _) = f start

If you understand this definition, its implementation as a Monad is pretty straightforward.

instance Monad ASM6502 where return ret = ASM6502 $ \loc -> ([], loc, ret) a >>= b = ASM6502 $ \start -> let (left, mid, val) = asmFunction a start (right, end, ret) = asmFunction (b val) mid in (left ++ right, end, ret)

Got that? The implementation of return just inserts a value into the third slot, producing no bytes and leaving the location unchanged. The bind operation >>= is necessarily more complicated, but it's not too hard to grasp if you're already familiar with monads. It creates a function that evaluates the ASM6502 on the left, uses its return value to derive the ASM6502 on the right, evaluates that, and concatenates the bytes together, moving the location along as it goes.

Now that we have this monad, let's make some 6502 opcodes! The simplest ASM statement is just to produce one byte. It moves the location forward by 1 and returns nothing interesting.

byte :: Word8 -> ASM6502 () byte x = ASM6502 $ \loc -> ([x], loc + 1, ())

Now how about some actual instructions?

ldai x = byte 0xa9 >> byte x -- load immediate value into A register clc = byte 0x18 -- clear carry flag adci x = byte 0x69 >> byte x -- A += x, with carry cmpi x = byte 0xc9 >> byte x -- compare A to x, setting status flags

In genuine 6502 ASM, the first and third instructions would just be called lda and adc, sharing their mnemonics with instructions that do the same thing but with different addressing modes. The immediate addressing mode would be indicated by putting a # before the operand. Haskell has no comparable syntax, so we've put an i on the end of the opcode name instead.

Let's move on to labels.

here :: ASM6502 Word16 here = ASM6502 $ \loc -> ([], loc, loc)

Easy, we just return the current location as our monadic return value, so that we can bind it to a variable in a do-block. How do we use labels once we've declared them?

bne :: Word16 -> ASM6502 () -- branch if not equal bne label = ASM6502 $ \loc -> let rel = toInteger label - toInteger (loc + 2) rel8 = if rel > 127 || rel < -128 then error "label given to bne is too far away" else fromInteger rel in ([0xf0, rel8], loc + 2, ())

This turned out to be nontrivial. Because bne uses relative addressing, we have to examine the current location so we can subtract it from the given label. Then we make sure the resulting value can fit into 8 bits, and if it can't, the byte is an error. It's important to restrict the scope of the error message as much as possible; in particular, the error must not be triggered while evaluating the second value of the returned triple, or we may end up with infinite dependency loops. We'll see why later on. The absolute jump opcode takes an absolute address, so it's implementation is far simpler.

jmp :: Word16 -> ASM6502 () jmp label = do byte 0x4c byte (fromIntegral label) -- Little end first byte (fromIntegral (shiftR 8 label)) -- shiftR is defined in Data.Bits

We now have enough functionality to make useful-seeming (but useless) programs.

program :: [Word8] program = asm 0x8000 $ do ldai 0 clc incloop <- here adci 1 cmpi 0xf0 bne incloop forever <- here jmp forever

This adds 1 to 0 until it's 240 and then loops forever. Turing-completeness, here we come! This illustrates the use of monadic bindings as labels. But there's a problem: we can only refer to labels after they have been declared! A real assembly language lets you use a label above its declaration, so you can jump forward, not just backward.

program = asm 0x8000 $ mdo ldai 2 clc adci 2 cmpi 5 bne universe_consistent jmp shouldnt_happen universe_consistent <- here jmp do_stuff

That is why we use the mdo keyword, which we get from the Haskell language extension RecursiveDo. With mdo, we can refer to variables before binding to them, just like that. Sweet, huh? Well it turns out the compiler needs just a little bit of our help to know how to be magical like that. What we need to do is declare ASM6502 to be an instance of the MonadFix class. This class has just one member, mfix, which feeds a monad cyclically into itself (actually, it feeds a monadic function into itself).

class MonadFix m where -- declared in Control.Monad.Fix mfix :: (a -> m a) -> m a instance MonadFix ASM6502 where mfix f = ASM6502 $ \start -> let (bytes, end, ret) = asmFunction (f ret) start in (bytes, end, ret)

It looks like there's an infinite loop in there, because the value ret is being returned from a chain of functions that takes ret as an argument. How on earth could this ever work? If you're used to Haskell you should know the answer: laziness. As long as end does not depend on the value of ret, it can be plucked out of the tuple and later used to calculate ret. This kind of recursive chicanery may be difficult to wrap our human brains around, but the good news is that once this instance is declared, we don't have to worry about it at all, and we can just go on happily writing assembly code which uses labels declared in the future.

That is, provided we don't hit the one snag with this plan. That snag is this: although the bytes produced by an ASM6502 object can depend on future labels, the location it returns cannot. This is why we needed to restrict the scope of the error message in our definition of bne, but for a clearer example, take this code.

program = asm 0x8000 $ mdo start <- here if end - start == 2 then byte 0x00 else byte 0x00 >> byte 0x01 end <- here byte 0x02

Our MonadFix instance will cause an infinite loop if we try to evaluate this program. This is not a flaw in the implementation—there is in fact no way to make this code work. It is inherently paradoxical. If the distance between end and start is 2, then one byte will be allocated, making the distance between end and start 1, but then if that distance is 1, then 2 bytes will be allocated, making the distance 2...and so on. Fortunately, GHC appears to be smart enough to recognize most of the infinite dependency loops I've been able to produce in real code, even if the loop spans hundreds of instructions. It will just print <<loop>> and exit, instead of thinking until the end of the universe.

There's one last thing I'd like to say. Previously we mentioned that we have to use opcode suffixes to denote addressing modes instead of denoting it with the operand syntax. Here is a table comparing official 6502 syntax with Haskell syntax for all of the addressing modes of adc.

adc #$80 => adci 0x80 adc $80 => adcz 0x80 adc $0400 => adcm 0x0400 adc $80,X => adcxz 0x80 adc $0400,X => adcxm 0x0400 adc $0400,Y => adcym 0x0400 adc ($0400,X) => adcxp 0x0400 adc ($0400),Y => adcpy 0x0400

Actually, we can implement just a little bit of magic to reduce the size of this table: we can decide whether to use 8-bit (z) or 16-bit (m) addressing based on the type and value of the operand. Here is an adc opcode that automatically selects either adcz or adcm.

adc :: Integral a => a -> ASM6502 () adc x = case toInteger (0x010101 `asTypeOf` x) of 0x0001 -> adcz (fromIntegral x) -- two bytes, three cycles 0x0101 -> adcm (fromIntegral x) -- three bytes, four cycles _ -> if toInteger x <= 255 then adcz (fromIntegral x) else adcm (fromIntegral x)

This relies on the fact that integers are truncated when you store them into a smaller integral type. We cast a special number to the same type as the argument to see how many bits get truncated, and select the opcode accordingly. If the argument has a large type, we assume that the programmer typed it directly (as in "adc 0x0400" as opposed to "adc label"), and select the smallest opcode we can, and only in this case does the length of the ASM fragment depend on the argument. Most importantly, if we use a 16-bit label defined in the future as an argument, we always use 16-bit addressing, without examining the value of the label, thus avoiding an infinite dependency loop. But if we were to cast a future-label to a wider type, and give that to adc, then we would end up with a loop.

We could create a typeclass to get this information from the argument's type instead of relying on integer truncation. However, introducing a custom class would prevent Haskell's default numeric types system from kicking in, forcing the programmer to be explicit about the type of every operand. Instead of saying "adc 0" they'd have to say "adc (0::Integer)" and at that rate they're rather just go back to "adcz 0" instead. My previous workaround was to use the Bounded class (which does not prevent numeric defaulting); the drawback being I'd have to declare Bounded Integer, even though Integer is supposed to be arbitrary-precision. Let's just say I'm glad I figured out this new method instead.

In conclusion, we have just implemented an example assembler in Haskell, with Haskell's own syntax (plus one extension). A real implementation would have more abstraction, more instructions, statements for manipulating the location, a more efficient way of concatenating bytes, better debugging support, and perhaps a way to arbitrarily annotate the bytestream in order to let the programmer perform static safety checks. I am actively working on such an implementation, though it currently could use a bit of reorganizing, as my coding standards have changed since I started it.