|I've attempted several compilers which implement some of the ideas presented here. You can download some of them to try for yourself. You'll find the download links at the bottom of the document. But don't rush down there straight away. The philosophy behind them won't make much sense unless you at least skim through what I have to say.|
Introduction: Why bother? What's wrong?
Any program can be written in any reasonable computer language so why do we need a different language for robot-programming?
Robot software is so fascinating because it must be an integrated whole that operates at very many different levels. Other software applications aren't like that. It must operate at levels from the bare metal up to, say, neural-nets, vision or knowledge-based-reasoning algorithms. It must include parallelism because a robot must keep track of many things at once. And it should be able to describe a system which operates over several processors.
Robot progamming requires us to consider both Flow of Execution and Flow of Data and there's currently no programming language that adequately covers both.
We tend to use laguages adopted from other domains because there's no specialist robot language. Those languages are never ideal and we end up kludging together systems from disparate parts. This document is a long discussion, or perhaps rant, about why current languages are unsuitable and what sould be done to make a better one.
I start with the basics: a language is a Human-Computer-Interface. In language design, the prime consideration must be the difference and overlap of what humans find easy vs what computers find easy. The semantics of the language should match the way both people and computers think. The syntax should encourage them to write good programs. Should the syntax be a visual programming language or a text-based one?
A robot program could be very large. How do people successfully write large programs? How can the language help them.
Robot programs must operate at several different levels - "low" to "high". What are the best syntax and semantics for the different levels? The different levels are performing many different tasks. How should those tasks be managed using multi-threading or multiple processors? How should the tasks communicate. How can the language support that?
People rarely know what they're doing when they write robot software. The language and development environment must support "hacking" and interactive debugging on what may be remote processors.
Are mobile robots different from assembly robots? I think they currently are (although one day they will merge). How should the languages differ for the two tasks.
So overall, this document is decribing the Ultimate Robot Language, its syntax and semantics. (Well, perhaps not Ultimate but, at least, very good for small to medium-sized robots.) It should run on small very processors as well as large ones. How should it be implemented?
You can download this document as an RTF file if you'd prefer.
A completely different document dicusses the ideal language for small educational robots. No-one else seems to find it an interesting problem but I think it's fascinating. As far as I know, it's not a question that has been answered, or hardly even asked, in any book or website.
1. A language is an HCI Device
[You can safely skip this section]
Programming languages are made for people - not for computers (or, at least, good ones are). So a good programming language embodies the same "concepts" that people use. If the language doesn't do that then you have to build your house out of much smaller, simpler bricks and people aren't good at that. Imagine building a skyscraper out of clay, limestone and scrap iron. It would never get finished. But when the builder can go to a merchant and buy bricks, cement, girders and rivets, it's straightforward. If a language doesn't embody the same "concepts" that people use then that's a failure of the language, not of the people.
Robots require programming in several different domains or "levels". The semantics should be tuned to match the domains. In a well designed language, syntax reflects semantics. It should be obvious what the semantics of any (syntactic) statement is. It should be easy to do the things that are semantically appropriate (to that specific domain) and hard to do the things that are semantically inappropriate. The semantics should be tuned to match the domains; if the domains are different then the semantics is different so the syntax is different.
Of course, any sufficiently powerful syntax can be used to express any semantics but that isn't enough. The syntax should make it easy to do the sort of things you want to do or ought to be doing.
As I keep emphasising, programming is something that humans do. I've no doubt that when C3P0 writes a program, he finds it equally easy to look at it as an FSM or as a C program or as hex code. But real programmers are people and people don't find it equally easy.
1.1 Hierarchical Decomposition
[I you already believe in heirarchical decomposition and encapsulation then skip this section]
Every large project since we left the stone age has been completed only because it was managed using heirarchical decomposition. From the pyramids to Windows. (Allegedly, Windows is the largest "thing" ever produced. The person-hours certainly exceeds building the pyramids or the Great Wall of China.)
Heirarchical decomposition is the only way that large projects can be completed. And by "large" I don't just mean something the size of Windows. Any program over a thousand lines is "large".
Robot programs ought to be "large". I know it's difficult to get a robot to, say, balance on two wheels but any passer-by would say "that's neat but so what?". I'm impressed by the Darpa's Grand Challenge winners. How many lines of code did they contain? 100k? 1M? It was only possible to write those programs using modern languages. Without heirarchical decomposition it would be impossible for people to write them. Note the word "impossible" not just "a lot harder".
Some languages "scale well" and some don't. Forth, Finite-State-Machines and Lisp have been used to program robots but they don't scale well. C, Pascal and Java scale better.
What does that mean?
As the program gets larger and larger, we find it harder and harder to keep control of the project. Eventually, we find it impossible and the project fails. That failure will happen sooner if you view your program as a FSM or write it in Forth. That's one of the faults I'm talking about when I say that some languages don't scale well.
Saying "it doesn't scale well" is shorthand for "this approach to programming works well for small systems and for the examples you've chosen but, although it shows initial promise, I believe it won't work for larger 'real world' projects". Scaling is important. The history of computing in general and AI/robotics in particular is full of good ideas that worked well with small problems but whose concepts couldn't be applied to large "real" systems.
Heirarchical decomposition helps with scaling and so does encapsulation.
Encapsulation is about hiding information. You divide your program into boxes Most of what's inside the box is hidden. Only a few things are visible on the surface. If boxes can contain other boxes then that's heirarchical decomposition.
In Basic, all variables are global you can use a Goto to jump anywhere, even to inside a loop or function. In C a function is a box and local variables are not visible from outside the box. You cannot Goto the inside of a box. How the internals work is hidden. Only the function parameters are visible on the outside.
In C, functions are global. They cannot be heirarchically decomposed. In Pascal, functions can contain funtion definitions. In Delphi Pascal and C++, objects are a form of box. Some methods and variables are hidden, some are visible from outside.
Hierarchical decomposition and encapsulation needs more than just procedures that contain procedure calls. All languages have that. It's about lexical nesting which leads to information hiding. Encapsulation is also about information hiding. And information hiding helps us control complexity. Controlling complexity helps us complete large projects.
The human brain has limited workspace. You can only juggle about seven ideas at once. If what you're trying to understand doesn't fit onto a single screen tyhen you're in big trouble. Heirarchical decomposition lets you split your big problem into smaller parts each of which does fit onto a single screen. Being able to look at a program and quicly understand what it does is essential if the program is ever to be maintained. (Programs have to be maintained even as they're being written.)
So the Ultimate Robot Language must encourage heirarchical decomposition and encapsulation as an aid to scaling.
1.2 Object Oriented
[Object Oriented design is good for some problems but I don't think it's essential for robots. Skip this section if you're not interested in Object Orientation.]
Object Oriented design beats subroutines when you must have polymorphism and inheritance. That's all they're good for. They are used for encapsulation and heirarchical decomposition but nested funtion declarations can do the same. Do robots need polymorphism and inheritance?
I use them all the time in Delphi and they're great because 99% of the objects I use are associated with something that appears in the user interface. I drag a button into the window and an object appears in my program that I can talk to. How useful is that in a robot? Sure, I could have an object that corresponds to a pushbutton or Sonar but does that make programming easier? Easier than just a library of subroutines? I haven't found it so.
Objects are wonderful in their place. I just don't think their place is in small robots. If I'm writing a Windows business application (yuk) where, say, clients arrive, have transactions and depart then Objects would be the way to do it. But robots aren't like that.
I've tried both development paradigms and I can find no outstanding advantage to OO design in those domains where polymorphism and inheritance are irrelevant. There is no advantage to writing ComPort.Open vs Open(ComPort).
I write comms protocol code for RS232 then I decide to change it to work with ethernet then would Object Oriented design be useful? If the comms object for RS232 and the comms object for ethernet were both descendants of a generic comms object then the protocol code wouldn't need to know what it was talking to. So it seems obvious that Object Oriented design makes life easier.
But compare that with using non-Object Oriented C or Pascal. I have a record (struct) which holds the data required by an RS232 or ethernet port. The protocol code uses that record but never looks inside it. It just knows that it can pass it to functions called Open and Send. The code looks much the same as Object Orientedcode and is as easy to write. But it's more efficient to implement.
So what use is Object Oriented code? It's useful when you don't know until run-time whether you're using RS232 or ethernet. And that never happens in robotics.
1.3 Visual vs textual languages
[This is a more interesting section. it introduces the important distinction between Flow of Execution and Flow of Data.]
Sould you program by dragging and dropping pictures or by writing text?
Robot progamming requires us to consider both Flow of Execution and Flow of Data. I think those two was of looking at the program should be dealt with by separate, different syntax and semantics. The arguments for visual vs textual are different for those two domains.
Imperative programming languages concentrate on Flow of Execution. Analogue computers (and their simulations on digial hardware, e.g. for servoing) concentrate on Flow of Data. Much business software concerned with "transaction processing" is defined by Flow of Data. Subsumption neatly separates Flow of Execution and Flow of Data. Vision processing software is defined by both Flow of Execution and Flow of Data. Etc. etc.
1.3.1 Flow of Data
The Flow of Data part of the program is best described by a visual language. When people talk about Flow of Data, they immediately start drawing boxes and arrows. Data flows along the arrows. The boxes are processes that produce, store, process or consume data. It's natural to draw boxes and arrows and that's what the language should allow you to do.
When you're considering data, you want to ask two questions: Where is this data used? Where does this data come from.
It's possible to define data flow using a textual language. For instance, the procedure call ShowData(b,c) says that the data comes from b and c. ShowData could be the name of a box in a boxes-and-arrows data flow diagram. Unfortunately, a list of "connections" like ShowData(b,c) doesn't tell me where an item of data is used. I must search through all the "connection" definitions to find out.
So I believe that Flow of Data is best described graphically. Graphical Flow of Data boxes can be nested so graphics can do hierarchical decomposition.
1.3.1 Flow of Execution
Every time I see a graphical language, I'm convinced that it's the only way to program and why aren't all languages like that. Then I try to program using it and after 5 minutes I'm ranting about who thought of such a stupid idea.
A "visual" language is exactly equivalent to a textual language. There are statements and they are executed in sequence. "Visual" languages are very useful for beginners because it is much more obvious what you can and can't do. If the blocks (i.e. statements) are shaped like jigsaw pieces then it's obvious what is and isn't syntactically correct. With a good visual language you can enforce syntax and encourange good programming practice.
But, after a few minutes use, you find they're slow and clumsy. Typing if faster. Find and replace is easier. Cut and paste is faster. It's all faster and easier using text.
One of the robot-language compilers I've written while trying to solve this whole simple/complex dilemma has buttons that paste in statements for "Go Left", "If button pressed then begin...end", etc. so it looks like a graphical language interface. But all it's doing is pasting syntactically-correct program fragments together. At any time, the user can type in the textual window that's being written. (It also has the style of multi-tasking used in RIS.) The idea is that programmers of any ability can use it. I can't yet say how successful the idea is.
2. Levels of software
[There is no such thing as "robot programming". Very different kinds of programs are needed in a robot. Skip this section if you already agree.]
There is no such practice as "typical" robot programming. There are three rather different kinds of programmer:
"High Level" Type 3 People won't be using a small computer. They'll mount a PC on a big chassis and have as much processing power as they can afford. We can forget them. They need C or Lisp or Prolog or Haskell or whatever is the fashionable languge this month. They are well-served by conventianal programming languages and we needn't consider them further. I'm more concerned with the Type 1 and Type 2 people. The Type 3's "Big Program" will talk to the lower-level Type 1 and Type 2 programs.
So that leaves Type 1 and Type 2 people.
Both could use C or Java or Fortran or Pascal or Forth or whatever, but none are ideal. Why?
"Low Level" Type 1 people find those languages are a bit clumsy. A circuit-diagram on the back of an envelope is better. The user interface should look like that. C, Java, etc. are sequential languages and there's no real concept of sequence in an analogue circuit.
"Mid Level" Type 2 people need multi-threading because you're often juggling several small algorithms at the same time. C, Java, etc. can be made to be multi-threaded but it's a real kludge.
"Low level" and "high level" here don't correspond with low-level-languages and high-level-languages. Assembler is a low-level-language but it's not suitable for writing a PID controller. The syntax for a language to describe servoing corrseponds to a semantics that involves arithmetic expressions and concurrent processes that are continually running. A FSM might be the ideal way to describe a wheel encoder or a comms protocol. For,say, route-finding, I want a completely different semantics and therefore a completely different syntax.
If the semantics is different in the different layers then the syntax should be different too. Otherwise it's confusing. It would be better if there was one tool that did it all. We need fast, simple stuff at the bottom and more high-falutin' stuff at the top. Currently, people use different tools and languages and chips and computers for the different parts.
Low level robot control programs are often stream processing problems, similar to audio and video stream processing. Too many robot-control languages ignore flow-of-data and concentrate instead on flow-of-execution. Flow-of-data tends to be most important at the lower levels. If I wanted my robot to be able to plan its behaviour, a traditional flow-of-execution style of language would be appropriate. I am such a fan of subsumption because if separates flow-of-data from flow-of-execution and uses different syntax to describe each while keeping them both in the same "system".
The Ultimate Robot Langauge and IDE would have rich features for describing Flow of Data and Flow of Execution at all the different levels.
Different levels tend to operate at different speeds. The lower levels often need to operate quickly in real-time. A robot repairing a motorcycle engine needs fast servo loops to be able to fit the carburettor and do up the nuts but it can sit back and have a cup of coffee while it ponders whether it should really have bolted-on the cylinder head before fitting the gasket.
Low level code corresponds to "Foreground tasks" that are executed in a timely manner and have a guaranteed response-time. "Background tasks" use up the rest of the execution cycles. I suspect that's a flawed model. I would say that if you have a time-critical task then you dedicate a chip to it. Chips are cheap. But the language is describing a single system and so should be able to deal with multiple processors.
Robotics is different because a robot has to operate at all levels: low to high.
3. Boxes and Arrows
[Finally, we're getting to something interesting.]
Robot progamming requires us to consider both Flow of Execution and Flow of Data.
The best representation consists of Boxes and Arrows. The Arrows show where the data flows. The Boxes contain prosesses that massage the data.
Consider a walking robot. Data flows from touch sensors on the feet, from force sensors in the muscles, from ultrasonic obstacle detectors and from a camera. It is processed to servo the leg joint angles, to set the overall body attitude, to set the individual foot placements on rough ground and to navigate over a long distance.
All these things are going on at the same time so each box is a separate thread.
Data enters and leaves a boxes through inpu and output ports on its surface. A box could contain other boxes so, inside the box, the data could be routed to other, smaller boxes.
But usually, a box contains a process: a thread.
But how do these threads communicate and coordinate? And what sort of process? It might be low-level or high level. That's what this section is about.
[A description of subsumption and its two Big Ideas.]
Subsumption is different. It took me lots of re-reading all Brookes's articles to understand what he was talking about but I think I eventually sorted it out. (I don't find his style easy to follow - too philosophical for my taste.)
Brookes draws Boxes and Arrows and they all look very similar. It took me a while to realise that inside the "Behaviour" boxes, the arrows represent flow of control and the interior boxes are "states" making up a FSM; but outside the "Behaviour" boxes, the arrows represent flow of data. In some ways, the meanings of the arrows and boxes are exactly the opposite depending on whether you're looking at Behavious from th eoutside or inside.
I found it hard to understand his programs because the syntax of the two levels was very similar but the semantics was completely different. Brooke's architecture definitely has different semantics between layers and so ought to have different syntax. My choice would be: graphical arrows-and-boxes between Behaviours; textual "statements" within Behaviours.
For me, Brooke's breakthrough was that he showed how to keep the different levels separate but also integrate them into a single system.
3.1.1 How Behaviours Work
Each "Behaviour" runs independently of the other. All Behaviours run all the time. Each Behaviour runs as a single thread. There is no multprocessing within a Behaviour, only between Behaviours.
Inside a Behaviour is an ordinary small program with variables, if-statements and assignment. Brookes calls it an "augmented finite state machine", AFSM, and draws it as a finite state machine with boxes and arrows. But that's just his preference.
An AFSM is exactly equivalent to a standard procedural programming language. They are simply a different kind of representation. I dislike the FSM representation for the reasons given below so I would replace the FSMs with a conventional programming language.
If a Behaviour is servoing or implementing a PID control then you might want to implement it as an analogue computer (or, rather, as a digital simulation of an analogue computer). If a Behaviour is implementing a quadrature encoder or a comms protocol then you might want to represent it as a FSM. So the IDE could allow a different surface representation of a behaviour. But, underneath, a Behaviour is always just a program.
That programming can use the traditional tools of hierarchical decomposition to make it easier to build large systems.
3.1.2 How Arrows connect Behaviours
Inside a Behaviour is an ordinary small program. Outside Behaviours, arrows pass data between the boxes.
Brookes says that Behaviours pass messages to each other. So the arrows show the paths that messages take. Messages aren't queued - old ones are simply discarded (actually, that varies between implementations and papers and he never explains why he changes). Personally, I think you shouldn't use messages; just "publish" a numerical value on the outside of the behavious and let other Behaviours read it.
So far, that's perfectly standard and doesn't require a whole new name like "subsumption".
What was innovative was how you deal with two Behaviours which are both trying to control the same hardware. One Behaviour outranks the other. So if one it "active" the other's output is simply ignored. (How that's achieved varies between his papers.)
Let's say one Behaviour is "head towards the light" and the other is "when you hit a wall, back up and turn away". Both Behaviours control the wheels; both tell the wheels to do different things. Previous architectures said that the robot was in one "mode" or the other: one Behaviour or the other was running. Brookes says that both Behaviours are always running but the "back up and turn away" Behaviour outranks the "head towards the light" Behaviour.
So a message can take the special value "Ignore Me". The lower ranking message only gets through when the higher ranking message is "Ignore Me".
It seems like a pretty trivial difference and it took me a long time of examining his detailed descriptions of various implementations to see why he thought it was important.
I'm still not 100% convinced - because it IS such a trivial difference. But every time I've tried it as a design philosophy, it's worked very well indeed.
The same trick can be used to control what data goes into a Behaviour. A Behaviour may think that it's getting raw data from a sensor while, in fact, its input stream has been hijacked by another Behaviour. Or, of course, an input could be the sum of the raw data and a control signal: that's how feedback-loops have been controlled for many decades.
A box could contain other boxes, inside the box, data can be routed to other, smaller boxes. So, once again, hierarchical decomposition makes it easier to build large systems.
3.1.3 Can Subsumption do it all?
Brooke's philosopy deals very neatly with Type 1 and Type 2 programming but how does it square with Type 3 (high level) programming?
That has always been my difficulty with subsumption architecture. It's fine for Genghis who just bumbles around but if you've got a job to do that's more complex than collecting beer cans, how is that integrated with the subsumption stuff? Of course, that's the question that all the critics of subsumption have asked. I'm a fan - not a critic.
I think that the answer is that your high level program is simply another of the Behaviour boxes.
3.2 Finite State Machines
[I dislike FSMs for anything but trivial algorithms and here's why. Skip this section if you also dislike FSMs]
Brookes says that a Behaviour is an "augmented finite state machine", AFSM. He draws it as a finite state machine with boxes and arrows. For Brookes, subsumption is state machines with interprocess communications. Most designs were implemented in Lisp but they were conceived as FSMs: he explicitly refers to them as such.
I don't like FSMs for anything but the most trivial algorithms.
If all you want to do is code the bottom-most levels of robot software then FSMs are useful . (Although for anyting that falls under the broad heading of "servoing", you need a analogue computer simulated by a digital computer. Servoing code doesn't really have "states".)
Of course, in theory, a FSM is equivalent to a Turing machine but, to be a Turing machine, it must have an infinite number of states. More than a couple of dozen states is almost impossible to represent in any sort of human-readable way and leads to dreadful spaghetti code.
FSMs are for tiny algorithms. They aren't a good way of specifying large algorithms. And many robot projects need large algorithms.
State machines throw away all the structured programming, encapsulation and hierarchical decomposition that has been developed in the last fifty years and that are undoubtably part of your Favourite Programming Language. Hierarchical decomposition is the _only_ method of handling complexity that humans have developed in the last 10000 years. No-one is going to throw that away are they?
So: FSMs don't scale well. So FSMs are good for some tiny, very low-level algorithms.
FSMs are the equivalent of a language whose only control structure is a GOTO. Students can write tiny programs using GOTOs and get away with it. But no-one does that for large programs.
Is an FSM better or worse than BASIC for a tiny hundred-line program? I think they're much much the same. In fact, if the FSM has 100 states, I'd personally use BASIC-like syntax to describe it rather than a state diagram. For a thousand lines (or states) I'd be forced to use at least Basic. For 10,000 lines: Pascal or C. For 100,000 lines: C++ or Delphi.
My experience in robotics is that FSMs are a clumsy way of representing, for instance, sequences of actions. The "FSM" representation is particularly clumsy when there are FOR loops and sequences of act-delay, act-delay, act-delay. I have a walking robot with three motors controlling four legs. The walking sequences for forwards, backwards, forwards-turning-left and forwards-turning-right require subtle differences in the timing of the motors. If it's walking forward and the right IR obstacle sensor sees an obstacle, the response is
I can express that in Pascal very easily because I have subroutines which call other subroutines. That's hierarchical decomposition and it's very clear. It executes as a single thread. The equivalent as a single FSM is nowhere near as clear and the representation is huge. If I split it into several FSMs that trigger each other then the user's representation spans several pages and the sequential nature of the actions is not as clear. Plus the processor has the overhead of multitasking. Getting a robot to walk backwards is a trivially simple programming exercise using Pascal whereas FSMs over-complicate it by requiring multiple threads and message-passing. I usually want my robots to do something rather more complex than "back-up-and-turn when you see an obstacle".
4. Multi-threading / Multi-processors
[It's prbably worth reading this section]
I've chose a architecture like Subsumption.
Subsumption separates Flow of Execution and Flow of Data. The outer layer is Flow of Data. The data flows between Behaviours. Inside a Behaviour, all we're concerned with is Flow of Execution.
Each "Behaviour" runs independently of the other. All Behaviours run all the time. Each Behaviour runs as a single thread.
4.1 Multiple processors
Several Behaviours could be running on the same processor. It's fairly easy to get even a small processor to do that.
But processors are cheap. Realistically, a cheap processor can only do one time-critical thing. If there is some task that it must respond to within a short time then it cannot reasonably do another time-critical thing. Even an expensive processor, like the one in your Windows machine, has difficulty controlling a real-time process.
So my philosophy is one time-critical task, one processor.
Behaviours can be implemented as many threads running on the same processor. Or as many threads running on many processors. Threads are threads and processors are processors. Threads are logical things which are part of the software. They can be implemented any way you like in hardware.
It would be nice to have: one source, one compiler, several target processors.
The processors might be of different kinds. It would, of course, be hard to write a compiler with multiple different targets but the processors needn't be all that different: thay could all be PICs with the same instruction set, just different memory and internal peripherals.
Programming languages are usually designed for single-processor systems. A word-processor is the archetypical example. Multiple processors and multiple threads are ideas bolted-on as an afterthought. Robots just don't fit into that mould because the software must work on so many different level simultaneously.
Occam had the concept of multiple processors and multiple threads in a single processor built into the language. But it has been pretty thoroughly ignored in subsequent languge design. But even Occam assumed that the processors are all the same.
Mid and High-level routines could reside mostly in the big brain, but bits of code could be downloaded and executed on a micro-controller when necessary (say because of latency concerns). All of this should be as transparent to the programmer as possible.
4.2 Inter process comms
How do Behaviours (i.e. processes) communicate? Traditionally, the answer is either
For small robots, I don't like (1). Queues can overflow and crash the system. And what's the point of old messages? If the message is old, it's probably out of date.
I don't like (2). You don't want a fast Behaviours to wait for a slow one. What the fast Behaviour is doing may be critical. What if Behaviours run at different speeds so that one produces data at twice or half the speed that another consumes it?
So I prefer is that a Behaviour "publishes" data. A Behaviour is a closed box - other processes can't see inside - but the box has some values (the public data) exposed on it's surface. Other Behaviours can read the data but can't alter it.
In the system I implemented, public data was always a single byte. But if a piece of public data is several bytes long then reading and writing data must be "atomic" operations (i.e. a context switch can't happen in the middle of reading or writing them). Which is (3) of course.
5.0 The ultimate language
[So, finally, I describe my dream robot language.]
Could there be a language and IDE that does all that I've proposed? I think yes.
Lets assume we're making a moderately powerful mobile robot. Most of the low-level control is done by PICs (or AVRs if you prefer) but the high-level vision processing and path-planning is done by an on-board PC running Linux or Windows.
Development work is done on a separate development PC which runs the IDE.
The IDE allows the user to specify Data Flow in terms of Arrows and Boxes. Boxes can contain other boxes but, at the lowest level, a box is a process: a Behaviour. Some Behaviours run on the PICs, some run on the PC.
The IDE doesn't know much about the Behaviours running on the robot's PC. All it knows is that they send and receive data as internal messages. On the robot's PC, there is a standard "router" program; the IDE gives it a setup file that tells it what messages to send to which programs and to which COM ports. That's all the development PC needs to know about the robot's PC. So from now on, we'll forget about the robot's PC.
The IDE contains perhaps three programming languages:
The "analogue computer" is simply a graphical front-end for the textual language. It's a straightforward piece of syntactic sugar that we needn't discuss further.
The FSM editor and compiler are easy to write. They must accept messages in the same format as the textual language. Once again, it's straightforward to write such a compiler so we won't discuss it further.
The tricky one is the textual language. If you can get that right, the other two are easy.
5.1 Language Syntax
Firstly, I think C is not a well-designed language. It's hard to write a good compiler for C, especially for a microcomputer. Pascal is somewhat better and a good Pascal compiler will produce code that's perhaps 10% smaller and faster than a good C compiler. Java is Pascal with curly brackets. It has all the advantages of Pascal but doesn't frighten C programmers.
So the surace syntax of the language should be similar to Pascal or Java. From now on, I'll assume it's Delphi Pascal but you're free to imagine Java.
The Java language itself is compact but its run-time-library is a bloated monster. so I'm simply considering the syntax of the core language.
The language has all the normal advantages of Pascal. It has various ways of encouraging hierarchical decomposition. It discourages spaghetti code. I demands strong typing. It performs the majority of bounds checking at compile time. It ver strongly discourages free pointers or pointer arithmetic.
Syntactically, the language is single threaded. It will be multi-threaded so far as the Data Flow layer is concerned but that's not part of the textual language. I suppose that the language could contain PAR and ALT as Occam did. (In a PAR block, the statements are executed in parallel. An ALT block is like a switch statement; each alternative was guarded by a boolean expression. The process would wait until one of the alternatives became true; it would execute that statement then exit the block.) But I would say that they're not necessary. PAR and ALT can be done by the Data Flow layer.
Should the language contain Objects?
Objects allow inheritance and polymorphism. Does a robot need those?
I can see that inheritance is nice syntactic sugar. It's fairly easy to implement and not costly at run-time.
Polymorphism is needed when you don't know at compile-time what type of data you'll be dealing with at run-time. That doesn't happen in robots. Polymorphism requires late-binding (dynamic-binding) which has very high run-time costs. I don't believe that polymorphism is required or desireable.
Should the language include records and pointers and dynamically allocated Objects? I think not.
Records (structs) and pointers are useful for building linked-lists, trees, etc. Mid- and low-level code on a robot don't need those features.
Dynamic data has been an integral part of AI robotics for decades of course but most robots don't need it. Dynamic data is useful for constructing large search trees and 90% of AI is about searching. Mid- and low-level robot code doesn't do that kind of thing. As Brookes said: "Elephants don't play chess". Perhaps some of the very high level code in the on-board PC might do that kind of thing but that code can be written in standard C or Pascal.
Dynamic data requires a heap. Heaps can overflow. No robot can wait for garbage collection.
So the language will not contain any way of creating dynamic data. There will be no New or Create.
But records (structs) and objects can still exist. But they can only be declared as variables at compile time. In other words, you can create a tree once but you can't modify or destroy it.
Should the language allow recursion? No.
Programmers use recursion when they're searching a tree. Mid- and low-level robot code doesn't do that kind of thing. Recursion isn't needed.
If you don't have recursion then you don't need a stack. If you don't have a stack, it can't overflow. If all variables are static then it's (usually) easier to acces them on a PIC.
Var parameters are hard to implement efficiently. It needs a lot of code to Dereference a pointer on a small processor. It's far better to allow Value Result paremeters.
Macros are faster at run-time than called procedures (but, of course, they take more code space). The user should be able to define macros much as they define procedures. Any procedure that is called only once can automatically be treated as a macro.
(Some people ask why not just use Forth? No. Forth has lousy syntax, spaghetti jumps, lack of encapsulation and is so opaque it's almost a write-only language. Every modern language design book will disparage it for those reasons. Plus it's also not well suited to micros but neither are C, Pascal, Java, etc. etc. On the other hand, Forth has excellent interactivity for th eprogrammer. I'll return to that in a later section.)
5.2 Boxes and Arrows
The Data Flow layer uses a graphical syntax of Boxes and Arrows.
The Arrows show how data "moves" from one Behaviour (Box) to another. If the two behaviours are running on the same processor then nothing actually moves. The source Behaviour publishes the value of the data on its surface and the recipient Behaviour can read but not alter the data.
If the Behaviours are on different processors then the data must be transmitted from one to the other.
The graphical syntax of Boxes and Arrows must allow the user to specify which Behaviour runs on which processor. Perhaps you draw a dotted box around all the Behaviours that are on the same processor. Or perhaps when you nest Data Flow boxes, a container box can specify a processor.
The programmer must also specify how the processors are physically connected together. The physical topology might now correspond to the topology of the boxes and arrows. Each connection might use a different technology.
Robots often contain "Foreground" Behaviours that must be executed in a timely manner and must have a guaranteed response-time. The "Background" Behaviours use up the rest of the execution cycles. I haven't really thought about how that would be specified or implemented. Perhaps for just one Behaviour per processor, you can specify what is the minimum time cycle time for executions of a particular loop.
The Data Flow layer can define global types and constants. For instance, if I change my definitions for the message-format for inter-processor comms, I want the IDE to recompile all the code that's affected (and tell me which chips to reprogram).
There can be no Global variables but special Constant Behaviours can be created which simply emit a constant value.
So I believe that Flow of Data is best described graphically. Graphical Flow of Data boxes can be nested so graphics can do hierarchical decomposition. But can the boxes also have inheritance? I can't see why they should. Although, if a new box as the same external ports, it could simply be slotted into the space left by an old Behaviour.
I think, but I'm not sure, that all messages should be the same size: a byte, a word or a longint. That way, any Behaviour can be attached to any other behaviour.
I've not yet come across a situation where I needed a large message but perhaps it should be allowed in exceptional circumstances. In general, If you're dealing with complex data then it's all done within a single behaviour.
Perhaps there is some good reason why a process is better implemented by two parallel threads which pass complex data back on forth. So the prorgammer uses the multitasking that's built in to the system. But I can't think of an example.
[Skip this section unless you're the sort of person who likes writing compilers.]
As discussed above a language for low-level and mid-level robot programming does not need a stack or a heap. Recursion and dynamic allocation are not allowed.
Splitting a system into concurrent Behaviours and assigning them to processors is straightforward and simply involves getting the housekeeping right. Task swapping must be fast because it will happen a lot but it should already be more efficient thanks to the lack of stack and heap.
So the question is, how easy is it to write a compiler that creates efficient code with a small footprint?
The compiler must produce native code. A desire for interpreted i-code is reasonable. It's a virtual machine that can be emulated on any platform so the problems of cross-platform development are reduced. A virtual machine also makes it easier to implement parallel threads. But interpreted i-code is slow and many of the things a robot does are time critical.
6.1 Stacks and Variables
[A discussion of why stack and heaps are bad and should be discarded. If you already agree, skip this section.]
If you want a reasonably rich language and a small footprint, you should look at the languages of the 60s and 70s. Computers were smaller but people were building large systems. Look at Fortran, Algol, Coral, BCPL, Occam, etc. They were perfectly useable languages which had tiny run-time overheads.
Ideally, the language should run well on a range of platforms. Java was meant to be like that but it failed. When it first appeared, it was meant to run on everything from a mainframe to a toaster. But it now seems that your toaster should have a megabyte of RAM. Java and Pascal treat all objects as dynamic. Java has an automatic garbage collector, Pascal has explicit deallocation. They treats the stack as dynamic. That is a mistake in a small processor with limited memory.
Removing the stack and heap so that all allocation is static will reduce language's overhead to nil.
Static allocation does not imply poor allocation, it just means no recursion and no dynamic linked-lists or trees or whatever. I've never needed recursion or dynamic allocation in a mobile robot. Static allocation makes it trivially simple to implement has muti-processing (because the context blocks are minute).
Old-time Fortran treated all data allocation as static. A good Fortran compiler will analyse the subroutine calls and work out which local variables cannot be used concurrently. That's exactly what's wanted for a small processor with limited memory.
Stack frames slow down function calls.
Coral had a similar syntax to C but no stack frames. It is perfectly possible to write something that looks superficially like Java or Delphi Pascal but where everything is statically allocated. (Because Java is a better designed language than C++, it should compile to smaller code. Pascal is even better in that respect.)
if there is no stack, how do you call and return from a procedure? Every return address is stored in a local variable owned by the called routine. So a procedure call is: store pc in ret_addr; goto procedure. And a return is: load pc from ret_addr.
Every modern programming language uses a stack but that's not to say that every modern programming language _should_ use a stack. I've written lots of compilers and, by choice, I'd always use a stack. Stacks make life easier for compiler writers. Writing a compiler using efficient static allocation is very much harder.
The compiler could easily spot when recursion is used (usually never) and use a stack there. But my preference would be simply to ban recursion.
It is quicker to enter and leave an interrupt handler if the handler does not have to set up and destroy a stack frame.
Stacks can be useful for any nested calls, not just recursive ones. They automatically make the most efficient use of the available memory. If variables are statically allocated then it's possible to do almost as well or even better but it's much harder work at compile-time. The compiler can tell what variables are required when. By analysing the lifetimes of variables, it's possible to re-use memory and the overall memory use can be smaller than if you had a stack (because you don't need stack-frames). A compiler with a good register allocation algorithm ought to be producing better code than a human writing in assembler. Statically allocated variables run faster on most processors and it's much easier to implement parallelism.
And what of the heap?
I'm more of a Delphi expert so I'm prepared to assert that dynamic memory isn't a requirement for most Delphi programs. It's simply a detail of current implementations. In the vast majority of programs, all the allocation of objects on the heap is done at initialisation. (I believe that the same is true of Java.) So, in fact, it could be done statically at compile-time. (Delphi also has dynamic strings and arrays which use the heap but few robot programs need that sort of thing.)
That would be true of the vast majority of programs that run on robots. It has to be a very complex robot algorithm to need dynamic allocation: perhaps speech processing or playing chess.
So the compiler ought to work out at compile-time what objects are needed and allocate them statically with fixed addresses. The code would be smaller and run faster. The footprint would be vastly reduced.
6.2 Inter-Behaviour communication
Message passing within a single processor is easy. The sending Behaviour simply puts the "message" in a public variable and other Behaviours are free to read it.
All messages are (probably?) the same size: a byte, a word or a longint. Writing to and reading from a message must be "atomic" operations so that the receiving Behaviour never sees "half" a message.
Message passing between processors is done by sending the byte(s) along a wire to the other processor. In the receiving processor, the value is published for Behaviours to read. There must be a small array of such published values because several Arrows correspond with the same wire. So each message going down the wiire is preceded by a byte saying which array element to write it into.
The word "wire" includes any sort of medium: RS232, I2C, ethernet, IRDA, Zigbee, etc. You name it.
Messages are never queued. Why would you want to queue messages? What would you do if the queue overflows? Old messages can just be thrown away - if the receiving Behaviour hasn't responded to them yet then they're probably out of data already.
Behaviour X never waits for a "message" from Behaviour Y. (I realise that there are algorithms which cannot be implemented that way but I don't think they are neede by robots.)
A message can take the special value "Ignore Me". Where message Arrows join, the lower ranking message only gets through when the value of the higher ranking message is "Ignore Me".
6.3 Multitasking and swapping
In traditional multi-processing systems, context switching (swapping processes) is slow and complicated. It happens at regular intervals on a clock-tick. It may happen at any time so the context block has to hold register values, stack pointers, etc.
In the Ultimate Language, it's a lot easier. There is no stack or heap and all variables are static so all you need to do is change the change the program counter. The context-block for a process is simply the value of the program counter. But what about the calues of the processor's registers? Shouldn't they be saved?
That depends on when swapping occurs?
A context switch could happen on a timer interrupt but I prefer explicit calls to a SWAP routine. That has the advantage that the compiler can control when swaps can occur and which operations are atomic.
The compiler sprinkles calls to SWAP into the code. Because the compiler knows what the code is doing, it can make sure that SWAP was called only in sensible places. For instance, SWAP is only called at the end of a statement so the processors registers are always empty (the compiler shouldn't do register optimisation across statements). But putting a call to SWAP after every statement is excessive. So the compiler should only put them where it thinks there might be a loop (e.g. inside a FOR or WHILE loop or before any GOTO or before a procedure call).
The assumption was that a list of statments without some sort of loop is never all that long. If it starts to get long, the compiler can insert extra SWAPs.
The SWAP routine jumps to each of the behaviours in turn. Previously, I've implemented SWAP with a table of Behaviours which it cycles through. I suppose there could be several SWAP routines so Behaviour A always swaps to Behaviour B; Behaviour B always swaps to Behaviour C; and so on.
So, if there's no stack and no context switching during expressions, then a process "context block" is reduced to one word: the program counter. Context switching is very fast. In my 20MHz PIC implementation, it took a couple of microSec.
If one of the Behaviours is of the special high-priority kind then it could get more time slices. It calls the normal SWAP routine. But the other Behaviours call a special SWAPHP routine which always swaps to the high-priority Behaviour. The high-priority Behaviour might mark some regions of its code as non-swap.
7.0 The IDE
The IDE must include
If might also include
7.1 The editors and compilers
The editors and compilers are straightforward if rather tedius to write.
For very low-level work, the compiler should include an assembler. It must be possible to declare variables at absolute addresses.
My favourite "low-level" PIC compiler shows me how many instructions every source line compiles to. I can click on any source line and it will show me what code it generates.
7.2 The programmer
The IDE will contain a programmer for each kind of target chip. Larger chips may contain bootloaders so they can be re-programmed in situ.
7.3 The Debugger
The IDE will include a standard high-level single-step, examine-variables, change-variables, set-breakpoints debugging tool.
That implies that there is comms from the IDE to all the processors in the robot. The physical topology will already have been specified. The debugging commands must be routed through that physical network.
Single-stepping through one Behaviour should not stop the other Behaviours from executing.
The programmer should be able to inspect and modify what's going down any Arrow. It should be possible to make time-charts or tables of the values in an Arrow.
Single-step debugging is easy to implement, all my small robots have it. Most also have live feedback of data. But it does imply that the target processors each have the overhead of a debugging kernel.
7.4 Hacking and Interactivity
Robotics is most definitely about hacking. No-one knows in advance how to make a robot that will work. No-one can write their code and get it right first time. The lifetime of commercial code is measured in years or decades. The lifetime of robot code is measured in hours.
During debugging, you always find a huge problem that means you have to re-think your system design. Robotics is about research and the whole essence of research is that you don't know what you're doing. The "development cycle" assumes you know what you're doing from the start. In research, you hack it until it works then, finally at the end of the project, you understand what you should have been doing all along.
All that hacking requires a high degree of interactivity within the IDE.
For all its faults, Forth is a highly interactive language. Editing and running a program is instant - you don't have to reprogram the whole chip and restart it. It's easy to examine and change variables. You can call any function from the IDE.
Forth achieves that by having the whole editor on the target processor. Most Forths also interpret an i-code (some compile to native code). When a function is edited, the new version is simply appended to the program.
That's a huge overhead. The benefits are great but so are the costs.
Is it possible to get the benefits without the cost? I can't see how it can be done. It needs more research.
My compromise has been to simulate parts of the program on the IDE.
The target processor contains a kernel which can control the motors and sensors. Each behaviour can operate in two modes: being run on the target or being run on the IDE computer.
When a Behaviour is run on the IDE, all of the instant interactivity of Forth is available. However, the robot probably runs more slowly thanks to comms delays.
Once a Behaviour is running well, it is compiled and the (whole) program is downloaded to the target.
8.0 Are mobile robots different from assembly robots?
In theory, a robot is a robot. In practice, programming mobile robots is different from programming assembly robots.
Generally, a mobile robot just has to get from A to B. An assembly robot has to perform a tightly defined sequence of tasks. Mobile robots tend to be autonomous. If they meet an obstacle, they work out how to go around it. Assembly robots have already been told what to do when they meet each problem.
So far, I've been considering mobile robots but many of the same arguments apply to assembly robots. Programming assembly robots requires a good IDE, multiple processes, multiple processors, etc.
But is subsumption suitable for assembly robotics? No, not really.
Assembly robotics is about sequencing but is subsumption isn't good at sequencing.
I've written a separate paper describing how to program assembly robots.
I've attempted several compilers which implement some of the ideas presented here. You can download some of them to try for yourself. Of course, to run the programs, you'd need the right kind of robot but at least you can see how the languages work.
Brainprog is a complete subsumption IDE. It implements many of the ideas discussed above. However, it does not allow multiple processors and it uses a graphical rather than textual language inside behaviours.
Flowchart shows how a graphical language for "servoing" could work. It is used inside sim3d to program a simulated robot.
Parlang is a graphical language for programming assemby robots. It has never been connected to a robot but the principles it embodies have been used with a real assemby robot.
I'm currently trying to write a Pascal compiler that follows the zero-size footprint philosophy. I've written lots of compilers before but never a block-structured one that's completely static and I'm getting bogged down in tracking the lifetimes of variables. The "register allocation" algorithms are quite fiendish. Once I've got that working, I'll start to think about parallelism. Parallelism will be static as well (i.e. threads can't be launched at run-time). I haven't yet thought formally about how it should be done. I suspect that parallelism will be external to the Pascal - it will be done in an "outer" layer with boxes and arrows. The compiler isn't ready for download yet but watch this space.
Questions or comments ? E-mail me at firstname.lastname@example.org
No part of this web page may be reproduced in any form or by any means, including photocopying and recording, for any purpose without the express written permission of Analogue Information Systems Ltd.
This web page is copyright © 2008 Analogue Information Systems Ltd., 1 Warrender Park Crescent, Edinburgh EH9 1DX, Scotland, UK, Tel: +44 (0) 131 228 6008. All Rights Reserved