Programming With End-Of-Data

In the last post we introduced eod (end-of-data), a special sentinel value used to mark the end of a finite Lucid stream. Streams in Lucid are all formally infinite (non terminating) but we can use eod to represent finite streams as infinite ones filled with eod past a certain point. For example the finite stream of the first five primes is

2, 3, 5, 7, 9, eod, eod, eod, …

The input and output conventions are adjusted to interpret eod as termination. If the above stream is the output, the implementation will ‘print’ the first five values and terminate normally. If a user inputs the first five values, then terminates the input stream, this is not treated as an error. Instead, the ‘missing’ values are evaluated to eod if requested.

What makes it interesting is that when a (strict) data operation is evaluated, if any (or all) of the operands are eod , the result is eod. (Non strict operations like if-then-else-fi need special rules). Thus termination propagates through expressions, which is almost always what you want. A continuously running filter which computes, say, a running average of its input will terminate normally if its input is terminated. There is no need to repeatedly test for end of input.

Furthermore, eod allows us to write expressions and filters for problems that require constructs like while or for .

A simple example is the last filter. Suppose we define S

S = 0 fby S+I

to be a running sum of the stream I. Let’s say I is finite and we want the sum of its elements. Obviously, this is the last value of S; so we write

Sum = last S

And how is last defined? Easy

last X = X asa iseod next X

Here iseod is a special operator that can examine eod without being turned to eod . It returns true if its argument is eod , false otherwise.

I was browsing Hacker News the other day and learned about the “rainfall problem”, a coding exercise used as a solve-at-the-whiteboard interview question. You have a series (finite, of course) of numbers and you must calculate the average of the positive numbers that appear before the sentinel value -999. Let’s solve it in Lucid with eod .

The first step is to remove the ad hoc sentinel value and replace is with eod . Let R be the original data stream; we define T , the finite stream of temperatures, as

T = R until R eq -999

Here X until P is like the stream X except that once P is true, the output is eod . The operator until (which normally would be built in) has a simple definition:

X until P = if sofar not P then X else eod fi

where sofar Q is true at a timepoint iff Q has been true up to then. We can define sofar as

sofar Q = R where R = true fby R and Q end

Now that we have the temperatures as a proper finite stream we can define the stream P of the positive temperatures as

P = T whenever T>0

(For this to work our implementation of whenever must handle eod correctly. This will be the case, for example, if we base it on the recursive definition

X whenever P = if first P then first X fby (next X whenever next P)

else next X whenever next P fi

which gives sensible results if X and/or P are finite.)

Finally we define the stream A of averages as

A = S/N where S = first P fby S+next P; N = 1 fby N+1 end

and the number we want is the last one

answer = last A fby eod

Note that if we count until , sofar , and last as being built-in, we don’t use iseod .

What happens when there is more than the time dimension? What do we do? If there is also a space dimension then we can add another special value, eos (end of space). The value eos propagates like eod when combined with ordinary data. And we add an extra rule: when eos combines with eod the result is eod ; eod trumps eos . With this arrangement we have a simple output convention. If X is the 2D stream being output, we evaluate X at timepoint 0 and at successive spacepoints till we encounter eos . Then we move to the next line, increase the timepoint to 1, and output successive spacepoints till we again encounter eos . We then increase the timepoint to 2, output successive spacepoints etc.

If at any stage we encounter eod , we terminate normally. We could call this the ‘typewriter’ output convention. There is a corresponding input convention that requires an end-of-line input as well as an end-of-data input.

And what about three dimensions? For example, video in which frames vary in the time dimension and a frame varies in a horizontal ( h ) dimension and a vertical ( v ). We can generalize the typewriter convention using eoh (end of horizontal), eof (end of frame), and eod .

What’s the general situation, when there’s lots of dimensions? It seems we need a family of special objects, indexed by a set of dimensions. If eod(S1) and eod(S2) are the objects corresponding to the sets S1 and S2 of dimensions, then the result of combining them (say, adding them) is eod(S1∪S2) . Thus the bigger the index set, the more overpowering is the object. The value eos is revealed to be eod({s}) and what we call simply eod is eod({s,t}) . In the video context, eoh is eod({h}) , eof is eod({h,v}) and eod is eod({h,v,t}) .

EOD

Recommend

How Linux's Kernel Developers 'Make C Less Dangerous'

Omnisharp, what is going on here? : archlinux

贝莱德集团曾支持选独立董事长替代特斯拉CEO马斯克

自如熊林：没有意愿“对付”谁是自如的责任定承担

美团敲定从五大基石投资者处获得15亿美元融资

程维道歉之后，滴滴的客服究竟要怎么改？

阿里员工租房后得白血病去世自如微博回应并关评

女性不希望男友太聪明或者太帅而男性恰好相反

陕西宁强县，于海明的亲哥于海军知道弟弟被判定为正当防卫的消息，他说全家人现在特别...

【追砍电动车主遭反杀：关键证人讲述案发全过程】9月1日，江苏省昆山市公安局副局长张...

About Joyk