23

Programming With End-Of-Data

 5 years ago
source link: https://www.tuicool.com/articles/hit/FZ3UbiJ
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In the last post we introduced eod (end-of-data), a special sentinel value used to mark the end of a finite Lucid stream. Streams in Lucid are all formally infinite (non terminating) but we can use eod to represent finite streams as infinite ones filled with eod past a certain point. For example  the finite stream of the first five  primes is

2, 3, 5, 7, 9, eod, eod, eod, …

The input and output conventions are adjusted to interpret eod as termination. If the above stream is the output, the implementation will ‘print’ the first five values and terminate normally. If a user inputs the first five values, then terminates the input stream, this is not treated as an error. Instead, the ‘missing’ values are evaluated to eod if requested.

What makes it interesting is that when a (strict) data operation is evaluated, if any (or all) of the operands are eod , the result is eod. (Non strict operations like if-then-else-fi need special rules). Thus termination propagates through expressions, which is almost always what you want. A continuously running filter which computes, say, a running average of its input will terminate normally if its input is terminated. There is no need to repeatedly test for end of input.

Furthermore, eod allows us to write expressions and filters for problems that require constructs like while or for .

A simple example is the last filter. Suppose we define S

S = 0 fby S+I

to be a running sum of the stream I. Let’s say I is finite and we want the sum of its elements. Obviously, this is the last value of S; so we write

Sum = last S

And how is last defined? Easy

last X = X asa iseod next X

Here iseod is a special operator that can examine eod without being turned to eod . It returns true if its argument is eod , false otherwise.

I was browsing Hacker News the other day and learned about the “rainfall problem”, a coding exercise used as a solve-at-the-whiteboard interview question. You have a series (finite, of course) of numbers and you must calculate the average of the positive numbers that appear before the sentinel value -999. Let’s solve it in Lucid with eod .

The first step is to remove the ad hoc sentinel value and replace is  with eod . Let R be the original data stream; we define T , the finite stream of temperatures, as

T = R until R eq -999

Here X until P is like the stream X except that once P is true, the output is eod . The operator until (which normally would be built in) has a simple definition:

X until P = if sofar not P then X else eod fi

where sofar Q is true at a timepoint iff Q has been true up to then. We can define sofar as

sofar Q = R where R = true fby R and Q end

Now that we have the temperatures as a proper finite stream we can define the stream P of the positive temperatures as

P = T whenever T>0

(For this to work our implementation of whenever must handle eod correctly. This will be the case, for example, if we base it on the recursive definition

X whenever P = if first P then first X fby (next X whenever next P)

                                   else next X whenever next P fi

which gives sensible results if X and/or P are finite.)

Finally we define the stream A of averages as

A = S/N where S = first P fby S+next P; N = 1 fby N+1 end

and the number we want is the last one

answer = last A fby eod

Note that if we count until , sofar , and last as being built-in, we don’t use iseod .

What happens when there is more than the time dimension? What do we do? If there is also a space dimension then we can add another special value, eos (end of space). The value eos propagates like eod when combined with ordinary data. And we add an extra rule: when eos combines with eod the result is eod ; eod trumps eos . With this arrangement we have a simple output convention. If X is the 2D stream being output, we evaluate X at timepoint 0 and at successive spacepoints till we encounter eos . Then we move to the next line, increase the timepoint to 1, and output successive spacepoints till we again encounter eos . We then increase the timepoint to 2, output successive spacepoints etc.

If at any stage we encounter eod , we terminate normally. We could call this the ‘typewriter’ output convention. There is a corresponding input convention that requires an end-of-line input as well as an end-of-data input.

And what about three dimensions? For example, video in which frames vary in the time dimension and a frame varies in a horizontal ( h ) dimension and a vertical ( v ). We can generalize the typewriter convention using eoh (end of horizontal), eof (end of frame), and eod .

What’s the general situation, when there’s lots of dimensions? It seems we need a family of special objects, indexed by a set of dimensions. If eod(S1) and eod(S2) are the objects corresponding to the sets S1 and S2 of dimensions, then the result of combining them (say, adding them) is eod(S1∪S2) . Thus the bigger the index set, the more overpowering is the object. The value eos is revealed to be eod({s}) and what we call simply eod is eod({s,t}) . In the video context, eoh is eod({h}) , eof is eod({h,v}) and eod is eod({h,v,t}) .

EOD


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK