28

Reading Chess (1990)

 5 years ago
source link: https://www.tuicool.com/articles/hit/MFfM32F
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
jMjIFf6.png

552

IEEE

TRANSACTIONS ON

PATTERN

ANALYSIS AND MACHINE INTELLIGENCE.

VOL.

12.

NO. 6.

JUNE

1990

Reading

Chess

Absrracr-In anapplicationofsemanticanalysistoimagesof ex-

tendedpassagesof text, severalvolumesofachessencyclopediahave

been readwith high accuracy. Althoughcarefullyproofread,the hooks

were poorlyprintedand poseda severechallengeto conventionalpage

layoutanalysisandcharacter-recognitionmethods.experimental

pagereadersystemcarriedoutstrictlytop-downlayoutanalysisfor

identificationof columns, lines,words,and characters. This proceeded

rapidlyreliablythankstoarecently-developedskew-estimation

technique.Resegmentation of broken,touching,anddirtycharacters

washandledan efficient and integrated mannera heuristicsearch

operating on isolated words. analyzingthesyntaxofgame descrip-

tions and applyingtherulesof chess, theerrorratewasreduceda

factor of

30

fromwhatwasachievablethroughshapeanalysis alone.

Ofthegameswith notypographical errors, 98% havebeen assigneda

legalinterpretation,foraneffectivesuccessrateof 99.995% on ap-

proximatelyonemillioncharacters (2850 games,945 pages).dis-

cuss severalcomputervisionsystems-integrationissuessuggested by

thisexperience.

Index

Terms-Character recognition,documentimage anal-

ysis,analysis,semantics.

I.

INTRODUCTION

Epresent anengineering case studyofa complete

computersystemexhibiting unusuallyhigh

competencya complex application, partthrough the

use of computablesemantics. The system is a page reader

and the application is the extraction of chess games from

the

Chess

Informant

[5], a series of volumes describing

games oftheoretical interest.

Theare poorly printed, anda severe chal-

lengeconventionallayout analysisand character rec-

ognition methods.Novelfeatures

of

our experimental

page reader include a strictlytop-downlayout analysis ap-

proachthatworksrapidly andreliably, largely owing to

arecently-developed technique for estimatingthealign-

ment angle of blockstext. Preliminaryclassification is

performed before attemptinglocate baselines or parti-

tion linesintowords.Broken,touching, anddirty char-

actersare handledanintegrated mannershape-

guidedheuristic operating on isolatedwords.

The games are extracted usingacombinationlayout

andsyntactic rules.Eachgamesegmented intomoves

(50

onaverage), and commentary recognizedand

stripped

off.

Eachmoveconsists of number and

two ply (half-moves), and eachdescribedthree

Manuscriptreceived May

23,

1988; revised October

12,

1989. Rec-

The authors are with

AT&T

Bell Laboratories,

600

Mountain Avenue,

IEEE

Log

Number9034827.

ommended

for

acceptance by

C.

Y.

Suen.

Murray Hill, NJ 07974.

108.’

(R

76/a)

A

62

Luzem (01) 1982

1.

d4

Q€6 2. c4 e6 3. 433 c5

4.

d5

e&

5.

cd5 d6 6. Qc3 g6 7. g3 pg7 8. Ag2

0-0

9.

0-0 Qa6!?

10.

h3

[lo.

e4 &g4=]

Qc7

[RR

10.

..

Ee8!?

11.

4.f4 Qc7 12.

a4

Qe4

13.

Rcl

b5!

14. Bel Rb8

15.

ad2

g5!

16. Qde4 g€417.ab5

f5

18. Qd2 fg3

19.

fg3

@g5

20. Qfl

Qb5F

Csom

-

Sub%

Biile Herculane1982; 11. Qd2!?]

11.

e4

Fig.

I.

A

sample

of

text

from

the

Clless

Infurinant

(shown

1.55

X

actual

size). This isa game’s headerand opening moves, including twopar-

enthetical comments movenumber

IO.

KORTCHNOI

-

TRlXGOV

charactersonaverage(seeFig.1).applying knowl-

edgethe rulesof chess (the “semantics”), each move

is checked for legalitydirectly in the contextbuilt up by

prior movesand indirectlythrough the consistency of later

moves.the interpretations of movessuggested by shape

analysis are inadequate,alternatives are generated, again

by invoking the rulesof chess. Even thoughthis semantic

analysisisfully-backtracking,intolerableruntimesare

preventedhighquality of theshape recognition

overall andaroughly uniformdistribution of serious mis-

takes.

Offirst 142 games, two were flawed typograph-

ical(editorial or typesetting mistakes), and the se-

mantic analysisfailed,assortedreasons,onthree

more. The resulting

98%

of games correct

impliesthat

99.99% of themoves,and 99.995% of thecharacters,

wereinterpreted correctly. Without semanticanalysis,

only

40%

of the gameshave hadlegal interpre-

tation,eventhecharacter recognitionrate due to

shapeanalysis alone is reasonablyhigh (99.5

%).

Prior approachesthe detection and correctionof er-

rorstext images have operated on

isolatedwords

from

natural languages, typically by means ofdictionarylook-

ups (e.g.,

[2]

[l

I]

1131) and character n-gram frequency

analysis (e.g., 1141

[I61

[7]). The lackeffectiveand

efficient algorithmstheanalysis

of

natural language

haslong frustratedthe natural desire to extendthis to sen-

tences.comparative tractabilityof semantics

of

chessoffered anopportunityexperimenton

extended

sequences of text,

oftenconsisting ofhundredsof char-

acters,makecomplete games.

The exercise had two cooperatingpurposes:the first au-

thor’sprincipal motive was to experiment with contextual

reasoning in imageanalysis andthesecond author wished

0162-8828/90/0600-0552$01

.OO

0

1990

IEEE

QjeAjyR.png!web

BAIRD AND

THOMPSON:

READING

CHESS

553

to bring

a

large fraction

Informant

onlinesup-

port

a

variety

of

studiescomputer chess, among them

theenrichmenttheopeningBelle,

a

chess-

playing machine [4].

Section I1 describes the booksthe

Chess

Informant

series.Pageanalysis is discussed Section 111.

Classification,word segmentation, andbaseline-finding

are discussed together

in

Section IV. Amethod cope

with broken,touching, anddirtyshapesdiscussed in

Section V. Extracting games andmovesusinglayout ge-

ometryandsyntax is described SectionUseofthe

rules of chess to detect and correct classification errors is

described in Section VII.Performance statistics are sum-

marized in Section VIII. Systems engineeringaspects are

discussed, finally, Section IX.

11. THE CHESS

INFORMANT

The

Chess

Informant

[5] is

a

series

of

volumes describ-

ing chessgamesselected by aninternationaleditorial

board fortheir theoretical interest. Over 40 volumes have

appeared,continue to appeartheabout two

a year.Eachvolumecontains a description ofabout 800

games, withdetailed commentary (along indexes,

rankings lists, etc). The gamedescriptions areavail-

ablecomputer-legible form.

An excerpt from

a

typical game descriptionshown in

Fig.1. Eachgame is introduceda headerthe

gamenumber, twocodesclassifyingthe opening play, the

names of the players, the location of the tournament, and

the year.

A game descriptionconsistsof

a

list of numberedmoves

with commentary.Somecommentsare briefremarks

usingspecial symbols,such

as

!?,

meaning

“a

move de-

serving attention,” or

f,

meaning“whitehas theupper

hand.” Othercommentstheformlistsalter-

native moves,enclosedsquarebrackets

[

-].

The series is published in Beograd,Yugoslavia. The

textwassetusing letterpresstechniques(handset lead

type)over

200

distinct letters(see Fig.

2),

of

10

point typesize.

These include complete alphanumeric fonts in both ro-

manand boldface,chesspiececharacters, letters modified

with diacriticaland

a

setspecial symbols used

for commentary. Lines of text are set solid: that is, spaced

verticallycloselypossiblethepoint size. The

column-breakisnarrow,lines are not horizontally

aligned across it.

The qualityof the paper andimpressionvariable

from volume to volumefrompage.Uneven

inkingproduced

a

wideof character densities

andslipping type has producedgross shape distortionsand

proximityeffects (Fig.

3).

Otherdefects, includingsurface dirt, ink smearing, and

nonrectilinear layout

of

columnslines, are discussed

later. Althoughthe

Informant’s

print quality is often worse

thanthatmostbooksand journals, it dif-

fers frommoredegree than in defects.

ABCDEFGHIJKLMNOPQRSTUV

WXYZ

abcdefghi]

klmnopqrstuvwxyr

0123456789

ABCDEFGHIJKLMNOPQRSTLVWXY

2

~bcdcl~hiJklmoopqr~tuvwxyz

0123466789

abcd

123

7”S()[]/-

--=

>

<

-

r

1,

A

ec

AC

e

1618160

Us

$

i

ACRS

ftbPH**

#t

+

f

F

f

f

=

=

I

x€l0*4-/0@3=3iI\~/o

tions

Flg

2

Most

of

the distinct letter shapes occurringthegame descrip-

Fig.

3.

Distortions due

to

uneven inking, poor impressions, and broken or

tilted type.

111.

LAYOUT

ANALYSIS

Each pagewasscanned on

a

Ricohdocument

scanner at

a

resolution

of

300 linesper inch (12

lines/”), producing

a

binary image

of

about

8.5

Mbits,

of

whichtypically 800

000

areblack. Maximal subsets of

%connectedblackpixels (‘‘blobs”)are typically

3000

per page. Blobsthat are manifestly too large to be a

10-point character(or

a

string

of

connectedcharacters)

are ignored.

We have found itnecessaryto examine blobs in

detail in orderanalyzepagelayoutintocolumns and

lines (forotherapproaches,see

[lo],

[9]). We simply view

eachblob

as

a boundingrectangle (Fig. 4). Throughout

this analysis, weproceedstrictly top-down,from coarser

partitionsof the page area tofiner ones (for

a

surveyand

analysisothermethods, see [15]). The motive for this

is thatat eachstep we canmakedecisions baseda

maximumofstatistical support andwithaminimum of

a

priori

assumptions.

The first step is to determinetheangle

of

the page

as

a

whole.This isby arecently-developedtech-

nique

[

11 that is fast and accurate to

a

resolutionof

2

min-

utes

of

arc. The pagecorrected for “pseudo-

rotating”theboxes; each is translated

so

thatset

of

theircenters rotate,the bitmap withineachbox

rotated.This workswell practicesinceordinary

care

a

personcan place

a

document on

a

scanner within

3”

of

vertical,,andasmallrotation haslittle effect

on characterbitmaps; however, it haslarge effects on page

layout analysis, as wewill see.

In

orderthelocation of columns, each

boxis widenedslightly by

a

fixed multiple to encourage

overlappingadjacent boxes, projected vertically

to giveone-dimensional distribution (Fig. 5); the wid-

ening is govemed the assumption that

a

column-break

is atleastwide

as

threeem-spaces.Thisdistribution

exhibitsnumber

of

prominentplateauswhich are

easilylocated by thresholding. The threshold is chosen to

ziuQB3Z.png!web

554

IEEE

TRANSACTIONS ON PATTERNANALYSIS AND MACHlNEINTELLIGENCE.

VOL.

12,

NO.

6.

JUNE

1990

$4

4

88'

0w.w

P.

JOOLT

"Wm

"onma

z"wm

"g=u

DPcmCsD

Fig.

6.

Schematic,exaggeratedillustration ofaffine sheardistortion of

column.layout.

&L%%

2==$%

waw

n--

Fig. 4. An image

page,beforecorrection.columns of

a

full page are on theleft,and part of

the

next page appears

on

the right.

Connected components are represented

by

their bounding rectangles.

/I

Fig.

5.

The vertical projection profile

Of

the

bounding

shown

in

Fig,

7,

Schematic, exaggeratedillustrations

of

laid

at

differ.

entskew angles.

Fig. 4, aftercorrection.

be8thpercentile value ofhistogramofvalues in

the distribution. Duetothe narrownessof thecolumn-

break,thistechnique wouldfail

if

the skewestimation

were in error by aslittle as 1.0" of arc. This problem is

compoundedaprinting defect:

it

is possible for the

columns to bedistortedby anshearasas

0.5"

(illustratedschematically in Fig. 6). Shearcorrec-

tionusing thebestalignmentangle discovered among

roughly-vertical angles effectivelysolvesthis problem.

Beforeproceeding to findin each column,

necessary to recomputetheangle for each, due

secondunexpectedprinting defect:eventhe col-

umnsshow no affine shear, it is possible for each to

adifferent angle,as much as

0.6"

apart (illustratedsche-

matically in Fig.

7).

After recorrecting eachcolumnfor

skew,ispossible to locatelines of textanalyzingthe

horizontalprojection (Fig. 8). Here athreshold is

used,theassumptionlineconsists

leasttwo characters. Breakdecisions are lesscriticalthan

for columns, andthis technique has workedwell on

varietyof text.However,where line-spacing isit

could fail if theestimate of skew angle were

off

by as little

as

0.3"

of arc.

Thecontentseachlinethen sortedleftto right,

and, still manipulatingonly theboxesblobs, characters

are tentativelyidentified,collectionsofmore

blobs. The ruleused to combinepair ofblobs (or,

recursively,characters)overlap verticallymore

than

70%

ofwidthof either.This test

is

performed

also ata12" slant,the typicalitalicinclination angle.

This works common fragmented characters and

diacriticalmarks in thetext, butnot perfect, since

it canoverruled later.

In

fact

it

succeeds onall but one

c

Fig.

8.

The horizontalprojection profile of

a

column

of

text,

after

skew

correction.

of the

172

special characters theLatinalphabet enum-

erated in the

Chicago

Manual

of

Style

[6].

IV.

CLASSIFICATION,

BASELINES,

AND

WORDS

Beforeattemptingto locate baselines or words, we per-

formapreliminaryclassificationof characters. The

methodused is templatematching.

To

matchaof

characters,thecenterstheir bounding boxes are

aligned,and abitwise exclusive-or computed. The num-

bernonzerothe result scaled by thetotal area

of

both pattemssubtracted from

1

.O,

givingamatch

scorerange [0,1], with 1.0 indicatingaperfect

match.Thissimple implementationofalong-estab-

lished technique

[

121.

Template-matchingselectedsince,fewer

than

250

distinct lettershapes,there wasrequirement

_-

I-

RVnmQjV.png!web

555

BAIRD AND

THOMPSON:

READING

CHESS

for amultiple-fontclassifier (but, see

[8]).

Also, many

characters suffered from fragmentation, to whichtemplate

matching

is

less vulnerable than topology-basedfeature-

analysis methods.

We manually collected over 1300charactersamples,

about

7

per class on average,each labeleda correct

baseline location. Classificationan unknownshapeoc-

curredessentially exhaustive matchingthis data-

base, yieldinga list of classesdecreasing order of match

score, whichwastruncatedwhen scores fellbelow

0.9

of

theAmanually-selectedthreshold

(0.75)

was used

to distinguishgood frommatches.

Eachdeterminesbaselinelocation,the

dominantbaseline-height of each textline waschosen as

themedianof theset baseline-heights foreachchar-

acter’s firstgood match.Thistechniquefastandhas

provento be accurate and insensitivecommonly-

occurring problemsincludingpoorcharactersegmenta-

tionand dirt. If skew-correctionhasless accurate, a

morecomplex two-dimensional fitting procedurewould

havebeen required.

Eachcharacter’sscore ismodifiedre-

ducingit to the extent that the height-above-baseline dif-

fersfromthe meanthat assuming aGaussian

distributionwithstandardoftraining sam-

ples. This final score thusdepends primarilyupon

similarity of shape, andsecondarily on verticalplacement

in the line. application, baseline displacements as

small asthe x-heightoffontcan becrucial,

sincedistinguishesbetweensquare brackets

‘[’

and

‘I’

and similarshapes such as ‘I’ and

‘J’.

The brackets

are importantlexicalbreak characters used to extract

games and ignore commentary.

Text lines are partitioned into wordsexamining the

statistics of intercharacterspacing withineach column. A

histogramof intercharacter distances usually possesses a

bimodal distributionwhose minimumagood

threshold for distinguishing word- from character-spaces.

In practice, thehistogramstend

to

besparse andbadly-

behaved. By quantizingcoarsely(by 1

/6

of anem-space),

andrequiringthresholdwithinanarrowrange

([0.2,

0.51

em), theresults are usually good.

Most residual word-segmentation problems arecaused

by characters with extremekerningproperties, suchas pe-

riods andwide chess-piececharacters.

No

attempt was

made to exploit special knowledgekerning. Results

were sufficiently good to servean importantfunction dur-

ingthe “clean-up” processing described next. The final

syntacticand semanticanalysisdoesdepend onword-

breaks at all.

Graphics such as board diagrams were eitherdiscarded

earlier as too large or appearasline dominated by

very bad matches; these lines are detected and ignored.

V.

FRAGMENTS,

SMEARING,

AND

DIRT

Therethreeways in the character-finding

heuristic describedabovecanfail: 1) a character canbe

fragmented;

2)

two or more characters cantouch;

(a)

(b)

(C)

Fig.

9.

Defects causing problemsthat are resolved after preliminary clas-

sification:

(a)

touching characters;

(b)

broken characters;and (c) dirt

fragments.

strayink marks,paperdefects, or dirt can be mistaken for

a character. Of course, combinationthese can oc-

cur together.Exampleseachshown in Fig.

9.

In

virtuallyevery instance,these problemsa bad

match.

We observed that it was never necessaryto reach across

a word-breaksuch a problem.averagelength

ofaword is three characters, andonly

7.4%

ofwords on

averagehave at least one character witha bad match.This

suggested astrategy for correcting thesedefects

ifiedway by a nearly-exhaustive examination of altema-

tives generatedwithin the boundsofindividual words. An

outline of the heuristic is as follows:

for

each badword

(7.4%)

{

repeat

{

try merging sets of adjacent characters (0.3

%)

for

eachbad character

{

try splittinginto two (or more)

(0.9%)

if

character is dirt,

then

delete it

(0.6%)

,

1

if

wordisolated dirt,

then

delete it

(0.2%)

1

until

word is deleted or not improved

1

Improvement in is defined as an increase in a com-

positeword-match score, computed

as

the average of the

character-matchscores, weighted by Abouthalfof

thewordsareimprovedheuristic. The per-

centagesshown for eachstatement are for triesthat suc-

ceedimprovingthecharacter or expressedas

fractionsthe total number, good

and

bad.

A

characterisconsidered to be

dirt

if

it

has a lowmatch

score

(

<0.30) anda small area(smaller than about two

periods). Aword is

isolateddirt

if itconsistsof one dirt

character. These rulescatch almost all cases, failingwhen

the dirt is attachedcharacter (Fig.

10)

or lies

so

close

to the baselineresemblesperiod. Deleting dirt is

delayedeverything else hastried,sincecan-

not be undone.

In anattempt to recombinefragmented characters, an

exhaustive listofstrings adjacentcharacters

is

gener-

ated. Eachisanchoredparticularcharacter,

and all strings are generated thatreach no furtherthan

0.5

em fromthe anchor, and are no more than

1.1

em wide.

When splitting characters, amodificationofKahan’s

method

[8]

is usedto purpose a smallnumber of candidate

splittingpoints.

Althoughruntimeofheuristicpotentially expo-

Ez6VRnI.png

556

IEEE

TRANSACTIONS ON PATTERN ANALYSIS MACHINE INTELLIGENCE.VOL.

12.

NO.

6.

JUNE

1990

s

Fig.

10.

Dirt fragmenttouching a character.

nential in thelength of the word,the frequency of

bad words and the short average lengthofwordsholds it

to about

1/7

ofthe costperformingthe preliminary

classification, onaverage (with a large variance).

VI.

EXTRACTING

GAMES

AND

MOVES

The result the preceding stages computationa

hierarchical analysis ofthepage into columns, lines,

words,characters,eachcharacter owning

of interpretations.sample of agalleyproofformat of

this datastructure

is

shown in

Fig.

11,

showing onlythe

topchoice interpretationof eachcharacter.

A

more de-

tailed format, including eachcharacter’sbounding box

andinterpretation list,shown in Fig.

12.

The results on

a sequence of(in detail) are concatenatedand passed

to the stages thatapply chess knowledge.

The text from thecharacter recognizerexamined for

chess semanticsthe goal of decoding the game played

alongwith players,event name and whowon. The anal-

ysis is performedsequentialphases asdescribedbelow.

There is no feedback between phases.

Isolating thegame.

Eachlineexaminedclassi-

fied as potentially beingone of thethreetitlelines of a

game. The first line (index) is leftandright justified with

a largespacethemiddle. The left partnumbers

while the rightpart contains parentheses. The second line

(players) is centered witha hyphen or dash. The third line

(event) is centered.Whenevertwo or threeconsecutive

linesare classified into potentialtitle lines, the previously

collected lines are analyzed as a complete game and sub-

sequent lines are collected for the next game.

Stripping Comments.

Commentsdelimited by

[

*

]

brackets. For analysis brackets are matched

in pairs.Whenpairbrackets are found thefollowing

possibilities are recognized.

[...

]

The

[

is accepted as real.

[

*

[

If thecharacterscores ofbothbrackets

areabove a threshold, the intervening

text isexamined by

reversetypesetting

(de-

scribedbelow) for the best potential

].

The

[

andinvented

]

areacceptedasreal.Oth-

erwise, the better

[

is acceptedand the other is

rejected as dirt.

I...

1

By analogywith

[

*

[.

The beginningof the game treated as ifit contains a

perfect

],

while theendcontains aperfect

[.

The above

COLUMN

1

1.1

lop

108:

(R

76ia)

A

62

1.2

1%

KORTCllNOl- TRINCOV

1.3

lop

Luzern

(01)

1982

1.4

lop

I.

d4 Qf6

2.

c4 e6

3

QO

c5 4. d5ed5

1.5

lop

5

cd5

d6 6.

&3

g6

7.

g3

Pg7

8.

Pg2

1.6

lop

1.7

lop

1.8

Io0

a4 Re4

13.

Hcl bS!

14

Eel

Hb8

15.

Rd2

W

9

0-4

&6!?

10.

h3

[IO.

e4 Pg4=]

&7

[RR

10.

. .

He8!?

11.

Qf4

Oc7

12.

1.9

16

@!

76. Qdeigf4

17.

abTf5

IF,

Qd2

fg3

1.10

lop

19.

fg3

eg5

20.

Qfl

Ob5F

Csom

-

Sub&

1.11

lop

B&le

Herculane

1982;

11.

bd2!”]

11.

e4

Fig.

I

I.

Galleyproofformat. used

for

manual proofreading,

for

the game

fragmentshown in Fig.

1,

Eachlabeledthepoint size.

1.06011.5531

1.1231

1.6511

1

9 0.90

1.13711.617i

1.16311.6471

1

per

0.95

1.2201 1.5401

1.50011.6771

0

\O

1.545n

1.2201 1.5501

1.28711.6431

1

0 0.93

1.29711.6001

1.42711.6201

1

dsh

0.94

1.4331

1.5501 1.5001

1.6431

1

0

0.93

1.56011.5401 1.9631

1.6771

0

\O

1.636”

1.56011.54011.6831

1.6671

1

chN

0.90

1.703i 1.5771

1.7701

1.6431 4

a

0.90

n

0.85

U

0.84

s

0.83

1.7801

1.54711.8431

1.6471

1

6

0.91

1.8701 1.54711.8871

1.6431 2

ex

0.90

1

0.81

1.9171

1.54711.9631

1.647i

1

qm

0.92

3.0331

1.5401a.2031

1.6771

o

\o

i.9ogn

2.0331

1.5431

2.08711.6431

4

1

0.91

I

0.83

1-sl

0.83 i 0.82

2.1001 1.547i

2.16311.6431

1

0

0.95

2.17311.6171

2.20311.6431

1

per

0.92

2.2601

1.54012.4131

1.677i

0

\O

1.545n

2.2601

1.54312.3431

1.6471 2

h

0.94

b

0.89

2.3471

1.5471

2.41311.6471

1

3

0.89

Fig.

12.

Detailedlayout descriptionmoves

9

and

10

of the game shown

in Figs.

1

and

11.

Thefour coordinates on theleft describe bounding

rectangles. The next number

is

a code

n

:

if

n

=

0,

thenthis is a layout

feature,as words(labeled

“\O”)

or

textlines(not shown);

if

n

>

0.

then itis a character possessing

n

interpretations.Eachinterpretation

consists of a letter-name and a match merit (e.g., “chN

0.90”

is

knight withmerit

0.9).

algorithm accepts, rejects, andinventsbrackets

SO

that

they are balanced.Accepted bracketsand theirenclosed

text are discardedascommentary.Thiseliminates about

50%

of the characters from further analysis.is goodat

locating brackets that were typesetbut are unrecognizable

due to printing defects, butnot at allgoodat supplying

bracketswereomittedtheprinter. The con-

tainslots of cluesfor realmissing brackets,suchas re-

peatedmove numbers,changetype face, commentary,

etc; noneofthis used.

Sketchingthe move sequence.

The text is examined for

move numberssequence. The rigorous enough

so

thatany numbers less than almost perfect are deferred

untila later phase.

Interpreting thegameresult.

The next afterthe last

move numberexaminedreverse typesetting for one

of: 1-0 (whitewins),

0-1

(black wins), or

l/2-1/2

(draw).

Completing the move sequence.

Missingnumbers in the

move sequence are filledby reverse typesetting. Pos-

siblymissing numbers betweenthelast backbone number

andtheresult are similarly filled in.

Preparsing the

moves.

The text between movenum-

bers is examined for twosyntactically-correct chess ply,

one for WhiteoneBlack.Candidateas-

sembled fromthealternatives offeredby theclassifier.

Correct formsinclude Piece-Letter-Digit (PLD),

LLD,PLLD,PDLD,or castle. Apiece one

of

I3

43

a

6,

a letter

abcdefgh,

12345678,

and

castleeither

0-0

or

0-0-0.

Commentarycharacters

may occuraftereachply,outsidebrackets;theseare

UNNn6rU.png

BAIRD AND

THOMPSON:

READING

CHESS

551

recognizedand discarded.one or morecorrectcandi-

dates are found for

both

ply,thenare recordedand

thetext is neverinterpretedWhenthere is more

than onecandidate, theysorted decreasingon the

productof the match scores (i.e., classifier’s top choices

first). These

preparsed

moves comprise about 98

%

of the

total moves.

Reverse typesetting

is theuseof the boundingrectangle

of characters to prune listsof matches. Both heightand

width ofboxes are compared, butnot height above base-

line.is never requiredwhen pruning listsfromthe clas-

sifier,buthelpfulwhen scanning for missedbrackets

or eliminatingwild choices made themove generator.

VJI.

CORRECTING

MOVES USING

SEMANTICS

Preparsedmovesinterpretedthe currentchess

context and, if legal,themove madeandmatch

continues. If thepreparsedmoveillegalthematch is

terminated.themovepreparsed,have

reached an

impasse,

andalllegalnextmoves fromthe

currentboardposition are generated andprunedusing re-

verse typesetting. The potentiallyexponentialruntimeof

this exhaustivestepmanaged takingproduct

of

thescores

of

the matchedmovesandpruningbelowa

threshold.matchesabove threshold aremade and

matching continues adepth-first,best-match-firstman-

ner. The longest matchtaken as the game score.

It isconceivable that somegame interpretationcould

be legalandstilldifferfromthe clear intentionof

the

ed-

itor-but suchforced interpretationshave not been ob-

served.example of the resultsof semantic analysis are

shown in Fig.

13.

The textthatrepresents theplayers and eventare

matched against (separate)dictionariesnames.larg-

estsubstring algorithm is usedto find the closest name in

the dictionary. The scoresformatches recorded

on an error file and examined by hand for additions to the

dictionaries. The dictionaries are initiallycreatedfrom the

indexthe

Informant.

This is highlyredundantbetween

issueswithonly about

5%

new entries pervolume after a

two volumestartup.

VIII.

RESULTS

Among the142 games,two wererejected the

systemfortypographicalerrors.Threeothers could not

be corrected thesemanticanalysis.rest, 98% of

the typo-free (correctly-printed) games, were accepted by

thesemanticanalysis,andhave beenproofreadhand

toconfirmthe interpretation.

No

forcedinterpretations

(substitution errors)werefound.game interpretations

were selected fromthose thatcould beassembledfrom

the top three classifier choices (withsyntactic nose-

mantic analysis) this fractionfallsto

76%.

Interpretation

by shapealone (usingonlyclassifier’s topchoices)

would haveyielded 40

%

of the games completely correct,

for an error factor of

30

worse thanthatachieved

by semantic analysis.

One of the games thatcouldfullycorrected had

/

person: TRINGOV (Trinqov

-0)

/

person:KORTCIINOI (KOrtchnOi

-2)

white:Kortchnoi

black: Tringov

/

event: Luzernlol)1982 (Lurcrn

(01)

1982

-0)

event: Lurern (01) 1982

result: 1-0

a4

Nf6e6 Nf3d5 e:d5

Nc3

q6

93

%37

%32

0-0

0-0

Nab

e4

Nd7Bf4Qe7

Re1

f6

Be3

b5

f4 b4 NA~ Nb5 Rc1

Re8

Nf3 Of8 Bf2

Bb7

BflNc7 Nd4

Kh8

Nc6 RA8 f5

e5

d:e5

Of3 N:b4 O:b? Nd3

B:d3

Q:g3r

-2

Q:d3Rcdl Ob5

b4

Race

Qd5

0.36

e6

Qd3

e?

A6

QC6

class/

004

108.*lR76/a)A62

N:CSN:CS

B:CS

of7

~d6

~:d5

0c4

Of6 f:e5

w5

Fig.

13.

Results ofthe syntactic and semantic analysis

for

the gameopen-

ingshown in Figs.

1,

11.

and

12.

Notethatmovenumbersand com-

mentaryhavebeenstripped

off

and a player’snamehasbeen corrected.

Theopeningclassification

“(R76/a)A62”

has been recomputed.

garbled characters atvery end, wheretherewas no

later context from which backtrack.Anothertoo

many consecutiveimpasses, atthe outset ofa game, lead-

ing tounmanageably largesearchspace.line of

text involved

is

illustrated in Fig.

14.

The top line the

bitmap(aftercleaning updirt), andthelinethe

top-choiceinterpretationsclassifier. Bad matches

are markedwith surrounding boxes.

For 99.4% of the atone candidate suggested

by the classifier was syntactically correct. Of these, the

topchoice wassemantically legal 98.7% time.

About

0.5%

of

thewere associatedwith impasses,

and thecorrect rate amongaftersemantic anal-

ysis was 99.99%. Since ply are madeabout threechar-

acterson average,the effective per-character success rate

is probably better than 99.995

%

.

We have no direct mea-

surementthe uncorrected character recognition rate,

and itisdifficult to it unbiased way since,

among other problems,about half

of

the characters are

ignored as commentary.

We havecontinued reading-withoutmanuallyproof-

readingevery game-and after four volumes(945 pages,

2850

games,

2

176

865

characters)performancehold-

ing steady, legal interpretationsfound for over 97%

of the gamesfree from typographical errors.Since about

amillion of thesecharacters in commentary,the

semantic analysisappliedabout amillion charac-

tersforan effectiverateof successful interpretationof

99.995%.

The system waswrittenentirely the

C

programming

languageunder variousresearch editionsthe

UNIX@operating system. Runtime for the image analysis

phases averaged

10

CPU minutespagea DEC

VAX@ 11/8550, of which

87%

was consumedexclu-

sive-or operationsduringtemplatematching.Connected

components analysisand layoutanalysis required about

10

CPU secondsapiece.Resegmentationdirty, etc.

shapesrequired anaverage ofoneCPU minute, witha

moderately large variancedue to variations in imagequal-

ity from pagepage.syntacticsemantic anal-

ysiswasperformeda Sequent;timeaveraged

2

WNIX

is a registeredtrademark

of

AT&T.

WAX

is

a

registeredtrademark

of

DigitalEquipmentCorporation

M7NrQjU.png!web

558

IEEE TRANSACTIONS

ON

PATTERNANALYSIS AND MACHINE INTELLIGENCE.

VOL.

12.

NO.

6.

JUNE

1990

1.

d4

Qf6

2.

cl

c3

3,

dS

b3

I.

a‘3

1.

d4

mf6

2.

cl

e3

3.

da

bl

1.

mrl3

Fig.A linetext containingenough impasses to inducefailure

semantic analysis. The toplineis original bitmap,after dirtfrag-

mentshavebeen deleted, sometimes erroneously

(as

in

“f”).

The bot-

tomline showstheclassifier’s topchoices, with badmatchesenclosed

in boxes.

seconds per butof course a largevariance

since several games requiredupto one CPU minuteand

afew(not averaged in) never terminated normally.

IX.

DISCUSS~ON

The

Chess

Informant

is anunusualof machine

visionthatitscontenteffectively-andefficiently-

computable semantics. The remarkablyhighcompetency

achieved by exploiting thisfactshould encouragefurther

attacks onimages withsimilarly conventional, unambig-

uous

consistency constraints.

In otherrespects,however,the

Informant

is represen-

tativeabroad class printed documents. Nonrectili-

near layout,column-line-spacing,broken,

touching, or dirty character-imagesalloccurotherdoc-

uments.thisreason,believeseveral of the

methods first applied here will be usefulachievinghigh

performance in the general case.

The layout analysis approach

is

unusual in followinga

strictlytop-down policy. Relyingaminimum of

a

priori

assumptions,ischaracterizedsequence of

refinement stepsordered

so

that the broadest statistical

support may be applied to the inference layoutparam-

eters.

So,

for example, lineorientationestimated

fromevidencedistributedthewholepage,word-

breakspacing over eachcolumn, andbaseline isdeter-

mined by consensus within entire text lines. The success

of thisstrategy,withoutrecourse to backtracking, in the

facemanifold layoutdistortions largelyto the

accuracyofskew-finding method.

We attempted to base as much

of

the error management

as possible on a single robust measure of merit. The clas-

sifier’s template matchoccasionally modified (by,

e.g., height above baseline) or combinedscore words

and ply), wastocontrolvirtuallyall later analysis.

The attempt to apply semantic interpretationto long se-

quences oftextranasubstantial intolerable run-

times due to combinatorial explosion. Withoutabasisof

consistentlygoodresults fromthe early states layout

analysis and character-shaperecognition, the effort would

have collapsed.

Astriking feature of theengineering designapro-

gression from fast heuristicsearly stages to slower

and more exhaustive algorithms at the end.For example,

itisnot untiltextbeen segmented into wordsthata

nearly-exhaustivesearchunleashedto fix broken,

touching, anddirty characters. Fullbacktracking

served for the last stagesemantic processing after sev-

eralkindsofglobalandlocal context hasbeenappliedto

pruneinterpretations to a manageable number.

ACKNOWLEDGMENT

An earlier versionof thepaperappearedas

[3].

The

character-splittingheuristicwasavariantof onedevel-

oped by

S.

Kahan. The UNIBUSinterface for the Ricoh

scanner was designed by

J.

Condonand built by W. Moy.

Its UNIXdriver waswritten by

L.

Cherry.

REFERENCES

[I]

H.

S.

Baird,skew angle

of

printed documents,” Proc. SPSE

40thSymp. HybridImaging Systems, Rochester, NY, May

121

W.

W.

BledsoeandBrowning, “Pattern recognitionreading

by machine,” in 1959Eastern JointComputer Conf., 1959.

131

H.

S.

Baird and K. Thompson,“Reading chess,” Proc.IEEE

Comput. Soc.Workshop Computer Vision. Miami Beach,Nov.

30-Dec. 2, 1987.

[4] J.

H.

Condon and K. Thompson, “Belle,” Chess Skillin Man and

Machine, P. Frey, Ed.

[5] Chess Informant. vols. 29, 30, 33. Beograd,Yugoslavia:Sahovski

Informator.1980- 1982.

[6] The Chicago Manual of

Style.

13th ed. Chicago,

1L:

University of

ChicagoPress, 1982,

[7]

J.

J.

Hull.

and

S.

N.

Srihai-i, “Experiments textrecognition with

1987,pp. 21-24.

New York: Springer-Verlag, 1982.-

binary n-gramand Viterbi analysis,”

IEEE

Trans.Pattern Anal. MA-

chineIntell.

,

vol.PAMI-4,520-530,Sept.1982.

S.

Kahan,

T.

Pavlidis,and

H.

S.

Baird,recognition

of

printed

characters

of

anyand size,” IEEE Trans.Pattern Anal. Machine

Intell.,

vol.PAMI-9,no. 2,

pp.

274-288,Mar.1987.

H.

0.

Kida,

0.

Iwaki. and

K.

Kawada. “Document recognitionsys-

tem for office automation.” in Proc. 8th

Inr.

Con$ Pattern Recog-

nition. Paris. France. 1986.

pp.

446-448.

E.

Meynieux,

S.

Seisen, and

K.

Tombre, “Bilevelinformation cod-

ing in office paper documents,” in Proc. 8th

Int.

Con$

PatternRec-

ognition, Paris, France, Oct.1986, pp. 442-445.

W.

S.

Rosenbaumand J. J. Hilliard, “Multifont OCRpostprocessing

system,”

IBM

J.

Res.

Develop.,

vol.

19,

pp.

398-421,July 1975.

H. F. Schantz, The History of OCR, Recognition Technologies Users

Assoc.,1982.

S.

N. Srihari,

J.

J. Hull,andR.Choudhari,“Integrating diverse

knowledge sources textrecognition.” ACM Trans.

Ofice

Inform.

S~sr.,

vol.

I,

pp.

68-87, Jan. 1983.

R. Shinghal, and G. T. Toussaint,“Experiments in textrecognition

with themodified Viterbi algorithm,” IEEE Trans. Pattern Anal. Ma-

chine lnte/l., vol. PAMI-I, pp.184-193, Apr.1979.

S.

N. Srihari and

G.

W.

Zack, “Documentimageanalysis,” in Proc.

8th

Int.

Con$

Pattern Recognition, Paris, France,Oct. 1986,

pp.

434-

436.

1.

R. Ullman, “A binaryn-gramtechnique for automaticcorrection

of substitution,deletion,insertion, andreversal errorswords,”

Comput.

J.,

vol. 20,

pp.

141-147, 1977.

7jaIZji.png!web

BAIRD AND

THOMPSON:

READING

CHESS

559

Henry

S.

Baird

(M’80)

received the Ph.D. de-

gree fromPrincetonUniversity. Princeton, NJ.

He is aMember

of

theTechnicalatthe

Computing Science Research Center,AT&T Bell

Laboratories,Hill,

NJ.

His research fo-

cuses on algorithms for machine vision.em-

phasisontheinterpretation

of

images of printed

documents.1976 he gave the first complete de-

scription

of

the sweep-linealgorithm,

a

funda-

mentaltechnique

in

computational geometryand

an essential engineering component

in

the analy-

sis

of

VLSI mask artwork.

Dr. Baird is anAssociate Editor of IEEE T-PAMI, and

a

member

EditorialBoard of

Patferri

Recognition

journal.serve

as

Program

Chair

of

the 1990IAPR Workshop on Syntactic andStructuralPatternRec-

ognition. Histhesisalgorithms for imagematching won

a

1984

ACMDistinguishedDissertationAwardandwaspublishedtheMIT

Press.a member

of

theAssociation for ComputingMachineryand is

active

in

the IAPR.

Ken

Thompson

wasborn in New Orleans, LA,

in 1943. Hereceivedthe

B.S.

anddegrees

in electrical engineering fromtheUniversity

of

Califomia, Berkeley.

In 1966 he joinedthe Computing Science Re-

search Center

of

Bell Laboratorieswhere he has

workeduntilthe present. Hewas involved in Bell

Laboratories’ participationthe Multicsproject.

In 1975, hereturned to Berkeley to teach Com-

puterSciencefor

a

year.He is one ofprincipal

designers

of

the UNIX time-sharingsystem.

also

one

of

the principal designers

of

the former World Computer Chess

Champion,“Belle.

Mr. Thompson

is

amember of theNationalAcademy of Science and

the NationalAcademy of Engineering.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK