Reading Chess (1990)
source link: https://www.tuicool.com/articles/hit/MFfM32F
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
552
IEEE
TRANSACTIONS ON
PATTERN
ANALYSIS AND MACHINE INTELLIGENCE.
VOL.
12.
NO. 6.
JUNE
1990
Reading
Chess
Absrracr-In anapplicationofsemanticanalysistoimagesof ex-
tendedpassagesof text, severalvolumesofachessencyclopediahave
been readwith high accuracy. Althoughcarefullyproofread,the hooks
were poorlyprintedand poseda severechallengeto conventionalpage
layoutanalysisandcharacter-recognitionmethods.experimental
pagereadersystemcarriedoutstrictlytop-downlayoutanalysisfor
identificationof columns, lines,words,and characters. This proceeded
rapidlyreliablythankstoarecently-developedskew-estimation
technique.Resegmentation of broken,touching,anddirtycharacters
washandledan efficient and integrated mannera heuristicsearch
operating on isolated words. analyzingthesyntaxofgame descrip-
tions and applyingtherulesof chess, theerrorratewasreduceda
factor of
30
fromwhatwasachievablethroughshapeanalysis alone.
Ofthegameswith notypographical errors, 98% havebeen assigneda
legalinterpretation,foraneffectivesuccessrateof 99.995% on ap-
proximatelyonemillioncharacters (2850 games,945 pages).dis-
cuss severalcomputervisionsystems-integrationissuessuggested by
thisexperience.
Index
Terms-Character recognition,documentimage anal-
ysis,analysis,semantics.
I.
INTRODUCTION
Epresent anengineering case studyofa complete
computersystemexhibiting unusuallyhigh
competencya complex application, partthrough the
use of computablesemantics. The system is a page reader
and the application is the extraction of chess games from
the
Chess
Informant
[5], a series of volumes describing
games oftheoretical interest.
Theare poorly printed, anda severe chal-
lengeconventionallayout analysisand character rec-
ognition methods.Novelfeatures
of
our experimental
page reader include a strictlytop-downlayout analysis ap-
proachthatworksrapidly andreliably, largely owing to
arecently-developed technique for estimatingthealign-
ment angle of blockstext. Preliminaryclassification is
performed before attemptinglocate baselines or parti-
tion linesintowords.Broken,touching, anddirty char-
actersare handledanintegrated mannershape-
guidedheuristic operating on isolatedwords.
The games are extracted usingacombinationlayout
andsyntactic rules.Eachgamesegmented intomoves
(50
onaverage), and commentary recognizedand
stripped
off.
Eachmoveconsists of number and
two ply (half-moves), and eachdescribedthree
Manuscriptreceived May
23,
1988; revised October
12,
1989. Rec-
The authors are with
AT&T
Bell Laboratories,
600
Mountain Avenue,
IEEE
Log
Number9034827.
ommended
for
acceptance by
C.
Y.
Suen.
Murray Hill, NJ 07974.
108.’
(R
76/a)
A
62
Luzem (01) 1982
1.
d4
Q€6 2. c4 e6 3. 433 c5
4.
d5
e&
5.
cd5 d6 6. Qc3 g6 7. g3 pg7 8. Ag2
0-0
9.
0-0 Qa6!?
10.
h3
[lo.
e4 &g4=]
Qc7
[RR
10.
..
Ee8!?
11.
4.f4 Qc7 12.
a4
Qe4
13.
Rcl
b5!
14. Bel Rb8
15.
ad2
g5!
16. Qde4 g€417.ab5
f5
18. Qd2 fg3
19.
fg3
@g5
20. Qfl
Qb5F
Csom
-
Sub%
Biile Herculane1982; 11. Qd2!?]
11.
e4
Fig.
I.
A
sample
of
text
from
the
Clless
Infurinant
(shown
1.55
X
actual
size). This isa game’s headerand opening moves, including twopar-
enthetical comments movenumber
IO.
KORTCHNOI
-
TRlXGOV
charactersonaverage(seeFig.1).applying knowl-
edgethe rulesof chess (the “semantics”), each move
is checked for legalitydirectly in the contextbuilt up by
prior movesand indirectlythrough the consistency of later
moves.the interpretations of movessuggested by shape
analysis are inadequate,alternatives are generated, again
by invoking the rulesof chess. Even thoughthis semantic
analysisisfully-backtracking,intolerableruntimesare
preventedhighquality of theshape recognition
overall andaroughly uniformdistribution of serious mis-
takes.
Offirst 142 games, two were flawed typograph-
ical(editorial or typesetting mistakes), and the se-
mantic analysisfailed,assortedreasons,onthree
more. The resulting
98%
of games correct
impliesthat
99.99% of themoves,and 99.995% of thecharacters,
wereinterpreted correctly. Without semanticanalysis,
only
40%
of the gameshave hadlegal interpre-
tation,eventhecharacter recognitionrate due to
shapeanalysis alone is reasonablyhigh (99.5
%).
Prior approachesthe detection and correctionof er-
rorstext images have operated on
isolatedwords
from
natural languages, typically by means ofdictionarylook-
ups (e.g.,
[2]
[l
I]
1131) and character n-gram frequency
analysis (e.g., 1141
[I61
[7]). The lackeffectiveand
efficient algorithmstheanalysis
of
natural language
haslong frustratedthe natural desire to extendthis to sen-
tences.comparative tractabilityof semantics
of
chessoffered anopportunityexperimenton
extended
sequences of text,
oftenconsisting ofhundredsof char-
acters,makecomplete games.
The exercise had two cooperatingpurposes:the first au-
thor’sprincipal motive was to experiment with contextual
reasoning in imageanalysis andthesecond author wished
0162-8828/90/0600-0552$01
.OO
0
1990
IEEE
BAIRD AND
THOMPSON:
READING
CHESS
553
to bring
a
large fraction
Informant
onlinesup-
port
a
variety
of
studiescomputer chess, among them
theenrichmenttheopeningBelle,
a
chess-
playing machine [4].
Section I1 describes the booksthe
Chess
Informant
series.Pageanalysis is discussed Section 111.
Classification,word segmentation, andbaseline-finding
are discussed together
in
Section IV. Amethod cope
with broken,touching, anddirtyshapesdiscussed in
Section V. Extracting games andmovesusinglayout ge-
ometryandsyntax is described SectionUseofthe
rules of chess to detect and correct classification errors is
described in Section VII.Performance statistics are sum-
marized in Section VIII. Systems engineeringaspects are
discussed, finally, Section IX.
11. THE CHESS
INFORMANT
The
Chess
Informant
[5] is
a
series
of
volumes describ-
ing chessgamesselected by aninternationaleditorial
board fortheir theoretical interest. Over 40 volumes have
appeared,continue to appeartheabout two
a year.Eachvolumecontains a description ofabout 800
games, withdetailed commentary (along indexes,
rankings lists, etc). The gamedescriptions areavail-
ablecomputer-legible form.
An excerpt from
a
typical game descriptionshown in
Fig.1. Eachgame is introduceda headerthe
gamenumber, twocodesclassifyingthe opening play, the
names of the players, the location of the tournament, and
the year.
A game descriptionconsistsof
a
list of numberedmoves
with commentary.Somecommentsare briefremarks
usingspecial symbols,such
as
!?,
meaning
“a
move de-
serving attention,” or
f,
meaning“whitehas theupper
hand.” Othercommentstheformlistsalter-
native moves,enclosedsquarebrackets
[
-].
The series is published in Beograd,Yugoslavia. The
textwassetusing letterpresstechniques(handset lead
type)over
200
distinct letters(see Fig.
2),
of
10
point typesize.
These include complete alphanumeric fonts in both ro-
manand boldface,chesspiececharacters, letters modified
with diacriticaland
a
setspecial symbols used
for commentary. Lines of text are set solid: that is, spaced
verticallycloselypossiblethepoint size. The
column-breakisnarrow,lines are not horizontally
aligned across it.
The qualityof the paper andimpressionvariable
from volume to volumefrompage.Uneven
inkingproduced
a
wideof character densities
andslipping type has producedgross shape distortionsand
proximityeffects (Fig.
3).
Otherdefects, includingsurface dirt, ink smearing, and
nonrectilinear layout
of
columnslines, are discussed
later. Althoughthe
Informant’s
print quality is often worse
thanthatmostbooksand journals, it dif-
fers frommoredegree than in defects.
ABCDEFGHIJKLMNOPQRSTUV
WXYZ
abcdefghi]
klmnopqrstuvwxyr
0123456789
ABCDEFGHIJKLMNOPQRSTLVWXY
2
~bcdcl~hiJklmoopqr~tuvwxyz
0123466789
abcd
123
’
7”S()[]/-
--=
>
<
-
r
1,
A
ec
AC
e
1618160
Us
$
i
ACRS
ftbPH**
#t
+
f
F
f
f
=
=
I
x€l0*4-/0@3=3iI\~/o
tions
Flg
2
Most
of
the distinct letter shapes occurringthegame descrip-
Fig.
3.
Distortions due
to
uneven inking, poor impressions, and broken or
tilted type.
111.
LAYOUT
ANALYSIS
Each pagewasscanned on
a
Ricohdocument
scanner at
a
resolution
of
300 linesper inch (12
lines/”), producing
a
binary image
of
about
8.5
Mbits,
of
whichtypically 800
000
areblack. Maximal subsets of
%connectedblackpixels (‘‘blobs”)are typically
3000
per page. Blobsthat are manifestly too large to be a
10-point character(or
a
string
of
connectedcharacters)
are ignored.
We have found itnecessaryto examine blobs in
detail in orderanalyzepagelayoutintocolumns and
lines (forotherapproaches,see
[lo],
[9]). We simply view
eachblob
as
a boundingrectangle (Fig. 4). Throughout
this analysis, weproceedstrictly top-down,from coarser
partitionsof the page area tofiner ones (for
a
surveyand
analysisothermethods, see [15]). The motive for this
is thatat eachstep we canmakedecisions baseda
maximumofstatistical support andwithaminimum of
a
priori
assumptions.
The first step is to determinetheangle
of
the page
as
a
whole.This isby arecently-developedtech-
nique
[
11 that is fast and accurate to
a
resolutionof
2
min-
utes
of
arc. The pagecorrected for “pseudo-
rotating”theboxes; each is translated
so
thatset
of
theircenters rotate,the bitmap withineachbox
rotated.This workswell practicesinceordinary
care
a
personcan place
a
document on
a
scanner within
3”
of
vertical,,andasmallrotation haslittle effect
on characterbitmaps; however, it haslarge effects on page
layout analysis, as wewill see.
In
orderthelocation of columns, each
boxis widenedslightly by
a
fixed multiple to encourage
overlappingadjacent boxes, projected vertically
to giveone-dimensional distribution (Fig. 5); the wid-
ening is govemed the assumption that
a
column-break
is atleastwide
as
threeem-spaces.Thisdistribution
exhibitsnumber
of
prominentplateauswhich are
easilylocated by thresholding. The threshold is chosen to
554
IEEE
TRANSACTIONS ON PATTERNANALYSIS AND MACHlNEINTELLIGENCE.
VOL.
12,
NO.
6.
JUNE
1990
$4
4
88'
0w.w
P.
JOOLT
"Wm
"onma
z"wm
"g=u
DPcmCsD
Fig.
6.
Schematic,exaggeratedillustration ofaffine sheardistortion of
column.layout.
&L%%
2==$%
waw
n--
Fig. 4. An image
page,beforecorrection.columns of
a
full page are on theleft,and part of
the
next page appears
on
the right.
Connected components are represented
by
their bounding rectangles.
/I
Fig.
5.
The vertical projection profile
Of
the
bounding
shown
in
Fig,
7,
Schematic, exaggeratedillustrations
of
laid
at
differ.
entskew angles.
Fig. 4, aftercorrection.
be8thpercentile value ofhistogramofvalues in
the distribution. Duetothe narrownessof thecolumn-
break,thistechnique wouldfail
if
the skewestimation
were in error by aslittle as 1.0" of arc. This problem is
compoundedaprinting defect:
it
is possible for the
columns to bedistortedby anshearasas
0.5"
(illustratedschematically in Fig. 6). Shearcorrec-
tionusing thebestalignmentangle discovered among
roughly-vertical angles effectivelysolvesthis problem.
Beforeproceeding to findin each column,
necessary to recomputetheangle for each, due
secondunexpectedprinting defect:eventhe col-
umnsshow no affine shear, it is possible for each to
adifferent angle,as much as
0.6"
apart (illustratedsche-
matically in Fig.
7).
After recorrecting eachcolumnfor
skew,ispossible to locatelines of textanalyzingthe
horizontalprojection (Fig. 8). Here athreshold is
used,theassumptionlineconsists
leasttwo characters. Breakdecisions are lesscriticalthan
for columns, andthis technique has workedwell on
varietyof text.However,where line-spacing isit
could fail if theestimate of skew angle were
off
by as little
as
0.3"
of arc.
Thecontentseachlinethen sortedleftto right,
and, still manipulatingonly theboxesblobs, characters
are tentativelyidentified,collectionsofmore
blobs. The ruleused to combinepair ofblobs (or,
recursively,characters)overlap verticallymore
than
70%
ofwidthof either.This test
is
performed
also ata12" slant,the typicalitalicinclination angle.
This works common fragmented characters and
diacriticalmarks in thetext, butnot perfect, since
it canoverruled later.
In
fact
it
succeeds onall but one
c
Fig.
8.
The horizontalprojection profile of
a
column
of
text,
after
skew
correction.
of the
172
special characters theLatinalphabet enum-
erated in the
Chicago
Manual
of
Style
[6].
IV.
CLASSIFICATION,
BASELINES,
AND
WORDS
Beforeattemptingto locate baselines or words, we per-
formapreliminaryclassificationof characters. The
methodused is templatematching.
To
matchaof
characters,thecenterstheir bounding boxes are
aligned,and abitwise exclusive-or computed. The num-
bernonzerothe result scaled by thetotal area
of
both pattemssubtracted from
1
.O,
givingamatch
scorerange [0,1], with 1.0 indicatingaperfect
match.Thissimple implementationofalong-estab-
lished technique
[
121.
Template-matchingselectedsince,fewer
than
250
distinct lettershapes,there wasrequirement
_-
I-
555
BAIRD AND
THOMPSON:
READING
CHESS
for amultiple-fontclassifier (but, see
[8]).
Also, many
characters suffered from fragmentation, to whichtemplate
matching
is
less vulnerable than topology-basedfeature-
analysis methods.
We manually collected over 1300charactersamples,
about
7
per class on average,each labeleda correct
baseline location. Classificationan unknownshapeoc-
curredessentially exhaustive matchingthis data-
base, yieldinga list of classesdecreasing order of match
score, whichwastruncatedwhen scores fellbelow
0.9
of
theAmanually-selectedthreshold
(0.75)
was used
to distinguishgood frommatches.
Eachdeterminesbaselinelocation,the
dominantbaseline-height of each textline waschosen as
themedianof theset baseline-heights foreachchar-
acter’s firstgood match.Thistechniquefastandhas
provento be accurate and insensitivecommonly-
occurring problemsincludingpoorcharactersegmenta-
tionand dirt. If skew-correctionhasless accurate, a
morecomplex two-dimensional fitting procedurewould
havebeen required.
Eachcharacter’sscore ismodifiedre-
ducingit to the extent that the height-above-baseline dif-
fersfromthe meanthat assuming aGaussian
distributionwithstandardoftraining sam-
ples. This final score thusdepends primarilyupon
similarity of shape, andsecondarily on verticalplacement
in the line. application, baseline displacements as
small asthe x-heightoffontcan becrucial,
sincedistinguishesbetweensquare brackets
‘[’
and
‘I’
and similarshapes such as ‘I’ and
‘J’.
The brackets
are importantlexicalbreak characters used to extract
games and ignore commentary.
Text lines are partitioned into wordsexamining the
statistics of intercharacterspacing withineach column. A
histogramof intercharacter distances usually possesses a
bimodal distributionwhose minimumagood
threshold for distinguishing word- from character-spaces.
In practice, thehistogramstend
to
besparse andbadly-
behaved. By quantizingcoarsely(by 1
/6
of anem-space),
andrequiringthresholdwithinanarrowrange
([0.2,
0.51
em), theresults are usually good.
Most residual word-segmentation problems arecaused
by characters with extremekerningproperties, suchas pe-
riods andwide chess-piececharacters.
No
attempt was
made to exploit special knowledgekerning. Results
were sufficiently good to servean importantfunction dur-
ingthe “clean-up” processing described next. The final
syntacticand semanticanalysisdoesdepend onword-
breaks at all.
Graphics such as board diagrams were eitherdiscarded
earlier as too large or appearasline dominated by
very bad matches; these lines are detected and ignored.
V.
FRAGMENTS,
SMEARING,
AND
DIRT
Therethreeways in the character-finding
heuristic describedabovecanfail: 1) a character canbe
fragmented;
2)
two or more characters cantouch;
(a)
(b)
(C)
Fig.
9.
Defects causing problemsthat are resolved after preliminary clas-
sification:
(a)
touching characters;
(b)
broken characters;and (c) dirt
fragments.
strayink marks,paperdefects, or dirt can be mistaken for
a character. Of course, combinationthese can oc-
cur together.Exampleseachshown in Fig.
9.
In
virtuallyevery instance,these problemsa bad
match.
We observed that it was never necessaryto reach across
a word-breaksuch a problem.averagelength
ofaword is three characters, andonly
7.4%
ofwords on
averagehave at least one character witha bad match.This
suggested astrategy for correcting thesedefects
ifiedway by a nearly-exhaustive examination of altema-
tives generatedwithin the boundsofindividual words. An
outline of the heuristic is as follows:
for
each badword
(7.4%)
{
repeat
{
try merging sets of adjacent characters (0.3
%)
for
eachbad character
{
try splittinginto two (or more)
(0.9%)
if
character is dirt,
then
delete it
(0.6%)
,
1
if
wordisolated dirt,
then
delete it
(0.2%)
1
until
word is deleted or not improved
1
Improvement in is defined as an increase in a com-
positeword-match score, computed
as
the average of the
character-matchscores, weighted by Abouthalfof
thewordsareimprovedheuristic. The per-
centagesshown for eachstatement are for triesthat suc-
ceedimprovingthecharacter or expressedas
fractionsthe total number, good
and
bad.
A
characterisconsidered to be
dirt
if
it
has a lowmatch
score
(
<0.30) anda small area(smaller than about two
periods). Aword is
isolateddirt
if itconsistsof one dirt
character. These rulescatch almost all cases, failingwhen
the dirt is attachedcharacter (Fig.
10)
or lies
so
close
to the baselineresemblesperiod. Deleting dirt is
delayedeverything else hastried,sincecan-
not be undone.
In anattempt to recombinefragmented characters, an
exhaustive listofstrings adjacentcharacters
is
gener-
ated. Eachisanchoredparticularcharacter,
and all strings are generated thatreach no furtherthan
0.5
em fromthe anchor, and are no more than
1.1
em wide.
When splitting characters, amodificationofKahan’s
method
[8]
is usedto purpose a smallnumber of candidate
splittingpoints.
Althoughruntimeofheuristicpotentially expo-
556
IEEE
TRANSACTIONS ON PATTERN ANALYSIS MACHINE INTELLIGENCE.VOL.
12.
NO.
6.
JUNE
1990
s
Fig.
10.
Dirt fragmenttouching a character.
nential in thelength of the word,the frequency of
bad words and the short average lengthofwordsholds it
to about
1/7
ofthe costperformingthe preliminary
classification, onaverage (with a large variance).
VI.
EXTRACTING
GAMES
AND
MOVES
The result the preceding stages computationa
hierarchical analysis ofthepage into columns, lines,
words,characters,eachcharacter owning
of interpretations.sample of agalleyproofformat of
this datastructure
is
shown in
Fig.
11,
showing onlythe
topchoice interpretationof eachcharacter.
A
more de-
tailed format, including eachcharacter’sbounding box
andinterpretation list,shown in Fig.
12.
The results on
a sequence of(in detail) are concatenatedand passed
to the stages thatapply chess knowledge.
The text from thecharacter recognizerexamined for
chess semanticsthe goal of decoding the game played
alongwith players,event name and whowon. The anal-
ysis is performedsequentialphases asdescribedbelow.
There is no feedback between phases.
Isolating thegame.
Eachlineexaminedclassi-
fied as potentially beingone of thethreetitlelines of a
game. The first line (index) is leftandright justified with
a largespacethemiddle. The left partnumbers
while the rightpart contains parentheses. The second line
(players) is centered witha hyphen or dash. The third line
(event) is centered.Whenevertwo or threeconsecutive
linesare classified into potentialtitle lines, the previously
collected lines are analyzed as a complete game and sub-
sequent lines are collected for the next game.
Stripping Comments.
Commentsdelimited by
[
*
]
brackets. For analysis brackets are matched
in pairs.Whenpairbrackets are found thefollowing
possibilities are recognized.
[...
]
The
[
is accepted as real.
[
*
[
If thecharacterscores ofbothbrackets
areabove a threshold, the intervening
text isexamined by
reversetypesetting
(de-
scribedbelow) for the best potential
].
The
[
andinvented
]
areacceptedasreal.Oth-
erwise, the better
[
is acceptedand the other is
rejected as dirt.
I...
1
By analogywith
[
*
[.
The beginningof the game treated as ifit contains a
perfect
],
while theendcontains aperfect
[.
The above
COLUMN
1
1.1
lop
108:
(R
76ia)
A
62
1.2
1%
KORTCllNOl- TRINCOV
1.3
lop
Luzern
(01)
1982
1.4
lop
I.
d4 Qf6
2.
c4 e6
3
QO
c5 4. d5ed5
1.5
lop
5
cd5
d6 6.
&3
g6
7.
g3
Pg7
8.
Pg2
1.6
lop
1.7
lop
1.8
Io0
a4 Re4
13.
Hcl bS!
14
Eel
Hb8
15.
Rd2
W
9
0-4
&6!?
10.
h3
[IO.
e4 Pg4=]
&7
[RR
10.
. .
He8!?
11.
Qf4
Oc7
12.
1.9
16
@!
76. Qdeigf4
17.
abTf5
IF,
Qd2
fg3
1.10
lop
19.
fg3
eg5
20.
Qfl
Ob5F
Csom
-
Sub&
1.11
lop
B&le
Herculane
1982;
11.
bd2!”]
11.
e4
Fig.
I
I.
Galleyproofformat. used
for
manual proofreading,
for
the game
fragmentshown in Fig.
1,
Eachlabeledthepoint size.
1.06011.5531
1.1231
1.6511
1
9 0.90
1.13711.617i
1.16311.6471
1
per
0.95
1.2201 1.5401
1.50011.6771
0
\O
1.545n
1.2201 1.5501
1.28711.6431
1
0 0.93
1.29711.6001
1.42711.6201
1
dsh
0.94
1.4331
1.5501 1.5001
1.6431
1
0
0.93
1.56011.5401 1.9631
1.6771
0
\O
1.636”
1.56011.54011.6831
1.6671
1
chN
0.90
1.703i 1.5771
1.7701
1.6431 4
a
0.90
n
0.85
U
0.84
s
0.83
1.7801
1.54711.8431
1.6471
1
6
0.91
1.8701 1.54711.8871
1.6431 2
ex
0.90
1
0.81
1.9171
1.54711.9631
1.647i
1
qm
0.92
3.0331
1.5401a.2031
1.6771
o
\o
i.9ogn
2.0331
1.5431
2.08711.6431
4
1
0.91
I
0.83
1-sl
0.83 i 0.82
2.1001 1.547i
2.16311.6431
1
0
0.95
2.17311.6171
2.20311.6431
1
per
0.92
2.2601
1.54012.4131
1.677i
0
\O
1.545n
2.2601
1.54312.3431
1.6471 2
h
0.94
b
0.89
2.3471
1.5471
2.41311.6471
1
3
0.89
Fig.
12.
Detailedlayout descriptionmoves
9
and
10
of the game shown
in Figs.
1
and
11.
Thefour coordinates on theleft describe bounding
rectangles. The next number
is
a code
n
:
if
n
=
0,
thenthis is a layout
feature,as words(labeled
“\O”)
or
textlines(not shown);
if
n
>
0.
then itis a character possessing
n
interpretations.Eachinterpretation
consists of a letter-name and a match merit (e.g., “chN
0.90”
is
knight withmerit
0.9).
algorithm accepts, rejects, andinventsbrackets
SO
that
they are balanced.Accepted bracketsand theirenclosed
text are discardedascommentary.Thiseliminates about
50%
of the characters from further analysis.is goodat
locating brackets that were typesetbut are unrecognizable
due to printing defects, butnot at allgoodat supplying
bracketswereomittedtheprinter. The con-
tainslots of cluesfor realmissing brackets,suchas re-
peatedmove numbers,changetype face, commentary,
etc; noneofthis used.
Sketchingthe move sequence.
The text is examined for
move numberssequence. The rigorous enough
so
thatany numbers less than almost perfect are deferred
untila later phase.
Interpreting thegameresult.
The next afterthe last
move numberexaminedreverse typesetting for one
of: 1-0 (whitewins),
0-1
(black wins), or
l/2-1/2
(draw).
Completing the move sequence.
Missingnumbers in the
move sequence are filledby reverse typesetting. Pos-
siblymissing numbers betweenthelast backbone number
andtheresult are similarly filled in.
Preparsing the
moves.
The text between movenum-
bers is examined for twosyntactically-correct chess ply,
one for WhiteoneBlack.Candidateas-
sembled fromthealternatives offeredby theclassifier.
Correct formsinclude Piece-Letter-Digit (PLD),
LLD,PLLD,PDLD,or castle. Apiece one
of
I3
43
a
6,
a letter
abcdefgh,
12345678,
and
castleeither
0-0
or
0-0-0.
Commentarycharacters
may occuraftereachply,outsidebrackets;theseare
BAIRD AND
THOMPSON:
READING
CHESS
551
recognizedand discarded.one or morecorrectcandi-
dates are found for
both
ply,thenare recordedand
thetext is neverinterpretedWhenthere is more
than onecandidate, theysorted decreasingon the
productof the match scores (i.e., classifier’s top choices
first). These
preparsed
moves comprise about 98
%
of the
total moves.
Reverse typesetting
is theuseof the boundingrectangle
of characters to prune listsof matches. Both heightand
width ofboxes are compared, butnot height above base-
line.is never requiredwhen pruning listsfromthe clas-
sifier,buthelpfulwhen scanning for missedbrackets
or eliminatingwild choices made themove generator.
VJI.
CORRECTING
MOVES USING
SEMANTICS
Preparsedmovesinterpretedthe currentchess
context and, if legal,themove madeandmatch
continues. If thepreparsedmoveillegalthematch is
terminated.themovepreparsed,have
reached an
impasse,
andalllegalnextmoves fromthe
currentboardposition are generated andprunedusing re-
verse typesetting. The potentiallyexponentialruntimeof
this exhaustivestepmanaged takingproduct
of
thescores
of
the matchedmovesandpruningbelowa
threshold.matchesabove threshold aremade and
matching continues adepth-first,best-match-firstman-
ner. The longest matchtaken as the game score.
It isconceivable that somegame interpretationcould
be legalandstilldifferfromthe clear intentionof
the
ed-
itor-but suchforced interpretationshave not been ob-
served.example of the resultsof semantic analysis are
shown in Fig.
13.
The textthatrepresents theplayers and eventare
matched against (separate)dictionariesnames.larg-
estsubstring algorithm is usedto find the closest name in
the dictionary. The scoresformatches recorded
on an error file and examined by hand for additions to the
dictionaries. The dictionaries are initiallycreatedfrom the
indexthe
Informant.
This is highlyredundantbetween
issueswithonly about
5%
new entries pervolume after a
two volumestartup.
VIII.
RESULTS
Among the142 games,two wererejected the
systemfortypographicalerrors.Threeothers could not
be corrected thesemanticanalysis.rest, 98% of
the typo-free (correctly-printed) games, were accepted by
thesemanticanalysis,andhave beenproofreadhand
toconfirmthe interpretation.
No
forcedinterpretations
(substitution errors)werefound.game interpretations
were selected fromthose thatcould beassembledfrom
the top three classifier choices (withsyntactic nose-
mantic analysis) this fractionfallsto
76%.
Interpretation
by shapealone (usingonlyclassifier’s topchoices)
would haveyielded 40
%
of the games completely correct,
for an error factor of
30
worse thanthatachieved
by semantic analysis.
One of the games thatcouldfullycorrected had
/
person: TRINGOV (Trinqov
-0)
/
person:KORTCIINOI (KOrtchnOi
-2)
white:Kortchnoi
black: Tringov
/
event: Luzernlol)1982 (Lurcrn
(01)
1982
-0)
event: Lurern (01) 1982
result: 1-0
a4
Nf6e6 Nf3d5 e:d5
Nc3
q6
93
%37
%32
0-0
0-0
Nab
e4
Nd7Bf4Qe7
Re1
f6
Be3
b5
f4 b4 NA~ Nb5 Rc1
Re8
Nf3 Of8 Bf2
Bb7
BflNc7 Nd4
Kh8
Nc6 RA8 f5
e5
d:e5
Of3 N:b4 O:b? Nd3
B:d3
Q:g3r
-2
Q:d3Rcdl Ob5
b4
Race
Qd5
0.36
e6
Qd3
e?
A6
QC6
class/
004
108.*lR76/a)A62
N:CSN:CS
B:CS
of7
~d6
~:d5
0c4
Of6 f:e5
w5
Fig.
13.
Results ofthe syntactic and semantic analysis
for
the gameopen-
ingshown in Figs.
1,
11.
and
12.
Notethatmovenumbersand com-
mentaryhavebeenstripped
off
and a player’snamehasbeen corrected.
Theopeningclassification
“(R76/a)A62”
has been recomputed.
garbled characters atvery end, wheretherewas no
later context from which backtrack.Anothertoo
many consecutiveimpasses, atthe outset ofa game, lead-
ing tounmanageably largesearchspace.line of
text involved
is
illustrated in Fig.
14.
The top line the
bitmap(aftercleaning updirt), andthelinethe
top-choiceinterpretationsclassifier. Bad matches
are markedwith surrounding boxes.
For 99.4% of the atone candidate suggested
by the classifier was syntactically correct. Of these, the
topchoice wassemantically legal 98.7% time.
About
0.5%
of
thewere associatedwith impasses,
and thecorrect rate amongaftersemantic anal-
ysis was 99.99%. Since ply are madeabout threechar-
acterson average,the effective per-character success rate
is probably better than 99.995
%
.
We have no direct mea-
surementthe uncorrected character recognition rate,
and itisdifficult to it unbiased way since,
among other problems,about half
of
the characters are
ignored as commentary.
We havecontinued reading-withoutmanuallyproof-
readingevery game-and after four volumes(945 pages,
2850
games,
2
176
865
characters)performancehold-
ing steady, legal interpretationsfound for over 97%
of the gamesfree from typographical errors.Since about
amillion of thesecharacters in commentary,the
semantic analysisappliedabout amillion charac-
tersforan effectiverateof successful interpretationof
99.995%.
The system waswrittenentirely the
C
programming
languageunder variousresearch editionsthe
UNIX@operating system. Runtime for the image analysis
phases averaged
10
CPU minutespagea DEC
VAX@ 11/8550, of which
87%
was consumedexclu-
sive-or operationsduringtemplatematching.Connected
components analysisand layoutanalysis required about
10
CPU secondsapiece.Resegmentationdirty, etc.
shapesrequired anaverage ofoneCPU minute, witha
moderately large variancedue to variations in imagequal-
ity from pagepage.syntacticsemantic anal-
ysiswasperformeda Sequent;timeaveraged
2
WNIX
is a registeredtrademark
of
AT&T.
WAX
is
a
registeredtrademark
of
DigitalEquipmentCorporation
558
IEEE TRANSACTIONS
ON
PATTERNANALYSIS AND MACHINE INTELLIGENCE.
VOL.
12.
NO.
6.
JUNE
1990
1.
d4
Qf6
2.
cl
c3
3,
dS
b3
I.
a‘3
1.
d4
mf6
2.
cl
e3
3.
da
bl
1.
mrl3
Fig.A linetext containingenough impasses to inducefailure
semantic analysis. The toplineis original bitmap,after dirtfrag-
mentshavebeen deleted, sometimes erroneously
(as
in
“f”).
The bot-
tomline showstheclassifier’s topchoices, with badmatchesenclosed
in boxes.
seconds per butof course a largevariance
since several games requiredupto one CPU minuteand
afew(not averaged in) never terminated normally.
IX.
DISCUSS~ON
The
Chess
Informant
is anunusualof machine
visionthatitscontenteffectively-andefficiently-
computable semantics. The remarkablyhighcompetency
achieved by exploiting thisfactshould encouragefurther
attacks onimages withsimilarly conventional, unambig-
uous
consistency constraints.
In otherrespects,however,the
Informant
is represen-
tativeabroad class printed documents. Nonrectili-
near layout,column-line-spacing,broken,
touching, or dirty character-imagesalloccurotherdoc-
uments.thisreason,believeseveral of the
methods first applied here will be usefulachievinghigh
performance in the general case.
The layout analysis approach
is
unusual in followinga
strictlytop-down policy. Relyingaminimum of
a
priori
assumptions,ischaracterizedsequence of
refinement stepsordered
so
that the broadest statistical
support may be applied to the inference layoutparam-
eters.
So,
for example, lineorientationestimated
fromevidencedistributedthewholepage,word-
breakspacing over eachcolumn, andbaseline isdeter-
mined by consensus within entire text lines. The success
of thisstrategy,withoutrecourse to backtracking, in the
facemanifold layoutdistortions largelyto the
accuracyofskew-finding method.
We attempted to base as much
of
the error management
as possible on a single robust measure of merit. The clas-
sifier’s template matchoccasionally modified (by,
e.g., height above baseline) or combinedscore words
and ply), wastocontrolvirtuallyall later analysis.
The attempt to apply semantic interpretationto long se-
quences oftextranasubstantial intolerable run-
times due to combinatorial explosion. Withoutabasisof
consistentlygoodresults fromthe early states layout
analysis and character-shaperecognition, the effort would
have collapsed.
Astriking feature of theengineering designapro-
gression from fast heuristicsearly stages to slower
and more exhaustive algorithms at the end.For example,
itisnot untiltextbeen segmented into wordsthata
nearly-exhaustivesearchunleashedto fix broken,
touching, anddirty characters. Fullbacktracking
served for the last stagesemantic processing after sev-
eralkindsofglobalandlocal context hasbeenappliedto
pruneinterpretations to a manageable number.
ACKNOWLEDGMENT
An earlier versionof thepaperappearedas
[3].
The
character-splittingheuristicwasavariantof onedevel-
oped by
S.
Kahan. The UNIBUSinterface for the Ricoh
scanner was designed by
J.
Condonand built by W. Moy.
Its UNIXdriver waswritten by
L.
Cherry.
REFERENCES
[I]
H.
S.
Baird,skew angle
of
printed documents,” Proc. SPSE
40thSymp. HybridImaging Systems, Rochester, NY, May
121
W.
W.
BledsoeandBrowning, “Pattern recognitionreading
by machine,” in 1959Eastern JointComputer Conf., 1959.
131
H.
S.
Baird and K. Thompson,“Reading chess,” Proc.IEEE
Comput. Soc.Workshop Computer Vision. Miami Beach,Nov.
30-Dec. 2, 1987.
[4] J.
H.
Condon and K. Thompson, “Belle,” Chess Skillin Man and
Machine, P. Frey, Ed.
[5] Chess Informant. vols. 29, 30, 33. Beograd,Yugoslavia:Sahovski
Informator.1980- 1982.
[6] The Chicago Manual of
Style.
13th ed. Chicago,
1L:
University of
ChicagoPress, 1982,
[7]
J.
J.
Hull.
and
S.
N.
Srihai-i, “Experiments textrecognition with
1987,pp. 21-24.
New York: Springer-Verlag, 1982.-
binary n-gramand Viterbi analysis,”
IEEE
Trans.Pattern Anal. MA-
chineIntell.
,
vol.PAMI-4,520-530,Sept.1982.
S.
Kahan,
T.
Pavlidis,and
H.
S.
Baird,recognition
of
printed
characters
of
anyand size,” IEEE Trans.Pattern Anal. Machine
Intell.,
vol.PAMI-9,no. 2,
pp.
274-288,Mar.1987.
H.
0.
Kida,
0.
Iwaki. and
K.
Kawada. “Document recognitionsys-
tem for office automation.” in Proc. 8th
Inr.
Con$ Pattern Recog-
nition. Paris. France. 1986.
pp.
446-448.
E.
Meynieux,
S.
Seisen, and
K.
Tombre, “Bilevelinformation cod-
ing in office paper documents,” in Proc. 8th
Int.
Con$
PatternRec-
ognition, Paris, France, Oct.1986, pp. 442-445.
W.
S.
Rosenbaumand J. J. Hilliard, “Multifont OCRpostprocessing
system,”
IBM
J.
Res.
Develop.,
vol.
19,
pp.
398-421,July 1975.
H. F. Schantz, The History of OCR, Recognition Technologies Users
Assoc.,1982.
S.
N. Srihari,
J.
J. Hull,andR.Choudhari,“Integrating diverse
knowledge sources textrecognition.” ACM Trans.
Ofice
Inform.
S~sr.,
vol.
I,
pp.
68-87, Jan. 1983.
R. Shinghal, and G. T. Toussaint,“Experiments in textrecognition
with themodified Viterbi algorithm,” IEEE Trans. Pattern Anal. Ma-
chine lnte/l., vol. PAMI-I, pp.184-193, Apr.1979.
S.
N. Srihari and
G.
W.
Zack, “Documentimageanalysis,” in Proc.
8th
Int.
Con$
Pattern Recognition, Paris, France,Oct. 1986,
pp.
434-
436.
1.
R. Ullman, “A binaryn-gramtechnique for automaticcorrection
of substitution,deletion,insertion, andreversal errorswords,”
Comput.
J.,
vol. 20,
pp.
141-147, 1977.
BAIRD AND
THOMPSON:
READING
CHESS
559
Henry
S.
Baird
(M’80)
received the Ph.D. de-
gree fromPrincetonUniversity. Princeton, NJ.
He is aMember
of
theTechnicalatthe
Computing Science Research Center,AT&T Bell
Laboratories,Hill,
NJ.
His research fo-
cuses on algorithms for machine vision.em-
phasisontheinterpretation
of
images of printed
documents.1976 he gave the first complete de-
scription
of
the sweep-linealgorithm,
a
funda-
mentaltechnique
in
computational geometryand
an essential engineering component
in
the analy-
sis
of
VLSI mask artwork.
Dr. Baird is anAssociate Editor of IEEE T-PAMI, and
a
member
EditorialBoard of
Patferri
Recognition
journal.serve
as
Program
Chair
of
the 1990IAPR Workshop on Syntactic andStructuralPatternRec-
ognition. Histhesisalgorithms for imagematching won
a
1984
ACMDistinguishedDissertationAwardandwaspublishedtheMIT
Press.a member
of
theAssociation for ComputingMachineryand is
active
in
the IAPR.
Ken
Thompson
wasborn in New Orleans, LA,
in 1943. Hereceivedthe
B.S.
anddegrees
in electrical engineering fromtheUniversity
of
Califomia, Berkeley.
In 1966 he joinedthe Computing Science Re-
search Center
of
Bell Laboratorieswhere he has
workeduntilthe present. Hewas involved in Bell
Laboratories’ participationthe Multicsproject.
In 1975, hereturned to Berkeley to teach Com-
puterSciencefor
a
year.He is one ofprincipal
designers
of
the UNIX time-sharingsystem.
also
one
of
the principal designers
of
the former World Computer Chess
Champion,“Belle.
”
Mr. Thompson
is
amember of theNationalAcademy of Science and
the NationalAcademy of Engineering.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK