Development of Appropriate Test Markings
for Optical Scan Voting Machines

Phase 2: Analysis of Ballot Markings
NIST Contract SB1341-10-SE-0745
Revision of January 11, 2012, 7:00 AM
by
Mitch Trachtenberg

1. OVERVIEW

The National Institute of Standards and Technology has been asked to develop a standard set of reference markings representative of the types of marks that voters make on each common type of optical scan / marksense ballot.

As the first step in that development process, during September and December 2010, I scanned over 300 000 ballots at three elections offices. In this second step, I have analyzed the markings on these ballots, primarily the markings at vote targets, but also additional marks which were entered on the ballots after they were printed.

Each of the elections offices used a different type of printed vote target area.

Vancouver, WA used Hart * ballots, which use a heavily marked rectangular box to designate the vote target. The voter is asked to fully fill the box.

[* Certain commercial equipment, instruments, or materials are identified in this paper in order to specify procedures and data adequately. Such identification is not intended to imply recommendation or endorsement by the National institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.]

Everett, WA used Dominion/Sequoia ballots, which use a “broken arrow” to designate the vote target. The voter is asked to draw a line connecting the two halves of the broken arrow.

Champaign, IL used ES&S ballots, which use an oval to designate the vote target. The voter is asked to fill the oval.

Image pics/RectangularMarks.ps.png

Figure 1: Hart ballot style

Image pics/BrokenArrow.ps.png

Figure 2: Dominion/Sequoia ballot style

Image pics/OvalMarks.ps.png

Figure 3: ES&S ballot style

2. ISOLATION

Vote targets were cropped out of each ballot and initially categorized as voted or non-voted. Most subsequent attention was devoted to the voted category, though we checked to ensure that non-voted targets were indeed uniformly blank.

Ballots were also grouped by layout, aligned, and overlaid, such that for every pixel location of the ballots of a set, an individual image could be built using the darkest pixel value of that location. The effect is equivalent to taking a set of overhead transparencies, laying them one over another, and viewing them with a white backing underneath. The resulting image shows the material printed on the ballots together with any marks entered by voters on any ballot of the set.

3. APPROACH TO ANALYZING THE VOTE TARGETS

In each mark, we determined the average red, green, and blue intensities of the pixels in and around the target areas. We also determined the maximum vertical and horizontal extents of the darkened areas. For the oval and rectangular targets, we also measured the number of light to dark transitions in each of several single-pixel horizontal stripes and several single-pixel vertical stripes. By combining the patterns from these passes with vertical and horizontal span information, we are able to locate scribbled markings of various types, as well as x’s extending beyond the vote target.

Due to dust accumulating on the scanner glass, some of the images of the vote targets contain streaking. We discuss the impact of this streaking, which is not believed to affect the statistics generated.

A question arose as to whether any colors were dropped out by the Kodak i4200 scanners we used. Logically, this should not be possible, as the scanners are designed to reproduce color images faithfully. We confirmed with Kodak that no color is dropped out on the i4200. We have not checked with the vendors of ballot-specific equipment to determine whether their scanning devices drop out certain colors.

4. VANCOUVER / BOXES

4.1 Procedure

Black rectangular vote targets were isolated from approximately 90 000 ballots which had been scanned at 300 dots per inch, and placed in cropped images 66 pixels tall by 117 pixels wide. Within these crops, printed vote targets were 52 pixels tall by 99 pixels wide, with an interior approximately 32 pixels tall by 73 pixels wide.

The crops were initially divided into voted, nonvoted, and ambiguous groups. Two criteria were used: intensity of the cropped image including the printed target itself and a surrounding margin, and total number of pixels darkened below half intensity. Both these tests were performed on intensity values averaged over the red, green, and blue individual intensities.

Using an intensity scale of 0 to 255, a crop was considered possibly voted if the average intensity over all pixels in the crop was below 155, or if the number of pixels with intensity beneath 128 exceeded 3300. If both conditions were satisfied, the crop was considered to represent a voted mark.

Using these tests, 1 794 942 targets were considered possibly voted. These were further characterized.

513 000 of the targets which did not pass either test for “possibly voted” were searched for any interior pixel in a 50 × 15 pixel central region with combined red, green, and blue intensity values averaging below 192. 940 such targets were discovered, and are addressed in a later section.

The possibly voted crops were filtered based on the pixel offset into the crop at which the printed target box appeared, 1 066 092 were selected for further characterization as their starting x offsets and starting y offsets each fell between 4 and 10 pixels into their crop. (Basic results were confirmed to not vary substantially between the larger set and the smaller set.)

Of these, 14 112 (1.3 %) were found to have no pixels in the lower two average intensity quartiles and were inspected and removed to a separate table; none were found to have votes. The image groups are provided on a supporting disk; the following image is one of seventy and shows the first 200 such images.

Image pics/artifacts/artifacts_00.ps.png

Figure 4: Artifacts


This left 1 051 980 marks in the voted and well-centered group. The primary reason for the 14 112 ambiguous marks was that scanner streaking had lowered their average intensities into the potentially voted range, which had intentionally been set high to avoid missing any potentially voted marks.

4.2 Pixel counts within intensity quartiles

A 29 × 70 interior rectangle of each voted target was examined. More than half (592 024) of the voted targets had zero interior pixels in the highest intensity quartile; three quarters (757 122) had fewer than ten pixels remaining in that quartile; more than 9/10 had fewer than 100 pixels in that quartile.

Two different sets of graphs are provided. The second set removes the extremes so that the mid-range is highlighted.

Image pics/markpixeldistribution/quartilesmontageaxisA.ps.png

Figure 5: Pixel counts within intensity quartiles

Image pics/markpixeldistribution/quartilesmontageaxisB.ps.png

Figure 6: Pixel counts within intensity quartiles, expanded y axis

Over the range where 200 to 1200 pixels have been darkened into the lowest two quartiles, the mark count is roughly flat (equal numbers of ballots have 200 darkened pixels as have 500 darkened pixels, or 1000). As the total number of darkened pixels rises within this range, pixels tend to leave the second quartile and move to the darkest quartile

Image pics/markpixeldistribution/quartiles7.ps.png

Figure 7: Pixel shift between quartiles

4.3 Color and Tint

Red, green, and blue average values were collected for horizontal lines across the center of each mark. To report tint statistics, these were converted to equivalent values in the HSV (hue, saturation, value) color space. Red intensity peaks somewhat higher than blue and green intensity.

The following graphs indicate distributions of marks based on hue, saturation, and value, followed by red, green, and blue mean intensities. The hue graph shows a small peak representing blue marks at a hue value of 0.65. This hue graph also shows a small peak at a hue value of approximately 0.22. Following these graphs are examples of marks at hue values of 0 to 0.01, 0.2 to 0.25, and 0.65 to 0.67.

Image pics/hsv/hsvmontageaxisA.ps.png

Figure 8: HSV values

Image pics/hsv/hsvmontageaxisB.ps.png

Figure 9: RGB values

Image pics/colors/hue0to1_00.ps.png

Figure 10: Marks with hue value near zero

Image pics/colors/hue20to25_00.ps.png

Figure 11: Marks with hue between 0.2 and 0.25

Image pics/colors/hue65to67_00.ps.png

Figure 12: Marks with hue between 0.65 and 0.67


A small number of red marks were detected by searching using two different tests. First, a search was made for any marks with a red mean intensity greater than 152 and also greater than the average of green and blue intensity by at least 30. (Different values were tried until a set that excluded most darker brown marks was found.) The result, containing fewer than 200 marks, follows:

Image pics/colors/lightred_00.ps.png

Figure 13: Marks with red hue and light intensity †

[† Note that the yellow highlighter visible over an inked "x" in Figure 13 was added by a voter, and pink highlighter was used as the sole marking implement by one voter.]


A test using HSV values was also run and turned up fewer results.

Although it may appear from the graphs that certain hues never occur, when the marks are filtered for saturation or “value” values greater than 0.2 and then divided by hue into 100 groups, every group is represented. The ratio between the most common hue (0.01 to 0.02) and the least common (0.39 to 0.40 for value > 0.2, 0.47 to 0.48 for saturation > 0.2) is greater than 3000 to 1.

4.4 Transitions

To provide an additional dimension by which marks can be grouped, the marks were examined for light to dark intensity transitions along three horizontal and five vertical “cut lines.”

These intensity transitions were defined as a drop to an intensity of 184 or below following an intensity that had been above 208, and serve as a proxy for individual well separated strokes.

Generally, when a transition count of 1 or 2 occurs at both a vertical and a horizontal “cut line,” the voted target contains one or two sharp strokes as in a check or an X. Higher transition counts generally indicated scribbling which consisted of lines separated from one another by sufficient space to allow intervening pixels to return to a very light shade.

The following series of graphs shows how the transition count pattern varies with the number of pixels in the lightest quartiles, pointing to a higher likelihood that particular intensities represent scribbles as opposed to X marks and check marks.

Image pics/tcount/tcount0.ps.png

Figure 14: Transition counts, 1500 to 2000 white pixels

Image pics/tcount/tcount1.ps.png

Figure 15: Transition counts, 1000 to 1500 white pixels

Image pics/tcount/tcount2.ps.png

Figure 16: Transition counts, 500 to 1000 white pixels

Image pics/tcount/tcount3.ps.png

Figure 17: Transition counts, 300 to 500 white pixels

Image pics/tcount/tcount4.ps.png

Figure 18: Transition counts, 100 to 300 white pixels

Image pics/tcount/tcount5.ps.png

Figure 19: Transition counts, < 100 white pixels

The following pages present samples of marks with differing transition counts along a horizontal line centered vertically on the mark.

Image pics/h2tcount_highestpixgt500/tcount0_00.ps.png

Figure 20: Transition count 0

Image pics/h2tcount_highestpixgt500/tcount1_00.ps.png

Figure 21: Transition count 1

Image pics/h2tcount_highestpixgt500/tcount2_00.ps.png

Figure 22: Transition count 2

Image pics/h2tcount_highestpixgt500/tcountgt2_00.ps.png

Figure 23: Transition count > 2

Transition counts along various lines can be combined in queries to isolate particular patterns. For example, here is the result of requiring zero transition counts at the top, bottom, left and right, while requiring nonzero transition counts at the center.

Image pics/h2tcount_highestpixgt500/tcount01000100_00.ps.png

Figure 24: Vertical, horizontal transition counts used to isolate marks

4.5 Stroke style

To get a sense of stroke form and direction, a sample of marks was taken from among those with 1600 to 1610 pixels in the darkest two quartiles. This sample was taken in the hope that these are filled out with similar stroking to the far more commonplace marks which have greater coverage.

An examination of these marks shows that about three quarters can reasonably be characterized as having a particular orientation: vertical strokes, horizontal strokes, forward leaning strokes, backward leaning strokes, and circular strokes. Categories blend, as many strokes are actually elliptical curlicues oriented strongly in a direction. Where there was no strong “winning direction,” the mark was categorized as random.

The following sampling gives the approximate percentages of each category noted in a manual examination.

•  Vertical 60 (20 %)
•  Horizontal 88 (30 %)
•  Forward leaning 50 (15 %)
•  Backward leaning 5 (1 %)
•  Circular 11 (3 %)
•  Even 6 (1 %)
•  Random 97 (30 %)
•  Total categorized: 317

Horizontal strokes are the most common, with vertical and forward leaning strokes somewhat less common. Other stroke types are rare.

4.6 Lightest marks

The initial pass captured all marks with an average intensity below 155.

Those with no pixels whose average intensity fell in the lower two quartiles were removed. (For unknown reasons, 13 remain. Nine are uniform gray and four uniform pink.)

Image pics/remainingzeros.ps.png

Approximately 14 000 images were pulled from the main set because they did not have pixels in the darker two quartiles. These marks were generally thought to contain streaking and, indeed, between half and a third did have substantial streaking. These 14 000 marks are on 70 mosaics that will be provided on a supporting disk.

The artifact marks were individually examined for votes and 53 possible votes were found on 39 different ballot sides. These possible votes represent 0.4 % of the artifacts and only 0.006 % of the initially screened, centered targets. A reasonable conclusion is that it is extremely rare for targets with overall intensity in the normally voted range to contain no interior pixels in the darkest two quadrants.

The possible votes follow:

Image pics/artifacts/votemontage.ps.png

Figure 25: Possible votes containing no pixels darkened to bottom half of intensity range


The “closest” marks to those in the artifacts collection are the 59 which contain fewer than 10 pixels in the darkest two quadrants, while having registered initial crop intensities between 152 and 155. At least nine of these can reasonably be interpreted as having been voted, 13 have been contacted by a vote mark placed elsewhere, 16 have specks which can reasonably be characterized as hesitation marks, one has a red dirt pattern, and somewhere between five and ten look like erasures which turned into smudges and/or roughened the paper, causing some darkened pixels.

4.7 Marks not Captured in the `Possibly Voted´ Set

After the “possibly voted” marks were identified, 513 000 of the targets which did not pass either test for “possibly voted” were searched for any interior pixel in a 50 × 15 pixel central region with combined red, green, and blue intensity values averaging below 192. 940 such targets were discovered, about 40 % of which had an overall intensity of less than 158.3, and 60 % of which had overall intensity of 158.3 and above. For the targets with lower overall intensity, approximately 30 % had marks which could be construed as votes, giving roughly 110 votes. For the targets in the higher intensity range, at most 3 % (20) could be construed as votes. Of the 513 000 targets examined, these 130 represent 0.025 %. The actual frequency will be above 0.05 % as the tested area covered less than half of the interior of the vote targets. Furthermore, there are likely to be more “unvoted” targets in the typical ballot than “possibly voted” targets, and other frequencies are specified for targets in the “possibly voted” set.

Image pics/van_specks/specks_intgt4750_00.jpg.ps.png

Figure 26: Targets with intensities above 158.3 having individual interior pixel intensities beneath 192.

Image pics/van_specks/specks_intlt4750_00.jpg.ps.png

Figure 27: Targets with intensities above 155 and below 158.3 having individual interior pixel intensities beneath 192.

4.8 Identifying borderline votes

The existence of some apparent votes mixed into the intensity range which primarily contains specks, hesitation marks, and unmarked targets indicates that measurement of overall target intensity, while sufficient for capturing more than 99.9 % of votes, will leave some votes behind. It may only be possible to distinguish the remaining votes from the background noise of hesitation marks by analyzing the location and shape of the mark within the target.

It is possible that the scanners actually used in vote counting may become dirty in the same way ours did as we scanned. In our target collection, streaks can be identified by observing similar patterns of pixel darkening above and below the target (and, in an unvoted target, in the interior of the target). It may be worth investigating the impact of the streak-filter feature in altering the generated images, to determine the impact of streak removal on borderline voted regions.

In some of the artifact images, it is difficult to determine from the target image alone whether the mark represents a vote. It can be useful to compare the area immediately outside the target with the area within, to determine whether slight darkening represents a voter’s lightly shading the target or just a darkened ballot background. However, comparison against the entire ballot and the other marks of the ballot would be useful.

About 5 % of the images in the artifacts group had marks touching or very near to an unvoted mark. A few such images are listed here:

013013_1651_0856_V_a
186795_0212_1824_V_A
136144_0940_1271_V_A
169678_0196_3429_V_A
068755_0914_1897_V_A
019113_0929_3473_V_A
002000_0925_1265_V_A
013013_1651_0856_V_a
185811_0196_1824_V_A(missed highlighter)
164733_1664_0791_V_a

Some of the images in the artifacts group are difficult to characterize; among other things, they may represent roughness caused by erasures. A list of some unusually marked artifacts follows:

054588_0191_0590_V_A
077812_1654_3715_V_A (rough?)
065564_1649_3548_V_A
196999_0211_0589_V_A
152614_0208_3576_V_A
016318_0920_2034_V_A
124271_0941_3085_V_A
164083_0935_1273_V_A
187781_0208_0591_V_A(footprint?)
067782_0915_2959_V_A
194127_0927_2667_V_A(drip?)
075885_1661_2923_V_A(mottled red stain?)
160645_0206_3574_V_A(footprint?)
168298_0184_0587_V_A
068580_0934_2584_V_A(vertical line not streak)
103145_0194_0713_V_A(pinkish cast)
193325_0209_0591_V_A(rough)
172950_0199_3570_V_A(rough)
055407_0932_3034_V_A(rough? marked?)
184353_0199_0591_V_A(rough?)
183950_0204_0592_V_A(rough? marked?)
175308_0192_0831_V_Alight fingerprint
194675_0203_3693_V_Avlg (erasure?)
028505_0197_0590_V_Avlg (erasure?)
075005_1660_2034_V_Ared ar
159587_0201_0592_V_Afootprint
151971_0194_0591_V_Aspec
169121_0204_0596_V_Afootprint
067782_0915_2010_V_A(bluish pattern)
065861_0191_0586_V_Afootprint?
173397_0198_2495_V_A(bluish)
182634_1653_0856_V_A(light droplet)
020199_1649_3730_V_A(bluish mark)
174747_0202_1099_V_Aerasure
173687_1652_2164_V_Alight drop
174662_0190_2494_V_Aerasure
057670_1648_0854_V_Ahaze
06741_0193_2625_V_A
089278_0918_1343_V_A
174245_0193_0836_V_Abluish wide stroked ve
067776_0198_2555_V_A
186977_1655_0789_V_Aerased
067782_0915_2575_V_Ablue pattern
135811_0198_3486_V_ahighlighter, nonvote treated as vote
198852_1652_1658_V_Abrown wash lightening target ??
186473_0200_1756_V_A

4.9 Marks in non-target regions

Composite images were generated from 10 000 ballot sides to examine ballot marks not associated with vote targets. Separate composites were generated for each different bar code found on the ballots. These composite images are provided on disk and an example follows:

Image pics/composites/10000300200048.ps.jpg

Figure 28: Composite showing marks in non-target regions


Essentially all composites show write-in entries. Write-ins may be well above the write-in line and are sometimes level with the words “write-in” but rarely go above the words “write-in.” Some voters include their preferred candidate’s party affiliation with their write-ins.

Almost all composites show some crossout over vote targets. Crossouts may be single strokes or thick scribbles. Crossouts are generally drawn through the vote target but are sometimes drawn through the name of the candidate.

Most composites show some bleed through where marks made on the opposite side appear on the observed side; the ballots are all laid out such that the bleed through does not interfere with the vote target columns.

A summary of items found on examined composites follows. When a particular item was found on several composites, it was only noted the first several times. The numbers are those of the bar codes on the upper left of the ballots going into a composite image:

10000100100009 writeins, X through name
10000010200021 writeins, X through votes, single stroke through name, circled vote op
10000010100014 writeins, X through votes
10000020100035 circle around measure description, writeins, x's through voteops
10000020200042 X through votes, erasure name crossout single stroke
10000030100056 X through votes
10000030200063 X through votes, single stroke name crossout
10000040100077 spill in instructions, X through votes, horiz stroke crossout
10000040200084 stray line at top left
10000050100098 redo stamp long multivote crossouts
10000050200008 bleed through, write ins, x's through votes
10000070100043 blue outside vote box
10000070200050 scribbles through contests and near vote areas
10000080100064 x's through vote areas
10000080200071 x's through vote areas, heavy candidate crossout
10000090100085 valid ballot stamp,
10000090200092 x's through vote areas, arrows trail into candidates
10000100100009 candidate crossout
10000100200016 heavy stray marks in third column
10000110100030 scribbles in governor area
10000110200037 heavy x crossout over omplete contests
10000120100051 valid ballot stamp, circled initial, write in into margin
10000120200058
10000130200079 scribble at top
10000140200003 blue bleed through, candidate line out
10000150200024
10000160200045 question mark in candidate area, lines to side of candidates,
heavy crossout of candidate
10000170200066 heavy crossout in writein area
10000180200087
10000190200011 slashes across candidate areas, cut out in margin
10000200100025 signed at top, stamped at bottom, wine stain?
10000200200032 heavy crossout in writein, circled digit 2
10000210200053
10000220200074
10000230200095 long tails on check marks from vote boxes
10000240200019
10000250200040
10000260200061
10000270200082
10000280200006
10000290200027
10000300100041 note NO written in with oval
10000300200048 heavy cross out through both vote box and writein,
x's through unmarked writeins
10000310200069
10000320200090
10000330200014
10000340200035
10000350200056
10000360200077
10000370200098 large check tail extends out of contest
10000380200022
10000390200043 brown speck in third column
10000400200064 arrow pointing to candidate, second column
10000410200085
10000420200009 street address third column
10000430100023 NOTE signed upper left with address and ssn
10000430200030
10000440200051 cross out into margin impacting digits of bar code
10000450200072 horizontal line into left lower margin near bar code
10000460200093 ok initials, question mark
10000470200017
10000480200038
10000490200059
10000500200080 brown stains right margin
10000510200004 (picked up pink sheet over ballot)
10000520200025 scribble in third column beneath valid area
10000530200046 large explanation below contest, x's between vote marks
10000540200067 vote target crossout into right column
10000550200088
10000560200012 magic markered question mark
10000570200033
10000580200054 bleed through into numbers of left margin
10000590200075
10000600200096
10000610200020 cross out through vote target and candidate text,
candidate circled
10000620200041
10000630200062
10000640200083 write in into left margin
10000650200007
10000660200028 bleed through or crossout near top second column
10000670200049 contests lightly xd out
10000680200070
10000690200091
10000700200015
10000720200057 slashes through some contests
10000730200078
10000750200023 high write in
10000760200044
10000770200065
10000780100079 Note blue stain
10000780200086
10000790200010
10000800200031
10000810200052
10000820200073 blue ink in write margin, writing in contest and blank part
10000830200094 squiggle cross out, high write in, low write in
10000840200018 blue ink in right margin near numbers
10000850200039 crossout into left margin by lower numbers
10000860200060
10000870200081
10000880200005 light ink blotches in columns 1 and 2, (NO) as correction
10000890200026
10000900200047 dark mark left column at bar code
10000910200068 contests x'd
10000920200089 water/tea/coffee stains
10000930200012 initials in column 1, dark scribbled crossout
10000940200034 slashes exit columns into right margin
10000950200055 chevron in magic marker right first column?
10000960200076 torn and folded-over left margin
10000970200097 mark into left margin near bar code
(note two high writeins of same name in same handscript)
10000980200021 tall slashes through multiple contests
10000990200042
10001000200063
10001010200084
10001020200008
10001030200029 write in into left margin
10001040200050 slight mark into lower left margin
10001050200071 marks into lower left margin
10001060200092
10001070200016
10001080200037
10001090200058 squiggle crossout of check mark
10001100200079
10001110200003
10001120200024 initials first column
10001130200045
10001140200066
10001150200087 initials in left column
10001160200011
10001170200032
10001180200053
10001190100067 note signed second column, stamped
10001190200074
10001200200095
10001210200019
10001220200040 marks into lower left column
10001230200061 slash top left
10001240200082 crossout into left column
10001250200006
10001260200027 extremely heavy crossout
10001270200048 fold on top right, issue at bottom,
note purple streak through third column vote ops.

One issue that becomes apparent on examination of the Champaign composite images is the entry of “judge’s initials” into an area which may be tested by optical scan equipment to determine vote columns. Examples are in the images below:

Image pics/judges_initials.jpg.ps.png

Image pics/judges_initials2.jpg.ps.png

4.10 Scanner streaking

In order to get a sense of the impact of scanner streaking, unvoted marks from 1000 ballot sides were examined using the same technique as that used on all voted and ambiguous marks. None appeared human marked.

Within the unvoted mark interiors, a region of 29 × 70 was examined. Of 22 844 marks, 22 334 had 0 pixels in the center two intensity quartiles and 510 had one or more pixel in those quartiles. 318 had more than 30 pixels in those quartiles. 247 had more than 60 pixels in those quartiles, 34 had more than 80.

For unambiguous nonvotes, then, fewer than 1 % of the marks were impacted to the extent of having a 1 pixel wide vertical streak down the interior, and an additional 1 % were impacted to the extent of having streaking of two or more pixels in width. The largest impact of this streaking was to move pixels from the highest intensity quartile to the second intensity quartile. In the affected marks, the difference in intensity would have consisted of fewer than 70 of 2030 pixels being darkened, generally by no more than half intensity.

Approximately 2 % of marks may have had up to 4 % of their interior pixels darkened by anywhere from 1/4 to 1/2. Streaking might lower average interior intensity of a typical affected mark by 2 % in the worst case scenario, where the box is otherwise white along the streak. Because the vast majority of marks have more than 80 % coverage, the actual impact on the typical affected mark is likely to be no more than 0.4 % darkening, and this occurs on no more than 2 % of the marks.

5. EVERETT / BROKEN ARROWS

Everett’s Sequoia/Dominion ballots ask the voters to indicate their choices by connecting two halves of a “broken arrow” with a line. This results in much less range for variability then in the oval and rectangle targets.

The following analysis is still based on an incomplete subset of the vote database.

More than 5.1 million arrow targets were captured from the ballot sample, of which more than 1 500 000 were marked by the voter.

We measured the line heights, tilts, and colors of the marks, and isolated marks where the lines did not go all the way to the printed target. The targets remain divided into separate sets for the ballot fronts and ballot backs. There are no significant variations in statistics between the two sets; the graphs present the back data unless otherwise specified.

5.1 Line Heights

The heights of marks was tested at a series of locations across the break in the arrow. Location “b” is near the beginning of the break, location “l” near the end, and locations “e” and “h” were nearer the center. No significant difference was noticed in the heights at the differing locations.
The most common heights were 6 pixels and 7 pixels, approximately 0.5 mm to 0.6 mm, making up 21 % and 20 % of all marks, respectively. Only 3 % of lines spanned 4 or fewer pixels, 10 % contained 15 pixels or more, 1 % contained 29 or more, 0.1 % spanned 39 or more. were greater than 27 pixels, and only 0.01 % of heights contained 36 pixels.

Image pics/everett_statistics/thickness/thickness_back0.ps.png

Figure 31: Distribution of line heights

Image pics/everett_statistics/thickness/thickness10plus_back0.ps.png

Figure 32: Distribution of line heights, 10 pixels and greater

Image pics/everett_montages/thickness/thick02_back_00.jpg.ps.png

Figure 33: Marked lines spanning two pixels, 25 pixels into gap

Image pics/everett_montages/thickness/thick06_back_00.jpg.ps.png

Figure 34: Marked lines spanning six pixels, 25 pixels into gap

Image pics/everett_montages/thickness/thick11_back_00.jpg.ps.png

Figure 35: Marked lines spanning 11 pixels, 25 pixels into gap

Image pics/everett_montages/thickness/thick31_back_00.jpg.ps.png

Figure 36: Marked lines spanning 31 pixels, 25 pixels into gap

Image pics/everett_montages/thickness/thick51_back_00.jpg.ps.png

Figure 37: Marked lines spanning 51 or more pixels, 25 pixels into gap

5.2 Line tilt

Lines showed a tendency to move very slightly downwards, by about a single pixel, moving from left to right. Very few lines showed more than 5 pixel (0.4 mm or 1/60 inch) of tilt.

Image pics/everett_statistics/tilt/tilt0.ps.png

Figure 38: Distribution of line tilts across arrow gap

Image pics/everett_statistics/tilt/tilt1.ps.png

Figure 39: Distribution of line tilts across left half of arrow gap

Image pics/everett_statistics/tilt/tilt2.ps.png

Figure 40: Distribution of line tilts across right half of arrow gap

5.3 Line extent

Of 777 727 arrow targets marked at the center, 952 (0.12 %) were found to be unmarked at 5 pixels from the gap start, 738 (0.09 %) were found to be unmarked at 10 pixels from the gap start, and 649 (0.08 %) were found to be unmarked at 15 pixels from the gap start. 2125 marks (0.27 %) failed to reach within 5 pixels of the gap end, 1049 (0.13 %) failed to reach within 10, 708 (0.09 %) failed to reach within 15.

Image pics/everett_montages/missingleft2_back_00.jpg.ps.png

Figure 41: Lines failing to reach left edge of arrow gap

Image pics/everett_montages/missingright2_back_00.jpg.ps.png

Figure 42: Lines failing to reach right edge of arrow gap

5.4 Line intensity and color

The lines red, green and blue intensities were measured across vertical test stripes, from the first darkened pixel to the last. These intensities were also used to calculate hue, saturation and value in the HSV color model.

Image pics/everett_statistics/color/rgb_mean_back0.ps.png

Figure 43: Distribution of red, green and blue mean intensity

Image pics/everett_statistics/color/hue_combined_unclipped.ps.png

Figure 44: HSV ’hue’ distribution

Image pics/everett_statistics/color/sat_mean_back0.ps.png

Figure 45: HSV ’saturation’ distribution

Image pics/everett_statistics/color/val_mean_back0.ps.png

Image pics/everett_montages/hues.ps.png

Figure 46: Lines with hues of varying color.

5.5 Marks away from broken arrows

Composite images were generated from 30 000 ballot sides. The composite images were generated by overlaying sets of scans from ballots with the same layout codes, taking for each pixel location the darkest pixel of any ballot image in the set.
The images were aligned at an upper left corner landmark and derotated to roughly align throughout the image.

Because write-ins were not present in the scanned set of ballots, the composite images are remarkably clear of stray marks.

The voters’ marks connecting the arrow halves stay well within the arrow boundaries, almost entirely within the vertical range defined by the shaft of each arrow rather than the head.

The most serious potential problem appears to be from the folding and unfolding of the ballots, resulting in cut lines just above the entry for Brad Owen in the Lieutenant Governor race. Cut lines are also visible elsewhere along the folds.

Several voters appeared to place preliminary marks to the left of the arrows, prior to filling in the arrows.

A typical composite image follows.

Image pics/F213.ps.jpg

Figure 47: Composite of Everett ballots


Excursions outside the arrow and other unusual marks as noted are visible on the following:

F4 c2, Treas, Auditor
F7 c2, LG
F9 c1, 1000
F12 c1, 985, c2, Rep
F13 c1, 1000
F14 c1 mark right of column, explanation points within
F15 c2, Gov
F17 c2, US Rep line
F19 c1, 985, c2, state, c2 LG
F21 c1, Pres
F22 c2, SoS
F25 marks to left of choices
F26 c2, US Rep
F27 c2, SoS
F28 c1, 985, c2, US Rep
F36 c2, USRep crossout
F37 c1, Pres, c2, Auditor
F41 c1, 1029, c2, Auditor
F47 c2, SoS, c2, Auditor
F48 c2, Auditor
F53 c2, clipped upper right
F60 c1, torn upper left
F61 c2,clipped upper right
F62 c2, Gov/LG boundary
F63 c1, right margin
F68 c2, stray mark by Treasurer
F72 c2, US Rep
F77 c2, US Rep
F78 c2, US Rep
F79 c2, SoS, Auditor
F81 c2, above USRep
F82 c1, footprint?, c2, checkmark at SoS
F83 c2, line USRep
F84 c1, 1029 crossout No
F85 c1, 1000 circle yes
F89 c2, cut at LG
F91 c1, 1000 stray mark in arrow column, misc in c2
F97 c1, brown stain left margin
F114 c1, lines, c2, lines
F119 c1, 985 checkmark, c2, brown stain in arrow tail channel
F123 c2, brown speck Gov
F124 c1, line
F132 c1, arrows, c2 arrows
F139 c2, LG
F146 c2, Gov crossout
F147 c1, 1000, c2, SoS
F155 c2, stray line LG
F157 c1, cutline, c2, USRep, blueline
F161 c1, circlesx2, c2, Gov
F162 c1, arrows, c2, arrows
F163 c2, USRep
F164 c1, 985 water damage and scribble, c2, USRep scribble, top damage
F166 c1, Pres w-i stray mark, c2, stray line
F170 c1, 985 x
F172 c2, Treas stray mark
F173 c2, brown stains
F174 c1, cutmark, c2, cutmark
F176 c1, crossout x 3
F178 c2, Gov
F181 c1, lines x 3
F183 c1, margin comment
F186 c1, smudge near top of arrow channel
F186 c2, crossed-out write-in, no write-in arrow
F190 c1, 1000 explanatory correction text
F197 c2, cutline
F200 c1, light asterisk
F201 c2, cut line at Auditor
F203 c2, cut line at LG (stopping note of this)
F207 c2, cut line at LG, stray marks beneath
F208 c1, 1000 below arrow
F213 c2, cut line at LG (noted because severe)
F219 c1, "sorry" in arrow column at Pres
F221 c2, low line at Treas
F229 c1, crossouts
F232 c1, 1029 low lines
F241 c1, c2, bluegreen dashes
F255 c2, State vertical line
F260 c2, blue and red marks
F265 c2, speck at Treas
F268 c2, USRep
F272 c2, stain at Gov
F276 c1, dashes to left of options, dash above 1000 arrow
F292 c1, scribble near top, stain, arc in many, low line in 985
F310 c1, 985 no
F318 c2, blob at halftone
F322 c1, dashes, c2, dashes
F328 c1, blue tail beneath 1029
F353 c1, 985 yes
F354 c1, tear at top, red in margin
F356 c2, curve near top touches barcode
F359 c2, Treas stray mark, Auditor low line
F360 c1, left margin stray mark
F367 c2, crossout USRep
F370 c2, several crossouts
F372 c1, tear, c2 Gov lowline, Auditor lowline
F373 c1, 985 low line
F374 c1, c2, dashes
F375 c2, dash SoS
F377 c2, Gov
F384 c1, dash and crossout, c2, Treas stain
F386 c2, torn at top
F387 c1, purple spread
F390 c2, dash
F394 c1, stray mark arrow yes 985
F395 c1, zigzag arrow 985, c2, stray mark SoS, Auditor
F398 c1, stray blue line, c2, torn at top
B8 c1, tear upper left
B9 c1, CPL line above arrow
B11 c1, SPI smudge above arrows
B13 c2, zigzag at arrow
B14 c1, right margin stray mark
B15 c1, blue line beneath arrow
B17 c1, AG Ladenburg arrow IC Adams arrow, c2, scribble near bottom

6. CHAMPAIGN / OVALS

Two sets of Champaign County ballots were examined in order to reach a sample size of 100 000 ballots. Approximately 20 000 ballots from the February 2008 election were examined separately from more than 80 000 ballots from the November 2008 election.

More than 3.9 million vote ovals were captured from the November ballot sample, of which more than 1.4 million were marked by the voter. An additional 800 000 ovals were captured from the February ballot sample.

Votes were examined by cropping regions of 87 × 60 (5220) pixels, with the printed vote target bounded by a rectangle of 72 × 30 (2160) pixels, or approximately 40 % of the crop region. Cropped regions not containing a centered vote oval have been almost entirely removed from the data, but several hundred such regions may remain in the more than 1 125 000 ovals studied. Because these represent fewer than 0.1 % of the ovals they are not believed to represent a problem to the analysis.

With a cropped region’s pixels grouped into four intensity quartiles, the typical nonvote had in the vicinity of 4800 to 4900 pixels in the highest of the four intensity ranges, with another 150 to 200 pixels in each of the next two quartiles and fewer than 30 pixels in the lowest quartile.

The typical vote removed 1500 pixels from the top intensity quartile and increased the pixel count in the low two intensity ranges to between 1500 and 1600. For the red channel, only about 200 pixels were darkened to the lowest quartile, but for the green and blue channels approximately 1000 pixels were darkened to the lowest quartile.

The following charts show the change in distribution of pixel counts as ovals are voted, first in the November ballot set and then in the February ballot set.

Image pics/cham2_plots/quartiles_summary1.ps.png

Figure 48: Distribution of pixel counts by intensity quartile, all, November

Image pics/chamfeb_plots/quartiles_summary1.ps.png

Figure 49: Distribution of pixel counts by intensity quartile, all, February

The following charts show the distribution of pixel counts by quartile in only the voted ovals, first in the November ballot set and then in the February ballot set.

Image pics/cham2_plots/quartiles_summary2.ps.png

Figure 50: Distribution of pixel counts by intensity quartile, voted, November

Image pics/chamfeb_plots/quartiles_summary2.ps.png

Figure 51: Distribution of pixel counts by intensity quartile, voted, February

The average intensity of the cropped regions drops from above 240 to approximately 190. The average intensity of voted ovals is shown both compared with nonvoted ovals and on an expanded y axis. In addition, the average intensity of the marked area along the vertical centerline of an oval is shown. Because this picks up only the marked pixels, it shows a lower intensity than the cropped rectangle as a whole, peaking at approximately 80 rather than 190.

Image pics/cham2_plots/votednonvoted_intensity0.ps.png

Figure 52: Distribution by average intensity, November

Image pics/chamfeb_plots/votednonvoted_intensity0.ps.png

Figure 53: Distribution by average intensity, February

Image pics/cham2_plots/avg_int0.ps.png

Figure 54: Distribution by average intensity, November

Image pics/chamfeb_plots/avg_int0.ps.png

Figure 55: Distribution by average intensity, February

Image pics/cham2_plots/avg_int2.ps.png

Figure 56: Distribution by average intensity, November

Image pics/chamfeb_plots/avg_int2.ps.png

Figure 57: Distribution by average intensity, February

Using the crop area’s average intensity, fewer than 1 % of votes have average intensity less than 163/255, approximately 10 % of votes have average intensity less than 179/255. Half of voted ovals have an intensity across the cropped region of between 184 and 195, and fewer than 1 % of voted ovals have an intensity across the cropped region of 213 or above.

Using the vertical centerline, fewer than 1 % of votes have average intensity less than 47/255, approximately 10 % of votes have average intensity less than 60/255. Half of voted ovals have an intensity along their vertical centerline of between 68 and 94, and fewer than 1 % of voted ovals have an intensity along their vertical centerline of 146 or above.

The characteristics of the marks, as expected, change as the cropped regions’ average intensity changes. The darkest cropped regions contain marks that were filled well outside of the printed target. The following marks have cropped region intensities below 120/255:

Image pics/cham2_july_montages/darkest.ps.png

Figure 58: Marks in darkest group

The following marks have cropped region intensities from 120 through 149; keep in mind that these represent less than 0.2 % of voted marks:

Image pics/cham2_july_montages/red_avg_int120to150_00.ps.png

Figure 59: Marks in dark group

Marks in the darkest 10 % (excluding those in the darkest 0.2 %) show nearly full coverage in the target area and some excess as well. Automark printed marks show up in this set, at the right of the eighth row:

Image pics/cham2_july_montages/red_avg_int170to180_00.ps.png

Figure 60: Marks in dark/normal group

Marks in crops with the average intensity tend to be neatly filled in. However, some crops with this intensity contain marks with less than complete coverage, with marking outside the target contributing to the intensity drop.

As the following montage represents the most typical marks, it can serve as a useful place to point out characteristics which can usefully be used to distinguish marks. (The image is divided into 8 blocks, each of which contains a 5 × 5 grid of marks. The blocks will be referred to as A to D down the left, then E to H down the right; rows and columns within a block will be designated r1 to 5 and c1 to 5. The mark at Er2c5 has a loop above the target.)

Image pics/cham2_july_montages/red_avg_int190_00.ps.png

Figure 61: Marks in typical intensity group

In addition to hue, brightness, “transition count,” and writing implement used, marks can be characterized by the presence and location of substantial voids and the manner in which the voter filled the target. Most of the marks in this typical set show that voters attempted to follow the target outline, probably starting at the perimeter and moving inward in an elliptical motion (the interior is often left slightly lighter than the rest).

Unusual strokes: Cr2c4 shows diagonal lines rather than elliptical curves, and Dr2c2 shows vertical lines. Dr5c3 shows a random pattern. Gr2c4 and c5 show a compromise between following the ellipse and drawing diagonal lines. tend to be neatly filled in with ink. Marks with typical average intensities are still filled neatly, with lighter ink or pencil.

Voids: Ar1c5 shows a void at upper left; in addition, the entire mark is shifted right and down from the arget. Br2c1 shows this to a lesser degree. Er2c1 and Er3c2 show minor voids but no shift of the mark with respect to the target.

Out of bounds: Gr5c5 shows a mark going substantially out of bounds to the left, and Ar1c5 goes substantially out of bounds to the right. Er2c5 goes out of bounds above the target. Gr2c4 and c5 both go out of bounds beneath the target.

As intensities rise, marks are incompletely filled in, and “x” marks, check marks, hollows, and miscellaneous variants appear.

The following three montages show, first, marks typical of those in the lightest 1.5 %, and then marks in the last 0.4 % and the last 0.1 %.

Image pics/cham2_july_montages/red_avg_int210to220_00.ps.png

Figure 62: Marks in lightest 1.5 % of targets passing vote tests

Image pics/cham2_july_montages/red_avg_int220to230_00.ps.png

Figure 63: Marks in lightest 0.4 % of targets passing vote tests

Image pics/cham2_july_montages/red_avg_int230to250_00.ps.png

Figure 64: Marks in lightest 0.1 % of targets passing vote tests

6.1 Spoiled Ballots

Ballots in the 178 000 range were spoiled by the voter. These have been included, because they are a rich sample of problematic marks. However, when an election official wrote “SPOILED” across them, the result generates artifacts where the marks appear to go across the cropped regions without any connection with the target. These marks are distinguishable from voter made marks, and are concentrated in the range with average red intensity above 230.

6.2 Hesitation Marks

It is important that vote counting equipment be able to distinguish the marks by which a voter typically indicates their choice from the marks which probably occur when a voter touches their marking implement to their ballot without intending to register a vote.

Vote ovals in the previous montages represent only ovals which passed either a general intensity test (below 720 for red, green, and blue intensity values combined; each on a scale of 0 to 255) or a number of darkened pixels test (more than 300 pixels in the lowest half of intensity values).

Ovals which failed both of the above tests but were between 720 (240 × 3) and 735 (245 × 3) in combined red, green, and blue average intensity were further checked for small marks. Each pixel in a central 41 × 14 region (of the 72 × 30 ovals, whose interiors’ maximum width and height were 67 and 25 respectively), and the presence of any pixel darkened by at least 1/4 was considered a mark. This generated 2860 marked ovals from a set of approx 2 500 000 “unmarked” targets, or approximately 0.1 % of the set initially thought to be“unmarked”.

A subsample of 65 171 ovals at combined intensity of exactly 729.0 was taken for further testing. This subsample returned 73 hits in the central rectangles, which were 574 pixels in size. Per pixel, this is a hit frequency of 0.13. Then this subsample was searched more thoroughly for low intensity pixels by masking off an approximately 10 pixel wide ring around the oval, there were 83 hits over a region including 1451 pixels, giving only 10 additional hits in 877 additional pixels, or a per pixel hit rate outside the central rectangle of 0.01. This suggests that more than 90 % of specks were in the central rectangle contained within the printed oval.

The distribution of specks shows a high rate in the small number of ovals with average intensity 240.0 to 242.0, then drops to approximately 0.05 % of all unvoted ovals. (Note that this figure includes many specks which are barely visible.)

7201585435 %avgintensity240.0
7213415717 %
7221090696 %
72335891043 %avgintensity241.0
72498301532 %
725221402571 %
726493783991 %avgintensity242.0
7271519055090.3 %
7284383105390.1 %
7297504783860.05 %avgintensity243.0
7306064021560.03 %
731183340650.04 %
73267864560.08 %avgintensity244.0
73372957390.05 %
73457238170.03 %

The specks vary substantially in size depending upon the exact intensity at which they were found:


(Marks with numbers beginning 178 887 and 178 947, the “full slashes,” are actually artifacts from spoiled ballots. There are no apparent voter slashes showing up in the speck intensity ranges.)

Image pics/cham2_specks/specks240.txt_00.jpg.ps.png

Figure 65 Specks, intensity 240 to 241

Image pics/cham2_specks/specks241.txt_00.jpg.ps.png

Figure 66 Specks, intensity 241 to 242

Image pics/cham2_specks/specks242.txt_00.jpg.ps.png

Figure 67 Specks, intensity above 242

6.3 Color and Tint

The following graphs summarize the red, green, and blue intensities found in the cropped rectangles and across the horizontal span of the contained ovals. Blue and green are consistently at lower intensity than red:

Image pics/cham2_plots/rgb_overall.ps.png

Figure 68 Red, green and blue intensities of crop, November

Image pics/chamfeb_plots/rgb_overall.ps.png

Figure 69 Red, green and blue intensities of crop, February

Image pics/cham2_plots/rgb_hline.ps.png

Figure 70 Red, green and blue centerline mark intensities, November

Image pics/chamfeb_plots/rgb_hline.ps.png

Figure 71 Red, green and blue centerline mark intensities, February


The predominating hue is slightly reddish. Following a graph in which the predominating hue swamps all other data, a second graph is presented with an expanded y axis to show the small number of marks with differing hues.

The graphs are followed by montages of marks at different H values in the HSV color system.

Image pics/cham2_plots/hue.ps.png

Figure 72 Hue distribution

Image pics/cham2_plots/hue_expanded_y.ps.png

Figure 73 Hue distribution, expanded y axis

Image pics/cham2_july_montages/hue95_00.ps.png

Figure 74 Hue approximately 0.95

Image pics/cham2_july_montages/hue22to27_00.ps.png

Figure 75 Hue 0.22 to 0.27

Image pics/cham2_july_montages/hue62to67_00.ps.png

Figure 76 Hue 0.62 to 0.67

6.4 Horizontal Spans, Voted Ovals

Image pics/cham2_queries/hspanvotes.ps.png

Figure 77 Horizontal spans, November set

Image pics/chamfeb_queries/hspanvotes.ps.png

Figure 78 Horizontal spans, February set

6.5 Vertical Spans, Voted Ovals

Image pics/cham2_queries/vspanvotes.ps.png

Figure 79 Vertical spans, November set

Image pics/chamfeb_queries/vspanvotes.ps.png

Figure 80 Vertical spans, February set

6.6 Transition Counts

The transition count represents the number of light to dark transitions encountered following the first dark pixel encountered (typically the left edge of the printed oval). Higher transition counts represent traversal of light regions prior to encountering dark regions; typical examples would be “x” marks, check marks, zigzags, and hollow marks not in contact with the printed target.

The following graphs show the distribution of transition counts as measured at the horizontal centerline of voted marks. The second graph uses an expanded y axis.

Image pics/cham2_plots/tcount_horiz.ps.png

Figure 81 Transition counts

Image pics/cham2_plots/tcount_rare_horiz.ps.png

Figure 82 Transition counts, expanded y axis

The following montages compare typical marks with a transition count of 0 with marks with a transition count of 4. Although the difference between these two sets of marks is apparent, it is not clear that the transition count as calculated can be used to give much detail with regard to “degree of scribbledness.”

Image pics/cham2_july_montages/tcount0_00.ps.png

Figure 83 Transition count 0

Image pics/cham2_july_montages/tcount4_00.ps.png

Figure 84 Transition count 4

7. POSSIBLE MARK TAXONOMY AND NOTATION

Marks placed on oval and rectangular targets can be characterized by the intensity of the cropped region surrounding the target and/or by the intensity of the pixels across a span (for example, the centerline of the mark, from first darkened pixel to last). In addition, the hue of the mark can be used to characterize it, as can the marking implement used (when this can be discerned). The horizontal and vertical spans of the marks can be used as well, as can the number of transitions along a particular line.

Should additional characteristics be necessary, the nature of the stroking and coverage can provide additional dimensions. It is unclear whether a test set really needs to take these variations into account, but a relatively compact notation for the stroking could be as follows. The first part is based on compass direction notation:

Wout of bounds to left (west)
Eout of bounds to right (east)
Nout of bounds beyond top (north)
Sout of bounds beyond bottom (south)
NW,NE,etc... out of bounds at top left, top right, etc...
wvoid at interior left
evoid at interior right
nvoid at interior top
svoid at interior bottom
ovoid at center
nw,ne,etc...void at top left, void at top right, etc...
Vvertical strokes
Hhorizontal strokes
Tstrokes conforming to target ellipse
tstrokes conforming to target ellipse, not extending to printed target
Fdiagonal strokes leaning forward
Bdiagonal strokes leaning backward
Xan X nor check mark (voids would be assumed)
Ca surrounding circle (though this is extremely rare)
Dan interior dot or dash
Guniform light coverage, no strokes evident
Rrandom or not otherwise defined stroke pattern
!modifier suffix indicating the prior pattern is major
rmodifier suffix indicating strokes are rounded (curlicues)
[0..9]alternate modifiers indicating degree to which prior pattern exists

Using this notation, mark Ar1c5 of the montage at page 68 could be described as “SE nw T”. This notation could be extended to incorporate the other mentioned characteristics: intensity, writing implement used, predominating color, etc...

8. NEXT STEPS

Following completion of the mark databases and mark characterization, procedures will be developed and documented for producing a set of reference marks on typical ballots using each of the three vote target types analyzed.

—END—