Development of
Appropriate Test Markings
for Optical Scan Voting Machines
Phase 2: Analysis of Ballot Markings
NIST Contract SB1341-10-SE-0745
Revision of January 11, 2012, 7:00 AM
by
Mitch Trachtenberg
1. OVERVIEW
The National Institute of Standards and Technology has been asked to develop a standard set of reference markings representative of the types of marks that voters make on each common type of optical scan / marksense ballot.
As the first step in that development process, during September and December 2010, I scanned over 300 000 ballots at three elections offices. In this second step, I have analyzed the markings on these ballots, primarily the markings at vote targets, but also additional marks which were entered on the ballots after they were printed.
Each of the elections offices used a different type of printed vote target area.
Vancouver, WA used Hart * ballots, which use a heavily marked rectangular box to designate the vote target. The voter is asked to fully fill the box.
[* Certain commercial equipment, instruments, or materials are identified in this paper in order to specify procedures and data adequately. Such identification is not intended to imply recommendation or endorsement by the National institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.]
Everett, WA used Dominion/Sequoia ballots, which use a “broken arrow” to designate the vote target. The voter is asked to draw a line connecting the two halves of the broken arrow.
Champaign, IL used ES&S ballots, which use an oval to designate the vote target. The voter is asked to fill the oval.

Figure 1: Hart ballot style

Figure 2: Dominion/Sequoia ballot style

Figure 3: ES&S ballot style
2. ISOLATION
Vote targets were cropped out of each ballot and initially categorized as voted or non-voted. Most subsequent attention was devoted to the voted category, though we checked to ensure that non-voted targets were indeed uniformly blank.
Ballots were also grouped by layout, aligned, and overlaid, such that for every pixel location of the ballots of a set, an individual image could be built using the darkest pixel value of that location. The effect is equivalent to taking a set of overhead transparencies, laying them one over another, and viewing them with a white backing underneath. The resulting image shows the material printed on the ballots together with any marks entered by voters on any ballot of the set.
3. APPROACH TO ANALYZING THE VOTE TARGETS
In each mark, we determined the average red, green, and blue intensities of the pixels in and around the target areas. We also determined the maximum vertical and horizontal extents of the darkened areas. For the oval and rectangular targets, we also measured the number of light to dark transitions in each of several single-pixel horizontal stripes and several single-pixel vertical stripes. By combining the patterns from these passes with vertical and horizontal span information, we are able to locate scribbled markings of various types, as well as x’s extending beyond the vote target.
Due to dust accumulating on the scanner glass, some of the images of the vote targets contain streaking. We discuss the impact of this streaking, which is not believed to affect the statistics generated.
A question arose as to whether any colors were dropped out by the Kodak i4200 scanners we used. Logically, this should not be possible, as the scanners are designed to reproduce color images faithfully. We confirmed with Kodak that no color is dropped out on the i4200. We have not checked with the vendors of ballot-specific equipment to determine whether their scanning devices drop out certain colors.
4. VANCOUVER / BOXES
4.1 Procedure
Black rectangular vote targets were isolated from approximately 90 000 ballots which had been scanned at 300 dots per inch, and placed in cropped images 66 pixels tall by 117 pixels wide. Within these crops, printed vote targets were 52 pixels tall by 99 pixels wide, with an interior approximately 32 pixels tall by 73 pixels wide.
The crops were initially divided into voted, nonvoted, and ambiguous groups. Two criteria were used: intensity of the cropped image including the printed target itself and a surrounding margin, and total number of pixels darkened below half intensity. Both these tests were performed on intensity values averaged over the red, green, and blue individual intensities.
Using an intensity scale of 0 to 255, a crop was considered possibly voted if the average intensity over all pixels in the crop was below 155, or if the number of pixels with intensity beneath 128 exceeded 3300. If both conditions were satisfied, the crop was considered to represent a voted mark.
Using these tests, 1 794 942 targets were considered possibly voted. These were further characterized.
513 000 of the targets which did not pass either test for “possibly voted” were searched for any interior pixel in a 50 × 15 pixel central region with combined red, green, and blue intensity values averaging below 192. 940 such targets were discovered, and are addressed in a later section.
The possibly voted crops were filtered based on the pixel offset into the crop at which the printed target box appeared, 1 066 092 were selected for further characterization as their starting x offsets and starting y offsets each fell between 4 and 10 pixels into their crop. (Basic results were confirmed to not vary substantially between the larger set and the smaller set.)
Of these, 14 112 (1.3 %) were found to have no pixels in the lower two average intensity quartiles and were inspected and removed to a separate table; none were found to have votes. The image groups are provided on a supporting disk; the following image is one of seventy and shows the first 200 such images.

Figure 4: Artifacts
4.2 Pixel counts within intensity quartiles
A 29 × 70 interior rectangle of each voted target was examined. More than half (592 024) of the voted targets had zero interior pixels in the highest intensity quartile; three quarters (757 122) had fewer than ten pixels remaining in that quartile; more than 9/10 had fewer than 100 pixels in that quartile.
Two different sets of graphs are provided. The second set removes the extremes so that the mid-range is highlighted.
![]()
Figure 5: Pixel counts within intensity quartiles
![]()
Figure 6: Pixel counts within intensity quartiles, expanded y axis
Over the range where 200 to 1200 pixels have been darkened into the lowest two quartiles, the mark count is roughly flat (equal numbers of ballots have 200 darkened pixels as have 500 darkened pixels, or 1000). As the total number of darkened pixels rises within this range, pixels tend to leave the second quartile and move to the darkest quartile
![]()
Figure 7: Pixel shift between quartiles
4.3 Color and Tint
Red, green, and blue average values were collected for horizontal lines across the center of each mark. To report tint statistics, these were converted to equivalent values in the HSV (hue, saturation, value) color space. Red intensity peaks somewhat higher than blue and green intensity.
The following graphs indicate distributions of marks based on hue, saturation, and value, followed by red, green, and blue mean intensities. The hue graph shows a small peak representing blue marks at a hue value of 0.65. This hue graph also shows a small peak at a hue value of approximately 0.22. Following these graphs are examples of marks at hue values of 0 to 0.01, 0.2 to 0.25, and 0.65 to 0.67.

Figure 8: HSV values

Figure 9: RGB values

Figure 10: Marks with hue value near zero

Figure 11: Marks with hue between 0.2 and 0.25

Figure 12: Marks with hue between 0.65 and 0.67

Figure 13: Marks with red hue and light intensity †
[† Note that the yellow highlighter visible over an inked "x" in Figure 13 was added by a voter, and pink highlighter was used as the sole marking implement by one voter.]
Although it may appear from the graphs that certain hues never occur, when the marks are filtered for saturation or “value” values greater than 0.2 and then divided by hue into 100 groups, every group is represented. The ratio between the most common hue (0.01 to 0.02) and the least common (0.39 to 0.40 for value > 0.2, 0.47 to 0.48 for saturation > 0.2) is greater than 3000 to 1.
4.4 Transitions
To provide an additional dimension by which marks can be grouped, the marks were examined for light to dark intensity transitions along three horizontal and five vertical “cut lines.”
These intensity transitions were defined as a drop to an intensity of 184 or below following an intensity that had been above 208, and serve as a proxy for individual well separated strokes.
Generally, when a transition count of 1 or 2 occurs at both a vertical and a horizontal “cut line,” the voted target contains one or two sharp strokes as in a check or an X. Higher transition counts generally indicated scribbling which consisted of lines separated from one another by sufficient space to allow intervening pixels to return to a very light shade.
The following series of graphs shows how the transition count pattern varies with the number of pixels in the lightest quartiles, pointing to a higher likelihood that particular intensities represent scribbles as opposed to X marks and check marks.

Figure 14: Transition counts, 1500 to 2000 white pixels

Figure 15: Transition counts, 1000 to 1500 white pixels

Figure 16: Transition counts, 500 to 1000 white pixels

Figure 17: Transition counts, 300 to 500 white pixels

Figure 18: Transition counts, 100 to 300 white pixels

Figure 19: Transition counts, < 100 white pixels
The following pages present samples of marks with differing transition counts along a horizontal line centered vertically on the mark.

Figure 20: Transition count 0

Figure 21: Transition count 1

Figure 22: Transition count 2

Figure 23: Transition count > 2
Transition counts along various lines can be combined in queries to isolate particular patterns. For example, here is the result of requiring zero transition counts at the top, bottom, left and right, while requiring nonzero transition counts at the center.

Figure 24: Vertical, horizontal transition counts used to isolate marks
4.5 Stroke style
To get a sense of stroke form and direction, a sample of marks was taken from among those with 1600 to 1610 pixels in the darkest two quartiles. This sample was taken in the hope that these are filled out with similar stroking to the far more commonplace marks which have greater coverage.
An examination of these marks shows that about three quarters can reasonably be characterized as having a particular orientation: vertical strokes, horizontal strokes, forward leaning strokes, backward leaning strokes, and circular strokes. Categories blend, as many strokes are actually elliptical curlicues oriented strongly in a direction. Where there was no strong “winning direction,” the mark was categorized as random.
The following sampling gives the approximate percentages of each category noted in a manual examination.
•
Vertical 60 (20 %)
•
Horizontal 88 (30 %)
•
Forward leaning 50 (15 %)
•
Backward leaning 5 (1 %)
•
Circular 11 (3 %)
•
Even 6 (1 %)
•
Random 97 (30 %)
•
Total categorized: 317
Horizontal strokes are the most common, with vertical and forward leaning strokes somewhat less common. Other stroke types are rare.
4.6 Lightest marks
The initial pass captured all marks with an average intensity below 155.
Those with no pixels whose average intensity fell in the lower two quartiles were removed. (For unknown reasons, 13 remain. Nine are uniform gray and four uniform pink.)

Approximately 14 000 images were pulled from the main set because they did not have pixels in the darker two quartiles. These marks were generally thought to contain streaking and, indeed, between half and a third did have substantial streaking. These 14 000 marks are on 70 mosaics that will be provided on a supporting disk.
The artifact marks were individually examined for votes and 53 possible votes were found on 39 different ballot sides. These possible votes represent 0.4 % of the artifacts and only 0.006 % of the initially screened, centered targets. A reasonable conclusion is that it is extremely rare for targets with overall intensity in the normally voted range to contain no interior pixels in the darkest two quadrants.
The possible votes follow:

Figure 25: Possible votes containing no pixels darkened to bottom half of intensity range
4.7 Marks not Captured in the `Possibly Voted´ Set
After the “possibly voted” marks were identified, 513 000 of the targets which did not pass either test for “possibly voted” were searched for any interior pixel in a 50 × 15 pixel central region with combined red, green, and blue intensity values averaging below 192. 940 such targets were discovered, about 40 % of which had an overall intensity of less than 158.3, and 60 % of which had overall intensity of 158.3 and above. For the targets with lower overall intensity, approximately 30 % had marks which could be construed as votes, giving roughly 110 votes. For the targets in the higher intensity range, at most 3 % (20) could be construed as votes. Of the 513 000 targets examined, these 130 represent 0.025 %. The actual frequency will be above 0.05 % as the tested area covered less than half of the interior of the vote targets. Furthermore, there are likely to be more “unvoted” targets in the typical ballot than “possibly voted” targets, and other frequencies are specified for targets in the “possibly voted” set.

Figure 26: Targets with intensities above 158.3 having individual interior pixel intensities beneath 192.

Figure 27: Targets with intensities above 155 and below 158.3 having individual interior pixel intensities beneath 192.
4.8 Identifying borderline votes
The existence of some apparent votes mixed into the intensity range which primarily contains specks, hesitation marks, and unmarked targets indicates that measurement of overall target intensity, while sufficient for capturing more than 99.9 % of votes, will leave some votes behind. It may only be possible to distinguish the remaining votes from the background noise of hesitation marks by analyzing the location and shape of the mark within the target.
It is possible that the scanners actually used in vote counting may become dirty in the same way ours did as we scanned. In our target collection, streaks can be identified by observing similar patterns of pixel darkening above and below the target (and, in an unvoted target, in the interior of the target). It may be worth investigating the impact of the streak-filter feature in altering the generated images, to determine the impact of streak removal on borderline voted regions.
In some of the artifact images, it is difficult to determine from the target image alone whether the mark represents a vote. It can be useful to compare the area immediately outside the target with the area within, to determine whether slight darkening represents a voter’s lightly shading the target or just a darkened ballot background. However, comparison against the entire ballot and the other marks of the ballot would be useful.
About 5 % of the images in the artifacts group had marks touching or very near to an unvoted mark. A few such images are listed here:
| 013013_1651_0856_V_a | |
| 186795_0212_1824_V_A | |
| 136144_0940_1271_V_A | |
| 169678_0196_3429_V_A | |
| 068755_0914_1897_V_A | |
| 019113_0929_3473_V_A | |
| 002000_0925_1265_V_A | |
| 013013_1651_0856_V_a | |
| 185811_0196_1824_V_A | (missed highlighter) |
| 164733_1664_0791_V_a |
Some of the images in the artifacts group are difficult to characterize; among other things, they may represent roughness caused by erasures. A list of some unusually marked artifacts follows:
| 054588_0191_0590_V_A | |
| 077812_1654_3715_V_A | (rough?) |
| 065564_1649_3548_V_A | |
| 196999_0211_0589_V_A | |
| 152614_0208_3576_V_A | |
| 016318_0920_2034_V_A | |
| 124271_0941_3085_V_A | |
| 164083_0935_1273_V_A | |
| 187781_0208_0591_V_A | (footprint?) |
| 067782_0915_2959_V_A | |
| 194127_0927_2667_V_A | (drip?) |
| 075885_1661_2923_V_A | (mottled red stain?) |
| 160645_0206_3574_V_A | (footprint?) |
| 168298_0184_0587_V_A | |
| 068580_0934_2584_V_A | (vertical line not streak) |
| 103145_0194_0713_V_A | (pinkish cast) |
| 193325_0209_0591_V_A | (rough) |
| 172950_0199_3570_V_A | (rough) |
| 055407_0932_3034_V_A | (rough? marked?) |
| 184353_0199_0591_V_A | (rough?) |
| 183950_0204_0592_V_A | (rough? marked?) |
| 175308_0192_0831_V_A | light fingerprint |
| 194675_0203_3693_V_A | vlg (erasure?) |
| 028505_0197_0590_V_A | vlg (erasure?) |
| 075005_1660_2034_V_A | red ar |
| 159587_0201_0592_V_A | footprint |
| 151971_0194_0591_V_A | spec |
| 169121_0204_0596_V_A | footprint |
| 067782_0915_2010_V_A | (bluish pattern) |
| 065861_0191_0586_V_A | footprint? |
| 173397_0198_2495_V_A | (bluish) |
| 182634_1653_0856_V_A | (light droplet) |
| 020199_1649_3730_V_A | (bluish mark) |
| 174747_0202_1099_V_A | erasure |
| 173687_1652_2164_V_A | light drop |
| 174662_0190_2494_V_A | erasure |
| 057670_1648_0854_V_A | haze |
| 06741_0193_2625_V_A | |
| 089278_0918_1343_V_A | |
| 174245_0193_0836_V_A | bluish wide stroked ve |
| 067776_0198_2555_V_A | |
| 186977_1655_0789_V_A | erased |
| 067782_0915_2575_V_A | blue pattern |
| 135811_0198_3486_V_a | highlighter, nonvote treated as vote |
| 198852_1652_1658_V_A | brown wash lightening target ?? |
| 186473_0200_1756_V_A |
4.9 Marks in non-target regions
Composite images were generated from 10 000 ballot sides to examine ballot marks not associated with vote targets. Separate composites were generated for each different bar code found on the ballots. These composite images are provided on disk and an example follows:

Figure 28: Composite showing marks in non-target regions
Almost all composites show some crossout over vote targets. Crossouts may be single strokes or thick scribbles. Crossouts are generally drawn through the vote target but are sometimes drawn through the name of the candidate.
Most composites show some bleed through where marks made on the opposite side appear on the observed side; the ballots are all laid out such that the bleed through does not interfere with the vote target columns.
A summary of items found on examined composites follows. When a particular item was found on several composites, it was only noted the first several times. The numbers are those of the bar codes on the upper left of the ballots going into a composite image:
| 10000100100009 | writeins, X through name |
| 10000010200021 | writeins, X through votes, single stroke through name, circled vote op |
| 10000010100014 | writeins, X through votes |
| 10000020100035 | circle around measure description, writeins, x's through voteops |
| 10000020200042 | X through votes, erasure name crossout single stroke |
| 10000030100056 | X through votes |
| 10000030200063 | X through votes, single stroke name crossout |
| 10000040100077 | spill in instructions, X through votes, horiz stroke crossout |
| 10000040200084 | stray line at top left |
| 10000050100098 | redo stamp long multivote crossouts |
| 10000050200008 | bleed through, write ins, x's through votes |
| 10000070100043 | blue outside vote box |
| 10000070200050 | scribbles through contests and near vote areas |
| 10000080100064 | x's through vote areas |
| 10000080200071 | x's through vote areas, heavy candidate crossout |
| 10000090100085 | valid ballot stamp, |
| 10000090200092 | x's through vote areas, arrows trail into candidates |
| 10000100100009 | candidate crossout |
| 10000100200016 | heavy stray marks in third column |
| 10000110100030 | scribbles in governor area |
| 10000110200037 | heavy x crossout over omplete contests |
| 10000120100051 | valid ballot stamp, circled initial, write in into margin |
| 10000120200058 | |
| 10000130200079 | scribble at top |
| 10000140200003 | blue bleed through, candidate line out |
| 10000150200024 | |
| 10000160200045 | question mark in candidate area, lines to side of candidates, |
| heavy crossout of candidate | |
| 10000170200066 | heavy crossout in writein area |
| 10000180200087 | |
| 10000190200011 | slashes across candidate areas, cut out in margin |
| 10000200100025 | signed at top, stamped at bottom, wine stain? |
| 10000200200032 | heavy crossout in writein, circled digit 2 |
| 10000210200053 | |
| 10000220200074 | |
| 10000230200095 | long tails on check marks from vote boxes |
| 10000240200019 | |
| 10000250200040 | |
| 10000260200061 | |
| 10000270200082 | |
| 10000280200006 | |
| 10000290200027 | |
| 10000300100041 | note NO written in with oval |
| 10000300200048 | heavy cross out through both vote box and writein, |
| x's through unmarked writeins | |
| 10000310200069 | |
| 10000320200090 | |
| 10000330200014 | |
| 10000340200035 | |
| 10000350200056 | |
| 10000360200077 | |
| 10000370200098 | large check tail extends out of contest |
| 10000380200022 | |
| 10000390200043 | brown speck in third column |
| 10000400200064 | arrow pointing to candidate, second column |
| 10000410200085 | |
| 10000420200009 | street address third column |
| 10000430100023 | NOTE signed upper left with address and ssn |
| 10000430200030 | |
| 10000440200051 | cross out into margin impacting digits of bar code |
| 10000450200072 | horizontal line into left lower margin near bar code |
| 10000460200093 | ok initials, question mark |
| 10000470200017 | |
| 10000480200038 | |
| 10000490200059 | |
| 10000500200080 | brown stains right margin |
| 10000510200004 | (picked up pink sheet over ballot) |
| 10000520200025 | scribble in third column beneath valid area |
| 10000530200046 | large explanation below contest, x's between vote marks |
| 10000540200067 | vote target crossout into right column |
| 10000550200088 | |
| 10000560200012 | magic markered question mark |
| 10000570200033 | |
| 10000580200054 | bleed through into numbers of left margin |
| 10000590200075 | |
| 10000600200096 | |
| 10000610200020 | cross out through vote target and candidate text, |
| candidate circled | |
| 10000620200041 | |
| 10000630200062 | |
| 10000640200083 | write in into left margin |
| 10000650200007 | |
| 10000660200028 | bleed through or crossout near top second column |
| 10000670200049 | contests lightly xd out |
| 10000680200070 | |
| 10000690200091 | |
| 10000700200015 | |
| 10000720200057 | slashes through some contests |
| 10000730200078 | |
| 10000750200023 | high write in |
| 10000760200044 | |
| 10000770200065 | |
| 10000780100079 | Note blue stain |
| 10000780200086 | |
| 10000790200010 | |
| 10000800200031 | |
| 10000810200052 | |
| 10000820200073 | blue ink in write margin, writing in contest and blank part |
| 10000830200094 | squiggle cross out, high write in, low write in |
| 10000840200018 | blue ink in right margin near numbers |
| 10000850200039 | crossout into left margin by lower numbers |
| 10000860200060 | |
| 10000870200081 | |
| 10000880200005 | light ink blotches in columns 1 and 2, (NO) as correction |
| 10000890200026 | |
| 10000900200047 | dark mark left column at bar code |
| 10000910200068 | contests x'd |
| 10000920200089 | water/tea/coffee stains |
| 10000930200012 | initials in column 1, dark scribbled crossout |
| 10000940200034 | slashes exit columns into right margin |
| 10000950200055 | chevron in magic marker right first column? |
| 10000960200076 | torn and folded-over left margin |
| 10000970200097 | mark into left margin near bar code |
| (note two high writeins of same name in same handscript) | |
| 10000980200021 | tall slashes through multiple contests |
| 10000990200042 | |
| 10001000200063 | |
| 10001010200084 | |
| 10001020200008 | |
| 10001030200029 | write in into left margin |
| 10001040200050 | slight mark into lower left margin |
| 10001050200071 | marks into lower left margin |
| 10001060200092 | |
| 10001070200016 | |
| 10001080200037 | |
| 10001090200058 | squiggle crossout of check mark |
| 10001100200079 | |
| 10001110200003 | |
| 10001120200024 | initials first column |
| 10001130200045 | |
| 10001140200066 | |
| 10001150200087 | initials in left column |
| 10001160200011 | |
| 10001170200032 | |
| 10001180200053 | |
| 10001190100067 | note signed second column, stamped |
| 10001190200074 | |
| 10001200200095 | |
| 10001210200019 | |
| 10001220200040 | marks into lower left column |
| 10001230200061 | slash top left |
| 10001240200082 | crossout into left column |
| 10001250200006 | |
| 10001260200027 | extremely heavy crossout |
| 10001270200048 | fold on top right, issue at bottom, note purple streak through third column vote ops. |
One issue that becomes apparent on examination of the Champaign composite images is the entry of “judge’s initials” into an area which may be tested by optical scan equipment to determine vote columns. Examples are in the images below:


4.10 Scanner streaking
In order to get a sense of the impact of scanner streaking, unvoted marks from 1000 ballot sides were examined using the same technique as that used on all voted and ambiguous marks. None appeared human marked.
Within the unvoted mark interiors, a region of 29 × 70 was examined. Of 22 844 marks, 22 334 had 0 pixels in the center two intensity quartiles and 510 had one or more pixel in those quartiles. 318 had more than 30 pixels in those quartiles. 247 had more than 60 pixels in those quartiles, 34 had more than 80.
For unambiguous nonvotes, then, fewer than 1 % of the marks were impacted to the extent of having a 1 pixel wide vertical streak down the interior, and an additional 1 % were impacted to the extent of having streaking of two or more pixels in width. The largest impact of this streaking was to move pixels from the highest intensity quartile to the second intensity quartile. In the affected marks, the difference in intensity would have consisted of fewer than 70 of 2030 pixels being darkened, generally by no more than half intensity.
Approximately 2 % of marks may have had up to 4 % of their interior pixels darkened by anywhere from 1/4 to 1/2. Streaking might lower average interior intensity of a typical affected mark by 2 % in the worst case scenario, where the box is otherwise white along the streak. Because the vast majority of marks have more than 80 % coverage, the actual impact on the typical affected mark is likely to be no more than 0.4 % darkening, and this occurs on no more than 2 % of the marks.
5. EVERETT / BROKEN ARROWS
Everett’s Sequoia/Dominion ballots ask the voters to indicate their choices by connecting two halves of a “broken arrow” with a line. This results in much less range for variability then in the oval and rectangle targets.
The following analysis is still based on an incomplete subset of the vote database.
More than 5.1 million arrow targets were captured from the ballot sample, of which more than 1 500 000 were marked by the voter.
We measured the line heights, tilts, and colors of the marks, and isolated marks where the lines did not go all the way to the printed target. The targets remain divided into separate sets for the ballot fronts and ballot backs. There are no significant variations in statistics between the two sets; the graphs present the back data unless otherwise specified.
5.1 Line Heights
The heights of marks was tested
at a series of locations across the break in the arrow.
Location “b” is near the beginning of the break,
location “l” near the end, and locations
“e” and “h” were nearer the center.
No significant difference was noticed in the heights at the
differing locations.
The most common heights were 6 pixels and 7 pixels,
approximately 0.5 mm to 0.6 mm, making up 21 % and 20 % of
all marks, respectively. Only 3 % of lines spanned 4 or
fewer pixels, 10 % contained 15 pixels or more, 1 %
contained 29 or more, 0.1 % spanned 39 or more. were greater
than 27 pixels, and only 0.01 % of heights contained 36
pixels.

Figure 31: Distribution of line heights

Figure 32: Distribution of line heights, 10 pixels and greater

Figure 33: Marked lines spanning two pixels, 25 pixels into gap

Figure 34: Marked lines spanning six pixels, 25 pixels into gap

Figure 35: Marked lines spanning 11 pixels, 25 pixels into gap

Figure 36: Marked lines spanning 31 pixels, 25 pixels into gap

Figure 37: Marked lines spanning 51 or more pixels, 25 pixels into gap
5.2 Line tilt
Lines showed a tendency to move very slightly downwards, by about a single pixel, moving from left to right. Very few lines showed more than 5 pixel (0.4 mm or 1/60 inch) of tilt.

Figure 38: Distribution of line tilts across arrow gap

Figure 39: Distribution of line tilts across left half of arrow gap

Figure 40: Distribution of line tilts across right half of arrow gap
5.3 Line extent
Of 777 727 arrow targets marked at the center, 952 (0.12 %) were found to be unmarked at 5 pixels from the gap start, 738 (0.09 %) were found to be unmarked at 10 pixels from the gap start, and 649 (0.08 %) were found to be unmarked at 15 pixels from the gap start. 2125 marks (0.27 %) failed to reach within 5 pixels of the gap end, 1049 (0.13 %) failed to reach within 10, 708 (0.09 %) failed to reach within 15.

Figure 41: Lines failing to reach left edge of arrow gap

Figure 42: Lines failing to reach right edge of arrow gap
5.4 Line intensity and color
The lines red, green and blue intensities were measured across vertical test stripes, from the first darkened pixel to the last. These intensities were also used to calculate hue, saturation and value in the HSV color model.

Figure 43: Distribution of red, green and blue mean intensity

Figure 44: HSV ’hue’ distribution

Figure 45: HSV ’saturation’ distribution


Figure 46: Lines with hues of varying color.
5.5 Marks away from broken arrows
Composite images were generated
from 30 000 ballot sides. The composite images were
generated by overlaying sets of scans from ballots with the
same layout codes, taking for each pixel location the
darkest pixel of any ballot image in the set.
The images were aligned at an upper left corner landmark and
derotated to roughly align throughout the image.
Because write-ins were not present in the scanned set of ballots, the composite images are remarkably clear of stray marks.
The voters’ marks connecting the arrow halves stay well within the arrow boundaries, almost entirely within the vertical range defined by the shaft of each arrow rather than the head.
The most serious potential problem appears to be from the folding and unfolding of the ballots, resulting in cut lines just above the entry for Brad Owen in the Lieutenant Governor race. Cut lines are also visible elsewhere along the folds.
Several voters appeared to place preliminary marks to the left of the arrows, prior to filling in the arrows.
A typical composite image follows.

Figure 47: Composite of Everett ballots
| F4 | c2, Treas, Auditor |
| F7 | c2, LG |
| F9 | c1, 1000 |
| F12 | c1, 985, c2, Rep |
| F13 | c1, 1000 |
| F14 | c1 mark right of column, explanation points within |
| F15 | c2, Gov |
| F17 | c2, US Rep line |
| F19 | c1, 985, c2, state, c2 LG |
| F21 | c1, Pres |
| F22 | c2, SoS |
| F25 | marks to left of choices |
| F26 | c2, US Rep |
| F27 | c2, SoS |
| F28 | c1, 985, c2, US Rep |
| F36 | c2, USRep crossout |
| F37 | c1, Pres, c2, Auditor |
| F41 | c1, 1029, c2, Auditor |
| F47 | c2, SoS, c2, Auditor |
| F48 | c2, Auditor |
| F53 | c2, clipped upper right |
| F60 | c1, torn upper left |
| F61 | c2,clipped upper right |
| F62 | c2, Gov/LG boundary |
| F63 | c1, right margin |
| F68 | c2, stray mark by Treasurer |
| F72 | c2, US Rep |
| F77 | c2, US Rep |
| F78 | c2, US Rep |
| F79 | c2, SoS, Auditor |
| F81 | c2, above USRep |
| F82 | c1, footprint?, c2, checkmark at SoS |
| F83 | c2, line USRep |
| F84 | c1, 1029 crossout No |
| F85 | c1, 1000 circle yes |
| F89 | c2, cut at LG |
| F91 | c1, 1000 stray mark in arrow column, misc in c2 |
| F97 | c1, brown stain left margin |
| F114 | c1, lines, c2, lines |
| F119 | c1, 985 checkmark, c2, brown stain in arrow tail channel |
| F123 | c2, brown speck Gov |
| F124 | c1, line |
| F132 | c1, arrows, c2 arrows |
| F139 | c2, LG |
| F146 | c2, Gov crossout |
| F147 | c1, 1000, c2, SoS |
| F155 | c2, stray line LG |
| F157 | c1, cutline, c2, USRep, blueline |
| F161 | c1, circlesx2, c2, Gov |
| F162 | c1, arrows, c2, arrows |
| F163 | c2, USRep |
| F164 | c1, 985 water damage and scribble, c2, USRep scribble, top damage |
| F166 | c1, Pres w-i stray mark, c2, stray line |
| F170 | c1, 985 x |
| F172 | c2, Treas stray mark |
| F173 | c2, brown stains |
| F174 | c1, cutmark, c2, cutmark |
| F176 | c1, crossout x 3 |
| F178 | c2, Gov |
| F181 | c1, lines x 3 |
| F183 | c1, margin comment |
| F186 | c1, smudge near top of arrow channel |
| F186 | c2, crossed-out write-in, no write-in arrow |
| F190 | c1, 1000 explanatory correction text |
| F197 | c2, cutline |
| F200 | c1, light asterisk |
| F201 | c2, cut line at Auditor |
| F203 | c2, cut line at LG (stopping note of this) |
| F207 | c2, cut line at LG, stray marks beneath |
| F208 | c1, 1000 below arrow |
| F213 | c2, cut line at LG (noted because severe) |
| F219 | c1, "sorry" in arrow column at Pres |
| F221 | c2, low line at Treas |
| F229 | c1, crossouts |
| F232 | c1, 1029 low lines |
| F241 | c1, c2, bluegreen dashes |
| F255 | c2, State vertical line |
| F260 | c2, blue and red marks |
| F265 | c2, speck at Treas |
| F268 | c2, USRep |
| F272 | c2, stain at Gov |
| F276 | c1, dashes to left of options, dash above 1000 arrow |
| F292 | c1, scribble near top, stain, arc in many, low line in 985 |
| F310 | c1, 985 no |
| F318 | c2, blob at halftone |
| F322 | c1, dashes, c2, dashes |
| F328 | c1, blue tail beneath 1029 |
| F353 | c1, 985 yes |
| F354 | c1, tear at top, red in margin |
| F356 | c2, curve near top touches barcode |
| F359 | c2, Treas stray mark, Auditor low line |
| F360 | c1, left margin stray mark |
| F367 | c2, crossout USRep |
| F370 | c2, several crossouts |
| F372 | c1, tear, c2 Gov lowline, Auditor lowline |
| F373 | c1, 985 low line |
| F374 | c1, c2, dashes |
| F375 | c2, dash SoS |
| F377 | c2, Gov |
| F384 | c1, dash and crossout, c2, Treas stain |
| F386 | c2, torn at top |
| F387 | c1, purple spread |
| F390 | c2, dash |
| F394 | c1, stray mark arrow yes 985 |
| F395 | c1, zigzag arrow 985, c2, stray mark SoS, Auditor |
| F398 | c1, stray blue line, c2, torn at top |
| B8 | c1, tear upper left |
| B9 | c1, CPL line above arrow |
| B11 | c1, SPI smudge above arrows |
| B13 | c2, zigzag at arrow |
| B14 | c1, right margin stray mark |
| B15 | c1, blue line beneath arrow |
| B17 | c1, AG Ladenburg arrow IC Adams arrow, c2, scribble near bottom |
6. CHAMPAIGN / OVALS
Two sets of Champaign County ballots were examined in order to reach a sample size of 100 000 ballots. Approximately 20 000 ballots from the February 2008 election were examined separately from more than 80 000 ballots from the November 2008 election.
More than 3.9 million vote ovals were captured from the November ballot sample, of which more than 1.4 million were marked by the voter. An additional 800 000 ovals were captured from the February ballot sample.
Votes were examined by cropping regions of 87 × 60 (5220) pixels, with the printed vote target bounded by a rectangle of 72 × 30 (2160) pixels, or approximately 40 % of the crop region. Cropped regions not containing a centered vote oval have been almost entirely removed from the data, but several hundred such regions may remain in the more than 1 125 000 ovals studied. Because these represent fewer than 0.1 % of the ovals they are not believed to represent a problem to the analysis.
With a cropped region’s pixels grouped into four intensity quartiles, the typical nonvote had in the vicinity of 4800 to 4900 pixels in the highest of the four intensity ranges, with another 150 to 200 pixels in each of the next two quartiles and fewer than 30 pixels in the lowest quartile.
The typical vote removed 1500 pixels from the top intensity quartile and increased the pixel count in the low two intensity ranges to between 1500 and 1600. For the red channel, only about 200 pixels were darkened to the lowest quartile, but for the green and blue channels approximately 1000 pixels were darkened to the lowest quartile.
The following charts show the change in distribution of pixel counts as ovals are voted, first in the November ballot set and then in the February ballot set.

Figure 48: Distribution of pixel counts by intensity quartile, all, November

Figure 49: Distribution of pixel counts by intensity quartile, all, February
The following charts show the distribution of pixel counts by quartile in only the voted ovals, first in the November ballot set and then in the February ballot set.

Figure 50: Distribution of pixel counts by intensity quartile, voted, November

Figure 51: Distribution of pixel counts by intensity quartile, voted, February
The average intensity of the cropped regions drops from above 240 to approximately 190. The average intensity of voted ovals is shown both compared with nonvoted ovals and on an expanded y axis. In addition, the average intensity of the marked area along the vertical centerline of an oval is shown. Because this picks up only the marked pixels, it shows a lower intensity than the cropped rectangle as a whole, peaking at approximately 80 rather than 190.

Figure 52: Distribution by average intensity, November

Figure 53: Distribution by average intensity, February

Figure 54: Distribution by average intensity, November

Figure 55: Distribution by average intensity, February

Figure 56: Distribution by average intensity, November

Figure 57: Distribution by average intensity, February
Using the crop area’s average intensity, fewer than 1 % of votes have average intensity less than 163/255, approximately 10 % of votes have average intensity less than 179/255. Half of voted ovals have an intensity across the cropped region of between 184 and 195, and fewer than 1 % of voted ovals have an intensity across the cropped region of 213 or above.
Using the vertical centerline, fewer than 1 % of votes have average intensity less than 47/255, approximately 10 % of votes have average intensity less than 60/255. Half of voted ovals have an intensity along their vertical centerline of between 68 and 94, and fewer than 1 % of voted ovals have an intensity along their vertical centerline of 146 or above.
The characteristics of the marks, as expected, change as the cropped regions’ average intensity changes. The darkest cropped regions contain marks that were filled well outside of the printed target. The following marks have cropped region intensities below 120/255:

Figure 58: Marks in darkest group
The following marks have cropped region intensities from 120 through 149; keep in mind that these represent less than 0.2 % of voted marks:

Figure 59: Marks in dark group
Marks in the darkest 10 % (excluding those in the darkest 0.2 %) show nearly full coverage in the target area and some excess as well. Automark printed marks show up in this set, at the right of the eighth row:

Figure 60: Marks in dark/normal group
Marks in crops with the average intensity tend to be neatly filled in. However, some crops with this intensity contain marks with less than complete coverage, with marking outside the target contributing to the intensity drop.
As the following montage represents the most typical marks, it can serve as a useful place to point out characteristics which can usefully be used to distinguish marks. (The image is divided into 8 blocks, each of which contains a 5 × 5 grid of marks. The blocks will be referred to as A to D down the left, then E to H down the right; rows and columns within a block will be designated r1 to 5 and c1 to 5. The mark at Er2c5 has a loop above the target.)

Figure 61: Marks in typical intensity group
In addition to hue, brightness, “transition count,” and writing implement used, marks can be characterized by the presence and location of substantial voids and the manner in which the voter filled the target. Most of the marks in this typical set show that voters attempted to follow the target outline, probably starting at the perimeter and moving inward in an elliptical motion (the interior is often left slightly lighter than the rest).
Unusual strokes: Cr2c4 shows diagonal lines rather than elliptical curves, and Dr2c2 shows vertical lines. Dr5c3 shows a random pattern. Gr2c4 and c5 show a compromise between following the ellipse and drawing diagonal lines. tend to be neatly filled in with ink. Marks with typical average intensities are still filled neatly, with lighter ink or pencil.
Voids: Ar1c5 shows a void at upper left; in addition, the entire mark is shifted right and down from the arget. Br2c1 shows this to a lesser degree. Er2c1 and Er3c2 show minor voids but no shift of the mark with respect to the target.
Out of bounds: Gr5c5 shows a mark going substantially out of bounds to the left, and Ar1c5 goes substantially out of bounds to the right. Er2c5 goes out of bounds above the target. Gr2c4 and c5 both go out of bounds beneath the target.
As intensities rise, marks are incompletely filled in, and “x” marks, check marks, hollows, and miscellaneous variants appear.
The following three montages show, first, marks typical of those in the lightest 1.5 %, and then marks in the last 0.4 % and the last 0.1 %.

Figure 62: Marks in lightest 1.5 % of targets passing vote tests

Figure 63: Marks in lightest 0.4 % of targets passing vote tests

Figure 64: Marks in lightest 0.1 % of targets passing vote tests
6.1 Spoiled Ballots
Ballots in the 178 000 range were spoiled by the voter. These have been included, because they are a rich sample of problematic marks. However, when an election official wrote “SPOILED” across them, the result generates artifacts where the marks appear to go across the cropped regions without any connection with the target. These marks are distinguishable from voter made marks, and are concentrated in the range with average red intensity above 230.
6.2 Hesitation Marks
It is important that vote counting equipment be able to distinguish the marks by which a voter typically indicates their choice from the marks which probably occur when a voter touches their marking implement to their ballot without intending to register a vote.
Vote ovals in the previous montages represent only ovals which passed either a general intensity test (below 720 for red, green, and blue intensity values combined; each on a scale of 0 to 255) or a number of darkened pixels test (more than 300 pixels in the lowest half of intensity values).
Ovals which failed both of the above tests but were between 720 (240 × 3) and 735 (245 × 3) in combined red, green, and blue average intensity were further checked for small marks. Each pixel in a central 41 × 14 region (of the 72 × 30 ovals, whose interiors’ maximum width and height were 67 and 25 respectively), and the presence of any pixel darkened by at least 1/4 was considered a mark. This generated 2860 marked ovals from a set of approx 2 500 000 “unmarked” targets, or approximately 0.1 % of the set initially thought to be“unmarked”.
A subsample of 65 171 ovals at combined intensity of exactly 729.0 was taken for further testing. This subsample returned 73 hits in the central rectangles, which were 574 pixels in size. Per pixel, this is a hit frequency of 0.13. Then this subsample was searched more thoroughly for low intensity pixels by masking off an approximately 10 pixel wide ring around the oval, there were 83 hits over a region including 1451 pixels, giving only 10 additional hits in 877 additional pixels, or a per pixel hit rate outside the central rectangle of 0.01. This suggests that more than 90 % of specks were in the central rectangle contained within the printed oval.
The distribution of specks shows a high rate in the small number of ovals with average intensity 240.0 to 242.0, then drops to approximately 0.05 % of all unvoted ovals. (Note that this figure includes many specks which are barely visible.)
| 720 | 158 | 54 | 35 % | avgintensity240.0 |
| 721 | 341 | 57 | 17 % | |
| 722 | 1090 | 69 | 6 % | |
| 723 | 3589 | 104 | 3 % | avgintensity241.0 |
| 724 | 9830 | 153 | 2 % | |
| 725 | 22140 | 257 | 1 % | |
| 726 | 49378 | 399 | 1 % | avgintensity242.0 |
| 727 | 151905 | 509 | 0.3 % | |
| 728 | 438310 | 539 | 0.1 % | |
| 729 | 750478 | 386 | 0.05 % | avgintensity243.0 |
| 730 | 606402 | 156 | 0.03 % | |
| 731 | 183340 | 65 | 0.04 % | |
| 732 | 67864 | 56 | 0.08 % | avgintensity244.0 |
| 733 | 72957 | 39 | 0.05 % | |
| 734 | 57238 | 17 | 0.03 % | |
The specks vary substantially in size depending upon the exact intensity at which they were found:

Figure 65 Specks, intensity 240 to 241

Figure 66 Specks, intensity 241 to 242

Figure 67 Specks, intensity above 242
6.3 Color and Tint
The following graphs summarize the red, green, and blue intensities found in the cropped rectangles and across the horizontal span of the contained ovals. Blue and green are consistently at lower intensity than red:

Figure 68 Red, green and blue intensities of crop, November

Figure 69 Red, green and blue intensities of crop, February

Figure 70 Red, green and blue centerline mark intensities, November

Figure 71 Red, green and blue centerline mark intensities, February
The graphs are followed by montages of marks at different H values in the HSV color system.

Figure 72 Hue distribution

Figure 73 Hue distribution, expanded y axis

Figure 74 Hue approximately 0.95

Figure 75 Hue 0.22 to 0.27

Figure 76 Hue 0.62 to 0.67
6.4 Horizontal Spans, Voted Ovals

Figure 77 Horizontal spans, November set

Figure 78 Horizontal spans, February set
6.5 Vertical Spans, Voted Ovals

Figure 79 Vertical spans, November set

Figure 80 Vertical spans, February set
6.6 Transition Counts
The transition count represents the number of light to dark transitions encountered following the first dark pixel encountered (typically the left edge of the printed oval). Higher transition counts represent traversal of light regions prior to encountering dark regions; typical examples would be “x” marks, check marks, zigzags, and hollow marks not in contact with the printed target.
The following graphs show the distribution of transition counts as measured at the horizontal centerline of voted marks. The second graph uses an expanded y axis.

Figure 81 Transition counts

Figure 82 Transition counts, expanded y axis
The following montages compare typical marks with a transition count of 0 with marks with a transition count of 4. Although the difference between these two sets of marks is apparent, it is not clear that the transition count as calculated can be used to give much detail with regard to “degree of scribbledness.”

Figure 83 Transition count 0

Figure 84 Transition count 4
7. POSSIBLE MARK TAXONOMY AND NOTATION
Marks placed on oval and rectangular targets can be characterized by the intensity of the cropped region surrounding the target and/or by the intensity of the pixels across a span (for example, the centerline of the mark, from first darkened pixel to last). In addition, the hue of the mark can be used to characterize it, as can the marking implement used (when this can be discerned). The horizontal and vertical spans of the marks can be used as well, as can the number of transitions along a particular line.
Should additional characteristics be necessary, the nature of the stroking and coverage can provide additional dimensions. It is unclear whether a test set really needs to take these variations into account, but a relatively compact notation for the stroking could be as follows. The first part is based on compass direction notation:
| W | out of bounds to left (west) |
| E | out of bounds to right (east) |
| N | out of bounds beyond top (north) |
| S | out of bounds beyond bottom (south) |
| NW,NE,etc... | out of bounds at top left, top right, etc... |
| w | void at interior left |
| e | void at interior right |
| n | void at interior top |
| s | void at interior bottom |
| o | void at center |
| nw,ne,etc... | void at top left, void at top right, etc... |
| V | vertical strokes |
| H | horizontal strokes |
| T | strokes conforming to target ellipse |
| t | strokes conforming to target ellipse, not extending to printed target |
| F | diagonal strokes leaning forward |
| B | diagonal strokes leaning backward |
| X | an X nor check mark (voids would be assumed) |
| C | a surrounding circle (though this is extremely rare) |
| D | an interior dot or dash |
| G | uniform light coverage, no strokes evident |
| R | random or not otherwise defined stroke pattern |
| ! | modifier suffix indicating the prior pattern is major |
| r | modifier suffix indicating strokes are rounded (curlicues) |
| [0..9] | alternate modifiers indicating degree to which prior pattern exists |
Using this notation, mark Ar1c5 of the montage at page 68 could be described as “SE nw T”. This notation could be extended to incorporate the other mentioned characteristics: intensity, writing implement used, predominating color, etc...
8. NEXT STEPS
Following completion of the mark databases and mark characterization, procedures will be developed and documented for producing a set of reference marks on typical ballots using each of the three vote target types analyzed.
—END—