By Kevin Skoglund, Citizens for Better Elections
Published June 20, 2019
The barcodes on a ExpressVote paper record can be strategically edited to change vote selections to different candidates on the same ballot. As barcodes become more widely used in elections, it is imperative that election officials and citizens understand the parameters of barcode manipulation and how it could be used for malicious purposes.
“How the ExpressVote XL Could Alter Ballots” lists a hack named “The Fixer” which describes how hacked software could use the Opportunity to Mark design flaw on the ExpressVote XL voting machine to modify a barcode to a new value.
How does one edit a barcode and how easy is it to do?
It would not be easy for a person. Modifying a barcode requires calculating the necessary changes, making new marks with precision, and avoiding detection. A human could not perform the calculations quickly. Even someone equipped with a pre-designed stamp would have difficulty reading the barcodes, making precise marks, and avoiding detection.
The ExpressVote XL, on the other hand, is well-equipped for these tasks. It is capable of quickly calculating the required modifications. It already knows which barcodes were printed in which positions on the paper so that it can make alterations without needing to rescan the barcodes first. It can make additional marks with precision. It can use the same printing method as the initial marks which would make detecting modifications difficult. It has all of the fundamental traits the task requires. And it has the opportunity to edit any barcode—because of the single paper path design flaw.
There are three rules to follow when editing the lines in a barcode to convert it to a new value.
Of course, a barcode could be modified to make it unreadable or to a new value with no meaning. This is a potentially useful hack which could be used to invalidate the original vote or to create an error state in the voting machine. But “The Fixer” hack describes changing it to a new legitimate value, located somewhere on the same ballot. In the most serious circumstance, the new value will point to another candidate in the same contest, so that hacked software could “flip” a vote from one candidate to a different one.
A barcode in Code-128 format is composed of four parts: a start character, characters for the data being encoded, a check character, and a stop character.
| Start | Data | Check | Stop ||
The start and stop characters will not be important during modification. They simply tell the barcode scanner where to start and stop reading the data. They need to remain the same to keep the barcode readable. The data characters and the check character will be important. Those portions will need to be modified.
Each barcode character is made up of a small group of black lines and white spaces. The Wikipedia page for Code-128 provides a table of the characters. The table lists a value for each character in the first column. The ExpressVote barcodes will use the characters in the fourth column, labeled “128C”. The 128A and 128B columns encode numbers as single digits, while the 128C column encodes numbers 00 to 99 in pairs. The benefit is that the barcodes will be half as wide—a six-digit number can be represented by three barcode characters. In the second to last column, the table lists patterns for each barcode using zeros and ones as a convenient notation for the white spaces and black lines.
Here are the barcodes which represent 34
and 30
in Code 128C. The pattern of zeros and ones included below the barcodes makes the width of each of the lines obvious. What appears to be a thick line can also be thought of as several small lines next to each other, and a large white space can be thought of as several smaller columns without black lines. Each character is exactly 11 columns wide, where one column is the smallest line or space possible.
34
10001011000
30
11011011000
The first barcode character has four black lines and seven white spaces. This makes it a great candidate for editing. Black lines can be added to those seven white spaces in different combinations to get new values. The second barcode character has six black lines and five white spaces. It is not a strong candidate for editing. However, the second barcode character has many lines in common with the first barcode character when the two are lined up.
10001011000 + + 11011011000
A 34
could be turned into a 30
by thickening the first two lines, specifically by adding two black lines to columns two and four. It is also clear that a 30
could never be turned into a 34
because that change would require removing two existing black lines.
The barcode data for a vote selection is always a six-digit number. The six-digit number corresponds to the position of the oval next to the candidate’s name on a standard hand-marked paper ballot. Even if the election will not use hand-marked paper ballots in the polling place, identifiers are assigned the same way. (Keep in mind that ovals on paper ballots may be used as absentee ballots.) A hand-marked paper ballot is laid out in a grid of 23 columns and up to 99 rows.
The coordinates of any oval can be defined as a six-digit number which follows the pattern CCRRSP.
A candidate with an oval in the ninth column, 34 rows down from the top, on side one of page one, would have an identifier of 093411
. On the example paper ballot, column 9, row 34 contains an oval for Bea Have, a candidate for Judge of the Superior Court. The barcode would encode that number in pairs: 09
, 34
, 11
.
When an optical scanner reads the barcode, it will use that six-digit number to register a vote for the candidate with an oval at that position on the ballot. It works the same way as a filled-in oval at the same position would register a vote for the candidate.
A barcode vote can be changed to a vote for a new candidate if the data can be modified to the coordinates of a different candidate’s oval.
The check character plays an important role in the barcode. The purpose of the check character is to help the barcode scanner know that it has read the data characters correctly. When a barcode is being created, the characters being encoded are passed into a simple algorithm. The result of the algorithm is stored in the barcode, right after the data. When the barcode scanner reads the lines representing the data, it send the values into the same algorithm and gets back a result. The scanner’s result should match the check character contained in the barcode. If not, then the barcode scanner assumes it scanned the lines incorrectly and keeps trying for a result that matches. This mechanism provides basic data integrity. It is not a robust security feature, but it does offer some protection against modification.
The algorithm is simple. As Wikipedia describes it: “It is calculated by summing the start code 'value' to the products of each symbol's 'value' multiplied by its position in the barcode string. The sum of the products is then reduced modulo 103. The remainder is then converted back to one of the 103 non-delimiter symbols.” Modulo 103 is a fancy way of saying divide by 103 and keep the remainder.
For example, the barcode data 093411
is calculated in 128C as:
Char | Value | Position | Value x Position |
---|---|---|---|
Start C | 105 | 1 | 105 |
09 | 9 | 1 | 9 |
34 | 34 | 2 | 68 |
11 | 11 | 3 | 33 |
Sum | 215 | ||
Mod 103 | 9 |
The barcode will include lines for the check character with a value of 9, to match the remainder.
9
11001001000
When the barcode scanner reads the data pattern, it will perform this same calculation and get the result 9. If the check character it reads is also 9 then it knows it read everything correctly. If not, then it keeps trying to scan a set of values that match the given check character.
A barcode must have a check character which matches the encoded data to be valid.
As described earlier, a 34
can become a 30
by adding two black lines. It naturally follows that lines could be added to the barcode 093411
to attempt to make the barcode 093011
.
This is a potentially useful modification for hacked software to make. The ninth column is commonly used for the middle column in a three-column ballot layout using hand-marked paper ballots. Row 34 is about half way down the page where candidate names are likely. The modified value would be in the same column, four grid rows above the original. Since each candidate name typically take up two grid rows on the ballot, there is a strong chance that these coordinates refer to two candidates in the same contest. It could allow flipping a vote!
The example paper ballot illustrates this well. It uses rows 1, 9, and 17 for the three columns of ovals, and 093011
is the oval for a candidate in the same contest, Jack B. Quick. Hacked software could potentially flip any barcoded votes for Bea Have to Jack B. Quick.
Here is a comparison of the original barcode and the barcode which results from adding two lines to make the 34
character into a 30
character.
| Start C | 9 |34 |11 | Check 9 | Stop ||
11010011100110010010001000101100011000100100110010010001100011101011 + + 11010011100110010010001101101100011000100100110010010001100011101011
| Start C | 9 |30 |11 | Check 9 | Stop ||
It is not so simple as adding two lines in this case. The modified barcode would be invalid because the check character does not match the data anymore. A barcode scanner would not accept it.
In some cases, it is possible to modify the encoded data characters and get a check character which also matches. There are over 150 such coordinates on one side of a ballot (around 7% of all coordinates), and many more if the new coordinates are allowed to be on another ballot side or page.
It is much more common that the check character will change as a result of the modification. There are 103 possible check characters (because the algorithm uses modulo 103, the remainder must be 0-102). The new barcode values can be sent through the algorithm to reveal which character the barcode scanner will expect.
Char | Value | Position | Value x Position |
---|---|---|---|
Start C | 105 | 1 | 105 |
09 | 9 | 1 | 9 |
30 | 30 | 2 | 60 |
11 | 11 | 3 | 33 |
Sum | 207 | ||
Mod 103 | 1 |
The calculation in the row for 30 is different than 34 in the previous table. It is now 60 instead of 68, a difference of 8. The remainder also went down by 8. After the data characters, the barcode scanner will expect to find the check character with a value of 1.
1
11001101100
Often a different check character prevents modification. For example, as described earlier, a 30
cannot be converted into a 34
. But in this case, it is a lucky combination. The initial check character has seven white spaces, making it easy to modify to a new value, and the check character needed for the new barcode lines up nicely. Two lines can be added to change the check character to match too!
11001001000 + + 11001101100
The complete edit would be:
| Start C | 9 |34 |11 | Check 9 | Stop ||
11010011100110010010001000101100011000100100110010010001100011101011 + + + + 11010011100110010010001101101100011000100100110011011001100011101011
| Start C | 9 |30 |11 | Check 1 | Stop ||
A barcode could be converted successfully to a new valid value on the ballot by adding four black lines in precise locations. On the example ballot, this would flip a vote from Bea Have to Jack B. Quick.
The example ballot has an even more serious vulnerability. There is a contest at the bottom of the first column for Straight Party. In many states, this option is included during a general election to allow voters to cast their votes for all of the candidates of a single party at once. By filling in a single oval, a voter can cast votes in every contest.
The second oval in the lower left of the example ballot would cast a vote for the every candidate on the ballot who is a member of the Republican Party. The barcode representing this selection on an ExpressVote paper record would contain the data 015811
(column 1, row 58, side 1, page 1 ).
This barcode can be manipulated so that it contains 016011
. That position is two grid rows lower and corresponds to the oval for the Green Party. It would require adding four additional lines.
| Start C | 1 |58 |11 | Check 49 | Stop ||
11010011100110011011001110110001011000100100110100011101100011101011 ++ ++ 11010011100110011011001110111101011000100100110111011101100011101011
| Start C | 1 |60 |11 | Check 53 | Stop ||
With a single edit, hacked software could affect every contest in the election, even on other ballot sides and other pages! It could steal votes from every Republican candidate and add votes for every Green Party candidate.
Barcodes in Code-128 format are editable. The data integrity features are suitable for preventing basic scanning errors, but offer weak protection against intentional data modification. Modification is facilitated by the grid layout of a ballot which creates predictable values at regular intervals.
Hacked software does not have the ability to change a barcode to any value it prefers. Many barcode characters are precluded by having black lines already printed where a white space would need to exist. The design of the ballot, the number of contests, and the number of candidates competing in each contest will determine which ovals can be manipulated most effectively. The opportunities will vary by ballot style. A vulnerable oval position on the ballot used in one precinct may not be occupied by an oval in another precinct. Hacked software must watch for opportunities and exploit them when available.
There are many potential opportunities. A long, 8.5" wide, single-sided ballot (with 23 columns, 99 rows) has 2,277 oval positions. If barcodes were created for all of them, 714 of those barcodes (31%) can be modified to a new value somewhere on the same side of the ballot. Most of those 714 vulnerable barcodes offer more than one option for modification. The white spaces can be filled in different combinations to achieve several values. For example, 093411
can be converted into nine valid barcodes, not just 093011
. The barcode for 100811
has 26 possible alternate values. The average number of possible modifications is 3.22.
The given examples flip votes in the same contest. This is the most obvious use case. It is also possible to modify a vote in one contest to a vote in a different contest, even in a different column or on a different page. Chains of modifications are possible. Hacked software could shift a vote in one contest to a vote in another, then shift a vote in that second contest to a third contest, and finally shift a vote in the third contest back to the first one.
Since 2015, use of barcodes to encode ballot selections in a machine-readable format has spread rapidly throughout American election systems. This has been in large part due to the adoption of ES&S ExpressVote voting machines which utilize Code-128 barcodes. The introduction of the ExpressVote XL, which has the opportunity to print on the paper record after the voter has last seen it, will further that expansion. Awareness of the ways in which these barcodes could be manipulated for malicious purposes is important for making elections more secure.