Table of Contents
 1. Foreword & Outline
 2. Introduction to CRC
 2.1 CRC verification
 3. Concept of the CRC shift register
 4. Implementing CRC8 algorithms
 5. Extending to CRC16
 6. Extending to CRC32
 7. CRC algorithm specification
 8. Additional remarks (points worth to know)
 8.1 Basic mathematical view of CRC
 8.2 Background to CRC verification
 8.3 CRC1 is the same as a parity bit
 8.4 Why is addition is the same as subtraction in CRC arithmetic?
 8.5 Why does multiplication with x^{n} append n zeros?
 8.6 When using an initial value other than zero in the shift register, the result is incorrect.
 9. Conclusion & References
Article updated November 2016 (for details, click here to see the changelog).
1. Foreword & Outline
This article is the result of the fact that I found finally time to deal with CRC.
After reading Wikipedia and some other articles, I had the feeling to not really understand completely in depth.
Therefore I decided to write this article, trying to cover all topics I had difficulties with. And this in exactly the same order I concerned myself with CRC. Please note that this article is not indented to be a full comprehensive CRC guide explaining all details  it should be used as an additional, practical oriented note to all general explanations on the web.
Here's the outline:
 At first, the general idea and functionality of CRC is discussed.
 Subsequently, some examples are calculated by hand to get familiar with the process of CRC calculation.
 Based on those observations, implementations of CRC calculation are presented step by step, from naive ones till more efficient algorithms. Here the emphasis lies on the target to really get the point of the code compared to the manual calculation. Here an exemplary CRC8 polynomial is used.
 Afterwards, the achieved knowledge is expanded to CRC16 and CRC32 calculation, followed by some CRC theory and maybe a FAQ section.
2. Introduction
CRC (Cyclic Redundancy Check) is a checksum algorithm to detect inconsistency of data, e.g. bit errors during data transmission. A checksum, calculated by CRC, is attached to the data to help the receiver to detect such errors. Refer also to [1] for a short or to [4] for a very detailed CRC introduction.
CRC is based on division. The actual input data is interpreted as one long binary bit stream (divident) which is divided by another fixed binary number (divisor). The remainder of this division is the checksum value.
However, reality is a bit more complicated. The binary numbers (divident and divisor) are not treated as normal integer values, but as binary polyonimals where the actual bits are used as coefficients.
For example, the input data 0x25 = 0010 0101 is taken as 0*x^{7} + 0*x^{6} + 1*x^{5} + 0*x^{4} + 0*x^{3} + 1*x^{2} + 0*x^{1} + 1*x^{0}.
XOR  0  1 
0  0  1 
1  1  0 
XOR truth table
Division of polynomials differs from integer division. Without going into detail, the underlying used aritmetic for CRC calculation is based on the XOR (ExclusiveOR) operation (we'll come to an example soon!).
 The divident is the complete input data (interpreted as binary stream).
 The divisor, also called generator polynomial, is statically defined by the used CRC algorithm. CRCn using a fixed defined generator polynom with (n+1) bits.
 The CRC checksum value is defined as divident % divisor.
For manual calculation, n zero bits are appended to the input data before actual CRC calculation (polynomial division) is computed. Let's perform an example CRC computation:
Input data is the byte 0xC2 = b11000010.
As generator polynomial (=divisor), let's use b100011101.
The divisor has 9 bits (therefore this is a CRC8 polynomial), so append 8 zero bits to the input pattern .
Align the leading '1' of the divisor with the first '1' of the divident and perform a stepbystep schoollike division, using XOR operation for each bit:
ABCDEFGHIJKLMNOP 1100001000000000 100011101  010011001 100011101  000101111 100011101 (*)  001100101 100011101  010001001 100011101  000001111 = 0x0F ABCDEFGHIJKLMNOP
The actual CRC value is 0x0F.
Useful observations:
 In each step, the leading '1' of the divisor is always aligned with the first '1' of the divident. This implies that the divisor does not move only 1 bit right per step, but sometimes also several steps (e.g. like in line (*).
 The algorithms stops if the divisor zeroed out each bit of the actual input data (without padding bytes): The input data ranges from column A to H including. In the last step, column H and all prior columns contain 0, so the algorithm stops.
 The remainder (= CRC) is the value 'below' the padding zero bits (column I to P). Because we added n padding bytes, the actual CRC value has also n bits.
 Only the remainder in each step is of interest, the actual division result (quotient) is therefore not tracked at all.
2.1 CRC Verification
The remainder is the CRC value which is transmitted along with the input data to the receiver. The receiver can either verify the received data by computing the CRC and compare the calculated CRC value with the received one. Or, more commonly used, the CRC value is directly appened to the actual data. Then the receiver computes the CRC over the whole data (input with CRC value appended): If the CRC value is 0, then most likely no bit error occured during transmission. Let's do verification according the latter case:
The actual transmission data (input data + CRC) would be b1100001000001111 Note that we have used an 8bit CRC, so the actual CRC value is also 8bit long. The generator polynomial is statically defined by the used CRC algorithm and so it's known by the receiver.
ABCDEFGHIJKLMNOP 1100001000001111 100011101....... ....... 010011001....... 100011101..... ...... 000101111...... 100011101.. ... 001100100... 100011101 . 010001110. 100011101  000000000 > Remainder is 0, data ok! ABCDEFGHIJKLMNOP
3. Concept of the CRC shift register
So we have seen how to calculate the CRC checksum value manually, but how can it be implemented?
The input data stream is generally quite long (more than 1 bit) so it's not possible to perform a simple division like "Input data % generator polynomial". The computation has to be performed stepbystep and here the concept of a shift register comes into play.
A shift register has a fixed width and can shift it's content by one bit, removing the bit at the right or left border and shift in a new bit at the freed position. CRC uses a left shift register: When shifted, the most significant bit pops out the register, the bit at position MSB1 moves one position left to postion MSB, the bit at position MSB2 to MSB1 and so on. The bit position of the least significant bit is free: here the next bit of the input stream is shifted in.
MSB LSB       <    ....   < (shift in input message bits)      
The process of CRC calculation using a shift register is as follow:
 Initialize the register with 0.
 Shift in the input stream bit by bit. If the popped out MSB is a '1', XOR the register value with the generator polynomial.
 If all input bits are handled, the CRC shift register contains the CRC value.
Let's visualize the procedure with the example data from above
1. CRC8 register initialized with 0.          0  0  0  0  0  0  0  0  < b1100001000000000         2. LeftShift register by one position. MSB is 0, so nothing do happen, shift in next byte of input stream.          0  0  0  0  0  0  0  1  < b100001000000000         3. Repeat those steps. All steps are left out until there is a 1 in the MSB (nothing interesting happens), then the state looks like:          1  1  0  0  0  0  1  0  < b00000000         4. LeftShift register. MSB 1 pops out:         1 <  1  0  0  0  0  1  0  0  < b0000000         So XOR the CRC register (with popped out MSB) b110000100 with polynomial b100011101 = b010011001 = 0x99. The MSB is discarded, so the new CRC register value is 010011001:          1  0  0  1  1  0  0  1  < b0000000         5. LeftShift register. MSB 1 pops out: b100110010 ^ b100011101 = b000101111 = 0x2F:          0  0  1  0  1  1  1  1  < b000000         6. Leftshift register until a 1 is in the MSB position:          1  0  1  1  1  1  0  0  < b0000         7. LeftShift register. MSB 1 pops out: b101111000 ^ b100011101 = b001100101 = 0x65:          0  1  1  0  0  1  0  1  < b000         8. Leftshift register until a 1 is in the MSB position:          1  1  0  0  1  0  1  0  < b00         9. LeftShift register. MSB 1 pops out: b110010100 ^ b100011101 = b010001001 = 0x89:          1  0  0  0  1  0  0  1  < b0         10. LeftShift register. MSB 1 pops out: b10001001 ^ b100011101 = b000001111 = 0x0F:          0  0  0  0  1  1  1  1  < <empty>         All input bits are processed, the algorithm stops. The shift register contains now the CRC value which is 0x0F.
4. Implementing CRC8 algorithms
This chapter handles different algorithms and their implementations in C# for calculating CRC8 checksum values. It starts with simple algorithms for limited input data and ends with efficient tablebased implementations.
4.1 Simple CRC8 shift register implementation for one byte input data
Let's start with an implementation of a CRC8 algorithm for solely one byte input data. The implementation will stay very closely to the shift register process from the example above.
A CRC8 algorithm uses actually a 9bit generator polynom, but it would be cumbersome to track such an unaligned value in an algorithm. Fortunately, as described in the previous chapter, the most significant bit can be discarded. First, it is always 1. Second, because the divisor is always aligned in such a manner that this leading '1' alignes with the next '1' of the divident, the XOR result for this bit is always 0.
This means we leave out the MSB '1', so we can use the generator polyonimal b100011101 = 0x1D as the example polynomial from now on.
Let's start with an implementation which is as close as possible to the shift register approach:
{
const byte generator = 0x1D;
byte crc = 0; /* init crc register with 0 */
/* append 8 zero bits to the input byte */
byte[] inputstream = new byte[] { byteVal, 0x00 };
/* handle each bit of input stream by iterating over each bit of each input byte */
foreach (byte b in inputstream)
{
for (int i = 7; i >= 0; i)
{
/* check if MSB is set */
if ((crc & 0x80) != 0)
{ /* MSB set, shift it out of the register */
crc = (byte)(crc << 1);
/* shift in next bit of input stream:
* If it's 1, set LSB of crc to 1.
* If it's 0, set LSB of crc to 0. */
crc = ((byte)(b & (1 << i)) != 0) ? (byte)(crc  0x01) : (byte)(crc & 0xFE);
/* Perform the 'division' by XORing the crc register with the generator polynomial */
crc = (byte)(crc ^ generator);
}
else
{ /* MSB not set, shift it out and shift in next bit of input stream. Same as above, just no division */
crc = (byte)(crc << 1);
crc = ((byte)(b & (1 << i)) != 0) ? (byte)(crc  0x01) : (byte)(crc & 0xFE);
}
}
}
return crc;
}
4.2 Modified CRC8 bitwise implementation for one byte input data
Well, above implementation looks complicated! How can it be simplified?
 The first 8 leftshifts are useless because the CRC value is initialized with 0 so no XOR operation is performed. This means we can initialize the CRC value directly with the input byte.
 So only '0' are left in the input stream (the appended zeros). They do not have to be explcitely shiftedin, as the C# leftshift operator << fills in the LSB with '0' by default.
 This implies that the inputstream array is not required anymore.
Applying these simplifications result in following implementation (much better, isn't it?):
{
const byte generator = 0x1D;
byte crc = byteVal; /* init crc directly with input byte instead of 0, avoid useless 8 bitshifts until input byte is in crc register */
for (int i = 0; i < 8; i++)
{
if ((crc & 0x80) != 0)
{ /* most significant bit set, shift crc register and perform XOR operation, taking notsaved 9th set bit into account */
crc = (byte)((crc << 1) ^ generator);
}
else
{ /* most significant bit not set, go to next bit */
crc <<= 1;
}
}
return crc;
}
To illustrate how the algorithmus is working, the example from above (input byte 0xC2, generator polynomial 0x1D) is repeated  this time showing the intermediate values of each step of implementation Compute_CRC8_Simple_OneByte.
1100001000000000  crc = 0xC2 
100011101  i = 0: 0xC2 = b11000010 > MSB set: 
  
010011001  i = 0: crc = (crc << 1) ^ generator = (0xC2 << 1) ^ 0x1D = 0x184 ^ 0x1D = 0x99. 
100011101  i = 1: 0x99 = b10011001 > MSB set: 
  
000101111  i = 1: crc = (0x99 << 1) ^ 0x1D = 0x132 ^ 0x1D = 0x2F 
100011101  i = 2: 0x2F = b00101111 > MSB not set: crc = (0x2F << 1) = 0x5E 
100011101  i = 3: 0x5E = b01011110 > MSB not set: crc = (0x5E << 1) = 0xBC 
100011101  i = 4: 0xBC = b10111100 > MSB set: 
  
001100101  i = 4: crc = (0xBC << 1) ^ 0x1D = 0x178 ^ 0x1D = 0x65 
100011101  i = 5: 0x65 = b01100101 > MSB not set: crc = (0x65 << 1) = 0xCA 
100011101  i = 6: 0xCA = b11001010 > MSB set: 
  
010001001  i = 6: crc = (0xCA << 1) ^ 0x1D = 0x194 ^ 0x1D = 0x89 
100011101  i = 7: 0x89 = b10001001 > MSB set: 
  
000001111  i = 7: crc = (0x89 << 1) ^ 0x1D = 0x112 ^ 0x1D = 0x0F 
The fact that the crc value is leftshifted by one _before_ it's 'xored' with the divisor should become clear: it's due to the already discussed reason that the MSB bit of the generator polynomial is not stored / not used by the algorithm as well as it's result.
4.3 General CRC8 bitwise implementation
Till now only one byte was used as input data, so let's see what happens if the input data is extended to a byte array.
The first function Compute_CRC8_Simple_OneByte_ShiftReg() could easily be adapted (only the input parameter would be a byte array at which a 0x00 byte is appended), but what about Compute_CRC8_Simple_OneByte()?
The interesting point is at the border between two bytes: If one byte is completely processed, how is the subsequent byte incorporated into the calculation process? Again, let's start with a simple example (even more manual CRCcalculation action!):
000000010000001000000000 100011101  0000111110000 100011101  0111011010 100011101  0110001110 100011101  0100100110 100011101  0001110110 = 0x76
Actually the algorithm handles one byte at a time, and does not consider the next byte until the current one is completely processed.
Refering to Compute_CRC8_Simple_OneByte, the value crc is set to 0x01 and the input looks like 0000000100000000...
So let's see the state when the first byte is completely processed:
000000010000000000000000
100011101

000011101
Compare this to our manual approach where we have the second byte 0x02 already 'in range':
000000010000001000000000
100011101

000011111
So obviously the next byte has to be XORed with the current CRC value: 000011101 ^ 00000010 = 000011111 and the algorithm then continues with the 'new' xored value.
With this knowledge we can easily extend our algorithm to work with an input byte array of arbitrary length:
{
const byte generator = 0x1D;
byte crc = 0; /* start with 0 so first byte can be 'xored' in */
foreach (byte currByte in bytes)
{
crc ^= currByte; /* XORin the next input byte */
for (int i = 0; i < 8; i++)
{
if ((crc & 0x80) != 0)
{
crc = (byte)((crc << 1) ^ generator);
}
else
{
crc <<= 1;
}
}
}
return crc;
}
4.4 Improved CRC8 bytebybyte algorithm (lookup table based)
So far the algorithm is quite inefficient as it works bit by bit. For larger input data, this could be quite slow. But how can our CRC8 algorithm be accelerated?
The divident is the current crc byte value  and a byte can only take 256 different values. The polynomial (= divisor) is fixed. Why not precompute the division for each possible byte by the fixed polynomial and store these result in a lookup table? This is possible as the remainder is always the same for the same divident and divisor! Then the input stream can be processed byte by byte instead of bit by bit.
Let's use our common example to demonstrate the process manually:
1. Init crc = 0x00.
2. 'Xorin' next input byte 0x01: 0x00 ^ 0x01 = 0x01.
3. Calculate CRC8 of 0x01 using precomputed lookup table: table[0x01] = crc = 0x1D.
4. 'Xorin' next input byte 0x02: 0x1D ^ 0x02 = 0x1F.
5. Calculate CRC8 of 0x1F: using precomputed lookup table: table[0x1F] = crc = 0x76.
For step 3 and 5, the lookup table is used instead of bit by bit processing  that makes the actual speedup.
Here an implementation for calculating the 256element lookup table:
{
const byte generator = 0x1D;
crctable = new byte[256];
/* iterate over all byte values 0  255 */
for (int divident = 0; divident < 256; divident++)
{
byte currByte = (byte)divident;
/* calculate the CRC8 value for current byte */
for (byte bit = 0; bit < 8; bit++)
{
if ((currByte & 0x80) != 0)
{
currByte <<= 1;
currByte ^= generator;
}
else
{
currByte <<= 1;
}
}
/* store CRC value in lookup table */
crctable[divident] = currByte;
}
}
Then improved CRC8 algorithm using the lookup table is as follow:
{
byte crc = 0;
foreach (byte b in bytes)
{
/* XORin next input byte */
byte data = (byte)(b ^ crc);
/* get current CRC value = remainder */
crc = (byte)(crctable[data]);
}
return crc;
}
The improvement of speed comes at the cost of processing time to precalculate the table and of higher memory consumption of the 256byte elements, but that's worth it :)!
5. Extending to CRC16
The more bits the CRC value has the less is the probability of a collision: for CRC8 there are only 256 different CRC values. This means if the data is disturbed or modified between sender and receiver, there is a probability of 1/256 that the modified data stream has the same CRC value as the original data stream, thus the error is not detected. Beside other factors (more on this in the theory part), increasing the CRC width results in better error protection.
This raises the question: What is the impact on the implementation if we want to extend it from CRC8 to CRC16?
 1. Obviously a CRC16 uses a polynomial of degree 17, but similar to CRC8 the most significant bit is implicitely 1. Therefore the generator polynomial as well as the CRC value have now a 16bit data type.
 2. How to 'XorIn' the next input byte (8bit) into the CRC value (16bit)? The answer is into the most signficant byte of the CRC. Similar to chapter 4.3, let's visualize it with an example:
First, let's make another manual calculation, this time for CRC16 with polynomial 0x1021:
ABCDEFGHIJKLMNOPQRSTUVWXYZ 00000001000000100000000000000000 10001000000100001  00001001000100001 = 0x1221 (intermediate CRC value after completing first byte) 10001000000100001  00011001000110001 10001000000100001  01000000110101001 10001000000100001  00001001101110011 = 0x1373 (final CRC value after both bytes) ABCDEFGHIJKLMNOPQRSTUVWXYZ
What happens if the algorithm has handled the first input byte 0x01: 00000001000000000000000000000000 10001000000100001  00001000000100001 Here we see that the next input byte 0x02 = 00000010 has be xored into the MSB of 00001000000100001 to get 00001001000100001 to proceed.
 3. The check if the most significant bit set changes because bit15 instead of bit7 has to be tested: 0x80 > 0x8000
Therefore the CRC16 simple implementation looks like:
{
const ushort generator = 0x1021; /* divisor is 16bit */
ushort crc = 0; /* CRC value is 16bit */
foreach (byte b in bytes)
{
crc ^= (ushort(b << 8); /* move byte into MSB of 16bit CRC */
for (int i = 0; i < 8; i++)
{
if ((crc & 0x8000) != 0) /* test for MSB = bit 15 */
{
crc = (ushort((crc << 1) ^ generator);
}
else
{
crc <<= 1;
}
}
}
return crc;
}
The modification of the implementations for calculating the CRC16 lookup table is now quite easy:
{
const ushort generator = 0x1021;
crctable16 = new ushort[256];
for (int divident = 0; divident < 256; divident++) /* iterate over all possible input byte values 0  255 */
{
ushort curByte = (ushort(divident << 8); /* move divident byte into MSB of 16Bit CRC */
for (byte bit = 0; bit < 8; bit++)
{
if ((curByte & 0x8000) != 0)
{
curByte <<= 1;
curByte ^= generator;
}
else
{
curByte <<= 1;
}
}
crctable16[divident] = curByte;
}
}
The actual byte by byte is a bit tricky so let's first check our example again:
1. Init crc = 0
2. Handle first input byte 0x01:
2.1 'Xorin' first input byte 0x01 into MSB(!) of crc:
0000 0000 0000 0000 (crc)
0000 0001 0000 0000 (input byte 0x01 leftshifted by 8)

0000 0001 0000 0000 = 0x0100
The MSB of this result is our current divident: MSB(0x100) = 0x01.
2.2 So 0x01 is the divident. Get the remainder for divident from our table: crctable16[0x01] = 0x1021. (Well this value is famila from the manual computation above.)
Remember the current crc value is 0x0000. Shift out the MSB of current crc and xor it with the current remainder to get the new CRC:
0001 0000 0010 0001 (0x1021)
0000 0000 0000 0000 (CRC 0x0000 leftshifted by 8 = 0x0000)

0001 0000 0010 0001 = 0x1021 = intermediate crc.
3. Handle next input byte 0x02:
Currently we have intermediate crc = 0x1021 = 0001 0000 0010 0001.
3.1 'Xorin' input byte 0x02 into MSB(!) of crc:
0001 0000 0010 0001 (crc 0x1021)
0000 0010 0000 0000 (input byte 0x02 leftshifted by 8)

0001 0010 0010 0001 = 0x1221
The MSB of this result is our current divident: MSB(0x1221) = 0x12.
3.2 So 0x12 is the divident. Get the remainder for divident from our table: crctable16[0x12] = 0x3273.
Remember the current crc value is 0x1021. Shift out the MSB of current crc and xor it with the current remainder to get the new CRC:
0011 0010 0111 0011 (0x3273)
0010 0001 0000 0000 (CRC 0x1021 leftshifted by 8 = 0x2100)

0001 0011 0111 0011 = 0x1373 = final crc.
The important point is here that after xoring the current byte into the MSB of the intermediate CRC, the MSB is the index into the lookup table.
Here the corresponding implementation for the tablebased CRC16 algorithm:
{
ushort crc = 0;
foreach (byte b in bytes)
{
/* XORin next input byte into MSB of crc, that's our new intermediate divident */
byte pos = (byte)( (crc >> 8) ^ b); /* equal: ((crc ^ (b << 8)) >> 8) */
/* Shift out the MSB used for division per lookuptable and XOR with the remainder */
crc = (ushort)((crc << 8) ^ (ushort)(crctable16[pos]));
}
return crc;
}
6. Extending to CRC32
After having understood the extension process from CRC8 to CRC16 in the previous chapter, the modifications to CRC32 is pretty easy. I discard the detailed steps as they are very similar to the CRC16 case  just a quick overview of the changes:
 CRC32 uses a 33bit polynom, however again the most signficant bit is always '1' and can be discarded. Therefore, the polynomial and the CRC value are represented by 32 bit variables.
 The bit operations change slightly: Moving the input byte into the MSB of CRC requires now a shift by 24. Also the check for the most significant bit uses a different mask of 0x8000000 instead of 0x8000.
Actually that's it, so here is the bitwise CRC32 algorithm implementation:
{
const uint polynomial = 0x04C11DB7; /* divisor is 32bit */
uint crc = 0; /* CRC value is 32bit */
foreach (byte b in bytes)
{
crc ^= (uint)(b << 24); /* move byte into MSB of 32bit CRC */
for (int i = 0; i < 8; i++)
{
if ((crc & 0x80000000) != 0) /* test for MSB = bit 31 */
{
crc = (uint)((crc << 1) ^ polynomial);
}
else
{
crc <<= 1;
}
}
}
return crc;
}
Calculating the CRC32 lookup table and using it for the CRC computation is then straight forward:
{
const uint polynomial = 0x04C11DB7;
crcTable = new uint[256];
for (int divident = 0; divident < 256; divident++) /* iterate over all possible input byte values 0  255 */
{
uint curByte = (uint)(divident << 24); /* move divident byte into MSB of 32Bit CRC */
for (byte bit = 0; bit < 8; bit++)
{
if ((curByte & 0x80000000) != 0)
{
curByte <<= 1;
curByte ^= polynomial;
}
else
{
curByte <<= 1;
}
}
crcTable[divident] = curByte;
}
}
{
uint crc = 0;
foreach (byte b in bytes)
{
/* XORin next input byte into MSB of crc and get this MSB, that's our new intermediate divident */
byte pos = (byte)((crc ^ (b << 24)) >> 24);
/* Shift out the MSB used for division per lookuptable and XOR with the remainder */
crc = (uint)((crc << 8) ^ (uint)(crcTable[pos]));
}
return crc;
}
7. CRC algorithm specification
At this point, you (and me hopefully... ;) ) should know how CRC computation works and how to calculate it manually as well as how to implement it in your favourite programming language.
However, a CRC instance is defined by specific parameters. We have already seen that e.g. the width of the generator polynomial varies, but there are more definition parameters. In order to be able to implement each CRC instance, let's discuss how an CRC algorithm instance is described.
7.1 CRC parametrization
Following standard parameters are used to define a CRC algorithm instance:
 Name: Well, a CRC instance has to be identified somehow, so each public defined CRC parameter set has a name like e.g. CRC16/CCITT.
 Width (in bits): Defines the width of the result CRC value (n bits). Simultaneously, also the width of the generator polynomial is defined (n+1 bits). Most common used widths are 8, 16 and 32 bit. But thereotically all widths beginning from 1 are possible. In practice, even quite big (80 bit) or uneven (5 bit or 31 bit) widths are used.
 Polynomial: Used generator polynomial value. Note that different respresentations exist, see chapter 7.2.
 Initial Value: The value used to initialize the CRC value / register. In the examples above, always zero is used, but it could be any value.

Input reflected: If this value is TRUE, each input byte is reflected before being used in the calculation. Reflected means that the bits of the input byte are used in reverse order. So this also means that bit 0 is treated as the most significant bit and bit 7 as least significant.
Example with byte 0x82 = b10000010: Reflected(0x82) = Reflected(b10000010) = b01000001 = 0x41.  Result reflected: If this value is TRUE, the final CRC value is reflected before being returned. The reflection is done over the whole CRC value, so e.g. a CRC32 value is reflected over all 32 bits.
 Final XOR value: The Final XOR value is xored to the final CRC value before being returned. This is done after the 'Result reflected' step. Obviously a Final XOR value of 0 has no impact.
 Check value [Optional]: This value is not required but often specified to help to validate the implementation. This is the CRC value of input string "123456789" or as byte array: [0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39].
For a great overview over standard CRC algorithms, refer to [5].
Here a pseudocode CRC32 tablebased implementation taking the definition parameters into account:
{
uint crc = crcModel.Initial; /* CRC is set to specified initial value */
foreach (byte b in bytes)
{
/* reflect input byte if specified, otherwise input byte is taken as it is */
byte curByte = (crcModel.InputReflected ? CrcUtil.Reflect8(b) : b);
/* XORin next input byte into MSB of crc and get this MSB, that's our new intermediate divident */
byte pos = (byte)((crc ^ (curByte << 24)) >> 24);
/* Shift out the MSB used for division per lookuptable and XOR with the remainder */
crc = (uint)((crc << 8) ^ (uint)(crcTable[pos]));
}
/* reflect result crc if specified, otherwise calculated crc value is taken as it is */
crc = (crcModel.ResultReflected ? CrcUtil.Reflect32(crc) : crc);
/* Xor the crc value with specified final XOR value before returning */
return (uint)(crc ^ crcModel.FinalXor);
}
And finally a possible implementation to reflect a 16bit value:
{
ushort resVal = 0;
for (int i = 0; i < 16; i++)
{
if ((val & (1 << i)) != 0)
{
resVal = (ushort)(1 << (15  i));
}
}
return resVal;
}
7.2 Representation of generator polynomials
It is worth to know that there are different ways to represent a generator polynomial in hexadecimal.
Remember: A CRCn uses a generator polynomial of degree x+1 of the form x^{n} + x^{n1} + ... + x^{1} + x^{0} (Note that it has n+1 coefficients).
 Normal representation: The most significant bit (= x^{n}) of the generator polynomial is left out in the hexadecimal respresentation (as it's always 1). The hexadecial polynomial contains only the coefficients x^{n1} ... x^{0}. We used normal respresentation in this article.
Example: In the CRC8 discussion we used polynomial 100011101. Discarding the most significant bit results in100011101, which is 0x1D.  Reversed representation: The most significant bit (= x^{n}) of the generator polynomial is left out in the hexadecimal respresentation (like in the normal representation), but the tail is then reflected ('LSB first'), i.e. each nibble is reversed. So the mostsignificant bit does not match x^{n1} as in the normal representation, but x^{0}.
Example: Polynomial is 100011101. Discarding the most significant bit results in10001 1101. Reflection (reverse of each nibble) of 0001 1101 is 1011 1000 = 0xB8.  Koopman representation: The least(!) significant bit (= x^{0}) of the generator polynomial is discarded. The hexadecial polynomial contains only the coefficients x^{n} ... x^{1}.
Example: Polynomial is 100011101. Discarding the least significant bit results in 100011101= 0x8E.
Note that there are also socalled reciprocal polynomials which have their own representation, see chapter 7.3.
Please note that this only affects the representation of the polynomial. But the key point is that in each case the SAME polynomial is used for calculation.
Consider the often used CRC16 polynomial x^16 + x^12 + x^5 + 1. In binary, that's 1 0001 0000 0010 0001.
Here the three different representations for this example:
1 0001 0000 0010 0001 0001 0000 0010 0001 = 0x1021  Normal Representation 0001 0000 0010 0001 > reversed : 1000 0100 0000 1000 = 0x8408  Reversed representation 1 0001 0000 0010 000 > rearranged: 1000 1000 0001 0000 = 0x8810  Koopman representation
7.2.1 Choosing a generator polyonomial
The length on the generator polynomial depends on the maximum length of the input data and the desired error detection properties. The more likely one or more bit errors shall be detected and/or the longer the input data may be, the longer the generator polynomial has to be. This also decreases the probality of collisions (same CRC value for different input data).
Note that the actual value of the generator polynomial has also major impact on its error detection capabilities and its design is nontrivial and requires serious math knowledge (e.g. see [2] for an expert article.)
7.3 Reciprocal polynomials
Reciprocal polyonomials are polynomials that are reflected. So the least significant coefficient becomes the most significant and the other way. In other words, a reciprocal polynomial is created from a polynomial by assigning coefficient x^{n} to x^{0}, x^{n1} to x^{1} and so on.
The representation is based on the Koopman representation, but reflected ('LSB first').
CAUTION: Sometimes reciprocal polynomials are called reversed polynomials. This can easily be mixed up with the reversed representation of polynomal as described in chapter 7.2. However these are two different things.
Example: Keep with the CRC16 polynomial 0x1021 which has binary respresentation 1 0001 0000 0010 0001.
The Koopman representation is 1000 1000 0001 0000 1.
Reversing (or reflecting) the polynomial results in 0000 1000 0001 0001 = 0x0811.
It's a fact that reversed polynomials are 'as good as' the polynomials of which they are the reciprocal ones referring to their error detection properties.
Unfortunately such polynomials make it all a bit more complicated: the calculated CRC value of a polynomial is NOT the same as the calculated CRC value of its reciprocal polyonomial for the same message data!
Because the LSB and MSB are exchanged, you can think of the processing of reversed polynomials as they would be shifted from the other side into the CRC shift register: In the example of the CRC shift register in chapter 3 the input data was shifted from the right. If the reversed polynomial is used, you could get the same CRC result when shifting the input data from the left. Actually the whole processing is then just mirrored.
In the C# download package linked at the top, there is a 'reflected' implementation for each CRC class: Such a reflected CRC algorithm produces the same CRC result when using a reversed polynomial like the standard implementation using the nonreversed polynomial by 'mirroring' the processing.
7.3.1 Reversed CRC lookup table and calculation of reciprocal CRC
The fact that there are the two variants MSBtoLSB and LSBtoMSB often causes confusion. For example this is the reason why often two different lookup tables are found in the net for the same CRC instance: For the wellknown standard CRC32 instance (e.g. PKZIP) with polynomial 0x04c11db7, you can find a lookup table starting with the values 0x00000000 and 0x04C11DB7, the other one with 0x00000000 and 0x77073096. The first one correspond to the algorithms described in this article, while the second corresponds to the reciprocal variant where the coefficients are reflected.
Let's recall the calculation for the 'normal' CRC32 lookup table, compared to the 'reciprocal variant' of the CRC32 lookup table :
{
crcTable = new uint[256];
for (int divident = 0; divident < 256; divident++)
{
uint curByte = (uint)(divident << 24);
for (byte bit = 0; bit < 8; bit++)
{
if ((curByte & 0x80000000) != 0)
{
curByte <<= 1;
curByte ^= polynomial;
}
else
{
curByte <<= 1;
}
}
crcTable[divident] = curByte;
}
}
{
crcTable = new uint[256];
for (int divident = 0; divident < 256; divident++)
{
uint curByte = (uint)(divident);
for (byte bit = 0; bit < 8; bit++)
{
if ((curByte & 0x00000001) != 0)
{
curByte >>= 1;
curByte ^= Reflect32(polynomial);
}
else
{
curByte >>= 1;
}
}
crcTable[divident] = curByte;
}
}
To calculate the reciprocal variant lookuptable, two changes are required to handle the reflected bitorder: At first, the CRC parameter Polynomial has to be reflected, and second the order of bit processing in the algorithm itself needs to be changed from MSBtoLSB to LSBtoMSB, resulting in rightshifting instead of leftshifting.
So this results in two different lookup tables. This implies that also the actual CRC calculation needs to adapated for the reciprocal variant in order to retrieve the same CRC result as for the normal variant.
In general, to use the reciprocal variant of a CRCmodel, following changes are required for the CRC model: The CRC parameters Polynomial, Initial Value and Final XOR value have to be reflected, and the parameters Input reflected and Result reflected have to be negated. Here some code to change the CRC model to the reciprocal CRC model:
{
return new Crc32Model(
CrcUtil.Reflect32(model.Polynomial),
CrcUtil.Reflect32(model.Initial),
CrcUtil.Reflect32(model.FinalXor),
!model.InputReflected,
!model.ResultReflected
);
}
With this information, here the tablebased calculation of reciprocal CRC32 (right) compared to the normal calculation (left):
{
uint crc = crcModel.Initial;
foreach (byte b in bytes)
{
byte curByte = (crcModel.InputReflected ? CrcUtil.Reflect8(b) : b);
/* update the MSB of crc value with next input byte */
crc = (uint)(crc ^ (curByte << 24));
/* this MSB byte value is the index into the lookup table */
byte pos = (byte)(crc >> 24);
/* shift out this index */
crc = (uint)(crc << 8);
/* XORin remainder from lookup table using the calculated index */
crc = (uint)(crc ^ (uint)crcTable[pos]);
}
crc = (crcModel.ResultReflected ? CrcUtil.Reflect32(crc) : crc);
return (uint)(crc ^ crcModel.FinalXor);
}
{
crcModel = GetReflectedCrcModel(crcModel);
uint crc = crcModel.Initial;
foreach (byte b in bytes)
{
byte curByte = (crcModel.InputReflected ? CrcUtil.Reflect8(b) : b);
/* update the LSB of crc value with next input byte */
crc = (uint)(crc ^ curByte);
/* this byte value is the index into the lookup table, make sure it's a byte */
byte pos = (byte)(crc & 0xFF);
/* shift out this index */
crc = (uint)(crc >> 8);
/* XORin remainder from lookup table using the calculated index */
crc = (uint)(crc ^ (uint)crcTable[pos]);
}
crc = (crcModel.ResultReflected ? CrcUtil.Reflect32(crc) : crc);
return (uint)(crc ^ crcModel.FinalXor);
}
8. Additional remarks (points worth to know)
This last chapter contains interesting (and maybe not completely obvious) topics about CRC calculation. These points are optional and contain just additional information, for those who are interested in same background information.
8.1 Basic mathematical view of CRC (read it first)
First let's recall some mathematical basics for CRC definition for better understanding of the next points. Please note that the sub chapter actually represents the same information as chapter 2 (don't hesitate to have a quick look back at this introduction chapter):
CRCn uses a generator polynomial G(x) which has degree n and n+1 terms: x^{n} + x^{n1} + ... + x^{1} + x^{0}.
The n+1 terms means that the polynomial has a length of n + 1 bits (using normal representation the most significant bit is left out).
Example: A CRC8 with polynomial 0x07 is actually the value 100000111 = 1*x^{8} + 0*x^{7} + 0*x^{6} + 0*x^{5} + 0*x^{4} + 0*x^{3} + 1*x^{2} + 1*x^{1} + 1*x^{0}.
The computation of CRC, which we know is based on polynomial division (or more specific on Polynomial arithmetic modulo 2), can be stated as M(x) * x^{n} = G(x) * Q(x) + R(x) where:
 M(x) is the input binary string, so M(x)*x^{n} is the input string with n zero bits appended. (Stop! Why is it like that? Answer: We'll come to this later in chapter 8.5 ).
 G(x) is the generator polynomial with degree n as already stated.
 Q(x) is quotient of the division and not used further.
 R(x) is the remainder = the actual CRC value.
8.2 Background to CRC verification
Or: "Why is the verification result zero if the CRC is computed over the input data with the actual CRC value appended (in the manual approach)?"
Actually, the XOR arithmetic of CRC division is comparable to school arithmetic in this case, so let's start with an example.
Assume the input data M(x) * x^{n} is the number 195 and the divisor G(x) is 29. We would calculate:
195 / 29 = 6 remainder 21^{(side note 1)}. This is the same as 195 = 29 * 6 + 21 (which corresponds to M(x) * x^n = G(x) * Q(x) + R(x)). So the CRC value R(x) is 21.
First verification approch (CRC is transmitted separately to the input data):
The sender sends the input data 195 along with the CRC value 21. The receiver then would take the input data and perform the same calculation (remember the generator polynomial is fixed and statically known by transmitter and receiver): 195 % 29 = 21. The calculated remainder is the same as the received one, so the data transmission was faultless.
Second verification approach (CRC is appended to the input data):
The CRC value is appended to the input data which corresponds in school arithemtic to subtraction. So the CRC value 21 is subtracted from input data 195 resulting in 174 ( = M(x) * x^n  R(x)). The receiver would perform 174 % 29 = 0, so the remainder is zero and the data transmission was faultless.
We know that M(x) * x^{n} is the input data bit string with n zero bits appended. As we see later ( in chapters 8.4 and 8.5) appending the bits of R(x) to this input string can be stated as M(x) * x^n  R(x).
Rearrange the formula: M(x) * x^n = G(x) * Q(x) + R(x) ===> M(x) * x^n  R(x) = G(x) * Q(x)
Here we see that M(x) * x^n  R(x) (which is the input data with the CRC appended) is an integer multiple of the polynomial G(x). And this means that M(x) * x^n  R(x) divided by G(x) results in 0.
Side note 1: M(x) * x^n = G(x) * Q(x) + R(x) can also be written using the modulo operator mod as R(x) = (M(x) * x^n) mod G(x).
8.3 CRC1 is the same as a parity bit
This is correct. Here some clarifying words: A parity bit indicates if the number of bits with value 1 is even or odd. In the case of even parity, the number of bits whose value is 1 in a given set are counted. If that total is odd, the parity bit value is set to 1, making the total count of 1's in the set an even number.
Example: The value 0x34 = 0011 0100 has three '1' bits. Because three is odd, the parity bit is 1.
CRC1 has degree 1 and 2 terms: a*x^1 + b*x^0. The most significant bit is always 1. However a polynomial 10 has no sense because the actual CRC value would always be 0. So CRC1 has polyomial 11 (binary) which is just 0x01 in normal representation. And this is in fact the calculation of an even parity bit. Remember that the actual CRC value has n bits, so for CRC1 the remainder has 1 bit, either 0 or 1.
Here an example for the value 0x34:
001101000 11  0001000 11  0100 11  010 11  01 = CRC = 1
8.4 Why is addition is the same as subtraction in CRC arithmetic?
CRC computation is performed using socalled polynomial arithmetic. This polynomial arithmetic is based on division over the finite field with two elements: 0 and 1. Also called Galois field over two elements.
To define the addition operation, there are only four cases to distinct:
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 0  note that there is no carry!
Due to the fact that we perform the calculation over a finite field (refer to [6] for more info), there is only one way to define subtraction:
0  0 = 0
0  1 = 1
1  0 = 1
1  1 = 0
We see that addition and subtraction are the same: Basically this is the XOR operation we have already used several times above.
That's the reason why we stated above that M(x) * x^n  R(x) and M(x) * x^n + R(x) are the same in CRC arithmetic.
8.5 Why does multiplication with x^{n} append n zeros?
This is due to the fact that each trailing zero bit of adds a factor of x. Let's say we have the polynomial x^{n} + x^{n1} + ... + x^{1} + x^{0}. Multiply it with x:
x* (x^{n} + x^{n1} + ... + x^{1} + x^{0})
= x^{n+1} + x^{n} + ... + x^{1}
So by multiplying with x, we have increased the degree of the polynomial by 1, which equals appending a zero to the right. Maybe this gets clearer with an example:
Assume the polynomial 01010010 = x^6 + x^4 + x and add a zero:
010100100 = x^7 + x^5 + x^2 = x * (x^6 + x^4 + x). So each factor of x appends a zero.
8.6 When using an initial value other than zero in the shift register, the result is incorrect.
In chapter 7 initial values were introduced, with the description that it's the value used to initialize the shift register. Well, you can read this statement also in other CRC articles, however there is a trap that is commonly not well described (if at all).
Example: If that would be true than e.g. following two CRC instances should calculate the same CRC:
Instance a: Initial value = 0x00, polynomial = 0x9B, input message data = [0xFF, 0x01].
Instance b: Initial value = 0xFF, polynomial = 0x9B, input message data = [0x01].
Assumption: Because in CRC instance a), the initial value is zero, nothing will happen (meaning no xor operation will be computed), until the first data byte 0xFF is completely shifted into the register, because 0x00 has no bit set. Then we have actually the same state as CRC instance b), so the processing of the remaining byte 0x01 should result the in the same CRC value for both cases. Right?
Unfortunately, not. Above examples give the two different CRC values of 0x2A and 0xE0.
Looking at the simple (nonlookup table, but bytewise handling) CRC8 implementation, we see the reason:
{
byte crc = crcModel.Initial;
foreach (byte b in bytes)
{
byte curByte = (crcModel.InputReflected ? CrcUtil.Reflect8(b) : b);
crc ^= curByte; /* XORin the next input byte */
for (int i = 0; i < 8; i++)
{
if ((crc & 0x80) != 0)
{
crc = (byte)((crc << 1) ^ crcModel.Polynomial);
}
else
{
crc <<= 1;
}
}
}
crc = (crcModel.ResultReflected ? CrcUtil.Reflect8(crc) : crc);
return (byte)(crc ^ crcModel.FinalXor);
}
> Before the actual first xor operation of the CRC calculation is performed, the initial value is xored with the first input byte!
What does that mean for the initial value of the shift register when performing the CRC calculation bit by bit, like the pen and paper approach at the top? Answer: The register must be initialized with the initial value 'xored' with the first input bytes. For CRC16 the initial value must be xored with the first two input bytes, for CRC32 with the first four input bytes, and so on.
Here an example implementation how to initialize a CRC32 shift register and the corresponding shift register algorithm. This is taken from the C# source package linked at the top of the article:
{
byte b1 = 0, b2 = 0, b3 = 0, b4 = 0;
if (bytes.Length >= 1)
{
b1 = crcModel.InputReflected ? CrcUtil.Reflect8(bytes[0]) : bytes[0];
}
if (bytes.Length >= 2)
{
b2 = crcModel.InputReflected ? CrcUtil.Reflect8(bytes[1]) : bytes[1];
}
if (bytes.Length >= 3)
{
b3 = crcModel.InputReflected ? CrcUtil.Reflect8(bytes[2]) : bytes[2];
}
if (bytes.Length >= 4)
{
b4 = crcModel.InputReflected ? CrcUtil.Reflect8(bytes[3]) : bytes[3];
}
return (uint)(crcModel.Initial ^ (uint)((uint)b4 << 24  (uint)b3 << 16  (uint)b2 << 8  b1));
}
public uint Compute_Simple_ShiftReg(byte[] bytes)
{
uint crc = GetInitialShiftRegister(bytes);
/* skip first four bytes, already inside crc register */
for (int byteIndex = 4; byteIndex < bytes.Length + 4; byteIndex++)
{
byte curByte = (byteIndex < bytes.Length) ? (crcModel.InputReflected ? CrcUtil.Reflect8(bytes[byteIndex]) : bytes[byteIndex]) : (byte)0;
for (int i = 0; i <= 7; i++)
{
if ((crc & LOWBIT_MASK) != 0)
{
crc = (uint)(crc >> 1);
crc = ((uint)(curByte & (1 << i)) != 0) ? (uint)(crc  0x80000000) : (uint)(crc & 0x7FFFFFFF);
crc = (uint)(crc ^ crcModel.Polynomial);
}
else
{
crc = (uint)(crc >> 1);
crc = ((uint)(curByte & (1 << i)) != 0) ? (uint)(crc  0x80000000) : (uint)(crc & 0x7FFFFFFF);
}
}
}
crc = (crcModel.ResultReflected ? CrcUtil.Reflect32(crc) : crc);
return (uint)(crc ^ crcModel.FinalXor);
}
9. Conclusion & References
This article is my own guide to CRC: very practical and based on many examples  quite different to most of the other articles on the web. I hope you found it interesting and that it supported you in understanding and implementing CRC!
Drop me a line for corrections, hints, criticism, praise ;)
Sunshine, February 2015 (updated September 2018)
References
[1] Cyclic redundancy check @ Wikipedia[2] "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". The International Conference on Dependable Systems and Networks: 145–154. Retrieved 29 December 2014.
[3] Mathematics of cyclic redundancy checks @ Wikipedia
[4] A Painless Guide to CRC Error Detection Algorithms
[5] CRC RevEng  Catalogue of parametrised CRC algorithms
[6] Finite field @ Wikipedia
Updates
 2018/09/29:
 Fixed a glitch in the "Stepbystep visualization of simple CRC8 algorithmus" table. Thanks to the careful reader for pointing me to it.
 Reworked the chapters of representation of polynomials and reciprocal polynomials  hopefully they are more understandable now (chapter 7.2 and 7.3).
 Fixed a typo in chapter 8.6 (byte 0x01 instead of 0x11). Thanks to the careful reader for pointing me to it.
 2016/11/11:
 Added chapters 7.3 (reversed polynomials) and chapter 8.6 (initial value for shift register)
 Updated C# source package with CRC classes that work in the reversed way.
 Updated Javascript Online calculator to show optionally the CRC lookup table in the reversed way.
 2016/08/19: Fixed a typo in chapter 8.4: 1 + 1 = 0 instead of 1 + 1 = 1.
 2015/05/30: Added chapter 8.