Bit error counter - how to make it faster

gamer · Jun 27, 2007

My goal is to implement a bit-error counter targeting 1GHz. The
datawidth is parametrizable. I started off this way,

Verilog code:
----------
assign mismatch[datawidth-1:0] = input_data ^ expected_data;
assign matched = ~( | mismatch); // reduction NOR

always @(posedge clk or posedge reset) begin
if (reset)
error_count = 0;
else if (~matched)
for (i=0; i<datawidth; i=i+1)
error_count = error_count + mismatch;
end
---------/////////----------------------
The above meets timing for small datawidths (like 8-bits) and starts
failing to meet timing once the datawidth gets larger. I will be glad
for suggestions of alternate ways to implement this bit-error counter.
In practice our datawidths could go upto 256 bits.

Thanks

Günther Jehle · Jun 27, 2007

The above meets timing for small datawidths (like 8-bits) and starts

failing to meet timing once the datawidth gets larger. I will be glad
for suggestions of alternate ways to implement this bit-error counter.
In practice our datawidths could go upto 256 bits.
Click to expand...

Spread your algorithm to work over more than one clock cycle. Use a pipeline
stage over eigth clocks. The first stage calculates bit 0 to 7, the second
calculates 8 to 15 and so on. In an additional stage you sum up all results.
The result is always 9 clocks behind.

See http://en.wikipedia.org/wiki/Pipeline_(computer) for a general
explanation of pipelines.

hth
Günther Jehle

Paul A. Clayton · Jun 27, 2007

My goal is to implement a bit-error counter targeting 1GHz. The
datawidth is parametrizable. I started off this way,

Verilog code:
----------
assign mismatch[datawidth-1:0] = input_data ^ expected_data;
assign matched = ~( | mismatch); // reduction NOR

always @(posedge clk or posedge reset) begin
if (reset)
error_count = 0;
else if (~matched)
for (i=0; i<datawidth; i=i+1)
error_count = error_count + mismatch;
end
---------/////////----------------------
The above meets timing for small datawidths (like 8-bits) and starts
failing to meet timing once the datawidth gets larger. I will be glad
for suggestions of alternate ways to implement this bit-error counter.
In practice our datawidths could go upto 256 bits.
Click to expand...

I know no verilog and am not even a C programmer, but ISTM
that the summation could be highly parallel by using a tree
of adders (perhaps using carry save to further accelerate
the summation). Consider it like computing the population
count followed by an ordinary addition: one can add together
the even and odd mismatch bits in parallel, then add pairs
of these sums, etc., finally adding the sum to error_count.

(BTW, if I understand correctly one should not need the
"reduction NOR" and the "if (~matched)" since if the
popcount is zero, adding it to error_count would effectively
be a no-op, assuming worst-case speed is the only
consideration. As Gunther Jehle pointed out pipelining
could provide higher throughput.)

Paul A. Clayton
just a technophile

comp.arch.fpga · Jun 28, 2007

A few layers of full adders to reduce the 256 input lines into a more
reasonable number, then one or more carry-save accumulators to gather
these into something that can be handled very quickly.

Only when reading out the number of errors do you propagate the carries
to generate the final/real counts.
Click to expand...

In Virtex-5 the 6-LUTs point to a 6-to-3 reduction instead of full
adders.
You can also use BRAMs for an 11 to 4 reduction.
512 bits at 500MHz might be easier than 256 bits at 1GHz.

Kolja Sulimma

Andy Freeman · Jun 28, 2007

My goal is to implement a bit-error counter targeting 1GHz. The
datawidth is parametrizable. I started off this way,
Click to expand...

How is the input data created? If the register values come from bit
set/clear operations, you can keep track of the number of ones with a
separate counter that is updated appropriately when the register is
updated.

For example, suppose that you'd like to know the number of ones in a
shift register. If the input bit and the bit shifted off (and thus
removed from the register) are the same, the separate counter is
unchanged. If the input is 1 and the bit shifted off is 0, the
counter is incremented. If the input is 0 and the bit shifted off is
1, the counter is decremented. When the register as a whole is
cleared, the counter is also cleared.

It's easy enough to extend this to any update scheme where the number
of bits that can change each cycle is small, even if those can be in
arbitrary positions.

YUAN, Nan · Jun 30, 2007

My goal is to implement a bit-error counter targeting 1GHz. The
datawidth is parametrizable. I started off this way,

Verilog code:
Click to expand...

I guess the critical path of timming is summation chain.
There're serveral binary tricks to get number of bit "1" in a word,
take C code of 32-bit word for example, there's only 5 summations
instead of 32.

unsigned int bitcount32(unsigned int x)
{
x = (x & 0x55555555UL) + ((x >> 1) & 0x55555555UL); // 0-2 in 2
bits
x = (x & 0x33333333UL) + ((x >> 2) & 0x33333333UL); // 0-4 in 4
bits
#if 1
// Version 1
x = (x & 0x0f0f0f0fUL) + ((x >> 4) & 0x0f0f0f0fUL); // 0-8 in 8
bits
x = (x & 0x00ff00ffUL) + ((x >> 8) & 0x00ff00ffUL); // 0-16 in 16
bits
x = (x & 0x0000ffffUL) + ((x >> 16) & 0x0000ffffUL); // 0-31 in 32
bits
return x;
#else
// Version 2
x = (x + (x >> 4)) & 0x0f0f0f0fUL; // 0-8 in 4 bits
x += x >> 8; // 0-16 in 8 bits
x += x >> 16; // 0-32 in 8 bits
return x & 0xff;
#endif
}

always @(posedge clk or posedge reset) begin
if (reset)
error_count = 0;
else if (~matched)
for (i=0; i<datawidth; i=i+1)
error_count = error_count + mismatch;
end
---------/////////----------------------
The above meets timing for small datawidths (like 8-bits) and starts
failing to meet timing once the datawidth gets larger. I will be glad
for suggestions of alternate ways to implement this bit-error counter.
In practice our datawidths could go upto 256 bits.

Thanks
Click to expand...

Bit error counter - how to make it faster

gamer Guest

Günther Jehle Guest

Paul A. Clayton Guest

comp.arch.fpga Guest

Andy Freeman Guest

YUAN, Nan Guest

Want to reply to this thread or ask your own question?

PKS 5.0: puzzled by 32-bit vs 64-bit performance

How to tell Cadence to use a 32-bit or 64-bit executable?

64 bit linux and 64 bit icfb executable

64-bit vs 32-bit

Are 64-bit Binaries Really Slower than 32-bit Binaries? - The answer?

64 Bit XP-32 Bit SWX-3GB switch

Is 64 bit OS faster?

how to make index counter

Bit error counter - how to make it faster

gamer Guest

Günther Jehle Guest

Paul A. Clayton Guest

comp.arch.fpga Guest

Andy Freeman Guest

YUAN, Nan Guest

Want to reply to this thread or ask your own question?

Useful Searches