Tutorial :Fastest method for checking overflow? [duplicate]



Question:

This question already has an answer here:

Here's my attempt. Any tips on a better solution?:

// for loop to convert 32 to 16 bits  uint32_t i;  int32_t * samps32 = (int32_t *)&(inIQbuffer[0]);  int16_t * samps16 = (int16_t *)&(outIQbuffer[0]);  for( i = 0; i < ( num_samples * 2/* because each sample is two int32 s*/ ); i++ ) {      overflowCount += ( abs(samps32[i]) & 0xFFFF8000 ) ? 1 : 0;       samps16[i] = (int16_t)samps32[i];  }    // Only report error every 4096 accumulated overflows  if( ( overflowCount & 0x1FFF ) > 4096 ) {      printf( "ERROR: Overflow has occured while scaling from 32 "              "bit to 16 bit samples %d times",               overflowCount );  }  

Here's the part that actually checks for overflow:

overflowCount += ( abs(samps32[i]) & 0xFFFF8000 ) ? 1 : 0;   


Solution:1

It seems that you are checking for the overflow of a 16-bit addition. You can avoid branch in the assembler code by just having

overflowCount += (samps32[i] & 0x8000) >> 15;  

This generates three ALU operations but no branch in the code. It may or may not be faster than a branching version.


Solution:2

I personally prefer to use the SafeInt class to do my overflow checking. It reduces the need for tedious error checking and turns it into an easy to process, yet difficult to ignore exception.

http://blogs.msdn.com/david_leblanc/archive/2008/09/30/safeint-3-on-codeplex.aspx


Solution:3

What you already do, is closest to the fastests possible for a single cast. you can however omit some code

overflowCount += ( abs(samps32[i]) & 0xFFFF8000 ) ? 1 : 0;

can be changed into:

if (samps32[i] & 0xFFFF8000) overflowCount++;

or even simpler

if (samps32[i] >> 15) overflowCount++;

both of these will be equally fast, and both will be faster than yours.

If you are actually interrested in the count of overflows, you might consider processing the array of integers with SIMD operations.


Solution:4

Bit ops would be my choice, too. the only faster way I can imagine at the moment is to use inline assembly where you load the source operand, make a copy onboard the chip, truncate, and bitwise compare (that was pseudo pseudo code).

Your code has an issue: It violates aliasing rules. You could use something like this instead:

union conv_t {      int32_t i32;      int16_t i16;  };  

Then you could ensure that IQBuffer is of that type. Finally, you could run:

for( i = 0; i < (num_samples * 2); i++ ) {      <test goes here>      samps [i].i16 = static_cast<int16_t>(samps [i].i32);  }  

edit: As per your edit (https://stackoverflow.com/revisions/677427/list) you drove nearly my whole post invalid. Thanks for not mentioning your edit in your question.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »