Tutorial :Numerical optimization


I was wondering which Integer or Float types are the fastest..
i was thinking byte is faster than integer because it has a smaller range.
Some people told me .. that in some cases integer is faster than a byte.

second question :
The GPU is on his way to World Domination ..
so i asked myself : Can a Double "be faster" than a Integer .. because of the FPU
so where are the experts ? :)


You have to think about more than the clock cycles to carry out arithmetic. You could say that adding two ints takes this many cycles, adding two doubles takes this many cycles, etc. but that may not be relevant. If all your data fits into cache at the same time, then timing individual operations makes sense. But if not, the extra time required due to a cache miss dominates the difference in individual operations. Sometimes working with smaller data types is faster because it makes the difference between having to pull something from cache or not, or having to go to disk or not.

These days computers spend most of their time moving data around, not doing arithmetic, even in number crunching applications. And the ratio of the former to the latter is increasing. You can't simply compare, for example, the time needed to multiply shorts versus doubles. You might find that given two versions of your program, one version runs faster on a small problem and the other version runs faster on a larger program, all because of the relative efficiency of kinds of memory.


i was thinking byte is faster than integer because it has a smaller range.

Something I have experienced: using a short gave me a performance hit whereas using an int was just fine. This is because, shorts typically don't exist on the architecture. They are convenience types. The processor actually works with its word-size. In my case, the word size was that of an int. So, when accessing a short, it had to pack the value in an int first, work with it and then unpack and get me the result in a short. All of which resulted in a performance hit. So, shorter is not necessarily better.


It depends of no of databits in the architecture. The floating point processor will treat float and double identically when doing calculations. They are both evaluated with 80-bit precision and will therefore take the same amount of time. Loading and saving the values into the FPU registers might make a difference. Double takes twice the space in RAM and might therefore be slower due to cache misses. Noticeable if you have large arrays that you tend to index randomly.


At the CPU level, there are no bytes, only words, which are 32bit or 64bit nowadays. Arithmetic units are usually hardwired to deal with word-sized numbers (or larger, in the case of floating point).

So there is no speed advantage in using types smaller than a word in regard to arithmetic operations, and there may be a speed penalty because you have to do additional work in order to simulate types that the CPU does not have natively, e.g. writing a single byte to memory requires you to first read the word it is part of, modify it and then write it back. In order to avoid this, most compilers will actually use a full word of memory for all smaller variables, so even a boolean variable takes up 32 or 64 bits.

However, if you have a large amount of data, such as a large array, then using smaller types will usually yield better performance because you'll have fewer cache misses.


The byte length of numeric types depends on the language and sometimes also the platform you are using. For example in java both int and float use 4 bytes, so the processing time should be equal. It would surprise me though that longer types get processed faster. If there is evidence of that i would like to read about it.


About which one is faster, integer or byte, as long they both fit into the register they work the same, or at least without measurable difference.

About integer vs.double: May be GPU does faster arithmetic with doubles then regular cpu, but I doubt it does double arithmetic faster then integer, since integer arithmetic is just registers arithmetics.


The biggest optimization is passing from using looped scalar calculations, to use vector calculations. Then take advantage of GPU or CPU's SSE.


Well, as long as you do not do any vector optimizations you can use integers as large as your registers (32/64 bit) without any actual performance hit.

Floating point numbers are a bit different: While CPUs are optimized for doubles, GPUs usually work with floats.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »