Tutorial :Float to binary in C++



Question:

I'm wondering if there is a way to represent a float using a char in C++?

For example:

int main()    {        float test = 4.7567;        char result = charRepresentation(test);        return 0;    }    

I read that probably using bitset I can do it but I'm not pretty sure.

Let's suppose that my float variable is 01001010 01001010 01001010 01001010 in binary.

If I want a char array of 4 elements, the first element will be 01001010, the second: 01001010 and so on.

Can I represent the float variable in a char array of 4 elements?


Solution:1

I suspect what you're trying to say is:

int main()    {        float test = 4.7567;       char result[sizeof(float)];        memcpy(result, &test, sizeof(test));        /* now result is storing the float,             but you can treat it as an array of              arbitrary chars           for example:      */      for (int n = 0; n < sizeof(float); ++n)           printf("%x", result[n]);        return 0;    }    

Edited to add: all the people pointing out that you can't fit a float into 8 bits are of course correct, but actually the OP is groping towards the understanding that a float, like all atomic datatypes, is ultimately a simple contiguous block of bytes. This is not obvious to all novices.


Solution:2

using a union is clean and easy

  union  {    float f;    unsigned int ul;    unsigned char uc[4];  } myfloatun;    myfloatun.f=somenum;  printf("0x%08X\n",myfloatun.ul);    

Much safer from a compiler perspective than pointers. Memcpy works just fine too.

EDIT

Okay, okay, here are fully functional examples. Yes, you have to use unions with care if you dont keep an eye on how this compiler allocates the union and pads or aligns it it can break and this is why some/many say it is dangerous to use unions in this way. Yet the alternatives are considered safe?

Doing some reading C++ has its own problems with unions and a union may very well just not work. If you really meant C++ and not C then this is probably bad. If you said kleenex and meant tissues then this might work.

    #include <stdio.h>  #include <string.h>  #include <stdlib.h>    typedef union  {      float f;      unsigned char uc[4];  } FUN;    void charRepresentation ( unsigned char *uc, float f)  {      FUN fun;        fun.f=f;      uc[0]=fun.uc[3];      uc[1]=fun.uc[2];      uc[2]=fun.uc[1];      uc[3]=fun.uc[0];  }    void floatRepresentation ( unsigned char *uc, float *f )  {      FUN fun;      fun.uc[3]=uc[0];      fun.uc[2]=uc[1];      fun.uc[1]=uc[2];      fun.uc[0]=uc[3];      *f=fun.f;  }    int main()  {      unsigned int ra;      float test;      char result[4];      FUN fun;        if(sizeof(fun)!=4)      {          printf("It aint gonna work!\n");          return(1);      }        test = 4.7567F;      charRepresentation(result,test);      for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");        test = 1.0F;      charRepresentation(result,test);      for(ra=0;ra<;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");        test = 2.0F;      charRepresentation(result,test);      for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");        test = 3.0F;      charRepresentation(result,test);      for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");        test = 0.0F;      charRepresentation(result,test);      for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");          test = 0.15625F;      charRepresentation(result,test);      for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");        result[0]=0x3E;      result[1]=0xAA;      result[2]=0xAA;      result[3]=0xAB;      floatRepresentation(result,&test);      printf("%f\n",test);        return 0;  }    

And the output looks like this

  gcc fun.c -o fun  ./fun  0x40 0x98 0x36 0xE3  0x3F 0x80 0x00 0x00  0x40 0x00 0x00 0x00  0x40 0x40 0x00 0x00  0x00 0x00 0x00 0x00  0x3E 0x20 0x00 0x00  0.333333  

You can verify by hand, or look at this website as I took examples directly from it, the output matches what was expected.

http://en.wikipedia.org/wiki/Single_precision

What you do not ever want to do is point at memory with a pointer to look at it with a different type. I never understood why this practice is used so often, particularly with structs.

  int broken_code ( void )  {      float test;      unsigned char *result        test = 4.567;      result=(unsigned char *)&test;        //do something with result here        test = 1.2345;        //do something with result here        return 0;  }  

That code will work 99% of the time but not 100% of the time. It will fail when you least expect it and at the worst possible time, like the day after your most important customer receives it. Its the optimizer that eats your lunch with this coding style. Yes, I know most of you do this and were taught this and perhaps have never been burned....yet. That just makes it more painful when it finally does happen, because now you know that it can and has failed (with popular compilers like gcc, on common computers like a pc).

After seeing this fail when using this method for testing an fpu, programmatically building specific floating point numbers/patterns, I switched to the union approach which so far has never failed. By definition the elements in the union share the same chunk of storage, and the compiler and optimizer do not get confused about the two items in that shared chunk of storage being...in that same shared chunk of storage. With the above code you are relying on an assumption that there is non-register memory storage behind every use of the variables and that all variables are written back to that storage before the next line of code. Fine if you never optimize or if you use a debugger. The optimizer does not know in this case that result and test share the same chunk of memory, and that is the root of the problem/bug. To do the pointer game you have to get into putting volatile on everything, like a union you still have to know how the compiler aligns and pads, you still have to deal with endians.

The problem is generic that the compiler doesnt know the two items share the same memory space. For the specific trivial example above I have watched the compiler optimize out the assignment of the number to the floating point variable because that value/variable is never used. The address for the storage of that variable is used and if you were to say printf the *result data the compiler would not optimize out the result pointer and thus not optimize out the address to test and thus not optimize out the storage for test, but in this simple example it can and has happened where the numbers 4.567 and 1.2345 never make it into the compiled program. I have also see the compiler allocate the storage for test, but assign the numbers to a floating point register then never use that register nor copy the contents of that register to the storage that it has assigned. The reasons why it fails for less trivial examples can be harder to follow, often having to do with register allocation and eviction, change a line of code and it works, change another and it breaks.

Memcpy,

    #include <stdio.h>  #include <string.h>  #include <stdlib.h>    void charRepresentation ( unsigned char *uc, float *f)  {      memcpy(uc,f,4);  }    void floatRepresentation ( unsigned char *uc, float *f )  {      memcpy(f,uc,4);  }    int main()  {      unsigned int ra;      float test;      unsigned char result[4];        ra=0;      if(sizeof(test)!=4) ra++;      if(sizeof(result)!=4) ra++;      if(ra)      {          printf("It aint gonna work\n");          return(1);      }        test = 4.7567F;      charRepresentation(result,&test);      printf("0x%02X ",(unsigned char)result[3]);      printf("0x%02X ",(unsigned char)result[2]);      printf("0x%02X ",(unsigned char)result[1]);      printf("0x%02X\n",(unsigned char)result[0]);        test = 0.15625F;      charRepresentation(result,&test);      printf("0x%02X ",(unsigned char)result[3]);      printf("0x%02X ",(unsigned char)result[2]);      printf("0x%02X ",(unsigned char)result[1]);      printf("0x%02X\n",(unsigned char)result[0]);        result[3]=0x3E;      result[2]=0xAA;      result[1]=0xAA;      result[0]=0xAB;      floatRepresentation(result,&test);      printf("%f\n",test);        return 0;  }  
  gcc fcopy.c -o fcopy  ./fcopy  0x40 0x98 0x36 0xE3  0x3E 0x20 0x00 0x00  0.333333  

With the flaming I am going to get about my above comments, and depending on which side of the argument you choose to be on. Perhaps memcpy is your safest route. You still have to know the compiler very well, and manage your endians. The compiler should not screw up the memcpy it should store the registers to memory before the call, and execute in order.


Solution:3

the best you can is create custom float that is byte size. or use char as fixed point decimal. on all cases this will lead to significant loss of precision.


Solution:4

You can only do so partially in a way that won't allow you to fully recover the original float. In general, this is called Quantization, and depending on your requirements there is an art to picking a good quantization. For example, floating point values used to represent R, G and B in a pixel will be converted to a char before being displayed on a screen.

Alternatively, it's easy to store a float in its entirety as four chars, with each char storing some of the information about the original number.


Solution:5

You can create, for that number, a fixed point value using 2 bits for the whole number and 5 bits for the fractional portion (or 6 if you want it to be unsigned). That would be able to store roughly 4.76 in terms of accuracy. You don't quite have enough size to represent that number much more accurately - unless you used a ROM lookup table of 256 entries where you are storing your info outside the number itself and in the translator.


Solution:6

int main()    {        float test = 4.7567;        char result = charRepresentation(test);        return 0;    }  

If we ignore that your float is a float and convert 47567 into binary, we get 10111001 11001111. This is 16 bits, which is twice the size of a char (8 bits). Floats store their numbers by storing a sign bit (+ or -), an exponent (where to put the decimal point, in this case 10^-1), and then the significant digits (47567). There's just not enough room in a char to store a float.

Alternatively, consider that a char can store only 256 different values. With four decimal places of precision, there are far more than 256 different values between 1 and 4.7567 or even 4 and 4.7567. Since you can't differentiate between more than 256 different values, you don't have enough room to store it.

You could conceivably write something that would 'translate' from a float to a char by limiting yourself to an extremely small range of values and only one or two decimal places*, but I can't think of any reason you would want to.

*You can store any value between 0 and 256 in a char, so if you always multiplied the value in the char by 10^-1 or 10^-2 (you could only use one of these options, not both since there isn't enough room to store the exponent) you could store any number between 0 and 25.6 or 0 and 2.56. I don't know what use this would have though.


Solution:7

A C char is only 8 bits (on most platforms). The basic problem this causes is two-fold. First, almost all FPUs in existence support IEEE floating point. That means floating point values either require 32 bits, or 64. Some support other non-standard sizes, but the only ones I'm aware of are 80 bits. None I have ever heard of support floats of only 8 bits. So you couldn't have hardware support for an 8-bit float.

More importantly, you wouldn't be able to get a lot of digits out of an 8-bit float. Remember that some bits are used to represent the exponent. You'd have almost no precision left for your digits.

Are you perhaps instead wanting to know about Fixed point? That would be doable in a byte.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »