Tutorial :Detecting endianness programmatically in a C++ program



Question:

Is there a programmatic way to detect whether or not you are on a big-endian or little-endian architecture? I need to be able to write code that will execute on an Intel or PPC system and use exactly the same code (i.e. no conditional compilation).


Solution:1

I don't like the method based on type punning - it will often be warned against by compiler. That's exactly what unions are for !

bool is_big_endian(void)  {      union {          uint32_t i;          char c[4];      } bint = {0x01020304};        return bint.c[0] == 1;   }  

The principle is equivalent to the type case as suggested by others, but this is clearer - and according to C99, is guaranteed to be correct. gcc prefers this compared to the direct pointer cast.

This is also much better than fixing the endianness at compile time - for OS which support multi-architecture (fat binary on Mac os x for example), this will work for both ppc/i386, whereas it is very easy to mess things up otherwise.


Solution:2

You can do it by setting an int and masking off bits, but probably the easiest way is just to use the built in network byte conversion ops (since network byte order is always big endian).

if ( htonl(47) == 47 ) {    // Big endian  } else {    // Little endian.  }  

Bit fiddling could be faster, but this way is simple, straightforward and pretty impossible to mess up.


Solution:3

Please see this article:

Here is some code to determine what is the type of your machine

int num = 1;  if(*(char *)&num == 1)  {      printf("\nLittle-Endian\n");  }  else  {      printf("Big-Endian\n");  }  


Solution:4

This is normally done at compile time (specially for performance reason) by using the header files available from the compiler or create your own. On linux you have the header file "/usr/include/endian.h"


Solution:5

You can use std::endian if you have access to C++20 compiler such as GCC 8+ or Clang 7+:

#include <type_traits>    if constexpr (std::endian::native == std::endian::big)  {      // Big endian system  }  else if constexpr (std::endian::native == std::endian::little)  {      // Little endian system  }  else  {      // Something else  }  


Solution:6

Ehm... It surprises me that noone has realized that the compiler will simply optimize the test out, and will put a fixed result as return value. This renders all code examples above, effectively useless. The only thing that would be returned is the endianness at compile-time! And yes, I tested all of the above examples. Here's an example with MSVC 9.0 (Visual Studio 2008).

Pure C code

int32 DNA_GetEndianness(void)  {      union       {          uint8  c[4];          uint32 i;      } u;        u.i = 0x01020304;        if (0x04 == u.c[0])          return DNA_ENDIAN_LITTLE;      else if (0x01 == u.c[0])          return DNA_ENDIAN_BIG;      else          return DNA_ENDIAN_UNKNOWN;  }  

Disassembly

PUBLIC  _DNA_GetEndianness  ; Function compile flags: /Ogtpy  ; File c:\development\dna\source\libraries\dna\endian.c  ;   COMDAT _DNA_GetEndianness  _TEXT   SEGMENT  _DNA_GetEndianness PROC                 ; COMDAT    ; 11   :     union   ; 12   :     {  ; 13   :         uint8  c[4];  ; 14   :         uint32 i;  ; 15   :     } u;  ; 16   :   ; 17   :     u.i = 1;  ; 18   :   ; 19   :     if (1 == u.c[0])  ; 20   :         return DNA_ENDIAN_LITTLE;        mov eax, 1    ; 21   :     else if (1 == u.c[3])  ; 22   :         return DNA_ENDIAN_BIG;  ; 23   :     else  ; 24   :        return DNA_ENDIAN_UNKNOWN;  ; 25   : }        ret  _DNA_GetEndianness ENDP  END  

Perhaps it is possible to turn off ANY compile-time optimization for just this function, but I don't know. Otherwise it's maybe possible to hardcode it in assembly, although that's not portable. And even then even that might get optimized out. It makes me think I need some really crappy assembler, implement the same code for all existing CPUs/instruction sets, and well.... never mind.

Also, someone here said that endianness does not change during run-time. WRONG. There are bi-endian machines out there. Their endianness can vary durng execution. ALSO, there's not only Little Endian and Big Endian, but also other endiannesses (what a word).

I hate and love coding at the same time...


Solution:7

Declare an int variable:

int variable = 0xFF;  

Now use char* pointers to various parts of it and check what is in those parts.

char* startPart = reinterpret_cast<char*>( &variable );  char* endPart = reinterpret_cast<char*>( &variable ) + sizeof( int ) - 1;  

Depending on which one points to 0xFF byte now you can detect endianness. This requires sizeof( int ) > sizeof( char ), but it's definitely true for the discussed platforms.


Solution:8

I surprised no-one has mentioned the macros which the pre-processor defines by default. While these will vary depending on your platform; they are much cleaner than having to write your own endian-check.

For example; if we look at the built-in macros which GCC defines (on an X86-64 machine):

:| gcc -dM -E -x c - |grep -i endian  #define __LITTLE_ENDIAN__ 1  

On a PPC machine I get:

:| gcc -dM -E -x c - |grep -i endian  #define __BIG_ENDIAN__ 1  #define _BIG_ENDIAN 1  

(The :| gcc -dM -E -x c - magic prints out all built-in macros).


Solution:9

For further details, you may want to check out this codeproject article Basic concepts on Endianness:

How to dynamically test for the Endian type at run time?

As explained in Computer Animation FAQ, you can use the following function to see if your code is running on a Little- or Big-Endian system: Collapse

#define BIG_ENDIAN      0  #define LITTLE_ENDIAN   1  
int TestByteOrder()  {     short int word = 0x0001;     char *byte = (char *) &word;     return(byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);  }  

This code assigns the value 0001h to a 16-bit integer. A char pointer is then assigned to point at the first (least-significant) byte of the integer value. If the first byte of the integer is 0x01h, then the system is Little-Endian (the 0x01h is in the lowest, or least-significant, address). If it is 0x00h then the system is Big-Endian.


Solution:10

As stated above, use union tricks.

There are few problems with the ones advised above though, most notably that unaligned memory access is notoriously slow for most architectures, and some compilers won't even recognize such constant predicates at all, unless word aligned.

Because mere endian test is boring, here goes (template) function which will flip the input/output of arbitrary integer according to your spec, regardless of host architecture.

#include <stdint.h>    #define BIG_ENDIAN 1  #define LITTLE_ENDIAN 0    template <typename T>  T endian(T w, uint32_t endian)  {      // this gets optimized out into if (endian == host_endian) return w;      union { uint64_t quad; uint32_t islittle; } t;      t.quad = 1;      if (t.islittle ^ endian) return w;      T r = 0;        // decent compilers will unroll this (gcc)      // or even convert straight into single bswap (clang)      for (int i = 0; i < sizeof(r); i++) {          r <<= 8;          r |= w & 0xff;          w >>= 8;      }      return r;  };  

Usage:

To convert from given endian to host, use:

host = endian(source, endian_of_source)

To convert from host endian to given endian, use:

output = endian(hostsource, endian_you_want_to_output)

The resulting code is as fast as writing hand assembly on clang, on gcc it's tad slower (unrolled &,<<,>>,| for every byte) but still decent.


Solution:11

Unless you're using a framework that has been ported to PPC and Intel processors, you will have to do conditional compiles, since PPC and Intel platforms have completely different hardware architectures, pipelines, busses, etc. This renders the assembly code completely different between the two.

As for finding endianness, do the following:

short temp = 0x1234;  char* tempChar = (char*)&temp;  

You will either get tempChar to be 0x12 or 0x34, from which you will know the endianness.


Solution:12

The C++ way has been to use boost, where preprocessor checks and casts are compartmentalized away inside very thoroughly-tested libraries.

The Predef Library (boost/predef.h) recognizes four different kinds of endianness.

The Endian Library was planned to be submitted to the C++ standard, and supports a wide variety of operations on endian-sensitive data.

As stated in answers above, Endianness will be a part of c++20.


Solution:13

I would do something like this:

bool isBigEndian() {      static unsigned long x(1);      static bool result(reinterpret_cast<unsigned char*>(&x)[0] == 0);      return result;  }  

Along these lines, you would get a time efficient function that only does the calculation once.


Solution:14

bool isBigEndian()  {      static const uint16_t m_endianCheck(0x00ff);      return ( *((uint8_t*)&m_endianCheck) == 0x0);   }  


Solution:15

compile time, non-macro, C++11 constexpr solution:

union {    uint16_t s;    unsigned char c[2];  } constexpr static  d {1};    constexpr bool is_little_endian() {    return d.c[0] == 1;  }  


Solution:16

union {      int i;      char c[sizeof(int)];  } x;  x.i = 1;  if(x.c[0] == 1)      printf("little-endian\n");  else    printf("big-endian\n");  

This is another solution. Similar to Andrew Hare's solution.


Solution:17

untested, but in my mind, this should work? cause it'll be 0x01 on little endian, and 0x00 on big endian?

bool runtimeIsLittleEndian(void)  {   volatile uint16_t i=1;   return  ((uint8_t*)&i)[0]==0x01;//0x01=little, 0x00=big  }  


Solution:18

You can also do this via the preprocessor using something like boost header file which can be found boost endian


Solution:19

int i=1;  char *c=(char*)&i;  bool littleendian=c;  


Solution:20

How about this?

#include <cstdio>    int main()  {      unsigned int n = 1;      char *p = 0;        p = (char*)&n;      if (*p == 1)          std::printf("Little Endian\n");      else           if (*(p + sizeof(int) - 1) == 1)              std::printf("Big Endian\n");          else              std::printf("What the crap?\n");      return 0;  }  


Solution:21

Unless the endian header is GCC-only, it provides macros you can use.

#include "endian.h"  ...  if (__BYTE_ORDER == __LITTLE_ENDIAN) { ... }  else if (__BYTE_ORDER == __BIG_ENDIAN) { ... }  else { throw std::runtime_error("Sorry, this version does not support PDP Endian!");  ...  


Solution:22

If you don't want conditional compilation you can just write endian independent code. Here is an example (taken from Rob Pike):

Reading an integer stored in little-endian on disk, in an endian independent manner:

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);  

The same code, trying to take into account the machine endianness:

i = *((int*)data);  #ifdef BIG_ENDIAN  /* swap the bytes */  i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);  





        
Previous
Next Post »