Tutorial :Is Ruby really an interpreted language if all of its implementations are compiled into bytecode?


In the chosen answer for this question about Blue Ruby, Chuck says:

All of the current Ruby implementations are compiled to bytecode. Contrary to SAP's claims, as of Ruby 1.9, MRI itself includes a bytecode compiler, though the ability to save the compiled bytecode to disk disappeared somewhere in the process of merging the YARV virtual machine. JRuby is compiled into Java .class files. I don't have a lot of details on MagLev, but it seems safe to say it will take that road as well.

I'm confused about this compilation/interpretation issue with respect to Ruby.

I learned that Ruby is an interpreted language and that's why when I save changes to my Ruby files I don't need to re-build the project.

But if all of the Ruby implementations now are compiled, is it still fair to say that Ruby is an interpreted language? Or am I misunderstanding something?


Yes, Ruby's still an interpreted language, or more precisely, Matz's Ruby Interpreter (MRI), which is what people usually talk about when they talk about Ruby, is still an interpreter. The compilation step is simply there to reduce the code to something that's faster to execute than interpreting and reinterpreting the same code time after time.


Nearly every language is "compiled" nowadays, if you count bytecode as being compiled. Even Emacs Lisp is compiled. Ruby was a special case because until recently, it wasn't compiled into bytecode.

I think you're right to question the utility of characterizing languages as "compiled" vs. "interpreted." One useful distinction, though, is whether the language creates machine code (e.g. x86 assembler) directly from user code. C, C++, many Lisps, and Java with JIT enabled do, but Ruby, Python, and Perl do not.

People who don't know better will call any language that has a separate manual compilation step "compiled" and ones that don't "interpreted."


A subtle question indeed... It used to be that "interpreted" languages were parsed and transformed into an intermediate form which was faster to execute, but the "machine" executing them was a pretty language specific program. "Compiled" languages were translated instead into the machine code instructions supported by the computer on which it was run. An early distinction was very basic--static vs. dynamic scope. In a statically typed language, a variable reference could pretty much be resolved to a memory address in a few machine instructions--you knew exactly where in the calling frame the variable referred. In dynamically typed languages you had to search (up an A-list or up a calling frame) for the reference. With the advent of object oriented programming, the non-immediate nature of a reference expanded to many more concepts--classes(types), methods(functions),even syntactical interpretation (embedded DSLs like regex).

The distinction, in fact, going back to maybe the late 70's was not so much between compiled and interpreted languages, but whether they were run in a compiled or interpreted environment. For example, Pascal (the first high-level language I studied) ran at UC Berkeley first on Bill Joy's pxp interpreter, and later on the compiler he wrote pcc. Same language, available in both compiled and interpreted environments.

Some languages are more dynamic than others, the meaning of something--a type, a method, a variable--is dependent on the run-time environment. This means that compiled or not there is substantial run-time mechanism associated with executing a program. Forth, Smalltalk, NeWs, Lisp, all were examples of this. Initially, these languages required so much mechanism to execute (versus a C or a Fortran) that they were a natural for interpretation.

Even before Java, there were attempts to speed up execution of complex, dynamic languages with tricks, techniques which became threaded compilation, just-in-time compilation, and so on.

I think it was Java, though, which was the first wide-spread language that really muddied the compiler/interpreter gap, ironically not so that it would run faster (though, that too) but so that it would run everywhere. By defining their own machine language and "machine" the java bytecode and VM, Java attempted to become a language compiled into something close to any basic machine, but not actually any real machine.

Modern languages marry all these innovations. Some have the dynamic, open-ended, you-don't-know-what-you-get-until-runtime nature of traditional "interpreted languages (ruby, lisp, smalltalk, python, perl(!)), some try to have the rigor of specification allowing deep type-based static error detection of traditional compiled languages (java, scala). All compile to actual machine-independent representations (JVM) to get write once-run anywhere semantics.

So, compiled vs. interpreted? Best of both, I'd say. All the code's around in source (with documentation), change anything and the effect is immediate, simple operations run almost as fast as the hardware can do them, complex ones are supported and fast enough, hardware and memory models are consistent across platforms.

The bigger polemic in languages today is probably whether they are statically or dynamically typed, which is to say not how fast will they run, but will the errors be found by the compiler beforehand (at the cost of the programmer having to specify pretty complex typing information) or will the errors come up in testing and production.


You can run Ruby programs interactively using irb, the Interactive Ruby Shell. While it may generate intermediate bytecode, it's certainly not a "compiler" in the traditional sense.


A compiled language is generally compiled into machine code, as opposed to just byte code. Some byte code generators can actually further compile the byte code into machine code though.

Byte code itself is just an intermediate step between the literal code written by the user and the virtual machine, it still needs to be interpreted by the virtual machine though (as it's done with Java in a JVM and PHP with an opcode cache).


This is possibly a little off topic but...

Iron Ruby is a .net based implementation of ruby and therefore is usually compiled to byte code and then JIT compiled to machine language at runtime (i.e. not interpreted). Also (at least with other .net languages, so I assume with ruby) ngen can be used to generate a compiled native binary ahead of time, so that's effectively a machine code compiled version of ruby code.


As for the information I got from RubyConf 2011 in Shanghai, Matz is developing a 'MRuby'(stands for Matz's Ruby) to targeting running on embedded devices. And Matz said the the MRuby, will provide the ability to compile the ruby code into machine code to boost the speed and decrease the usage of the (limited) resources on the embedded devices. So, there're various kind of Ruby implementation and definitely not all of them are just interpreted during the runtime.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »