Tutorial :Setting up multiple MySQL databases with scalability options



Question:

I need to set up a MySQL environment that will support adding many unique databases over time (thousands, actually). I assume that at some point I will need to start adding MySQL servers, and would like my environment to be prepared for the case beforehand, to make the transition to a 2nd, 3rd, 100th server easy.

And just to make it interesting, It would be very convenient if the solution was modeled so the application that queries the databases sends all the queries to a single address and receives a result. It should be unaware of the number and location of the servers. The database name is unique and can be used to figure out which server holds the database.

I've done some research, and MySQL Proxy pops out as the main candidate, but I haven't been able to find anything specific about making it perform as described above.

Anyone?


Solution:1

Great question. I know of several companies that have done this (Facebook jumps out as the biggest). None are happy, but alternatives kind of suck, too.

More things for you to consider -- what happens when some of these databases or servers fail? What happens when you need to do a cross-database query (and you will, even if you don't think so right now).

Here's the FriendFeed solution: http://bret.appspot.com/entry/how-friendfeed-uses-mysql

It's a bit "back-asswards" since they are basically using MySQL as a glorified key-value store. I am not sure why they don't just cut out the middleman and use something like BerkeleyDB for storing their objects. Connection management, maybe? Seems like the MySQL overhead would be too high a price to pay for something that could be added pretty easily (famous last words).

What you are really looking for (I think) is a distributed share-nothing database. Several have been built on top of open-source technologies like MySQL and PostgreSQL, but none are available for free. If you are in the buying mood, check out these companies: Greenplum, AsterData, Netezza, Vertica.

There is also a large number of various distributed key-value storage solutions out there. For lack of a better reference, here's a starting point: http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/ .


Solution:2

Your problem sounds similar to one we faced - that you are acting as a white-label, and that each client needs to have their own separate database. Assuming this concept parallels yours, what we did was leverage a "master" database that stored the hostname and database name for the client (which could be cached in the application tier). The server the client was accessing could then dynamically shift its datasource to the required database. This allowed us to scale up to thousands of client databases, scattered across servers.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »