MySQL: The Dissing and the DefenseBy Channel Insider Staff | Posted 2003-12-31 Email Print
Re-Thinking HR: What Every CIO Needs to Know About Tomorrow's Workforce
Here's a select sampling of what Slashdot users had to say about the recent, independent study of MySQL code on the part of Reasoning Inc.Here's a select sampling of what Slashdot users had to say about the recent, independent study of MySQL code on the part of Reasoning Inc. To read the whole string, click here.
It does indeed sound a bit like that, and with good reason. If you notice, the "independent review" was carried out by Reasoning Inc., and we've heard of them before in these parts.
For the benefit of those who haven't seen this trollfest^H^H^H^H^H^H^H^H^Hstory in its previous incarnations, Reasoning's services spot what some people call "systematic" errors, things like NULL pointer dereferencing or the use of uninitialized variables. As many people note every time this subject comes up, any smart development team will use a tool like Lint to check their code anyway, as a required step before check-in and/or as a regular, automated check of the entire code base, and so any smart development team should find all such errors immediately. IOWs, it's grossly unfair to compare open and closed source "code quality" on this basis. Any project that has errors like this in it at all isn't serious about quality, and it shouldn't take an external study to point this out.
Serious code quality is not dictated by how many mechanical errors there are that slip through because of weaknesses in the implementation language. Rather, it is indicated by how many "genuine" logic errorscases where the output differs unintentionally from the specificationsthere are. Of course, no automated process can identify those, but to get a meaningful comparison of code quality, you'd need to investigate that aspect, rather than kindergarten mistakes.
There are other objections to their principal metric as well. For starters, source code layout is not normally significant in C, C++ or Java, so any metric based on line count is going to be flawed at best. But the big objection is that they're talking about childish mistakes, and comparing supposedly world-class software based on childish mistakes isn't helpful (except to dispel the myth that some big name products have sensible development processes).
Re: they quantified it by dividing verified defects by lines of code.
Problem with that is that it assumes the same "code density". Granted, it's probably not going to differ by a factor of six, but remember the old question about programmer productivity:
Who's more productive: the coder who solves a given problem with 100 lines of code written in 1 hour, or the coder who solves it with 10 lines in 2 hours?
I mean, simple stuff like doing this:
bool function(int i);
//blah blah blah
bool function(int i);
foo = false;
foo = function(i);
//blah blah blah
...will give you a threefold difference in line count (specifically counting lines in the main() function). Throw in an identical line using malloc in each, both forgetting to free it later, and you've got a "bug density" of .33 for the former, and .14 for the latter. Heck, you could have two un-freed malloc's in the latter and it'd still only be at .25! I'm not saying the study is wrongI'd rather have the code out where I can see it, no matter WHAT the "bug density"I'm just saying that I wouldn't take any statistic that is derived using "lines of code" as a variable as a serious, hard number.
If only it were MySQL just lacking features that would, after much mudslinging at the ideas themselves, be grudgingly retrofitted into a new table type. MySQL's brokenness goes deeper than that.
MySQL's attitude toward data integrity can be summed up as "if the constraint can't be satisfied, do it half-assed anyway." I find myself having to write application code to manage data integrity with MySQL, something I can take for granted with a real database.
No defects != good software.
A flawless implementation of a crap algorithm is still crap. I don't care if your bubble-sort routine has no memory leaks or buffer overruns; it still scales O(N^2). Likewise, a so-called "database" which does not implement key features like transactions and stored procedures is fundamentally flawed even if there are zero coding errors. o
MySQL may be well-written, but it's still a piece of crap by the standards of any professional DBA.
Sorry, but until MySQL has a mode where ALL tables are transaction safe, or at least throws an error when you try to create a fk reference to a non-transaction safe table, it's transactions are too prone to data loss due to human error.
It's a good data store, but the guys programming it have to "get it" that transactions can't be optional in certain types of databases, and neither can constraints, or fk enforcement.
MySQL has a tendency of failing to do what you thought it did, and failing to report an error so you know. This is a legacy left over from being a SQL interpreter over ISAM files. It makes MySQL a great choice for content management, but a dangerous choice for transactional systems.
Yeah, and the 3 users on the planet who actually need a full fledged SQL database can install Oracle or DB2. Although I've had my indexes corrupted and other horrible things with both those database packages. §
I've worked on several projects interacting with SQL databases and I've only seen one really take advantage of the power of the database. Most of them are using Oracle as a glorified DBASE III, and as a glorified DBASE III, MySQL is much less expensive. And I've seen entire companies built around DBASE III applications.
Re: Six times better?
Sadly, this isn't what most people assume it means. Reasoning's software only finds "obvious" defects, such as null pointer assignments. It doesn't (and can't) determine if a bit of code does what it's supposed to do, only that it does whatever it does without any danger of crashing.
Basically, it's no different from running your code through BoundsChecker or CodeWizard, or any number of other such tools that check for obvious errors (Null pointers, obvious buffer overflows, dangling references, etc.)
While I have no doubt that MySQL's code is perhaps "cleaner" than your typical unpublished code, I have plenty of doubt that MySQL's code is "better" than unpublished code in terms of efficiency, logic errors, etc.
Re: On paper it looks better
That's like asking how a little red wagon compares to a Formula 1 racecar -- there is simply no comparison. The list of missing features in MySQL could fill a book. MySQL is not a true relational database, so comparing it to Oracle, Sybase, DB2, or MS-SQL is like comparing apples and very small rocks. They're not the same thing at all.
It would be more accurate to compare MySQL to dbaseIII, Berkely DB, or Microsoft Access. Against those products, MySQL compares favorably. MySQL performs well for tasks in a narrowly-constrained domain of problems, and is totally incapable of dealing with anything else.
This "proves" that MySQL is better than commercial offerings. Good. A lot of people knew that. Hats off to the developers. But... 1. This cannot be generalized into a property of all open source projects. 2. It's more a tribute to the architecture and original core developers of MySQL than anything else. 3. Realize that even though MySQL is an open source product, MySQL AB is the *company* that organizes and pays for MySQL development. So, again, you can't generalize this into something that covers late night hackers working on personal projects in their basements (the open source geek fantasy). MySQL is awesome! But let's be careful about this story, okay? It's the over-generalization that gives OSS/Linux advocates a bad name ("The Gimp is equivalent to Photoshop!").
MySQL is a "TOY" as far as RDBM's goes
First off, I think MySQL is a fantastic product. Its the perfect mix of speed and ease of use well suited for small to medium sized datastores where speed and relaibility are a must. That being said, I think it's unfair to describe this product alongside others such as Oracle, MSSQL (blow me guys, its a great product) and even PostgreSQL and SAP DB (which is be best OpenSource option in my opinion). The codebase for MySQL will never acheive the magnitude of the aforementioned products so it should be used that way. Just my 2 cents.